idnits 2.17.1 draft-midtskogen-netvc-clpf-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (March 18, 2016) is 2958 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-fuldseth-netvc-thor-01 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Midtskogen 3 Internet-Draft A. Fuldseth 4 Intended status: Standards Track M. Zanaty 5 Expires: September 19, 2016 Cisco 6 March 18, 2016 8 Constrained Low Pass Filter 9 draft-midtskogen-netvc-clpf-01 11 Abstract 13 This document describes a low complexity filtering technique which is 14 being used as a low pass loop filter in the Thor video codec. 16 Status of This Memo 18 This Internet-Draft is submitted in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF). Note that other groups may also distribute 23 working documents as Internet-Drafts. The list of current Internet- 24 Drafts is at http://datatracker.ietf.org/drafts/current/. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 This Internet-Draft will expire on September 19, 2016. 33 Copyright Notice 35 Copyright (c) 2016 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents 40 (http://trustee.ietf.org/license-info) in effect on the date of 41 publication of this document. Please review these documents 42 carefully, as they describe your rights and restrictions with respect 43 to this document. Code Components extracted from this document must 44 include Simplified BSD License text as described in Section 4.e of 45 the Trust Legal Provisions and are provided without warranty as 46 described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 51 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 2 53 2.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 2 54 3. Filtering Process . . . . . . . . . . . . . . . . . . . . . . 3 55 4. Complexity considerations . . . . . . . . . . . . . . . . . . 5 56 5. Performance . . . . . . . . . . . . . . . . . . . . . . . . . 5 57 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 58 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 59 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 60 9. Normative References . . . . . . . . . . . . . . . . . . . . 7 61 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 63 1. Introduction 65 Modern video coding standards such as Thor [I-D.fuldseth-netvc-thor] 66 include in-loop filters which correct artifacts introduced in the 67 encoding process. Thor includes a deblocking filter which correct 68 artifacts introduced by the block based nature of the encoding 69 process, and a low pass filter correcting artifacts not corrected by 70 the deblocking filter, in particular artifacts introduced by 71 quantisation errors of transform coefficients and by the 72 interpolation filter. Since in-loop filters have to be applied in 73 both the encoder and decoder, it is highly desirable that these 74 filters have low computational complexity. 76 2. Definitions 78 2.1. Requirements Language 80 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 81 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 82 document are to be interpreted as described in RFC 2119 [RFC2119]. 84 2.2. Terminology 86 This document will refer to a pixel X and six of its neighbouring 87 pixel A, B, C, D, E, F ordered in the following pattern. 89 +---+---+---+---+---+ 90 | | | A | | | 91 +---+---+---+---+---+ 92 | B | C | X | D | E | 93 +---+---+---+---+---+ 94 | | | F | | | 95 +---+---+---+---+---+ 97 Figure 1: Filter pixel positions 99 In Thor the frames are divided into filter blocks (FB) of 128x128 100 pixels and each FB can be divided into a quadtree of coding blocks 101 (CB) which can range from 8x8 to 128x128. The filter described in 102 this draft can be switched on or off for the entire frame or 103 optionally on or off for each FB. CB's that have been coded using 104 the skip mode are not filtered, and if all CB's within a FB have been 105 been coded in skip mode, the FB will not be filtered and no signal 106 will be transmitted to indicate this. 108 3. Filtering Process 110 Given a pixel X and its neighbouring pixels described above we can 111 define a general non-linear filter as: 113 X' = X + clip(a*clip(A-X,-s,s) + b*clip(B-X,-s,s) + c*clip(C-X,-s,s) + 114 d*clip(D-X,-s,s) + e*clip(E-X,-s,s) + f*clip(F-X,-s,s),-g,g) 116 Figure 2: Equation 1 118 If a neighbour pixel is outside the image frame, it is given the same 119 value as the closes pixel within the frame. To avoid dependencies 120 prohibiting parallel processing, all neighbour pixels must be the 121 unfiltered pixels of the frame being filtered. 123 Experiments in Thor have shown that a good compromise between 124 complexity and performance is a=f=0.25, b=e=0.0625, c=d=0.1875 and 125 the filter strength s being 1, 2 or 4 signalled at frame level. 126 These values eliminate the need for the outer clipping to +/-g. The 127 rounding is to the nearest integer. 129 This gives us the equation: 131 X' = X + (4*clip(A-X,-s,s) + clip(B-X,-s,s) + 3*clip(C-X,-s,s) + 132 3*clip(D-X,-s,s) + clip(E-X,-s,s) + 4*clip(F-X,-s,s)) / 16 134 Figure 3: Equation 2 136 It can be noted that a=c=d=f=0.25, b=e=0 and s=1 give a slighly 137 simpler filter which is very similar to the one described in the 138 first version of this draft. 140 The filter leaves the encoder seven different choices for a frame: 142 1. The frame is not filtered. 144 2. The frame is filtered with s=1 and all non-skip CB's are 145 filtered. 147 3. The frame is filtered with s=2 and all non-skip CB's are 148 filtered. 150 4. The frame is filtered with s=4 and all non-skip CB's are 151 filtered. 153 5. The frame is filtered with s=1 and one bit per FB is sent to 154 indicate whether all non-skip CB's in the FB must be filtered. 156 6. The frame is filtered with s=2 and one bit per FB is sent to 157 indicate whether all non-skip CB's in the FB must be filtered. 159 7. The frame is filtered with s=4 and one bit per FB is sent to 160 indicate whether all non-skip CB's in the FB must be filtered. 162 The decisions at both frame level and FB level may be based on rate- 163 distortion optimisation (RDO), but an encoder running in a low- 164 complexity mode, or possibly a low-delay mode, may instead assume 165 that a fixed mode will be beneficial. In general, using s=2 and RDO 166 only at the FB level gives good results. Applying the filter to all 167 non-skip CB with no RDO at either frame level or FB level gives a 168 poorer result, and will not unfrequently lower the PSNR of the frame, 169 in particular if the frame already had near lossless quality. 171 However, because of the low complexity of the filter, fully RDO based 172 decisions are not costly. The distortion of the six configurations 173 of the filter can easily be computed in a single pass. 175 The filter is applied after the deblocking filter. 177 4. Complexity considerations 179 The filter has been designed to offer the best compromise between low 180 complexity and performance. A single pixel can be filtered with 181 simple operations as illustrated by this C function: 183 int clpf_sample(int X, int A, int B, int C, int D, int E, int F, int s) 184 { 185 int delta = 186 4*clip(A - X, -s, s) + clip(B - X, -s, s) + 3*clip(C - X, -s, s) + 187 3*clip(D - X, -s, s) + clip(E - X, -s, s) + 4*clip(F - X, -s, s); 188 return (8 + delta - (delta < 0)) >> 4; 189 } 191 Figure 4: C code 193 Also, these operations are easily vectorised in architectures 194 supporting SIMD instructions, such as x86/SSE4 and ARM/NEON. The 195 pixel difference is 9 bit, but it can be computed using adding an 8 196 bit offset and the use of 8 bit saturated signed subtraction. This 197 means that 16 pixels per core can be filtered in parallel on these 198 architectures. Clipping at frame borders can be implemented using 199 shuffle instructions. 201 A C implementation using x86/SSE4 intrinsics required 6.8 202 instructions per pixel to filter a single 8x8 block. The 203 corresponding number for ARM/NEON (armv7) was 4.9. The compiler was 204 gcc 4.8.4 in both cases. 206 Since the filter only needs to look up pixels in the line directly 207 above and below the pixel to be filtered, the line buffer requirement 208 in hardware implementations is very low. 210 5. Performance 212 The table below show filters effect on the bandwidth for a selection 213 of 10 second video sequences encoded in Thor with uni-prediction 214 only. The numbers have been computed using the Bjontegaard Delta 215 Rate (BDR). BDR-low and BDR-high indicate the effect at low and bigh 216 bitrates respectively. The effect of the filter was tested in two 217 encoder configurations: high complexity in which the encoder strongly 218 favours compression efficiency over CPU usage, and medium complexity 219 which is more suited for real-time applications. The bandwidth 220 reduction is somewhat less in the high complexity configuration. 222 +----------------+--------------------+--------------------+ 223 | | MEDIUM COMPLEXITY | HIGH COMPLEXITY | 224 +----------------+------+------+------+--------------------+ 225 | | | BDR- | BDR- | | BDR- | BDR- | 226 |Sequence | BDR | low | high | BDR | low | high | 227 +----------------+------+------+------+------+------+------+ 228 |Kimono | -2.6%| -2.3%| -3.1%| -1.8%| -1.9%| -1.7%| 229 |BasketballDrive | -3.1%| -2.5%| -4.0%| -2.0%| -1.7%| -2.5%| 230 |BQTerrace | -7.0%| -4.9%| -8.4%| -5.1%| -3.6%| -6.0%| 231 |FourPeople | -5.5%| -3.9%| -7.9%| -3.7%| -2.6%| -5.3%| 232 |Johnny | -5.4%| -3.9%| -8.0%| -3.9%| -3.3%| -5.0%| 233 |ChangeSeats | -6.3%| -3.6%|-10.5%| -4.3%| -2.8%| -6.4%| 234 |HeadAndShoulder | -7.9%| -2.8%|-16.6%| -5.3%| -2.5%| -9.4%| 235 |TelePresence | -5.9%| -3.3%|-10.2%| -4.0%| -2.2%| -6.6%| 236 +----------------+------+------+------+--------------------+ 237 |Average | -5.5%| -3.4%| -8.6%| -3.8%| -2.6%| -5.4%| 238 +----------------+------+------+------+--------------------+ 240 Figure 5: Compression Performance without Biprediction 242 Whilst the filter objectively performs better at relatively high 243 bitrates, the subjective effect seems better at relatively low 244 bitrates, and overall the subjective effect seems better than what 245 the objective numbers suggest. 247 If bi-prediction is allowed, there is generally less bandwidth 248 reduction as the table below shows. 250 +----------------+--------------------+--------------------+ 251 | | MEDIUM COMPLEXITY | HIGH COMPLEXITY | 252 +----------------+------+------+------+--------------------+ 253 | | | BDR- | BDR- | | BDR- | BDR- | 254 |Sequence | BDR | low | high | BDR | low | high | 255 +----------------+------+------+------+------+------+------+ 256 |Kimono | -2.1%| -2.0%| -2.4%| -1.3%| -1.4%| -1.3%| 257 |BasketballDrive | -2.4%| -2.6%| -2.2%| -1.4%| -1.7%| -0.9%| 258 |BQTerrace | -3.7%| -3.2%| -3.9%| -2.4%| -2.5%| -2.0%| 259 |FourPeople | -3.9%| -2.9%| -5.1%| -2.5%| -2.2%| -2.8%| 260 |Johnny | -3.4%| -3.2%| -4.0%| -2.2%| -1.7%| -2.7%| 261 |ChangeSeats | -4.2%| -3.2%| -5.7%| -2.6%| -2.2%| -2.9%| 262 |HeadAndShoulder | -3.9%| -3.0%| -5.4%| -2.4%| -2.1%| -2.7%| 263 |TelePresence | -2.6%| -2.0%| -3.6%| -1.5%| -1.1%| -2.1%| 264 +----------------+------+------+------+------+------+------+ 265 |Average | -3.3%| -2.8%| -4.0%| -2.0%| -1.9%| -2.2%| 266 +----------------+------+------+------+------+------+------+ 268 Figure 6: Compression Performance with Biprediction 270 6. IANA Considerations 272 This document has no IANA considerations yet. TBD 274 7. Security Considerations 276 This document has no security considerations yet. TBD 278 8. Acknowledgements 280 The authors would like to thank Gisle Bjontegaard for reviewing this 281 document and design, and providing constructive feedback and 282 direction. 284 9. Normative References 286 [I-D.fuldseth-netvc-thor] 287 Fuldseth, A., Bjontegaard, G., Midtskogen, S., Davies, T., 288 and M. Zanaty, "Thor Video Codec", draft-fuldseth-netvc- 289 thor-01 (work in progress), October 2015. 291 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 292 Requirement Levels", BCP 14, RFC 2119, 293 DOI 10.17487/RFC2119, March 1997, 294 . 296 Authors' Addresses 298 Steinar Midtskogen 299 Cisco 300 Lysaker 301 Norway 303 Email: stemidts@cisco.com 305 Arild Fuldseth 306 Cisco 307 Lysaker 308 Norway 310 Email: arilfuld@cisco.com 312 Mo Zanaty 313 Cisco 314 RTP,NC 315 USA 317 Email: mzanaty@cisco.com