idnits 2.17.1 draft-midtskogen-netvc-clpf-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (March 10, 2017) is 2597 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Midtskogen 3 Internet-Draft A. Fuldseth 4 Intended status: Standards Track M. Zanaty 5 Expires: September 11, 2017 Cisco 6 March 10, 2017 8 Constrained Low Pass Filter 9 draft-midtskogen-netvc-clpf-04 11 Abstract 13 This document describes a low complexity filtering technique which is 14 being used as a low pass loop filter in the Thor video codec. 16 Status of This Memo 18 This Internet-Draft is submitted in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF). Note that other groups may also distribute 23 working documents as Internet-Drafts. The list of current Internet- 24 Drafts is at http://datatracker.ietf.org/drafts/current/. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 This Internet-Draft will expire on September 11, 2017. 33 Copyright Notice 35 Copyright (c) 2017 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents 40 (http://trustee.ietf.org/license-info) in effect on the date of 41 publication of this document. Please review these documents 42 carefully, as they describe your rights and restrictions with respect 43 to this document. Code Components extracted from this document must 44 include Simplified BSD License text as described in Section 4.e of 45 the Trust Legal Provisions and are provided without warranty as 46 described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 51 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 2 53 2.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 2 54 3. Filtering Process . . . . . . . . . . . . . . . . . . . . . . 3 55 4. Further complexity considerations . . . . . . . . . . . . . . 6 56 5. Performance . . . . . . . . . . . . . . . . . . . . . . . . . 6 57 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 58 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 59 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 60 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 61 9.1. Normative References . . . . . . . . . . . . . . . . . . 8 62 9.2. Informative References . . . . . . . . . . . . . . . . . 8 63 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 65 1. Introduction 67 Modern video coding standards such as Thor [I-D.fuldseth-netvc-thor] 68 include in-loop filters which correct artifacts introduced in the 69 encoding process. Thor includes a deblocking filter which corrects 70 artifacts introduced by the block based nature of the encoding 71 process, and a low pass filter correcting artifacts not corrected by 72 the deblocking filter, in particular artifacts introduced by 73 quantisation errors of transform coefficients and by the 74 interpolation filter. Since in-loop filters have to be applied in 75 both the encoder and decoder, it is highly desirable that these 76 filters have low computational complexity. 78 2. Definitions 80 2.1. Requirements Language 82 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 83 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 84 document are to be interpreted as described in RFC 2119 [RFC2119]. 86 2.2. Terminology 88 This document will refer to a pixel X and eight of its neighbouring 89 pixels A - H ordered in the following pattern. 91 +---+---+---+---+---+ 92 | | | A | | | 93 +---+---+---+---+---+ 94 | | | B | | | 95 +---+---+---+---+---+ 96 | C | D | X | E | F | 97 +---+---+---+---+---+ 98 | | | G | | | 99 +---+---+---+---+---+ 100 | | | H | | | 101 +---+---+---+---+---+ 103 Figure 1: Filter pixel positions 105 In Thor the frames are divided into filter blocks (FB) of 128x128, 106 64x64 or 32x32 pixels, which is signalled for each frame to be 107 filtered. Also, each frame is divided into coding blocks (CB) which 108 range from 8x8 to 128x128 independent of the FB size. The filter 109 described in this draft can be switched on or off for the entire 110 frame or optionally on or off for each FB. CB's that have been coded 111 using the skip mode are not filtered, and if a FB only contains CB's 112 that have been coded in skip mode, the FB will not be filtered and no 113 signal will be transmitted for this FB. 115 If the frame can't fit a whole number of FB's, the FB's at the right 116 and bottom edges are clipped to fit. For instance, if the frame 117 resolution is 1920x1080 and the FB size is 128x128, the size of the 118 FB's at the bottom of the frame becomes 128x56. 120 3. Filtering Process 122 Given a pixel X and its neighbouring pixels described above we can 123 define a general non-linear filter as: 125 X' = X + a*constrain(A-X) + b*constrain(B-X) + 126 c*constrain(C-X) + d*constrain(D-X) + 127 e*constrain(E-X) + f*constrain(F-X) + 128 g*constrain(G-X) + h*constrain(H-X) 130 Figure 2: Equation 1 132 where constrain(x) is a function limiting the range of x. 134 If a neighbour pixel is outside the image frame, it is given the same 135 value as the closest pixel within the frame. To avoid dependencies 136 prohibiting parallel processing, all neighbour pixels must be the 137 unfiltered pixels of the frame being filtered. 139 Experiments in Thor have shown that a good compromise between 140 complexity and performance is a=c=f=h=1/16, b=d=e=g=3/16, and a good 141 constrain function has been found to be: 143 constrain(x, s, d) = 144 sign(x) * max(0, abs(x) - max(0, abs(x) - s + 145 (abs(x) >> (d - log2(s))))) 147 Figure 3: Equation 2 149 where sign(x) returns 1 or -1 if x is positive or negative 150 respectively, s denotes the strength of the filter by which x will be 151 clipped, and d further constrains the range of x so that the output 152 of the function will linearly approach 0 as abs(x) approaches 2^d. d 153 depends on the frame quality (qp) and is computed as 155 d = bitdepth - 4 + qp/16 157 Figure 4: Equation 4 159 for the luma plane and 161 d = bitdepth - 5 + qp/16 163 Figure 5: Equation 5 165 for the chroma planes. 167 The constrain function can be visualised as follows 168 s ---- 169 / ---- 170 / ---- 171 0 ------------ / ------------ 172 ---- / 173 ---- / 174 s ---- 175 -2^d -s 0 s 2^d 177 Figure 6: Graph 1 179 The filter strength s can be 1, 2 or 4 signalled at frame level when 180 the bitdepth is 8. The strengths are scaled according to the 181 bitdepth, so they become 4, 8 and 16 when the bitdepth is 10, and 16, 182 32 and 64 when the bitdepth is 12. The rounding is to the nearest 183 integer. 185 This gives us the equation: 187 X' = X + (1*constrain(A-X, s, d) + 3*constrain(B-X, s, d) + 188 (1*constrain(C-X, s, d) + 3*constrain(D-X, s, d) + 189 (3*constrain(E-X, s, d) + 1*constrain(F-X, s, d) + 190 (3*constrain(G-X, s, d) + 1*constrain(H-X, s, d) 192 Figure 7: Equation 6 194 The filter leaves the encoder 13 different choices for a frame. The 195 filter can be disabled for the entire frame, or the frame is filtered 196 using all distinct combinations of strength (1, 2 or 4 scaled for 197 bitdepth), non-skip FB signal (enabled/disabled) and FB size (32x32, 198 64x64 or 128x128). Note that the FB size only matters when FB 199 signalling is in use. 201 The decisions at both frame level and FB level may be based on rate- 202 distortion optimisation (RDO), but an encoder running in a low- 203 complexity mode, or possibly a low-delay mode, may instead assume 204 that a fixed mode will be beneficial. In general, using s=2, a QP 205 dependent FB size and RDO only at the FB level gives good results. 207 However, because of the low complexity of the filter, fully RDO based 208 decisions are not costly. The distortion of the 13 configurations of 209 the filter can easily be computed in a single pass by keeping track 210 of the distortions of the three different strengths and the bit costs 211 for different FB sizes. 213 The filter is applied after the deblocking filter. 215 4. Further complexity considerations 217 The filter has been designed to offer the best compromise between low 218 complexity and performance. All operations are easily vectorised 219 with SIMD instructions and if the video input is 8 bit, all SIMD 220 operations can have 8 bit lanes in architectures such as x86/SSE4 and 221 ARM/NEON. Clipping at frame borders can be implemented using shuffle 222 instructions. 224 5. Performance 226 The table below shows filters effect on the bandwidth for a selection 227 of 10 second video sequences encoded in Thor with uni-prediction 228 only. The numbers have been computed using the Bjontegaard Delta 229 Rate (BDR). BDR-low and BDR-high indicate the effect at low and high 230 bitrates, respectively, as described in BDR [BDR]. 232 The effect of the filter was tested in two encoder low-delay 233 configurations: high complexity in which the encoder strongly favours 234 compression efficiency over CPU usage, and medium complexity which is 235 more suited for real-time applications. The bandwidth reduction is 236 somewhat less in the high complexity configuration. 238 +----------------+--------------------+--------------------+ 239 | | MEDIUM COMPLEXITY | HIGH COMPLEXITY | 240 +----------------+------+------+------+--------------------+ 241 | | | BDR- | BDR- | | BDR- | BDR- | 242 |Sequence | BDR | low | high | BDR | low | high | 243 +----------------+------+------+------+------+------+------+ 244 |Kimono | -2.6%| -2.3%| -3.2%| -1.7%| -1.7%| -1.9%| 245 |BasketballDrive | -3.9%| -3.1%| -5.0%| -2.8%| -2.4%| -3.5%| 246 |BQTerrace | -7.4%| -4.0%|-10.2%| -4.8%| -2.4%| -6.8%| 247 |FourPeople | -5.5%| -3.9%| -8.2%| -3.8%| -2.9%| -5.1%| 248 |Johnny | -5.2%| -3.5%| -8.2%| -3.3%| -2.7%| -4.5%| 249 |ChangeSeats | -6.4%| -3.9%|-10.2%| -4.7%| -2.6%| -6.9%| 250 |HeadAndShoulder | -9.3%| -3.4%|-19.6%| -6.2%| -2.6%|-11.8%| 251 |TelePresence | -5.8%| -3.6%|-10.0%| -4.5%| -3.3%| -6.6%| 252 +----------------+------+------+------+--------------------+ 253 |Average | -5.8%| -3.5%| -9.3%| -4.0%| -2.6%| -5.9%| 254 +----------------+------+------+------+--------------------+ 256 Figure 8: Compression Performance without Biprediction 258 While the filter objectively performs better at relatively high 259 bitrates, the subjective effect seems better at relatively low 260 bitrates, and overall the subjective effect seems better than what 261 the objective numbers suggest. 263 If biprediction is allowed, there is generally less bandwidth 264 reduction as the table below shows. These results reflect low-delay 265 biprediction without frame reordering. 267 +----------------+--------------------+--------------------+ 268 | | MEDIUM COMPLEXITY | HIGH COMPLEXITY | 269 +----------------+------+------+------+--------------------+ 270 | | | BDR- | BDR- | | BDR- | BDR- | 271 |Sequence | BDR | low | high | BDR | low | high | 272 +----------------+------+------+------+------+------+------+ 273 |Kimono | -2.2%| -2.0%| -2.7%| -1.4%| -1.3%| -1.5%| 274 |BasketballDrive | -3.1%| -3.0%| -3.3%| -1.9%| -2.0%| -1.7%| 275 |BQTerrace | -5.4%| -4.3%| -6.5%| -3.9%| -3.6%| -3.8%| 276 |FourPeople | -3.8%| -2.8%| -5.2%| -2.4%| -1.8%| -3.0%| 277 |Johnny | -3.8%| -3.1%| -4.8%| -2.4%| -2.2%| -2.7%| 278 |ChangeSeats | -4.4%| -3.1%| -6.5%| -3.2%| -2.6%| -3.9%| 279 |HeadAndShoulder | -4.8%| -3.0%| -8.1%| -3.0%| -2.7%| -3.7%| 280 |TelePresence | -3.4%| -2.3%| -5.5%| -2.2%| -1.7%| -3.1%| 281 +----------------+------+------+------+------+------+------+ 282 |Average | -3.9%| -2.9%| -5.3%| -2.5%| -2.2%| -2.9%| 283 +----------------+------+------+------+------+------+------+ 285 Figure 9: Compression Performance with Biprediction 287 6. IANA Considerations 289 This document has no IANA considerations yet. TBD 291 7. Security Considerations 293 This document has no security considerations yet. TBD 295 8. Acknowledgements 297 The authors would like to thank Gisle Bjontegaard for reviewing this 298 document and design, and providing constructive feedback and 299 direction. 301 9. References 303 9.1. Normative References 305 [I-D.fuldseth-netvc-thor] 306 Fuldseth, A., Bjontegaard, G., Midtskogen, S., Davies, T., 307 and M. Zanaty, "Thor Video Codec", draft-fuldseth-netvc- 308 thor-03 (work in progress), October 2016. 310 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 311 Requirement Levels", BCP 14, RFC 2119, 312 DOI 10.17487/RFC2119, March 1997, 313 . 315 9.2. Informative References 317 [BDR] Bjontegaard, G., "Calculation of average PSNR differences 318 between RD-curves", ITU-T SG16 Q6 VCEG-M33 , April 2001. 320 Authors' Addresses 322 Steinar Midtskogen 323 Cisco 324 Lysaker 325 Norway 327 Email: stemidts@cisco.com 329 Arild Fuldseth 330 Cisco 331 Lysaker 332 Norway 334 Email: arilfuld@cisco.com 336 Mo Zanaty 337 Cisco 338 RTP,NC 339 USA 341 Email: mzanaty@cisco.com