idnits 2.17.1
draft-valin-netvc-deringing-01.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
-- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(ii)
Publication Limitation clause. If this document is intended for
submission to the IESG for publication, this constitutes an error.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (March 21, 2016) is 2958 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
No issues found here.
Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group JM. Valin
3 Internet-Draft Mozilla
4 Intended status: Standards Track March 21, 2016
5 Expires: September 22, 2016
7 Directional Deringing Filter
8 draft-valin-netvc-deringing-01
10 Abstract
12 This document describes a deringing filter that takes into account
13 the direction of edges and patterns being filtered. The filter works
14 by identifying the direction of each block and then adaptively
15 filtering along the identified direction. In a second pass, the
16 blocks are also filtered in a different direction, with more
17 conservative thresholds to avoid blurring edges. The proposed
18 deringing filter is shown to improve the quality of both Daala and
19 the Alliance for Open Media (AOM) video codec.
21 Status of This Memo
23 This Internet-Draft is submitted in full conformance with the
24 provisions of BCP 78 and BCP 79.
26 Internet-Drafts are working documents of the Internet Engineering
27 Task Force (IETF). Note that other groups may also distribute
28 working documents as Internet-Drafts. The list of current Internet-
29 Drafts is at http://datatracker.ietf.org/drafts/current/.
31 Internet-Drafts are draft documents valid for a maximum of six months
32 and may be updated, replaced, or obsoleted by other documents at any
33 time. It is inappropriate to use Internet-Drafts as reference
34 material or to cite them other than as "work in progress."
36 This Internet-Draft will expire on September 22, 2016.
38 Copyright Notice
40 Copyright (c) 2016 IETF Trust and the persons identified as the
41 document authors. All rights reserved.
43 This document is subject to BCP 78 and the IETF Trust's Legal
44 Provisions Relating to IETF Documents
45 (http://trustee.ietf.org/license-info) in effect on the date of
46 publication of this document. Please review these documents
47 carefully, as they describe your rights and restrictions with respect
48 to this document. Code Components extracted from this document must
49 include Simplified BSD License text as described in Section 4.e of
50 the Trust Legal Provisions and are provided without warranty as
51 described in the Simplified BSD License.
53 This document may not be modified, and derivative works of it may not
54 be created, and it may not be published except as an Internet-Draft.
56 Table of Contents
58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
59 2. Direction Search . . . . . . . . . . . . . . . . . . . . . . 2
60 3. Directional Filter . . . . . . . . . . . . . . . . . . . . . 3
61 4. Second Stage Filter . . . . . . . . . . . . . . . . . . . . . 4
62 5. Setting Thresholds . . . . . . . . . . . . . . . . . . . . . 5
63 6. Superblock Filtering . . . . . . . . . . . . . . . . . . . . 6
64 7. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
65 8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 7
66 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7
67 10. Security Considerations . . . . . . . . . . . . . . . . . . . 7
68 11. Informative References . . . . . . . . . . . . . . . . . . . 7
69 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 8
71 1. Introduction
73 This document describes a deringing filter that takes into account
74 the direction of edges and patterns being filtered. The filter works
75 by identifying the direction of each block and then adaptively
76 filtering along the identified direction. In a second pass, the
77 blocks are also filtered in a different direction, with more
78 conservative thresholds to avoid blurring edges. The deringing
79 filter is implemented for both Daala and the Alliance for Open Media
80 (AOM) codec.
82 2. Direction Search
84 The first step is to divide the image into blocks of fixed or
85 variable size. Variable-size blocks make it possible to use large
86 blocks on long, continuous edges and small blocks where edges
87 intersect or change direction. A fixed block size is easier to
88 implement and does not require signaling the sizes on a block-by-
89 block basis. For this work, we consider a fixed block size of 8x8.
91 Once the image is divided into blocks, we determine which direction
92 best matches the pattern in each block. One way to determine the
93 direction is to minimize mean squared difference (MSD) between the
94 input block and a perfectly directional block. A perfectly
95 directional block is a block for which each line along a certain
96 direction has a constant value. For each direction, we assign a line
97 number to each pixel, as shown below.
99 +---+---+---+---+---+---+---+---+
100 | 0 | 0 | 1 | 1 | 2 | 2 | 3 | 3 |
101 +---+---+---+---+---+---+---+---+
102 | 1 | 1 | 2 | 2 | 3 | 3 | 4 | 4 |
103 +---+---+---+---+---+---+---+---+
104 | 2 | 2 | 3 | 3 | 4 | 4 | 5 | 5 |
105 +---+---+---+---+---+---+---+---+
106 | 3 | 3 | 4 | 4 | 5 | 5 | 6 | 6 |
107 +---+---+---+---+---+---+---+---+
108 | 4 | 4 | 5 | 5 | 6 | 6 | 7 | 7 |
109 +---+---+---+---+---+---+---+---+
110 | 5 | 5 | 6 | 6 | 7 | 7 | 8 | 8 |
111 +---+---+---+---+---+---+---+---+
112 | 6 | 6 | 7 | 7 | 8 | 8 | 9 | 9 |
113 +---+---+---+---+---+---+---+---+
114 | 7 | 7 | 8 | 8 | 9 | 9 |10 |10 |
115 +---+---+---+---+---+---+---+---+
117 For each direction d, we compute the value s_d, which is equal to a
118 direction-independent offset minus the MSD (see [Deringing-Note] for
119 detauls) as:
121 __ __ 2
122 \ 1 / \ \
123 s_d= /_ ------- * | /_ x_p | ,
124 k in block N_(d,k) \ p in P_(d,k) /
126 where x_p is the value of pixel p, P_(d,k) is the set of pixels in
127 like k along direction d, and N_(d,k) is the cardinality of P_(d,k).
128 From there, the direction is computed as the value of d that
129 maximizes s_d.
131 3. Directional Filter
133 The directional filter for pixel (i,j) is defined as the 7-tap non-
134 linear filter
135 3 _
136 1 -- | / \
137 y(i,j)=x(i,j)+---*\ w_k*| f| x(i,j)-x(i+floor(k*d_y),j+floor(k*d_x), T|
138 W /_ |_ \ /
139 k=1 _
140 / \ |
141 + f| x(i,j)-x(i-floor(k*d_y),j-floor(k*d_x), T| |
142 \ /_|
144 where d_x and d_y define the direction, W is a constant normalizing
145 factor, T is the filtering threshold for the block, and f(d,T) is
146 defined as
148 /
149 \ d , |d| < T
150 f(d, T) = <
151 / 0 , otherwise
152 \
154 The direction parameters are shown in the table below. The weights
155 w_k can be chosen so that W is a power of two. For example, Daala
156 currently uses w=[3 2 2] with W=16. Since the direction is constant
157 over 8x8 blocks, all operations in this filter are directly
158 vectorizable over the blocks.
160 +-----------+------+------+
161 | Direction | d_x | d_y |
162 +-----------+------+------+
163 | 0 | 1 | -1 |
164 | 1 | 1 | -1/2 |
165 | 2 | 1 | 0 |
166 | 3 | 1 | 1/2 |
167 | 4 | 1 | 1 |
168 | 5 | 1/2 | 1 |
169 | 6 | 0 | 1 |
170 | 7 | -1/2 | 1 |
171 +-----------+------+------+
173 Table 1
175 4. Second Stage Filter
177 The 7-tap directional filter is sometimes not enough to eliminate all
178 ringing, so we use an additional filtering step that operates across
179 the direction lines used in the first filter. Considering that the
180 input of the second filter has considerably less ringing than the
181 input of the second filter, and the fact that the second filter risks
182 blurring edges, the position-dependent threshold T_2(i,j) for the
183 second filter is set lower than that of the first filter T. The
184 filter structure is the same as the one used for the directional
185 filter. The direction parameters for the second stage filter are
186 shown in the table below and the filter weights are w=[1 1] with
187 W=16/3.
189 +-----------+-----+-----+
190 | Direction | d_x | d_y |
191 +-----------+-----+-----+
192 | 0 | 1 | 0 |
193 | 1 | 0 | 1 |
194 | 2 | 0 | 1 |
195 | 3 | 0 | 1 |
196 | 4 | 1 | 0 |
197 | 5 | 1 | 0 |
198 | 6 | 1 | 0 |
199 | 7 | 1 | 0 |
200 +-----------+-----+-----+
202 Table 2
204 5. Setting Thresholds
206 The thresholds T and T_2 must be set high enough to smooth out
207 ringing artefacts, but low enough to avoid blurring important details
208 in the image. Although the ringing is roughly proportional to the
209 quantization step size Q, as the quantizer increases the error grows
210 slightly less than linearly because the unquantized coefficients
211 become very small compared to Q. As a starting point for determining
212 the thresholds, Daala uses a power model of the form
213 T_0=level*alpha_1*Q^beta, with beta=0.842, and where alpha_1 depends
214 on the input scaling. The level is a threshold adjustment coded for
215 each superblock (64x64). In the AOM codec, a global threshold is
216 selected by the encoder instead of using a function of the quantizer,
217 so T_0=level*global_level.
219 Another factor that affects the optimal filtering threshold is the
220 presence of strong directional edges/patterns. These can be
221 estimated from the s_d parameters computed in the direction search as
222 delta=s_(d_opt)-s_(d_ortho), where d_ortho=d_opt+4 (mod 8). We
223 compute the direction filtering threshold for each block as
225 / 1 / 1/6 \ \
226 T = T_0*max| ---, min| 3, alpha_2*(delta) | |,
227 \ 2 \ / /
229 where alpha_2 also depends on the input scaling. For the second
230 filter, we use a more conservative threshold that depends on the
231 amount of change caused by the directional filter.
233 / T \
234 T_2(i,j) = min| T, --- + |y(i,j)-x(i,j)| |.
235 \ 3 /
237 As a special case, when the pixels corresponding to the 8x8 block
238 being filtered are all skipped, then T=T_2=0, so no deringing is
239 performed.
241 6. Superblock Filtering
243 The filtering is applied one superblock at a time, in a way that
244 depends on the level. In Daala, the level can take one of 6 values:
245 0, 0.5, 0.7, 1.0, 1.4, 2.0, where a level of zero disables the
246 deringing filter for the current superblock. The level is the only
247 information coded in the bitstream by the deringing filter. On
248 keyframes, it is entropy-coded based on the neighbor values. On
249 inter-predicted frames, the level is only coded for superblocks that
250 are not skipped and is entropy-coded based on a single adapted
251 probability distribution (no context from the neighbors).
252 Superblocks where no level is coded have deringing disabled.
253 Similarly, any skipped block within a superblock has deringing
254 disabled, even if it is signaled enabled for the superblock.
256 The level of the deringing filter in AOM is handled similarly, except
257 that only four levels are currently available and there is no entropy
258 coding yet.
260 The deringing process sometimes reads pixels that lie outside of the
261 superblock being processed. When these pixels belong to another
262 superblock, the filtering always uses the unfiltered pixel values --
263 even for the second stage filter -- so that no dependency is added
264 between the superblocks. This makes it possible -- in theory -- to
265 filter all superblocks in parallel. When the pixels used for a
266 filter lie outside of the viewable image, we set f(d,T)=0.
268 7. Results
270 The deringing filter described here has been implemented for the
271 Daala [Daala-website] codec. It is available from the Daala Git
272 repository [Daala-Git]. We tested the deringing filter on the Are We
273 Compressed Yet [AWCY] ntt-short1 set over the 0.025 bit/pixel to 0.1
274 bit/pixel range, corresponding to a 1080p30 bitrate of 1.5 Mbit/s to
275 6 Mbit/s. The Bjontegaard-delta [I-D.daede-netvc-testing] rate
276 reduction over that range was 6.5% for PSNR, 4.7% for PSNR-HVS, 5.6%
277 for SSIM and -6.0% (regression) for FAST-SSIM. Visual inspection
278 confirmed that the quality is indeed improved, despite the regression
279 in the FAST-SSIM result.
281 In AOM for the ntt-short1 set, the medium bitrate (0.02 to 0.06 bit/
282 pixel) Bjontegaard-delta improvement is 2.5% for PSNR, 1.5% for PSNR-
283 HVS, 1.5% for SSIM, and -3.8% (regression) on FAST-SSIM. The high
284 bitrate (0.06 to 0.2 bit/pixel) Bjontegaard-delta improvement is 2.0%
285 for PSNR, 0.8% for PSNR-HVS, 1.3% for SSIM, and -3.1% (regression) on
286 FAST-SSIM.
288 The smaller improvement for AOM compared to Daala may be due to the
289 newly integrated code not being mature, but also to the fact that
290 some features in Daala tend to cause more ringing. These features
291 include lapped transforms, quantization matrices, perceptual vector
292 quantization, overlapped block motion compensation (OBMC), and
293 activity masking.
295 8. Conclusion
297 We have demonstrated an effective algorithm to remove ringing
298 artefacts from coded images and videos. The proposed filter takes
299 into account the directionality of the patterns it is filtering to
300 reduce the risk of blurring.
302 9. IANA Considerations
304 This document makes no request of IANA.
306 10. Security Considerations
308 This draft has no security considerations.
310 11. Informative References
312 [AWCY] "Are We Compressed Yet?", Xiph.Org Foundation ,
313 .
315 [Daala-Git]
316 "Daala Git repository", Xiph.Org Foundation ,
317 .
319 [Daala-website]
320 "Daala website", Xiph.Org Foundation , .
323 [Deringing-Note]
324 Valin, JM., "The Daala Directional Deringing Filter",
325 arXiv:1602.05975 [cs.MM] ,
326 .
328 [I-D.daede-netvc-testing]
329 Daede, T. and J. Jack, "Video Codec Testing and Quality
330 Measurement", draft-daede-netvc-testing-02 (work in
331 progress), October 2015.
333 Author's Address
335 Jean-Marc Valin
336 Mozilla
337 331 E. Evelyn Avenue
338 Mountain View, CA 94041
339 USA
341 Email: jmvalin@jmvalin.ca