idnits 2.17.1 

draft-davies-netvc-qmtx-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 7) being 59 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** There is 1 instance of too long lines in the document, the longest one
     being 4 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 206 has weird spacing: '...rameter  q is ...'

  -- The document date (March 16, 2016) is 2962 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'IRFVC' is defined on line 280, but no explicit
     reference was found in the text


     Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                          T. Davies
2	Internet-Draft                                                     Cisco
3	Intended status: Standards Track                          March 16, 2016
4	Expires: September 17, 2016

6	               Quantisation matrices for Thor video coding
7	                      draft-davies-netvc-qmtx-00

9	Abstract

11	   This draft describes a family of default quantisation matrices
12	   that may be used to improve perceptual quality when encoding with
13	   Thor. Similar quantisation matrix designs may be used in most
14	   block-based video and image codecs.

16	Status of This Memo

18	   This Internet-Draft is submitted in full conformance with the
19	   provisions of BCP 78 and BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF).  Note that other groups may also distribute
23	   working documents as Internet-Drafts.  The list of current Internet-
24	   Drafts is at http://datatracker.ietf.org/drafts/current/.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   This Internet-Draft will expire on September 17, 2016.

33	Copyright Notice

35	   Copyright (c) 2016 IETF Trust and the persons identified as the
36	   document authors.  All rights reserved.

38	   This document is subject to BCP 78 and the IETF Trust's Legal
39	   Provisions Relating to IETF Documents
40	   (http://trustee.ietf.org/license-info) in effect on the date of
41	   publication of this document.  Please review these documents
42	   carefully, as they describe your rights and restrictions with respect
43	   to this document.  Code Components extracted from this document must
44	   include Simplified BSD License text as described in Section 4.e of
45	   the Trust Legal Provisions and are provided without warranty as
46	   described in the Simplified BSD License.

48	Table of Contents

50	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2

52	   2.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   2
53	     2.1.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   2

55	   3.  Quantisation matrix design  . . . . . . . . . . . . . . . . .   3
56	     3.1.  The function of quantisation matrices . . . . . . . . . .   3
57	     3.2.  Quantisation matrices in AVC and HEVC . . . . . . . . . .   4
58	     3.3.  Quantisation matrices in Thor . . . . . . . . . . . . . .   4
59	     3.4.  Implementation . . . . . .  . . . . . . . . . . . . . . .   5

61	   4.  Compression performance . . . . . . . . . . . . . . . . . . .   6

63	   5. Informative References . . . . . . . . . . . . . . . . . . . .   7

65	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   7

67	1.  Introduction

69	   This document describes a family of default quantisation matrices
70	   that may be used to improve perceptual quality when encoding with
71	   Thor. The quantisation matrices are designed to be near-flat at high
72	   quantisation levels and more strongly profiled at low quantisation
73	   levels, to avoid ringing artefacts and better shape quantisation
74	   error across a whole sequence with varying quantisation levels.

76	2.  Definitions

78	2.1.  Terminology

80	   This document uses the following terms.

82	      QP: quantisation parameter

84	      QM: quantisation matrix

86	      CSF: contrast sensitivity function

88	      BDR: Bjontegaard Delta-Rate

90	3.  Quantisation matrix design

92	3.1.  The function of quantisation matrices

94	   Quantisation matrices work by shaping the residual error after
95	   quantisation in the spatial frequency domain, usually the DCT domain.
96	   This is done by varying the quantisation factor applied across
97	   spatial frequencies in the transform block. Typically a high
98	   quantisation factor is applied at high spatial frequencies and a
99	   low one at low spatial frequencies.

101	   The aim is roughly to match a Contrast Sensitivity Function for the
102	   human visual system. This provides a curve of sensitivity to detail
103	   (and therefore coding errors) with spatial frequency. Given known
104	   resolutions and assumed viewing distances, a weighting function
105	   can be simply defined for all the coefficients in a transform block.

107	   This simple approach is complicated, however, by a number of factors.
108	   The first is the CSF is in reality not a simple function
109	   of spatial frequency, but depends on factors such as brightness
110	   which are imperfectly corrected for by television gammas. There is
111	   little that can be done about that in the quantisation matrices
112	   themselves, but adjusting QP itself may help.

114	   The second factor is that CSFs are determined experimentally based
115	   on models of Just Noticeable Difference (JND) and do not reflect so
116	   well the impact of distortions well above this level. Adjustments
117	   at high levels of quantisation are needed to reflect this.

119	   Finally, applying quantisation matrices to video is affected by the
120	   fact that most frames are predicted and the QM is applied to the
121	   residual after prediction. This means that the quantisation error
122	   for a block consists of the quantisation error in the reference
123	   block, plus any additional error introduced in the current block.
124	   These errors will add if they are uncorrelated, but they may well
125	   be correlated at high QP.

127	   Despite these difficulties, QMs are widely used and known to work
128	   well, and are available in video coding standards such as
129	   H264/AVC and H265/HEVC [AVC,HEVC].

131	3.2  Quantisation matrix design in AVC and HEVC

133	   Quantisation matrices are available in a number of different codecs.
134	   The design in AVC and HEVC is to provide default matrices together
135	   with the ability to signal bespoke matrices [AVC,HEVC]. These
136	   matrices must cover all the different transform block sizes,
137	   components (Y, Cb, Cr) and intra and inter frame or block types,
138	   with fall-backs defined if bespoke matrices are not provided.
139	   Default inter block matrices are flatter than
140	   intra matrices, no doubt because of the noise-addition effect
141	   described in section 3.1: if they had the same profile as for intra
142	   then the overall profile of the combined prediction + residual could
143	   be over-shaped.

145	3.3  Design of quantisation matrices in Thor

147	   Thor provides a set of matrices for each component of 420-sampled
148	   video, for each block size and each quantisation parameter. The
149	   principles behind the design are as follows:

151	   1) QP dependence. Matrices become flatter as quantisation levels
152	   increase

154	   2) Energy preservation for intra. The inverse quantisation matrices
155	   for intra blocks are normalised to approximately preserve energy
156	   of the residual

158	   3) DC preservation for inter. The inverse quantisation matrices for
159	   inter blocks are normalised to preserve the DC level

161	   4) Matrices are also flatter for inter blocks than for intra blocks.

163	   5) Quantisation matrix strength is globally adjustable

165	   The QP dependence takes account of a number of factors. Firstly it
166	   reflects that inter blocks typically have higher QPs than the blocks
167	   used to predict them. This means that flattening the matrices at
168	   higher QP naturally prevents over-shaping the quantisation error.

170	   Secondly, the high-QP flattening process also reflects the fact
171	   that errors at this level are very visible even at high spatial
172	   frequencies. Strong error-shaping at these QP levels leads to very
173	   visible additional ringiness.

175	   SSIM-based metrics [SSIM,MSSSIM,FASTSSIM] indicate that preserving
176	   image variances and therefore residual energies is perceptually
177	   important. This is feasible for intra where residuals are substantial
178	   but in the case of inter it is also important to preserve DC levels
179	   since getting these wrong can produce very visible artefacts.

181	   Intra frames tend to have lower QP than inter frames, and this means
182	   that QP dependence absorbs most of the requirement for inter
183	   matrices to be flatter than intra matrices. However inter matrices
184	   are still a little flatter, to take account of the different
185	   characteristics of intra and inter blocks within the same frame.

187	   In determining the quantisation matrix, there are 12 possible sets
188	   available giving a new set of matrix for each change of approximately
189	   4 in quantisation value. Thor also supports a global adjustment
190	   or strength parameter, which offsets the LUT mapping quantisation
191	   parameter to quantisation matrix set. This is a value from -32
192	   to 31. A value of -32 will reduce the qp used by 32, increasing
193	   the strength of quantisation matrix dramatically. Likewise a value
194	   of 31 will eliminate quantisation matrices for all but the smallest
195	   QPs.

197	   The effect of the ability to signal strength, and the provision
198	   of a range of QP-dependent matrices are intended to remove the need
199	   to signal bespoke matrices at all.

201	3.4  Implementation

203	  Quantisation matrices are applied as multiplicative factors in
204	  forward or inverse quantisation processes. In Thor the basic
205	  unweighted dequantisation process for a coefficient c with
206	  quantisation parameter  q is based on two values: scale[q], which
207	  depends only on q%6, and shift[q] which depends only on q/6,
208	  the block size and the signal dynamic range.  scale[q] takes care of
209	  quantisation step sizes which fall between powers of 2 and shift[q]
210	  takes care of the basic power of 2 part of the quantisation step.

212	  The formula for unweighted dequantisation is then:

214	  c -> (c*scale[q] + (1<<(shift[q]-1))) >> shift[q]                  (1)

216	  for positive shift[q], otherwise

218	  c -> (c*scale[q])<<(-shift[q])                                     (2)

220	  To apply a matrix M  to a coefficient c[i,j] at position
221	  (i,j) within a block, the formulae (1), (2) change to:

223	  c[i,j]->(c[i,j]*M[i,j]*scale[q]+(1<<(shift[q]+5)))>>(shift[q]+6)   (3)

225	  if shift[q]+6 > 0, otherwise

227	  c[i,j]->(c[i,j]*M[i,j]*scale[q])<<(-shift[q]-6)                    (4)

229	  otherwise.

231	   Exactly complementary formulae can be derived for the forward
232	   quantisation process.

234	4.  Compression performance

236	   Although largely a visual tool, the effectiveness of QMs can be
237	   inferred by changes to PSNRHVS [PSNRHVS] and FASTSSIM metrics.
238	   FASTSSIM tends to over-estimate gains a little, as it has a bias
239	   towards low-pass filtering. Overall BDR results for the Low-Delay B (LDB)
240	   and High-Delay B GOP 16 configuration (HDB16) are as follows
241	   (QPs 22, 27, 32, 37):

243	   Config |    PSNR   |  PSNRHVS  | FASTSSIM  |
244	   --------------------------------------------
245	   LDB    |   +1.1%   |   -3.3%   |   -9.0%   |
246	   --------------------------------------------
247	   HDB    |   +2.2%   |   -2.6%   |  -11.6%   |
248	   --------------------------------------------

250	   These were computed on the same test sequences as in IRFVC.

252	   FASTSSIM and PSNRHVS gains are typically larger, and PSNR losses
253	   smaller, for higher resolution material.

255	5.  Informative References

257	[AVC]        ITU-T Recommendation H.264, "Advanced video coding for
258	             generic audiovisual services", March 2010.

260	[HEVC]       ITU-T Recommendation H.265, "High efficiency video
261	             coding", April 2013.

263	[FASTSSIM]   Chen, M. and A. Bovik, "Fast structural similarity index
264	             algorithm", 2010, <http://live.ece.utexas.edu/publications
265	             /2011/chen_rtip_2011.pdf>.

267	[MSSSIM]     Wang, Z., Simoncelli, E., and A. Bovik, "Multi-Scale
268	             Structural Similarity for Image Quality Assessment", n.d.,
269	             <http://www.cns.nyu.edu/~zwang/files/papers/msssim.pdf>.

271	[PSNRHVS]    Egiazarian, K., Astola, J., Ponomarenko, N., Lukin, V.,
272	             Battisti, F., and M. Carli, "A New Full-Reference Quality
273	             Metrics Based on HVS", 2002.

275	[SSIM]       Wang, Z., Bovik, A., Sheikh, H., and E. Simoncelli, "Image
276	             Quality Assessment: From Error Visibility to Structural
277	             Similarity", 2004,
278	             <http://www.cns.nyu.edu/pub/eero/wang03-reprint.pdf>.

280	[IRFVC]      Davies, T. "Interpolated reference frames for video
281	             coding", IETF draft
282	             https://www.ietf.org/id/draft-davies-netvc-irfvc-00.txt

284	Author's Address:

286	   Thomas Davies
287	   Cisco
288	   Feltham
289	   UK

291	   Email: thdavies@cisco.com