NETVC IETF 94

Date:	Monday, November 1st, 15:20 - 16:50; Tuesday, November 2nd, 15:20 - 16:50
Location:	Yokohama, Japan
Chairs:	Adam Roach, Mo Zanaty
Minutes:	Jonathan Lennox, Nathan Egge, David Benham

Administriva

Presenter:	Chairs
Slides:	https://www.ietf.org/proceedings/94/slides/slides-94-netvc-0.pdf

The chairs called the working group's attention to the newly formed CELLAR WG, which will be publishing specifications for lossless codecs, based on FLAC and FFV1, as well as a container format based on Matroska.

Requirements

Presenter:	Jose Alvarez
Slides:	https://www.ietf.org/proceedings/94/slides/slides-94-netvc-1.pdf
Drafts:	draft-filippov-netvc-requirements

Slide 7

Mo Zanaty: for traditional video conferences (with mixes), there aren’t many Intras; but with newer use cases, e.g. PERC, Intra becomes more important. E.g. active speaker switching, not just random entry. Test conditions should reflect this.

Tim Terriberry: Can you propose a number to the list?

Mo: I’ll come up with a number.

Screencasting (Slide 11)

Jonathan Lennox: Screencasting can also mean window sharing, which means arbitrary image sizes, including odd sizes.

Peter Thatcher: Desktop sizes go up to 5K now.

Low Coding Delay (Slide 16)

Maire Reavy: I'm concerned even 100ms of coding delay will be too high for real-time use. There can be slower modes (for non-real-time use cases) but 30ms (or less) for real-time needs to be the upper bound.

Mo Zanaty: we were thinking about a proper profile for these requirements, first version vs. later. May be time to do that now. On the delay, we should have a mode with zero structural delay. On bit depth, trend towards having higher internal precision apart from input or output formats. Might be good to differentiate these things.

Slide 18

Mo: To be clear, RGB means RGB 4:4:4

Slide 19

Martin Dürst: Useful to be able to play videos faster or slower, is that included in temporal scalability?

Jose: it could be.

Martin: Your list only includes 15, 30, etc. If you want 1.2x, whatever.

Jose: What is the use case?

Martin: If students are reviewing a lecture, can go through faster or slower depending on their needs.

Jose: Are these predictable factors, or infinitely variable?

Martin: Ones I know are 1.2, 1.5, etc.

Thomas Davies (Jabber Room): what does support of HDR mean in practice? If you can do 10/12 bit, then with a mapping function and suitable metadata then you can do some form of HDR.

Jose: Requirement is support for high dynamic range. More bits can do that, but it’s not necessarily linear. We’re not suggesting any particular implementation.

Call for Adoption

Chairs call for sense of the room regarding whether this draft should be the basis for fulfilling the requirements milestone. Of those in the room who had read the document (approximately 10), all favored adoption. No objections to adoption were noted.

Sense of room is to adopt; to be confirmed on-list

Mo: Does this have everything Jack had in his requirements document?

Tim: This had everything he wanted to see.

Test and Evaluation Criteria

Presenter:	Thomas Daede
Slides:	https://www.ietf.org/proceedings/94/slides/slides-94-netvc-2.pdf
Drafts:	draft-daede-netvc-testing

Slide 2

Thomas Davies: What about frame rate? 60 Hz does not require 2x the bit rate of 30 Hz

Thomas Daede: Currently that not’s taken into account, not sure how we want to scale it.

Mo Zanaty: Would it be useful to put the resolutions on the bitrate ranges too?

Thomas Daede: Screencast has all sorts of weird video sizes, I didn’t want to list them all, but when we get our explicit test cases we should list them.

Thomas Davies: square root is okay for framerate.

Slide 5

Mo: Unconstrained low latency is unconstrained in every way except structural delay?

Thomas Daede: Yes.

Slide 7

Thomas Davies: Main comment. The draft does not state what the testing is for and what the best test methodology might be for each purpose. Can you clarify the purposes you have in mind? What you might do for final testing is different for what you might do for testing a tool for inclusion, for example. This draft seems aimed at final testing rather than the process of developing the codec. I'm concerned about sucking away a lot of effort into developing fancy 2-pass encoding methods and rate controls that don't improve the fundamental technologies at all, when we are doing tests just for deciding on tools and judging current progress. Really want to constrain adaptivity to make fair comparisons between different codecs, also.

Thomas Daede: If you tell codecs they can’t use certain features, they have to be very careful not to use them. Not sure you can quite make it equal. Newer metrics respond correctly to a lot of these.

Mo: It seems like we have two types of testing, beauty contest vs. constant iteration working in the group. May be useful to have a split in the doc for those two cases.

Thomas: Yes, and a lot of concerns about fairness between codecs don’t apply if you’re testing a tool with a codec.

Mo: We already know you can get outlandish gains with outlandish rate controls. We don’t want that to mar tool selection or candidate selection.

Mo: I thought we were going to weight planes.

Thomas: Yes, but not clear how to weight. Works for PSNR not necessarily others.

Mo: But a weight of 0 for chroma is wrong.

Tim Terriberry: You don’t have to weight, just report all three scores.

Thomas: Doesn’t work great for other than PSNR.

Tim: Some are okay.

Call for Adoption

Sense of room is to adopt; to be confirmed on-list

Directional Deringing Filter

Presenter:	Jean-Marc Valin
Slides:	https://www.ietf.org/proceedings/94/slides/slides-94-netvc-3.pdf
Drafts:	http://jmvalin.ca/notes/dir_dering.pdf

Slide 8

Mo Zanaty: Do you have a threshold or aggressiveness for direction determination?

Jean-Marc: The threshold I described depends on the bitrate, runs parallel to the edge.

Mo: So they’re not dependent on the content.

Jean-Marc: No, they’re also dependent on the content. Variance of the block influences the threshold.

Mo: Does direction determination have aggressiveness based on the content, confidence in the determination?

Jean-Marc: No, because conditional replacement filter avoids blurring the image.

Tim Terriberry: We always pick a direction, there’s no non-directional mode.

Slide 10 (Filtering across direction)

Steve Botzko: I’m confused what you mean by across? Orthogonal to direction?

Jean-Marc: Not quite orthogonal.

Steve: So it’s always horizontal or vertical

Jean-Marc: Yes, whichever is closer to orthogonal

Slide 13

Steinar Midtskogen: These bitrates are quite high, do you have numbers for lower bitrates?

Jean-Marc: Even at lower bitrates it’s a clear improvement. At some point it falls apart, just blocks of different colors, you lose directionality.

Steve: So this is part of the prediction mode?

Jean-Marc: It’s run in a loop.

Steve: Any point in using it as a post-filter?

Jean-Marc: I don't think so

Steinar Midtskogen: Is this off by an order of magnitude?

Jean-Marc: Yes, should be 0.025

Thomas Davies: have you tried running the filter on a subset of frames (e.g. HQ frames in a frame hierarchy)? would reduce complexity. e.g. do it on every 4th frame, to reduce complexity.

Jean-Marc: No, I haven’t; may be worth trying. Current complexity is about 5% of CPU use.

Mo: What was your test set?

Jean-Marc: NTT-short on arewecompressedyet.

Mo: The visual example you showed earlier is part of the test set?

Jean-Marc: No, that’s a separate test set of still images. The curves are on video.

Mo: Would be interesting to see on a class of images.

Jean-Marc: I went over the test set, the filter improves all of them.

Tim Terriberry: Responding to Thomas Davies, running every fourth frame would have some difficulties, would have to keep track of which areas were skipped. Could certainly do it with something like hierarchical P-frames, just run on I and P. Always have the option to disable at the encoder. Disabling on every block would be cheap, because of entropy coding.

Tim: Your B-Frames are skipping lots of superblocks anyway, so we’re already almost doing that.

Jean-Marc: If you’re doing B-Frames.

Mo: Would be good if you and Steinar could combine this with Thor’s low-pass filter, come up with something that’s the best common tool.

Jean-Marc: Definitely more experiments to do there. Thor’s constrained low-pass filter is constrained to changing by only one; works well for high bitrates, less for lower.

Steinar: Result probably depends on what kind of interpolation filter the codec has, so what works in Daala might not work in Thor.

Jean-Marc: Also that Daala has no Intra-prediction, Thor does. Yes, I wouldn’t expect result to look the same for the two codecs.

Daala Update

Presenter:	Timothy Terriberry
Slides:	https://www.ietf.org/proceedings/94/slides/slides-94-netvc-4.pdf
Drafts:	draft-terriberry-codingtools, draft-valin-netvc-pvq, draft-egge-netvc-tdlt, draft-terriberry-netvc-obmc, https://git.xiph.org/?p=daala.git

See slides for summary of changes and resulting improvements. No discussion recorded in minutes.

Thor

Presenter:	Steinar Midtskogen
Slides:	https://www.ietf.org/proceedings/94/slides/slides-94-netvc-6.pdf
Drafts:	draft-fuldseth-netvc-thor, draft-davies-netvc-irfvc, draft-midtskogen-netvc-clpf

See slides for summary of changes and resulting improvements.

Performance Update

Mo Zanaty: Asked for cllarification about SIMD optimizations made. Also asked if a general purpose GPU abstraction layer has been considered

Steiner: Roughly speaking, no.

Call for Harmonization

Chairs requested that Daala and Thor developers agree on common methodology for depicting improvements between the two codecs.

Chroma from Luma

Presenter:	Nathan Egge
Slides:	https://www.ietf.org/proceedings/94/slides/slides-94-netvc-5.pdf
Drafts:	draft-egge-netvc-cfl

See slides for summary of technique and impacts.

Mo Zanaty: Have these been tested with 4:4:4 and RGB content?

Nathan: no

Chairs re-iterated that we need harmonized approach for displaying performance improvement information.