NetVC (Internet Video Codec) Meeting Minutes IETF 100 Singapore 2017-Nov-15 09:30-11:00 SGT (UTC+8) Chairs: Mo Zanaty (mzanaty@cisco.com) Matthew Miller standing in for Natasha Rooney (nrooney@gsma.com) Area Director: Adam Roach (adam@nostrum.com) Jabber Scribe: Matthew Miller Note Taker: Nathan Egge Agenda: Administrivia, Status (chairs) Test and Evaluation Criteria (Thomas Daede) Thor/AV1 Update and Comparisons (Steinar Midtskogen) Daala Transforms (Tim Terriberry) Chroma Prediction from Luma in AV1 (Luc Trudeau) --- Milestones & Status (Mo Zanaty) --- (Slide 5) Update the requirements document by the end of the month. Testing document is a living document and will not be finalized. Still need to merge the two candidate codecs (Daala and Thor). Milestone dates will be updated to 2018. --- Testing (Thomas Daede) --- (Slide 3) Change in command line paramters to remove lag-in-frames. Turned off some of the slowest modes. (Slide 4) Mo: The testing methodology still specifies that testing should be at the maximum compression not a limited set of options. Thomas: Yes, the draft does allow for other speciifc testing conditions, e.g., codec to codec, codec to self, etc. including lower complexity. Mo: Does this belong in the testing draft? Thomas: This is specific to individual tools, not suitable for codec to codec comparisons. We could simply *not* specify this in the draft, just say that you could use other speed settings if they are justified. Jonatahn Lennox: It seems with any codec, your code could be incredibly slow. You can always do an exhaustive search. There should be some threshold of "this is absurd" and has no practical use. Mo: I'll just note that "this is absurd" is what is used for other codec development. Thomas: I'd like to specify some normalized way of doing this since people are going to be presenting results with cpu-levels that are not zero. Thomas: I'd like to specify something but maybe it will not be so explicit. If there are no objections, I'll put this together for the next testing draft. Mo: Hearing no objections. --- Thor/AV1 Updates (Steinar Midtskogen) --- (Slide 9) Mo: Question on the gains for chroma, look like 3-4x the gains overall. Any clue why? Steinar: I don't have any clues, but what I woud like to investigate is if there are any bugs in the encoder. It might not be normative, but it could be a bug that CDEF is able to compensate for. Mo: Before you finish on CDEF, I wanted to raise one issue. The requirements document says we will support 4:2:2, CDEF is a roadblock because the direction search does not work for 4:2:2. Steinar: So the way it works is that in 4:2:2, we will still be able to do the filter for luma, but for chroma we disable the direction search and filter. Mo: Do you disable the direction search, or do we disable the filter. Steinar: I believe we disable the filter. Tim Terriberry: The directional search is only ever done on luma, so the chroma uses the direction that luma found to orient its filters. Since there is no correlation between the direction for chroma and luma when you have these rectangular blocks, we just disable the filter. What you could do is just assume a fixed direction for Chroma and filter along that, and you get CLPF. Steinar: I am not sure how important it is to have good performance for 4:2:2 Mo: As I said, we may need to make changes to the requirements document if it is not supported. Steinar: Well 4:2:2 is supported but it might not have the best compression gains. (Slide 13) Mo: Can we get a quick thumb in the air about what the complexity is absolutely (not relative to earlier AV1). How about as compared to VP9? Steinar: [100x slower than VP9] Mo: The specification says we are supposed to be able to run 10-bit 4k. Mo: Those numbers were for the encoder. Steinar: Decoder speed is roughly 1/4. Mo: So 4x as complex as vp9? Steinar: I think it is closer to 16x, but I think the reason for that is there are still some SIMD optimizations lacking so I think 4x is an accuarte number. --- Daala Transforms (Tim Terriberry) --- (Slide 21) Mo: Is that compared to the same approximation, or is that the full DCT? Tim: That is a full expanded value matrix multiply. (Slide 25) Mo: So you don't have a fixed shift between the row and column stages? Tim: We don't have a fixed shfit at all. Mo: What happens if you have more than a 64-pixel block? Tim: We go to larger SIMD at that point. (Slide 28) Mo: So this is the implementation from Daala that is not perfectly invertible [referring to the shift down by 4 for reference frames], what about AV1. Tim: AV1 does the same thing. Mo: So if you have full precision references you would get this back. Tim: Yes. Mo: Where I am going with this is that you could use full precision references and not have a quantizer, you could use a real transform instead of the 4x4 in H264. Tim: Well you need to becareful about how you do this. We tried something similar in VP9 and it was worse. Mo: [You'd need to selectively chose what size you use, but you should get gains.] (Slide 37) Mo: One more question, kind of broad. These transforms look like they compare very favorably to VP9 and AV1. Have you looked at the Thor or HEVC transforms? Tim: We have not done direct comparisons in terms of coding performance. In terms of complexity, my understanding is the Thor transforms are giant matrix multiplies. You can get away with that for small transforms but as you get much larger I think these will be significantly faster. --- Chroma Prediction from Luma in AV1 (Luc Trudeau) --- [no questions, issues or actions]