[rtcweb] Comments on H.264 and VP8 performance comparisons

Bo Burman <bo.burman@ericsson.com> Mon, 14 October 2013 21:12 UTC

Return-Path: <bo.burman@ericsson.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C09BA21E80A8 for <rtcweb@ietfa.amsl.com>; Mon, 14 Oct 2013 14:12:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.249
X-Spam-Level:
X-Spam-Status: No, score=-6.249 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_SE=0.35, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PcLusoZLtzYZ for <rtcweb@ietfa.amsl.com>; Mon, 14 Oct 2013 14:12:16 -0700 (PDT)
Received: from mailgw2.ericsson.se (mailgw2.ericsson.se [193.180.251.37]) by ietfa.amsl.com (Postfix) with ESMTP id BC25421E80F8 for <rtcweb@ietf.org>; Mon, 14 Oct 2013 14:12:10 -0700 (PDT)
X-AuditID: c1b4fb25-b7eff8e000000eda-cd-525c5e2a2829
Received: from ESESSHC016.ericsson.se (Unknown_Domain [153.88.253.125]) by mailgw2.ericsson.se (Symantec Mail Security) with SMTP id 3D.C7.03802.A2E5C525; Mon, 14 Oct 2013 23:12:10 +0200 (CEST)
Received: from ESESSMB105.ericsson.se ([169.254.5.148]) by ESESSHC016.ericsson.se ([153.88.183.66]) with mapi id 14.02.0328.009; Mon, 14 Oct 2013 23:12:10 +0200
From: Bo Burman <bo.burman@ericsson.com>
To: "rtcweb@ietf.org" <rtcweb@ietf.org>
Thread-Topic: Comments on H.264 and VP8 performance comparisons
Thread-Index: Ac7JIe76OhlqjGwCQFaenvACTkQ0Vw==
Date: Mon, 14 Oct 2013 21:12:08 +0000
Message-ID: <BBE9739C2C302046BD34B42713A1E2A22DF9F8D2@ESESSMB105.ericsson.se>
Accept-Language: sv-SE, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [153.88.183.17]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprHLMWRmVeSWpSXmKPExsUyM+Jvra5WXEyQQdM6fou1/9rZHRg9liz5 yRTAGMVlk5Kak1mWWqRvl8CVcaq9m7Fgql7FlnV/mBoYd6t2MXJySAiYSOyc9YIRwhaTuHBv PVsXIxeHkMBhRokvC6YzQjhLGCUOtl9kA6liE9CQmL/jLliHiIC6xOWHF9hBbGEBK4klkx5A xe0l1mzeyQZh60mc3/sSLM4ioCrxr20FC4jNK+ArsevrOiYQm1FAVuL+93tgcWYBcYlbT+Yz QVwkILFkz3lmCFtU4uXjf6xdjBxAtqLE8n45iHIdiQW7P7FB2NoSyxa+ZoYYLyhxcuYTlgmM wrOQTJ2FpGUWkpZZSFoWMLKsYmTPTczMSS832sQIDOODW36r7mC8c07kEKM0B4uSOO+Ht85B QgLpiSWp2ampBalF8UWlOanFhxiZODilGhgnHd/xVsZUTnHLsvA1D9epMmVtX/nwWTWbRlkt byzX9Fjuqqi7n03XrLC2zuY9e/ThD4NQtU/FyTMvL9F7cviKV2JFhF2F/tkDgXxuJi9Fdkrf C9S8eV3u4+dd3Zd7lBNDbs13P9V6asdLz7vRu3mkfH9H7zl8RugtywfJq6eDf+86rBvQLtGr xFKckWioxVxUnAgA+qovpzECAAA=
Subject: [rtcweb] Comments on H.264 and VP8 performance comparisons
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Oct 2013 21:12:22 -0000

Hi all,

We would like to counter Google's suggestion that our test has only "demonstrated that it is possible to reduce VP8's performance" (updated draft on VP8 http://datatracker.ietf.org/doc/draft-alvestrand-rtcweb-vp8/). 

In fact, what we did in our test was mostly undoing some very peculiar x264 settings made by Google in their test from April 3. By instead using the x264 settings Google themselves proposed in their earlier test (from March 12), and removing threading, the difference went down from 41% to 16%. (This is without touching the VP8 parameters.)

The last change we made was to remove the rate control from the comparison, something that is standard practice in the world of video standardization. This involved changing both the x264 and VP8 parameters. After that, the difference went down to -1%.

In summary, the following steps were taken in our comparison:

1) Downloading the latest software: 41% became 36%
2) Removing threading: 26%
3) Removing bit padding: 18%
4) Removing other differences between Google's March 12 and April 3rd tests: 16%
5) Removing rate controller: -1%

Contrary to Google's note on our test the purpose was not to "reduce the VP8 performance" but rather to present a technically correct codec comparison. Below follows a detailed description of what we found:

---

When we first saw Google's test which was posted on April 3rd, we were surprised to find that their results differed so much from our own. Whereas we got a negligible difference between VP8 and H.264 constrained baseline, Google reported that H.264 constrained baseline needed 41% more bits than VP8 for the same quality. This made us look deeper into Google's test to see how this could be explained. 

The first thing we did was to download the latest versions of all software packages. By using the newest version of x264, the difference went down from 41% to 36%.

The second thing that caught our attention was that Google's x264 test used auto threading. By omitting the parameter "--threads 1", the x264 encoder defaults to "--threads auto", which means that a large number of threads will be used to compress the image (see for instance http://mewiki.project357.com/wiki/X264_Settings). This will have the effect of decreasing the compression time at the cost of quality. The VP8 codec, on the other hand, defaults to a single thread with no quality degradation. Even when we changed the x264 configuration to use the proper threading value ("--threads 1"), the x264 codec was twice as fast as VP8, and the difference in bit rate went down from 36% to 26%. 

At this time we proceeded to only test the first 10 seconds of each sequence in order to get reasonable running times. This actually increased the difference again to 29%.

Our attention now turned to the "--nal-hrd cbr" parameter in Google's x264 April 3rd parameter list. As described at http://mewiki.project357.com/wiki/X264_Settings#nal-hrd, this setting will pack the bitstream with padding bits in order to exactly reach a particular bit rate. This is useful in some circumstances such as in Blu-ray or ISDN video telephony which has to be exactly, say, 64000 bits per second, but it is undesirable in a codec comparison since it will only add bits and not increase quality. The VP8 encoder does not do such bit padding and was thus at an advantage. Removing the "--nal-hrd cbr" parameter for x264 avoided the unnecessary bit padding and the difference now went down from 29% to 18%.

At this point in time we tried removing the remaining variables that differed between the x264 parameter set of Google's first test (from March 12) and the x264 parameter set of Google's second test (from April 3rd), and this resulted in a further decrease from 18% to 16%.

Finally we removed the rate controller from the test and instead used fixed QP. As we have argued previously, having a rate controller in the loop just adds noise to the test and also risks measuring the performance of the rate controller rather than the codec. Using fixed QP (no rate control) is therefore established practice in the video codec community. As an example, the Motion Pictures Expert Group (MPEG) recommends using fixed QP for video comparisons, see for instance (http://mpeg.chiariglione.org/standards/exploration/internet-video-coding/call-proposals-internet-video-coding-technology). The reworked test without rate control (which we published on June 22) then got the result of -1%. This means H.264 constrained baseline (in the x264 implementation) outperformed VP8. We also found H.264 constrained high, again using the x264 implementation, to be 24% better than VP8.

Google's comparison on April 3rd also included a speed test. In this, x264 is set to use only one thread, the parameter "--threads 1" is set. This means that x264 cannot enjoy the faster speed of a parallel implementation. We do not quite understand this choice of parameters from Google: When threading should be avoided (in the case of measuring quality), it is used, whereas when it would be helpful (in the case of measuring speed), it is avoided. In both cases this is unfavorable to x264. This does not seem entirely fair.

Note that the biggest differences in performance were not due the changes made to the VP8 settings but to those of x264: The threading and the bit padding alone accounted for 21 of the 41 percentage points. Thus, we do not think that our test is about trying to "reduce the VP8 performance". Instead, the major things have been avoiding what we believe are unfortunate parameter choices for x264 on behalf of Google, and to create a technically correct test.

Best Regards,

Bo Burman