[This is a resend of a message sent on Monday that never appeared to show
up - apologies if this is a duplicate]
I have run some experiments on existing WebSockets applications to see what
could be gained if compression were enabled.
To do this, I captured traces with tcpdump, then wrote a small libpcap-based
program to strip out individual TCP streams to separate files. I then wrote
another program which read these files, removed the WebSocket framing, and
tried compressing them in various ways using zlib and analyzed the results.
The applications tested were:
- A hacked version of Google Wave that used WebSockets for communication
with the server, with Chrome 5 as the browser. The messages in this case
are JSON consisting of ASCII text. (Note that this was a relatively small
trace, I am getting a larger trace to make sure the results hold).
- GWT Quake, which is an HTML5 app that gets up to 60fps running in the
browser using WebSockets for communication with the server. The messages
are largely binary, with individual bytes being encoded as UTF8 characters
(so bytes 0x80-0xFF are encoded as a two-byte UTF8 character), and those
binary values are largely IEEE floating point values. The sample taken was
from two players, one using Chrome 5 on Windows and the other using WebKit
nightly on Mac playing a multiplayer game.
The compression methods tried were:
- GZIP - using zlib with a gzip header (ie, deflateInit2(zstr,
Z_DEFAULT_COMPRESSION, Z_DEFLATED, 15 | 16, 8, Z_DEFAULT_STRATEGY) - this
should closely model the gzip encoding used for HTTP responses.
- DEFLATE - similar, but using deflateInit(Z_DEFAULT_COMPRESSION) so only
the zlib header is included
- DEFLATE/STREAM - the above methods compress each frame separately, so
the compression dictionary has to be rebuilt for each message. However,
this method maintains compression state across messages in the same stream
(using deflate(Z_SYNC_FLUSH) to finish an individual frame), so later
messages can exploit redundancy from previous messages. This does have the
downside of maintaining state for the duration of the connection, but there
is already significant state due to keeping a TCP connection up
For each stream, I collected the following information:
- total payload bytes transferred (note that I did not count TCP/IP
overhead or WebSockets framing overhead)
- packet sizes at the 25th, 50th 75th, and 90th percentiles
- percent reduction
- in each case, I assumed that if the compression resulted in an increase
in the frame size, it would be sent uncompressed instead (but I include the
count of such frames)
Google Wave
client->server:
31 frames totalling 9189 bytes, percentiles: 248/249/396/397 bytes
- GZIP: 6193 total bytes, percentiles 183/184/225/228, 32.60% reduction
- DEFLATE: 5937 total bytes, percentiles: 175/176/217/220 bytes, 35.39%
reduction, none under 63 bytes
- DEFLATE/STREAM: 1274 total bytes, percentiles: 26/28/36/76 bytes,
86.14% reduction, 88.8% under 63 bytes
server->client:
42 frames totalling 8962 bytes, percentiles: 90/91/111/657 bytes
- GZIP: 5485 total bytes, percentiles: 87/88/109/282 bytes, 38.80%
reduction
- DEFLATE: 5149 total bytes, percentiles: 79/80/101/274 bytes, 42.55%
reduction
- DEFLATE/STREAM: 1427 total bytes, 25/26/28/37 bytes, 84.08% reduction
Conclusion:
- Contrary to expectations, the wave protocol has sufficient redundancy
to get savings from compression - even basic gzip compression provides
significant benefits for this trace of the wave protocol.
- Sharing compression state across messages results in 5-6x reduction in
frame sizes, which would be very important in mobile environments
GWT Quake
client 1 -> server:
3837 frames totalling 212653 bytes, percentiles: 50/54/60/66 bytes
- GZIP: total 210242 bytes, percentiles: 50/54/60/66 bytes, 1.13%
reduction, 3833 frames grew larger
- DEFLATE: total 210184 bytes, percentiles: 50/54/60/66 bytes, 1.16%
reduction, 3817 frames grew larger
- DEFLATE/STREAM: total 94902 bytes, percentiles: 21/24/28/31 bytes,
55.37% reduction
server -> client 1:
2150 frames totalling 608658 bytes, percentiles: 120/163/476/478 bytes
- GZIP: total 405572 bytes, percentiles: 112/142/281/282 bytes, 33.37%
reduction
- DEFLATE: total 388897 bytes, percentiles: 104/134/273/274 bytes, 36.11%
reduction
- DEFLATE/STREAM: total 96093 bytes, percentiles: 23/44/58/68 bytes,
84.21% reduction
client 2 -> server:
1996 frames totalling 103091 bytes, percentiles: 44/50/58/64
- GZIP: total 101887 bytes, percentiles: 44/50/57/64 bytes, 1.17%
reduction, 1982 frames grew larger
- DEFLATE: total 101647 bytes, percentiles: 44/50/57/64 bytes, 1.40%
reduction, 1938 frames grew larger
- DEFLATE/STREAM: 49229 bytes, percentiles 21/24/28/32 bytes, 52.25%
reduction, 99.9% under 63 bytes
server -> client 2:
1880 frames totalling 423606 bytes, percentiles: 110/165/333/380
- GZIP: total 303437 bytes, percentiles: 102/142/217/255 bytes, 28.37%
reduction, 250 frames grew larger
- DEFLATE: total 289864 bytes, percentiles: 94/134/209/247 bytes, 31.57%
reduction, 77 frames grew larger
- DEFLATE/STREAM: total 73246 bytes, percentiles: 18/27/53/67 bytes,
82.71% reduction
Conclusion:
- It is absolutely critical that uncompressed frames be allowed even when
compression has been negotiated -- otherwise, the total bytes transferred
would be much higher (either through loss of compression where useful, or
by sending compressed data that is larger than the uncompressed data).
- the client->server stream only compresses if state is maintained across
frames, which also gives an order of magnitude size reduction on the
server->client stream.
- Even small packets benefit -- with persistent compression state,
traffic that is 90% under 64 bytes still gets 2:1 compression.
Implications for Protocol Changes for Compression:
- At least some apps will have different characteristics between the
traffic in each direction. For example, in GWT Quake, the client->server
traffic would pay a roughly 13% size penalty if it had to be compressed or
pay a 33%+ size penalty on the server->client traffic if the connection
wasn't compressed at all
- The simple approach is to simply allow compression to be optional
for each frame and only use it if it reduces the size
- This decision could be based on heuristics or by simply compressing
and comparing the size, though the latter is likely to be inefficient for
mobile devices.
- A more complicated approach would be to allow asymmetric compression
algorithms, though it isn't clear how the browser/server could take
advantage of it without exposing some API for the application to describe
the likely traffic.
- Maintaining compression state across frames gives a large benefit, and
likely overcomes the need to allow optional compression (though
in pathological cases there are probably still cases where compression
results in a size increase). However, this comes at the cost of additional
state, so striking the proper balance on mobile devices that are constrained
on both memory and network bandwidth may be difficult.
Next Steps
- I will get longer Wave traces and verify the measurements made here
still hold, especially during startup
- If anyone else has packet traces of actual WebSocket traffic they can
share (pcap format is fine, but please don't include any sensitive data)
please email them to me and I can include them in the analysis. If that is
a problem I could also send the source for the analysis tools which should
build on any Unix-based system.
- I want to get some functional compression test going so I can measure
actual latency gains on real apps running over real networks, which means
defining some plausible framing format. Probably the most straightforward
would just be to define frame type 0x80 as compressed UTF8 text, and the
uncompressed bytes are decoded and passed to the app just like the 0x00
frame. It looks like it would be easy to add to Jetty7 (though lack of
Z_SYNC_FLUSH from Java's Deflater makes it harder), though getting a
functional browser implementation might be a lot of work.
--
John A. Tamplin
Software Engineer (GWT), Google
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.