[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Asrg] About that e-postage draft [POSTAGE]
John Leslie wrote, On 2/19/09 8:53 PM:
Bill Cole <asrg3 at billmail.scconsult.com> wrote:
John Leslie wrote, On 2/18/09 5:59 PM:
John Levine <johnl at taugh.com> wrote:
[...]
None of this is new, this was all worked out a decade ago. But it's
all still way too slow to keep up with an interesting amount of e-mail.
Have you perhaps not noticed Moore's Law at work these last ten years?
CPU power is not likely to be the critical bottleneck for a system that
has to do large numbers of transactions across the Internet.
Actually, CPU power _can_ be a bottleneck, though seldom the critical
one.
Any component *can* be a bottleneck. Maybe my point would be clearer put
this way: throwing more processing power at a high-volume OLTP system that
has to talk to clients across the Internet stops having returns long before
it requires the deployment of 21st-century CPU technologies.
Moores' law also applies to RAM. It has become somewhat practical to
address a terabyte of RAM. This eliminates storage latency, for all
practical purposes, which usually _is_ the critical bottleneck.
Storage latency can certainly be a problem and I'm convinced that it is the
easiest bottleneck to create or eliminate by seemingly trivial details of
logical design and implementation. That makes it a common criticality in
deployed systems, where it is often cheaper/faster/easier to fix than design
and/or coding, but it isn't rational for the hardware aspects of storage to
be critical in a system that has to chat with clients halfway around the
world. As Mouse noted, RTT's in hundreds of milliseconds are assured by
physics for as long as the Internet operates with wires, fiber, and photons
(rather than with spin-coupled antiparticle pairs or wormholes...) When many
transactions will have one or more 100ms periods of dead time and
essentially all will have 10ms waits, there's no fundamental reason for hard
storage latency to be critical.
Network latency is particularly resistant to Moore's law or anything
like it.
Agreed, but network _bandwidth_ has benefited substantially from
Moore's law.
Which doesn't help much for a transactional system that ultimately has to
serialize a lot of small transactions. The standard tactical responses to
the fact that bandwidth has grown without proportional drop in latency have
been aimed at utilizing bandwidth, i.e. transferring a lot of data. Email
postage validation wouldn't seem to me to benefit from such tactics as
parallelization of sliced-up tasks or deep acknowledgement queues, but maybe
I'm just not seeing some insight.
The bottom line is, redeeming a million tokens per second is practical
with processing delay not much greater than network latency. (This was
not true ten years ago...)
I think I'd quibble with details on that, but they really are not all that
important.
This is a problem that should be subject to a simplified thought experiment.
If you make extremely unrealistic positive assumptions about processing,
storage, and bandwidth but recognize that many RTT's between redemption
clients and servers will be above 100ms, most above 10ms, and essentially
all above 1ms, it is very hard to design a logical system (never mind a
collection of hardware) that will handle a million redemption requests per
second in a worthwhile manner (i.e. not repudiate valid tokens or validate
bogus or redeemed ones by design) when the request stream is being
engineered to break the system by parties with tens of thousands of hijacked
machines at their disposal.
I'm pretty sure that I'm not the best systems analyst/designer on this list.
I certainly hope I'm not the best one to have thought about e-postage. I'd
be happy to learn from a master how it is in fact possible to make an
ideally simplified minimal system like this work as a starting point for how
to assemble a more complex system that has more elements of reality in it. I
think (but may be wrong!) that it isn't possible to design a system that
will be theoretically capable of correctly handling a million redemption
requests per second of which ~90% are the result of someone working to break
the system.