Problems with Proxies in HTTP
mnot@mnot.net
http://www.mnot.net/
General
Internet-Draft
This document discusses the use and configuration of proxies in HTTP, pointing
out problems in the currently deployed Web infrastructure along the way. It
then offers a few principles to base further discussion upon, and lists some
potential avenues for further exploration.
HTTP/1.1 was designed to accommodate proxies. It allows them (and
other components) to cache content expansively, and allows for proxies to break
“semantic transparency” by changing message content, within broad constraints.
As the Web has matured, more networks have taken advantage of this by deploying
proxies for a variety of reasons, in a number of different ways. is a
survey of the different ways that proxies are used, and shows how they
are interposed into communication.
Some uses of proxies cause problems (or the perception of them) for origin
servers and end users. While some uses are obviously undesirable from the
perspective of an end users and/or origin server, other effects of their
deployment are more subtle; these are examined in .
These tensions between the interests of the stakeholders in every HTTP
connection – the end users, the origin servers and the networks they use –
has led to decreased trust for proxies, then increasing deployment of
encryption, then workarounds for encryption, and so forth.
Left unchecked, this escalation can erode the value of the Web itself.
Therefore, proposes straw-man principals to base further
discussion upon.
Finally, proposes some areas of technical investigation that may
yield solutions (or at least mitigations) for some of these problems.
Note that this document is explicitly about “proxies” in the sense that HTTP
defines them. Intermediaries that are interposed by the server (e.g., gateways
and so-called “Reverse Proxies”, as used in Content Delivery Networks) are out
of scope.
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”,
“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this
document are to be interpreted as described in .
HTTP proxies are interposed between user agents and origin servers for a
variety of purposes; some of them are with the full knowledge and consent of
end users, to their benefit, and some are solely for the purposes of the
network operator – sometimes even against the interests of the end users.
This section attempts to identify the different motivations networks have for
deploying proxies.
Some networks do not have direct Internet connectivity for Web browsing. These
networks can deploy proxies that do have Internet connectivity and then
configure clients to use them.
Such gatewaying between networks were some of the first uses for proxies.
An extremely common use of proxies is to interpose a HTTP cache, in order to
save bandwidth, improve end-user perceived latency, increase reliability, or
some combination of these purposes.
HTTP defines a detailed model for caching (see
); however, some lesser-known aspects of the
caching model can cause operational issues. For example, it allows caches to go
into an “offline” mode where most content can be served stale.
Also, proxy caches sometimes fail to honor the HTTP caching model, reusing
content when it should not have been. This can cause interoperability issues,
with the end user seeing overly “stale” content, or applications not operating
correctly.
Some proxies are deployed to aid in network policy enforcement; for example, to
control access to the network, requiring a login (as allowed explicitly by
HTTP’s proxy authentication mechanism), bandwidth shaping of HTTP access,
quotas, etc. This includes so-called “Captive Portals” used for network login.
Some uses of proxies for policy enforcement cause problems; e.g., when a proxy
uses URL rewriting to send a user a message (e.g., a “blocked” page), they can
make it appear as if the origin server is sending that message – especially
when the user agent isn’t a browser (e.g., a software update process).
Some networks attempt to filter HTTP messages (both request and response) based
upon network-specific criteria. For example, they might wish to stop users from
downloading content that contains malware, or that violates site policies on
appropriate content, or that violates local law.
Intermediary proxies as a mechanism for enforcing content restrictions are
often easy to circumvent. For example, a device might become infected by using
a different network, or a VPN. Nevertheless, they are commonly used for this
purpose.
Some content policy enforcement is also done locally to the user agent; for
example, several Operating Systems have machine-local proxies built in that
scan content.
Content filtering is often seen as controversial, often depending on the
context it is used within and how it is performed.
Some networks modify HTTP messages (both request and response) as they pass
through proxies. This might include the message body, headers, request-target,
method or status code.
Motivation for content modification varies. For example, some mobile networks
interpose proxies that modify content in an attempt to save bandwidth, improve
perceived performance, or transcode content to formats that limited-resource
devices can more easily consume.
Modifications also include adding metadata in headers for accounting purposes,
or removing metadata such as Accept-Encoding to make virus scanning easier.
In other cases, content modification is performed to make more substantial
modifications. This could include inserting advertisements, or changing the
layout of content in an attempt to make it easier to use.
Content modification is very controversial, often depending on the context it
is used within and how it is performed. Many feel that, without the explicit
consent of either the end user or the origin server, a proxy that modifies
content violates their relationship, thereby degrading trust in the Web overall.
However, it should be noted that explicitly allows
“non-transparent” proxies that modify content in certain ways. Such proxies are
required to honor the “no-transform” directive, giving both user agents and
origin servers a mechanism to “opt out” of modifications; however, it is not
technically enforced.
is a product of the W3C Mobile Web Best
Practices Working Group that attempts to set guidelines for content
modification proxies. Again, it is a policy document, without technical
enforcement measures.
How a proxy is interposed into a network flow often has great affect on
perceptions of its operation by end users and origin servers. This section
catalogues the ways that this happens, and potential problems with each.
The original way to interpose a proxy was to manually configure it into the
user agent. For example, most browsers still have the ability to have a proxy
hostname and port configured for HTTP; many Operating Systems have system-wide
proxy settings.
Unfortunately, manual configuration suffers from several problems:
Users often lack the expertise to manually configure proxies.
When the user changes networks, they must manually change proxy settings, a
laborious task. This makes manual configuration impractical in a modern,
mobile-driven world.
Not all HTTP stacks support manual proxy configuration. Therefore, a proxy
administrator cannot rely upon this method.
The limitations of manual configuration were recognized long ago. The solution
that evolved was a format called “proxy.pac” that allowed the
proxy configuration to be automated, once the user agent had loaded it.
Proxy.pac is a JavaScript format; before each request is made, it is dispatched
to a function in the file that returns a string that denotes whether a proxy is
to be used, and if so, which one to use.
Discovery of the appropriate proxy.pac file for a given network can be made
using a DHCP extension, . WPAD started as a simple protocol; it conveys
a URL that locates the proxy.pac file for the network.
Unfortunately, the proxy.pac/WPAD combination has several operational issues
that limit its deployment:
The proxy.pac format does not define timeouts or failover behavior
precisely, leading to wide divergence between implementations. This makes
supporting multiple user agents reliably difficult for the network.
WPAD is not widely implemented by user agents; some only implement proxy.pac.
In those user agents where it is implemented, WPAD is often not the default,
meaning that users need to configure its use.
Neither proxy.pac nor WPAD have been standardized, leading to implementation
divergence and resulting interoperability problems.
There are DNS-based variants of WPAD, adding to to confusion.
DHCP options generally require tight integration with the operating system to
pass the results to HTTP-based applications. While this level of integration
is found between O/Ses and their provided applications, the interface may or
may not be available to third parties.
WPAD can be spoofed, allowing attackers to interpose a proxy and intercept
traffic.
The problems with manual configuration and proxy.pac/WPAD have led to the wide
deployment of a third style of interposition; interception proxies.
Interception occurs when lower-layer protocols are configured to route HTTP
traffic to a host other than the origin server for the URI in question. It
requires no client configuration (hence its popularity over other methods). See
for an example of an interception-related protocol.
Interception is also strongly motivated when it is necessary to assure that the
proxy is always used, e.g., to enforce policy.
Interception is problematic, however, because it is often done without the
consent of either the end user or the origin server. This means that a response
that appears to be coming from the origin server is actually coming from the
intercepting proxy. This makes it difficult to support features like
proxy authentication, as the unexpected status code breaks many clients (e.g.,
non-interactive applications like software installers).
In addition, as adoption of multi-path TCP (MPTCP) increases, the
ability of intercepting proxies to offer a consistent service degrades.
More recently, it’s become more common for a proxy to be interposed as a side
effect of another choice by the user.
For example, the user might decide to add virus scanning – either as installed
software, or a service that they configure from their provider – that is
interposed as a proxy. Indeed, almost all desktop virus scanners and content
filters operate in this fashion.
This approach has the merits of both being easy and obtaining explicit user
consent. However, in some cases, the end user might not understand the
consequences of use of the proxy, especially upon security and interoperability.
Deployment of proxies has an effect on the HTTP protocol itself. Because a
proxy implements both a server and a client, any limitations or bugs in their
implementation impact the protocol’s use.
For example, HTTP has a defined mechanism for upgrading the protocol of a
connection, to aid in the deployment of new versions of HTTP (such as HTTP/2.0)
or completely different protocol (e.g., ).
However, operational experience has shown that a significant number of proxy
implementations do not correctly implement it, leading to dangerous situations
where two ends of a HTTP connection think different protocols are being spoken.
Anothr example is the Expect/100-continue mechanism in HTTP/1.1, which is often
incorrectly implemented. Likewise, differences in support for trailers limits
protocol extensions.
It has become more common for Web sites to use TLS in an attempt to
avoid many of the problems above. Many have advocated use of TLS more broadly;
for example, see the EFF’s HTTPS Everywhere program, and
SPDY’s default use of TLS .
However, doing so engenders a few problems.
Firstly, TLS as used on the Web is not a perfectly secure protocol, and using
it to protect all traffic gives proxies a strong incentive to work around it,
e.g., by deploying a certificate authority directly into browsers, or buying a
sub-root certificate.
Secondly, it removes the opportunity for the proxy to inform the user agent of
relevant information; for example, conditions of access, access denials, login
interfaces, and so on. User Agents currently do not display any feedback from
proxy, even in the CONNECT response (e.g., a 4xx or 5xx error), limiting their
ability to have inform users of what’s going on.
Finally, it removes the opportunity for services provided by a proxy that the
end user may wish to opt into. For example, consider when a remote village
shares a proxy server to cache content, thereby helping to overcome the
limitations of their Internet connection. TLS-protected HTTP traffic cannot be
cached by intermediaries, removing much of the benefit of the Web to what is
arguably one of its most important target audiences.
It is now becoming more common for a proxy to man-in-the-middle TLS connections
(see for an overview), to gain access to the application message
flows. This represents a serious degradation in the trust infrastructure of the
Web.
Worse is the situation where proxies provide a certificate where they inure the
user to a certificate warning that they must then ignore in order to receive
service.
Every HTTP connection has at least three major stakeholders; the user
(through their agent), the origin server (possibly using gateways such as a
CDN) and the networks between them.
Currently, the capabilities of these stakeholders are defined by how the Web is
deployed. Most notably, networks sometimes change content. If they change it
too much, origin servers will start using encryption. Changing the way that
HTTP operates therefore has the potential to re-balance the capabilities of the
various stakeholders.
This section proposes several straw-man principles for consideration as the
basis of those changes. Their sole purpose here is to provoke discussion.
As illustrated above, there are many legitimate uses for proxies, and they are
a necessary part of the architecture of the Web. While all uses of proxies are
not legitimate – especially when they’re interposed without the knowledge or
consent of the end user and the origin – undesirable intermediaries (i.e.,
those that break the reasonable expectations of other stakeholders) are a small
portion of those deployed used.
Any solution needs to give all stakeholders – end users, networks and origin
servers – a strong incentive towards security.
This has subtle implications. If networks are disempowered disproportionately,
they might react by blocking secure connections, discouraging origin servers
(who often have even stronger profit incentives) from deploying encryption,
which would result in a net loss of security.
Security at the expense of long-term interoperability is not a good trade.
For example, if networks decide to only allow secure connections to well-known,
large origin servers, it creates a “walled garden” that favours big sites at
the expense of less well-known ones.
Likewise, if a jurisdiction cannot use standard-conformant browsers to impose
their legal requirements upon network users, they might decide to create a
separate Web based upon competing technology.
When a proxy is interposed, the user needs to be informed about it, so they
have the opportunity to change their configuration (e.g., attempt to introduce
encryption), or not use the network at all.
Proxies also need to be strongly authenticated; i.e., users need to be able to
verify who the proxy is.
When a proxy is interposed, the user needs to be able to tunnel any request
through it without its content (or that of the response) being exposed to the
proxy.
This includes both “https://” and “http://” URIs.
A proxy can refuse to forward any request. This includes a request to a
specific URI, or from a specific user, and includes refusing to allow tunnels
as described above.
The “no”, however, needs to be explicit, and explicitly from the proxy.
Any changes to the message body, request URI, method, status code, or
representation header fields of an HTTP message needs to be detectable by the
origin server or user agent, as appropriate, if they desire it.
This allows a proxy to be trusted, but its integrity to be verified.
It must be possible to configure a proxy extremely easily; the adoption of
interception over proxy.pac/WPAD illustrates this very clearly.
There are many situations where a proxy needs to communicate with the end user;
for example, to gather network authentication credentials, communicate network
policy, report that access to content has been denied, and so on.
Currently, HTTP has poor facilities for doing so. The proxy authentication
mechanism is extremely limited, and while there are a few status codes that are
define as being from a proxy rather than the origin, they do not cover all
necessary situations.
The Warning header field was designed as a very limited form of communication
between proxies and end users, but it has not been widely adopted, nor exposed
by User Agents.
Importantly, proxies also need a limited communication channel when TLS is in
use, for similar purposes.
Equally as important, the communication needs to clearly come from the proxy,
rather than the origin, and be strongly authenticated.
While some users are sophisticated in their understanding of Web security, they
are in a vanishingly small minority. The concepts and implications of many
decisions regarding security are subtle, and require an understanding of how
the Web works; describing these trade-offs in a modal dialogue box that gets in
the way of the content the user wants has been proven not to work.
Similarly, while some Web publishers are sophisticated regarding security, the
vast majority are not (as can be proven by the prevalence of cross-site
scripting attacks).
Therefore, any changes cannot rely upon perfect understanding by these parties,
or even any great effort upon their part. This implies that user interface will
be one of the biggest challenges faced, both in the browser and for any changes
server-side.
Notably, the most widely understood indicator of security today is the “lock
icon” that shows when a connection is protected by TLS. Any erosion of the
commonly-understood semantics of that indicator, as well as “https://” URIs, is
likely to be extremely controversial, because it changes the already-understood
security properties of the Web.
Another useful emerging convention is that of “Incognito” or “private” mode,
where the end user has requested enhanced privacy and security. This might be
used to introduce higher requirements for the interposition of intermediaries,
or even to prohibit their use without full encryption.
HTTP is used in a wide variety of environments. As such there can be no
assumption that a user is sitting on the other end to interpret information or
answer questions from proxies.
Getting consent from users, as well as informing them, can take a variety of
forms. For example, if we require that users consent to using a proxy, that
consent could be obtained through a modal dialog in the browser, or through a
written agreement between an employer and their employee.
Likewise, a browser vendor may choose not to implement some optional portions
of the specification, based upon how they want to position their product with
their audience.
It’s very tempting for a committee to proclaim that proxies MUST do this and
SHOULD NOT do that, but the reality is that the proxies, like any other actor
in a networked system, will do what they can, not what they’re told to do, if
they have an incentive to do it.
Therefore, it’s not enough to say that (for example), “proxies have to honor
no-transform” as HTTP/1.1 does. Instead, the protocol needs to be designed in
a way so that either transformations aren’t possible, or if they are, they
can be detected (with appropriate handling by User Agents defined).
Any improvements to the proxy ecosystem MUST be incrementally deployable, so
that existing clients can continue to function.
Finally, this section lists some areas of potential future investigation,
bearing the principles suggested above in mind.
The IETF has long fought against interception proxies, as they are
indistinguishable from Man-In-The-Middle attacks. Nevertheless, they persist as
the preferred method for interposing proxies in many networks.
Unless another mechanism can be found or defined that offers equally attractive
properties to network operators, we ought to consider that they’ll continue to
be deployed, and work to find ways to make their operation both more verifiable
and unnecessary (or at least legitimate).
Many of the flaws in proxy.pac and WPAD can be fixed by careful specification
and standardization, with active participation by both implementers and those
that deploy them.
HTTP’s use of TLS currently offers no way for an interception proxy
to communicate with the user agent on its own behalf. This might be necessary
for network authentication, notification of filtering by hostname, etc.
The challenge in defining such a mechanism is avoiding the opening of new
attack vectors; if unauthenticated content can be served as if it were from the
origin server, or the user can be encouraged to “click through” a dialog, it
has severe security implications. As such, the user experience would need to be
carefully considered.
HTTP currently defines two status codes that are explicitly generated by a
proxy:
504 Gateway Timeout - when a proxy (or gateway) times out going
forward
511 Network Authentication Required - when authentication
information is necessary to access the network
It might be interesting to discuss whether a separate user experience can be
formed around proxy-specific status codes, along with the definition of new
ones as necessary.
While TLS can be used end-to-end for “https://” URIs, support for connecting to
a proxy itself using TLS (e.g., for “http://” URIs) is spotty. Using a proxy
without strong proof of its identity introduces security issues, and if a proxy
can legitimately insert itself into communication, its identity needs to be
verifiable.
To allow users to tunnel any request through proxies without revealing its
contents, it must be possible to use TLS for HTTP URIs.
Proxies can then choose whether to allow such tunneled traffic, and if not, the
user can choose whether to trust the proxy.
Currently, it is possible to exploit the mismatched incentives and other flaws
in the CA system to cause a browser to trust a proxy as authoritative for a
“https://” URI without full user knowledge. This needs to be remedied; otherwise, proxies will continue to man-in-the-middle TLS.
Signatures for HTTP content – both requests and responses – has been
discussed on and off for some time.
Of particular interest here, signed responses would allow a user-agent to
verify that the origin’s content has not been modified in transit, whilst still
allowing it to be cached by intermediaries.
Likewise, if header values can be signed, the caching policy (as expressed by
Cache-Control, Date, Last-Modified, Age, etc.) can be signed, meaning it can be
verified as being adhered to.
Note that properly designed, a signature mechanism could work over TLS,
separating the trust relationship between the UA and the origin server and that
of the UA and its proxy (with appropriate consent).
There are significant challenges in designing a robust, widely-deployable HTTP
signature mechanism. One of the largest is an issue of user interface - what
ought the UA do when encountering a bad signature?
Plenty of them, I suspect.
This document benefits from conversations and feedback from many people,
including Amos Jeffries, Willy Tarreau, Patrick McManus, Roberto Peon, Guy
Podjarny, Eliot Lear, Brad Hill and Martin Nilsson.
Key words for use in RFCs to Indicate Requirement Levels
Harvard University
1350 Mass. Ave.
Cambridge
MA 02138
- +1 617 495 3864
sob@harvard.edu
General
keyword
In many standards track documents several words are used to signify
the requirements in the specification. These words are often
capitalized. This document defines these words as they should be
interpreted in IETF documents. Authors who follow these guidelines
should incorporate this phrase near the beginning of their document:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
RFC 2119.
Note that the force of these words is modified by the requirement
level of the document in which they are used.
Hypertext Transfer Protocol -- HTTP/1.1
Department of Information and Computer Science
University of California, Irvine
Irvine
CA
92697-3425
+1(949)824-1715
fielding@ics.uci.edu
World Wide Web Consortium
MIT Laboratory for Computer Science, NE43-356
545 Technology Square
Cambridge
MA
02139
+1(617)258-8682
jg@w3.org
Compaq Computer Corporation
Western Research Laboratory
250 University Avenue
Palo Alto
CA
94305
mogul@wrl.dec.com
World Wide Web Consortium
MIT Laboratory for Computer Science, NE43-356
545 Technology Square
Cambridge
MA
02139
+1(617)258-8682
frystyk@w3.org
Xerox Corporation
MIT Laboratory for Computer Science, NE43-356
3333 Coyote Hill Road
Palo Alto
CA
94034
masinter@parc.xerox.com
Microsoft Corporation
1 Microsoft Way
Redmond
WA
98052
paulle@microsoft.com
World Wide Web Consortium
MIT Laboratory for Computer Science, NE43-356
545 Technology Square
Cambridge
MA
02139
+1(617)258-8682
timbl@w3.org
The Hypertext Transfer Protocol (HTTP) is an application-level
protocol for distributed, collaborative, hypermedia information
systems. It is a generic, stateless, protocol which can be used for
many tasks beyond its use for hypertext, such as name servers and
distributed object management systems, through extension of its
request methods, error codes and headers . A feature of HTTP is
the typing and negotiation of data representation, allowing systems
to be built independently of the data being transferred.
HTTP has been in use by the World-Wide Web global information
initiative since 1990. This specification defines the protocol
referred to as "HTTP/1.1", and is an update to RFC 2068 .
HTTP Over TLS
This memo describes how to use Transport Layer Security (TLS) to secure Hypertext Transfer Protocol (HTTP) connections over the Internet. This memo provides information for the Internet community.
Internet Web Replication and Caching Taxonomy
This memo specifies standard terminology and the taxonomy of web replication and caching infrastructure as deployed today. It introduces standard concepts, and protocols used today within this application domain. This memo provides information for the Internet community.
The Transport Layer Security (TLS) Protocol Version 1.2
This document specifies Version 1.2 of the Transport Layer Security (TLS) protocol. The TLS protocol provides communications security over the Internet. The protocol allows client/server applications to communicate in a way that is designed to prevent eavesdropping, tampering, or message forgery. [STANDARDS-TRACK]
The WebSocket Protocol
The WebSocket Protocol enables two-way communication between a client running untrusted code in a controlled environment to a remote host that has opted-in to communications from that code. The security model used for this is the origin-based security model commonly used by web browsers. The protocol consists of an opening handshake followed by basic message framing, layered over TCP. The goal of this technology is to provide a mechanism for browser-based applications that need two-way communication with servers that does not rely on opening multiple HTTP connections (e.g., using XMLHttpRequest or <iframe>s and long polling). [STANDARDS-TRACK]
Additional HTTP Status Codes
This document specifies additional HyperText Transfer Protocol (HTTP) status codes for a variety of common situations. [STANDARDS-TRACK]
TCP Extensions for Multipath Operation with Multiple Addresses
TCP/IP communication is currently restricted to a single path per connection, yet multiple paths often exist between peers. The simultaneous use of these multiple paths for a TCP/IP session would improve resource usage within the network and, thus, improve user experience through higher throughput and improved resilience to network failure.</t><t> Multipath TCP provides the ability to simultaneously use multiple paths between peers. This document presents a set of extensions to traditional TCP to support multipath operation. The protocol offers the same type of service to applications as TCP (i.e., reliable bytestream), and it provides the components necessary to establish and use multiple TCP flows across potentially disjoint paths. This document defines an Experimental Protocol for the Internet community.
Guidelines for Web Content Transformation Proxies 1.0
SPDY Protocol
This document describes SPDY, a protocol designed for low-latency transport of content over the World Wide Web. SPDY introduces two layers of protocol. The lower layer is a general purpose framing layer which can be used atop a reliable transport (likely TCP) for multiplexed, prioritized, and compressed data communication of many concurrent streams. The upper layer of the protocol provides HTTP- like RFC2616 [RFC2616] semantics for compatibility with existing HTTP application servers.
Hypertext Transfer Protocol (HTTP/1.1): Caching
The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypertext information systems. This document defines requirements on HTTP caches and the associated header fields that control cache behavior or indicate cacheable response messages.
Proxy Auto-Config
Web Proxy Auto-Discovery Protocol
HTTPS Everywhere
SSL/TLS Interception Proxies and Transitive Trust