idnits 2.17.1 draft-mcfadden-pp-centralization-problem-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 84 has weird spacing: '...er time the i...' == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (January 13, 2022) is 834 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: 'ID.davidson-pp-protocol-01' on line 81 -- Looks like a reference, but probably isn't: 'ID.davidson-pp-architecture-01' on line 131 == Unused Reference: '1' is defined on line 317, but no explicit reference was found in the text == Unused Reference: '2' is defined on line 322, but no explicit reference was found in the text == Unused Reference: '3' is defined on line 328, but no explicit reference was found in the text == Outdated reference: A later version (-16) exists of draft-ietf-privacypass-protocol-00 == Outdated reference: A later version (-01) exists of draft-ietf-privacypass-http-api-00 Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Privacy Pass M. McFadden 2 Internet Draft internet policy advisors, llc 3 Intended status: Informational January 13, 2022 4 Expires: July 13, 2022 6 Privacy Pass: Centralization Problem Statement 7 draft-mcfadden-pp-centralization-problem-02.txt 9 Status of this Memo 11 This Internet-Draft is submitted in full conformance with the 12 provisions of BCP 78 and BCP 79. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that 16 other groups may also distribute working documents as Internet- 17 Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six 20 months and may be updated, replaced, or obsoleted by other documents 21 at any time. It is inappropriate to use Internet-Drafts as 22 reference material or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html 30 This Internet-Draft will expire on July 13, 2022. 32 Copyright Notice 34 Copyright (c) 2022 IETF Trust and the persons identified as the 35 document authors. All rights reserved. 37 This document is subject to BCP 78 and the IETF Trust's Legal 38 Provisions Relating to IETF Documents 39 (http://trustee.ietf.org/license-info) in effect on the date of 40 publication of this document. Please review these documents 41 carefully, as they describe your rights and restrictions with 42 respect to this document. Code Components extracted from this 43 document must include Simplified BSD License text as described in 44 Section 4.e of the Trust Legal Provisions and are provided without 45 warranty as described in the Simplified BSD License. 47 Abstract 49 This document discusses the problems associated with strict upper 50 bounds on the number of Privacy Pass servers in the proposed Privacy 51 Pass ecosystem. It documents a proposed problem statement. 53 Table of Contents 55 1. Introduction...................................................2 56 2. Potential Privacy Concerns.....................................3 57 3. Centralization in Privacy Pass - Problem Statement.............4 58 3.1. Architectural Problems....................................4 59 3.2. Engineering Problems......................................5 60 3.3. Practical Problems........................................5 61 4. Problem Statement and Potential for Mitigations................6 62 4.1. Problem Statement.........................................6 63 4.2. Potential Mitigations.....................................6 64 5. Security Considerations........................................7 65 6. IANA Considerations............................................7 66 7. References.....................................................7 67 7.1. Normative References......................................7 68 7.2. Informative References....................................8 69 8. Acknowledgments................................................8 71 1. Introduction 73 The Privacy Pass protocol provides a set of cross-domain 74 authorization tokens that protect the client's anonymity in message 75 exchanges with a server. This allows clients to communicate an 76 attestation of a previously authenticated server action, without 77 having to reauthenticate manually. The tokens retain anonymity in 78 the sense that the act of revealing them cannot be linked back to 79 the session where they were initially issued. 81 The protocol itself in defined in [ID.davidson-pp-protocol-01] and 82 the architectural framework is in [ID.davidson-pp-architecture-01]. 84 The architecture document leaves for a later time the issue of 85 server centralization. This document is a discussion of the 86 problems related to server centralization in Privacy Pass, the 87 impact of centralization on the protocol's privacy goals, and some 88 potential mitigations for the problem. 90 An important feature of the Privacy Pass Architecture is the concept 91 of the anonymity set of each individual client. The Privacy Pass 92 ecosystem has a set of servers which issue tokens to clients which 93 can then be redeemed at the application layer for authentication. 95 Trust is an important component in Privacy Pass. The servers have to 96 publish their public keys and details of the ciphersuite they are 97 using. It is necessary to publish these in a globally consistent, 98 tamper-proof data structure. Clients that use the same registry of 99 server information need to coordinate in some way to validate that 100 they have the same view of the registry and its data. 102 Four server running modes are discussed in [ID.davidson-pp- 103 architecture-01]. Common to all four is a discussion of the need to 104 set an upper limit on the number of servers that are allowed. The 105 motivation for limiting the number of servers is that the is a 106 correlation between larger numbers of servers and dilution of 107 privacy. 109 2. Potential Privacy Concerns 111 When a client redeems a token in Privacy Pass, there is very little 112 information in the token itself other than the key that was used to 113 sign the token. A key feature of the protocol is that any client can 114 only remain private relative to the entire space of users using the 115 protocol. 117 In three of the four server running modes, a Privacy Pass verifier 118 is able to trigger redemption for any of the available servers. The 119 greater the number of servers, the greater the loss in anonymity. 121 The architecture document, [ID.davidson-pp-architecture-01], 122 provides an example where, if there are 32 servers, then the 123 verifier learns 32 bits of information about the client. In certain 124 circumstances, having that much information about the client can 125 lead to the client being uniquely identified and the goals of 126 Privacy Pass thwarted. As a result, the architecture document 127 supplies the following mitigation: 129 "In cases where clients can hold tokens for all servers at any given 130 time, a strict bound SHOULD be applied to the active number of 131 servers in the ecosystem. [ID.davidson-pp-architecture-01]." 133 Putting restrictions on the number of redemption tokens at the 134 client is considered. However, establishing control of the client, 135 and the number of tokens it has, is far more difficult than 136 restricting the number of active servers. 138 3. Centralization in Privacy Pass - Problem Statement 140 For Privacy Pass to succeed clients must be able to acquire tokens 141 that they can later redeem with greater privacy and anonymity. This 142 document does not discuss the goals of privacy or anonymity. 143 Instead, it identifies a problem related to the upper bound in 144 number of servers that affects the Privacy Pass ecosystem. 146 For the purposes of this draft, "server centralization" is the 147 strict limit or upper bound in the number of servers available from 148 which a client can acquire a token for later redemption. 150 The architecture draft specifies an upper limit of four for this 151 upper bound. 153 The problem statement for Privacy pass can be summarized: an upper 154 bound to available Privacy Pass servers creates architectural, 155 engineering and practical problems for the deployment of the 156 protocol. Any successful deployment of Privacy Pass must find 157 mitigations for these problems. 159 3.1. Architectural Problems 161 Centralization is a problem space that has been exhaustively 162 explored by others; not least of which in the IETF itself. The now 163 expired IAB draft, [I-D.arkko-arch-infrastructure-centralisation- 164 00], discussed six separate issues related to centralization and 165 several of them appear to apply to Privacy Pass. 167 Having a very limited number of servers available creates an 168 architectural strain on avoiding single points of failure. While 169 the Privacy Pass architecture document does specify up to four 170 servers, this is a very small number for, potentially, billions of 171 possible users. And this assumes that the protocol is only used in 172 "human-to-server" applications and not in situations where the 173 client is not a human but some other device - either acting on 174 behalf oa human or autonomously. Strict limitations on the number of 175 servers poses the question of how the Privacy Pass architecture can 176 scale in the presence of a large user base. 178 The Privacy Pass architecture, by limiting the number of servers, 179 also concentrates information and potentially limits the ability for 180 other competing providers of the token generating services. By 181 concentrating the information in a small number of servers, a 182 problem appears when there are machine learning opportunities to 183 collect and process data about clients requesting tokens. 185 A side effect of limiting the number of servers is that a 186 significant amount of information ends up being in the control of a 187 small number of entities. A client may trust a Privacy Pass server 188 as send it information about itself in order to request tokens. 189 However, the protocol itself can make no guarantee about the data 190 handling practices of the server operator. Situations outside the 191 control of the protocol may make it so there are pressures to misuse 192 the data concentrated at the small number of servers. 194 3.2. Engineering Problems 196 In the event that a very limited number of servers can be provided 197 while still supporting the goals of the protocol, there is clearly a 198 global scaling problem that needs to be solved. Each server must 199 publish a global, consistent and protected view of its published key 200 and the cryptosystem in use. Without access to that view, the system 201 appears to have no failure mode. 203 With a small number of servers, the ecosystem would likely be 204 dominated by a few providers. With a dominant position in the market 205 these Privacy Pass server operators would have a significant impact 206 on default connectivity parameters in operating systems and 207 browsers. As a result, a change to the way the access mechanism 208 works for a variety of applications would have broad impacts to a 209 wide variety of users. The relationship between engineering and how 210 it affects a broad community of users has a recent example in DNS 211 over HTTP. 213 3.3. Practical Problems 215 Limits to the number of server operators also results in practical 216 problems outside the protocol. In the event that a small number of 217 server operators appear in the Privacy Pass ecosystem, and a large 218 number of clients enter into trust relationships with those 219 operators, what happens when those operators are acquired by other 220 organizations that have different data handling and privacy policies 221 than the original operator? 223 With the requirement for a small number of operators, the 224 architecture also doesn't consider the possibility that an 225 organization or government could require Privacy Pass and the use of 226 a particular set of servers. Such a requirement could potentially 227 turn the goals of Privacy Pass against itself. 229 4. Problem Statement and Potential for Mitigations 231 4.1. Problem Statement 233 An upper bound to available Privacy Pass servers creates 234 architectural, engineering and practical problems for the deployment 235 of the protocol. Any successful deployment of Privacy Pass must find 236 mitigations for these problems. 238 4.2. Potential Mitigations 240 The motivation for having an upper bound to available Privacy Pass 241 servers is to limit the amount of information that could be gather 242 because a client could be forced to redeem tokens for any issuing 243 key. A large number of keys, means a greater about of information 244 exposed. 246 One alternative to limiting the number of servers is to constrain 247 the clients so that they only possess redemption tokens for a small 248 number of servers. This potential mitigation doesn't address how the 249 tokens might be cached, but it does discuss how the limitation might 250 be implemented. However, there is much engineering experience to 251 suggest that making a limitation work in a very large number of 252 clients is a much greater engineering and deployment problem than 253 placing the restriction in the server. 255 If the motivation for restricting the number of servers is essential 256 for Privacy Pass - and the mitigations at either the server or 257 client are difficult to overcome - it is hard to understand where 258 the mitigations for the problem statement will emerge. 260 4.3. Redemption Contexts as a Mitigation 262 Contexts are groupings of resources that have shared anonymity and 263 privacy properties. The current architecture statement has a single, 264 global context for redemption. It is this feature that causes the 265 problem outlined in section 4.1 above: with N issuers in the global 266 ecosystem, there are 2^N possible anonymity sets. Adding additional 267 metadata bits increases the number of anonymity sets. 269 The global redemption context results in a requirement of less than 270 ten total issuers in order to maintain anonymity sets of 5,000. 272 One possible mitigation is to limit redemptions to a specific, 273 shared context. Such an approach could limit the information 274 available - and the potential for leakage - to a specific context. 275 This type of solution would rely, in part, on strong 276 security/privacy boundaries between contexts. While information 277 about redemptions in one context wouldn't affect information in 278 another context, this solution depends upon there being no leakage 279 of information between those contexts. 281 While this potential mitigation is not reflected in the Privacy Pass 282 architecture, it is unclear whether it should be a part of the 283 protocol design or it should be left to the application layer to 284 implement. If left to the application layer, there is potential for 285 the anonymity sets to be very small and not meet the privacy goals 286 of the protocol. 288 What is not clear is how the consolidation considerations are 289 affected by the development of a "symmetric mode" for Privacy Pass. 290 The symmetric mode provides optional metadata but will not enable 291 the use of public verifiability. The goal of this change is to 292 remain consistent with work in the W3C. The mode will use the POPRF 293 algorithm which does not change the architectural characteristics 294 considered in this paper. 296 In addition, there is related work on other forms of anonymous 297 tokens being considered at the IETF called Private Access Tokens. 298 The centralization considerations for Private Access Tokens is 299 beyond the scope of this draft. 301 5. Security Considerations 303 This document is all about security considerations for Privacy Pass. 304 In particular it addresses the very specific problem associated with 305 centralization of Privacy Pass servers. 307 6. IANA Considerations 309 This memo contains no instructions or requests for IANA. The authors 310 continue to appreciate the efforts of IANA staff in support of the 311 IETF. 313 7. References 315 7.1. Normative References 317 [1] Bradner, S., "Key words for use in RFCs to Indicate 318 Requirement Levels", BCP 14, RFC 2119, March 1997. 320 7.2. Informative References 322 [2] Celi, S., Davidson, A., and A. Faz-Hernandez, "Privacy Pass 323 Protocol Specification", Work in Progress, Internet-Draft, 324 draft-ietf-privacypass-protocol-00, 5 January 2021, 325 . 328 [3] [I-D.ietf-privacypass-http-api] Valdez, S., "Privacy Pass HTTP 329 API", Work in Progress, Internet-Draft, draft-ietf- 330 privacypass-http-api-00, 5 January 2021, 331 . 334 8. Acknowledgments 336 This document was prepared using 2-Word-v2.0.template.dot. 338 Authors' Addresses 340 Mark McFadden 341 Internet policy advisors, ltd 342 Chepstow, Wales, United Kingdom 344 Email: mark@internetpolicyadvisors.com