idnits 2.17.1 draft-ietf-repute-model-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 27, 2013) is 3889 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 5751 (ref. 'SMIME') (Obsoleted by RFC 8551) -- Obsolete informational reference (is this intentional?): RFC 5246 (ref. 'TLS') (Obsoleted by RFC 8446) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 REPUTE Working Group N. Borenstein 3 Internet-Draft Mimecast 4 Intended status: Standards Track M. Kucherawy 5 Expires: February 28, 2014 6 A. Sullivan, Ed. 7 Dyn, Inc. 8 August 27, 2013 10 A Model for Reputation Reporting 11 draft-ietf-repute-model-08 13 Abstract 15 This document describes a general architecture for a reputation-based 16 service and a model for requesting reputation-related data over the 17 Internet, where "reputation" refers to predictions or expectations 18 about an entity or an identifier such as a domain name. The document 19 roughly follows the recommendations of RFC4101 for describing a 20 protocol model. 22 Status of this Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on February 28, 2014. 39 Copyright Notice 41 Copyright (c) 2013 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 3. High-Level Architecture . . . . . . . . . . . . . . . . . . . 5 59 4. Terminology and Definitions . . . . . . . . . . . . . . . . . 7 60 4.1. Response Set . . . . . . . . . . . . . . . . . . . . . . . 7 61 4.2. Reputon . . . . . . . . . . . . . . . . . . . . . . . . . 8 62 5. Information Represented in a Response Set . . . . . . . . . . 8 63 6. Information Flow in the Reputation Query Protocol . . . . . . 9 64 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 65 8. Privacy Considerations . . . . . . . . . . . . . . . . . . . . 9 66 8.1. Data In Transit . . . . . . . . . . . . . . . . . . . . . 9 67 8.2. Aggregation . . . . . . . . . . . . . . . . . . . . . . . 9 68 8.3. Collection Of Data . . . . . . . . . . . . . . . . . . . . 10 69 9. Security Considerations . . . . . . . . . . . . . . . . . . . 10 70 9.1. Biased Reputation Agents . . . . . . . . . . . . . . . . . 10 71 9.2. Malformed Messages . . . . . . . . . . . . . . . . . . . . 11 72 9.3. Further Discussion . . . . . . . . . . . . . . . . . . . . 11 73 10. Informative References . . . . . . . . . . . . . . . . . . . . 11 74 Appendix A. Public Discussion . . . . . . . . . . . . . . . . . . 12 75 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 77 1. Introduction 79 Historically, many Internet protocols have operated between 80 unauthenticated entities. For example, an email message's author 81 field (From) [MAIL] can contain any display name or address and is 82 not verified by the recipient or other agents along the delivery 83 path. Similarly, a sending email server using [SMTP] trusts that the 84 [DNS] has led it to the intended receiving server. Both kinds of 85 trust are easily betrayed, opening the operation to subversion of 86 some kind, which leads to spam, phishing, and other attacks. 88 In recent years, explicit identity authentication mechanisms have 89 begun to see wider deployment. For example, the [DKIM] protocol 90 permits associating a validated identifier to a message. This 91 association is cryptographically strong, and is an improvement over 92 the prior state of affairs, but it does not distinguish between 93 identifiers of good actors and bad. Even when it is possible to 94 validate the domain name in an author field (e.g. 95 "trustworthy.example.com" in "john.doe@trustworthy.example.com") 96 there is no basis for knowing whether it is associated with a good 97 actor worthy of trust. As a practical matter, both bad actors and 98 good adopt basic authentication mechanisms like DKIM. In fact, bad 99 actors tend to adopt them even more rapidly than the good actors do 100 in the hope that some receivers will confuse identity authentication 101 with identity assessment. The former merely means that the name is 102 being used by its owner or their agent, while the latter makes a 103 statement about the quality of the owner. 105 With the advent of these authentication protocols, it is possible to 106 statisfy the requirement for a mechanism by which mutually trusted 107 parties can exchange assessment information about other actors. For 108 these purposes, we may usefully define "reputation" as "the 109 estimation in which an identifiable actor is held, especially by the 110 community or the Internet public generally". We may call an 111 aggregation of individual assessments "reputation input." 113 While the need for reputation services has been perhaps especially 114 clear in the email world, where abuses are commonplace, other 115 Internet services are coming under attack and may have a similar 116 need. For instance, a reputation mechanism could be useful in rating 117 the security of web sites, the quality of service of an Internet 118 Service Provider (ISP), or an Application Service Provider (ASP). 119 More generally, there are many different opportunities for use of 120 reputation services, such as customer satisfaction at e-commerce 121 sites, and even things unrelated to Internet protocols, such as 122 plumbers, hotels, or books. Just as human beings traditionally rely 123 on the recommendations of trusted parties in the physical world, so 124 too they can be expected to make use of such reputation services in a 125 variety of applications on the Internet. 127 A full trust architecture encompasses a range of actors and 128 activities, to enable an end-to-end service for creating, exchanging, 129 and consuming trust-related information. One component of that is a 130 query mechanism, to permit retrieval of a reputation. Not all such 131 reputation services will need to convey the same information. Some 132 need only produce a basic rating, while others need to provide 133 underlying detail. This is akin to the difference between check 134 approval versus a credit report. 136 An overall reckoning of goodness versus badness can be defined 137 generically, but specific applications are likely to want to describe 138 reputations for multiple attributes: an e-commerce site might be 139 rated on price, speed of delivery, customer service, etc., and might 140 receive very different ratings on each. Therefore, the model defines 141 a generic query mechanism and basic format for reputation retrieval, 142 but allows extensions for each application. 144 Omitted from this model is the means by which a reputation-reporting 145 agent goes about collecting such data and the method for creating an 146 evaluation. The mechanism defined here merely enables asking a 147 question and getting an answer; the remainder of an overall service 148 provided by such a reputation agent is specific to the implementation 149 of that service and is out of scope here. 151 2. Overview 153 The basic premise of this reputation system involves a client that is 154 seeking to evaluate content based on an identifier associated with 155 the content, and a reputation service provider that collects, 156 aggregates, and makes available for consumption, scores based on the 157 collected data. Typically client and service operators enter into 158 some kind of agreement during which some parameters are exchanged 159 such as the location at which the reputation service can be reached, 160 the nature of the reputation data being offered, possibly some client 161 authentication details, and the like. 163 Upon receipt of some content the client operator wishes to evaluate 164 (an Internet message, for example), the client extracts from the 165 content one or more identifiers of interest to be evaluated. 166 Examples of this include the domain name found in the From: field of 167 a message, or the domain name extracted from a valid DomainKeys 168 Identified Mail (DKIM) signature. 170 Next, the goal is to ask the reputation service provider what the 171 reputation of the extracted identifier is. The query will contain 172 the identifier to be evaluated and possibly some context-specific 173 information (such as to establish the context of the query, e.g., an 174 email message) or client-specific information. The client typically 175 folds the data in the response into whatever local evaluation logic 176 it applies to decide what disposition the content deserves. 178 3. High-Level Architecture 180 A reputation mechanism functions as a component of an overall 181 service. A current example is that of an email system that uses 182 DomainKeys Identified Mail (DKIM; see [DKIM]) to affix a stable 183 identifier to a message and then uses that as a basis for evaluation: 185 +-------------+ +------------+ 186 | Author | | Recipient | 187 +-------------+ +------------+ 188 | ^ 189 V | 190 +-------------+ +------------+ 191 | MSA | | MDA | 192 +-------------+ +------------+ 193 | ^ 194 | | 195 | +------------+ 196 | | Handling | 197 | | Filter | 198 | +------------+ 199 | ^ 200 | | 201 | +------------+ +------------+ 202 | | Reputation |<=====>| Identifier | 203 | | Service | | Assessor | 204 | +------------+ +------------+ 205 | ^ 206 V | 207 +----------------------------------------------------------+ 208 | +------------+ Responsible Identifier +------------+ | 209 | | Identifier |. . . . . . . . . . . . . .>| Identifier | | 210 | | Signer | | Verifier | | 211 | +------------+ DKIM Service +------------+ | 212 +----------------------------------------------------------+ 213 | ^ 214 V | 215 +-------------+ /~~~~~~~~~~\ +------+-----+ 216 | MTA |----->( other MTAs )------>| MTA | 217 +-------------+ \~~~~~~~~~~/ +------------+ 218 Figure 1: Actors in a Trust Sequence Using DKIM 220 (See [EMAIL-ARCH] for a general description of the Internet messaging 221 architecture.) In this figure, the solid lines indicate the flow of 222 a message; the dotted line indicates transfer of validated 223 identifiers within the message content; and the double line shows the 224 query and response of the reputation information. 226 Here, the DKIM Service provides one or more stable identifiers that 227 is the basis for the reputation query. On receipt of a message from 228 an MTA, the DKIM Service provides a (possibly empty) set of validated 229 identifiers -- domain names, in this case -- which are the subjects 230 of reputation queries made by the Identity Assessor. The Identity 231 Assessor queries a Reputation Service to determine the reputation of 232 the provided identifiers, and delivers the identifiers and their 233 reputations to the Handling Filter. The Handling Filter makes a 234 decision about whether and how to deliver the message to the 235 recipient based on these and other inputs about the message, possibly 236 including evaluation mechansisms in addition to DKIM. 238 This document outlines the reputation query and response mechanism. 239 It provides the following definitions: 241 o Vocabulary for the current work and work of this type; 243 o The types and content of queries that can be supported; 245 o The extensible range of response information that can be provided; 247 o A query/response protocol; 249 o Query/response transport conventions. 251 It provides an extremely simple query/response model that can be 252 carried over a variety of transports, including the Domain Name 253 System. (Although not typically thought of as a 'transport', the DNS 254 provides generic capabilities and can be thought of as a mechanism 255 for transporting queries and responses that have nothing to do with 256 Internet addresses, such as is one with a DNS BlockList [DNSBL].) 257 Each specification for Repute transport is independent of any other 258 specification. A diagram of the basic query service is found in 259 Figure 2. 261 + . . . . . . . . . . . . . . . . . . . . . . . . . . . . + 262 . Reputation Service . 263 . +------------+ . 264 . | Reputation | . 265 . | Database | . 266 . +------------+ . 267 . | . 268 . V . 269 . +-----------+ Query +----------+ . 270 . | |. . . . . . . . . . . . . .>| | . 271 . | Client | | Server | . 272 . | |< . . . . . . . . . . . . . | | . 273 . +-----+-----+ Response +----------+ . 274 . ^ ^ . 275 + . . . | . . . . . . . . . . . . . . . . . . . | . . . . + 276 V | 277 +-----------+ +-----------+ | 278 | Transport |<-------------->| Transport |<---+ 279 +-----------+ DNS +-----------+ 280 TCP 281 UDP 282 ... 284 Figure 2: Basic Reputation Query Service 286 The precise syntaxes of both the query and response are application- 287 specific. An application of the model defines the parameters 288 available to queries of that type, and also defines the data returned 289 in response to any query. 291 4. Terminology and Definitions 293 This section defines terms used in the rest of the document. 295 4.1. Response Set 297 A "Response Set" comprises those data that are returned in response 298 to a reputation query about a particular entity. The types of data 299 are specific to an application; the data returned in the evaluation 300 of email senders would be different than the reputation data returned 301 about a movie or a baseball player. 303 Response Sets have symbolic names, and these have to be registered 304 with IANA, in the Reputation Applications Registry, to prevent name 305 collisions. IANA registries are created in a separate document. 306 Each definition of a Response Set also needs to define its registry 307 entry. 309 4.1.1. Assertions and Ratings 311 One of the key properties of a Response Set is called an Assertion. 312 Assertions are claims made about the subject of a reputation query. 313 For example, one might assert that a particular restaurant serves 314 good food. In the context of this model, the assertion would be 315 "serves good food". 317 Assertions are coupled with a numeric value called a Rating, which is 318 an indication of how much the party generating the Response Set 319 agrees with the assertion being made. For example, with the above 320 Assertion, a rating of 1.0 indicates strong agreement, while a rating 321 of 0.0 indicates no support for the assertion. 323 4.2. Reputon 325 A "reputon" is an object that comprises the basic response to a 326 reputation query. It contains the response set relevant to the 327 subject of the query. Its specific encoding is left to documents 328 that implement this model. 330 5. Information Represented in a Response Set 332 The basic information to be represented in the protocol is fairly 333 simple, and includes the following: 335 o the identity of the entity providing the reputation information; 337 o the identity of the entity being rated; 339 o the application context for the query (e.g., email address 340 evaluation); 342 o the overall rating score for that entity; 344 o the level of confidence in the accuracy of that rating; and 346 o the number of data points underlying that score. 348 Beyond this, arbitrary amounts of additional information might be 349 provided for specific uses of the service. The entire collection is 350 the Response Set for that application. The query/response protocol 351 defines a syntax for representing such Response Sets, but each 352 application defines its own Response Set. Thus, the basic information 353 also includes the name of the application for which the reputation 354 data is being expressed. 356 Each application requires its own specification of the Response Set. 357 For example, a specification might be needed for a reputation 358 Response Set for an "email-sending-domain"; the Response Set might 359 include information on how often spam was received from that domain. 360 Additional documents define a [MIME] type for reputation data, and 361 protocols for exchanging such data. 363 6. Information Flow in the Reputation Query Protocol 365 The basic Response Set could be wrapped into a new MIME media type 366 [MIME] or a DNS RR, and transported accordingly. It also could be 367 the integral payload of a purpose-built protocol. For a basic 368 request/response scenario, one entity (the Client) will ask a second 369 entity (the Server) for reputation data about a third entity (the 370 Target), and the second entity will respond with that data. 372 An application might benefit from an extremely lightweight mechanism, 373 supporting constrained queries and responses, while others might need 374 to support larger and more complex responses. 376 7. IANA Considerations 378 This document presents no actions for IANA. 380 [RFC Editor: Please remove this section prior to publication.] 382 8. Privacy Considerations 384 8.1. Data In Transit 386 Some kinds of reputation data are sensitive, and should not be shared 387 publicly. For cases that have such sensitivity, it is imperative to 388 protect the information from unauthorized access and viewing. The 389 model described here neither suggests nor precludes any particular 390 transport mechanism for the data. However, for the purpose of 391 illustration, a reputation service that operates over HTTP might 392 employ any of its well-known mechanisms to solve these problems, 393 which include OpenPGP [OPENPGP], Transport Layer Security [TLS], and 394 S/MIME [SMIME]. 396 8.2. Aggregation 398 The data that are collected as input to a reputation calculation are 399 in essence a statement by one party about the actions or output of 400 another. What one party says about another is often meant to be kept 401 in confidence. Accordingly, steps often need to be taken to secure 402 the submission of these input data to a reputation service provider. 404 Moreover, although the aggregated reputation is the product provided 405 by this service, its inadvertent exposure can have undesirable 406 effects. Just as the collection of data about a subject needs due 407 consideration to privacy and security, so too does the output and 408 storage of whatever aggregation the service provider applies. 410 8.3. Collection Of Data 412 The basic notion of collection and storage of reputation data is 413 obviously a privacy issue in that the opinions of one party about 414 another are likely to be sensitive. Inadvertent or unauthorized 415 exposure of those data can lead to personal or commercial damage. 417 9. Security Considerations 419 This document introduces an overall protocol model, but no 420 implementation details. As such, the security considerations 421 presented here are very high-level. The detailed analyses of the 422 various specific components of the protocol can be found the 423 documents that instantiate this model. 425 9.1. Biased Reputation Agents 427 As with [VBR], an agent seeking to make use of a reputation reporting 428 service is placing some trust that the service presents an unbiased 429 "opinion" of the object about which reputation is being returned. 430 The result of trusting the data is, presumably, to guide action taken 431 by the reputation client. It follows, then, that bias in the 432 reputation service can adversely affect the client. Clients 433 therefore need to be aware of this possibility and the effect it 434 might have. For example, a biased system returning a reputation 435 about a DNS domain found in email messages could result in the 436 admission of spam, phishing or malware through a mail gateway (by 437 rating the domain name more favourably than warranted) or could 438 result in the needless rejection or delay of mail (by rating the 439 domain more unfavourably than warranted). As a possible mitigation 440 strategy, clients might seek to interact only with reputation 441 services that offer some disclosure of the computation methods for 442 the results they return. Such disclosure and evaluation is beyond 443 the scope of the present document. 445 Similarly, a client placing trust in the results returned by such a 446 service might suffer if the service itself is compromised, returning 447 biased results under the control of an attacker without the knowledge 448 of the agency providing the reputation service. This might result 449 from an attack on the data being returned at the source, or from a 450 man-in-the-middle attack. Protocols, therefore, need to be designed 451 so as to be as resilient against such attacks as possible. 453 9.2. Malformed Messages 455 Both clients and servers of reputation systems need to be resistant 456 to attacks that involve malformed messages, deliberate or otherwise. 457 Malformations can be used to confound clients and servers alike in 458 terms of identifying the party or parties responsible for the content 459 under evaluation. This can result in delivery of undesirable or even 460 dangerous content. 462 9.3. Further Discussion 464 Numerous other topics related to use and management of reputation 465 systems can be found in [I-D.REPUTE-CONSIDERATIONS]. 467 10. Informative References 469 [DKIM] Crocker, D., Ed., Hansen, T., Ed., and M. Kucherawy, Ed., 470 "DomainKeys Identified Mail (DKIM) Signatures", RFC 6376, 471 September 2011. 473 [DNS] Mockapetris, P., "Domain names - implementation and 474 specification", STD 13, RFC 1035, November 1987. 476 [DNSBL] Levine, J., "DNS Blacklists and Whitelists", RFC 5782, 477 February 2010. 479 [EMAIL-ARCH] 480 Crocker, D., "Internet Mail Architecture", RFC 5598, 481 July 2009. 483 [I-D.REPUTE-CONSIDERATIONS] 484 Kucherawy, M., "Operational Considerations Regarding 485 Reputation Services", draft-ietf-repute-considerations 486 (work in progress), November 2012. 488 [MAIL] Resnick, P., "Internet Message Format", RFC 5322, 489 October 2008. 491 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 492 Extensions (MIME) Part One: Format of Internet Message 493 Bodies", RFC 2045, November 1996. 495 [OPENPGP] Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. 496 Thayer, "OpenPGP Message Format", RFC 4880, November 2007. 498 [SMIME] Ramsdell, B. and S. Turner, "Secure/Multipurpose Internet 499 Mail Extensions (S/MIME) Version 3.2: Message 500 Specification", RFC 5751, January 2010. 502 [SMTP] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, 503 October 2008. 505 [TLS] Dierks, T. and E. Rescorla, "The Transport Layer Security 506 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 508 [VBR] Hoffman, P., Levine, J., and A. Hathcock, "Vouch By 509 Reference", RFC 5518, April 2009. 511 Appendix A. Public Discussion 513 Public discussion of this suite of documents takes place on the 514 domainrep@ietf.org mailing list. See 515 https://www.ietf.org/mailman/listinfo/domainrep. 517 Authors' Addresses 519 Nathaniel Borenstein 520 Mimecast 521 203 Crescent St., Suite 303 522 Waltham, MA 02453 523 USA 525 Phone: +1 781 996 5340 526 Email: nsb@guppylake.com 528 Murray S. Kucherawy 529 270 Upland Drive 530 San Francisco, CA 94127 531 USA 533 Email: superuser@gmail.com 534 Andrew Sullivan (editor) 535 Dyn, Inc. 536 150 Dow St. 537 Manchester, NH 03101 538 USA 540 Email: asullivan@dyn.com