idnits 2.17.1 draft-richardson-shg-un-quarantine-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (2 November 2020) is 1268 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'THISDOCUMENT' is mentioned on line 488, but not defined ** Downref: Normative reference to an Informational draft: draft-ietf-capport-architecture (ref. 'I-D.ietf-capport-architecture') ** Obsolete normative reference: RFC 7710 (Obsoleted by RFC 8910) == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-44 == Outdated reference: A later version (-19) exists of draft-ietf-teep-architecture-12 == Outdated reference: A later version (-02) exists of draft-richardson-shg-mud-quarantined-access-01 Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RIPE IoT Working Group M. Richardson 3 Internet-Draft Sandelman Software Works 4 Intended status: Best Current Practice J. Latour 5 Expires: 6 May 2021 CIRA Labs 6 2 November 2020 8 A standard process to quarantine and restore IoT Devices 9 draft-richardson-shg-un-quarantine-03 11 Abstract 13 The Manufacturer Usage Description (MUD) is a tool to describe the 14 limited access that a single function device such as an Internet of 15 Things device might need. The enforcement of the access control 16 lists described protects the device from attacks from the Internet, 17 and protects the Internets from compromised devices. 19 This document details a process which occurs when a device is 20 detected to have violated the stated policy. The goal of these steps 21 is to ensure that the device is correctly removed from operation, 22 fixed, and if possible, restored to safe operation. This document 23 does not define any new protocols, but provides context in which a 24 number of existing protocols are to be used together. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on 6 May 2021. 43 Copyright Notice 45 Copyright (c) 2020 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 50 license-info) in effect on the date of publication of this document. 51 Please review these documents carefully, as they describe your rights 52 and restrictions with respect to this document. Code Components 53 extracted from this document must include Simplified BSD License text 54 as described in Section 4.e of the Trust Legal Provisions and are 55 provided without warranty as described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 60 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 61 1.2. An overview of the stages of activity . . . . . . . . . . 4 62 2. Detailed description of states . . . . . . . . . . . . . . . 5 63 2.1. New device . . . . . . . . . . . . . . . . . . . . . . . 5 64 2.2. Nominal . . . . . . . . . . . . . . . . . . . . . . . . . 6 65 2.2.1. Use of Captive Portal API . . . . . . . . . . . . . . 6 66 2.3. Suspicious . . . . . . . . . . . . . . . . . . . . . . . 6 67 2.4. Suspect . . . . . . . . . . . . . . . . . . . . . . . . . 7 68 2.5. Device of Interest . . . . . . . . . . . . . . . . . . . 7 69 2.6. Quarantined . . . . . . . . . . . . . . . . . . . . . . . 8 70 2.7. Disabled . . . . . . . . . . . . . . . . . . . . . . . . 8 71 2.8. Returning to Service . . . . . . . . . . . . . . . . . . 8 72 2.9. Owned by malicious entity ("p0wned") . . . . . . . . . . 9 73 3. Detailed description of transitions . . . . . . . . . . . . . 9 74 3.1. Initial Enrollment . . . . . . . . . . . . . . . . . . . 9 75 3.2. Re-enrollment . . . . . . . . . . . . . . . . . . . . . . 9 76 3.2.1. factory-default re-enrollment . . . . . . . . . . . . 9 77 3.2.2. simple re-enrollment . . . . . . . . . . . . . . . . 10 78 3.2.3. other kinds? . . . . . . . . . . . . . . . . . . . . 10 79 3.3. Initial suspicion . . . . . . . . . . . . . . . . . . . . 10 80 3.4. Confirmed suspicion . . . . . . . . . . . . . . . . . . . 10 81 3.5. Device identified as attack target . . . . . . . . . . . 10 82 3.6. Suspension of connectivity . . . . . . . . . . . . . . . 10 83 3.7. Re-Installation of valid firmware . . . . . . . . . . . . 10 84 4. An example process . . . . . . . . . . . . . . . . . . . . . 10 85 5. Human Rights Considerations . . . . . . . . . . . . . . . . . 11 86 6. Privacy Considerations . . . . . . . . . . . . . . . . . . . 11 87 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 88 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 89 8.1. Captive Portal API JSON keys . . . . . . . . . . . . . . 11 90 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 91 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 92 10.1. Normative References . . . . . . . . . . . . . . . . . . 11 93 10.2. Informative References . . . . . . . . . . . . . . . . . 12 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 96 1. Introduction 98 [RFC8520] describes the format of the Manufacturer Usage Description 99 (MUD) files. MUD files provide a set of network Access Control Lists 100 (ACL, pronounced [ak-uhl]) that describes the expected traffic from a 101 device, such as an Internet of Things (IoT) device. 103 MUD files are used in a number of projects, including the CIRALabs' 104 [SecureHomeGateway] (SHG) project. In this project a home gateway 105 ("router") is enhanced to be able to use MUD files to describe the 106 traffic expected from all connected devices. If a device does not 107 have a MUD format description, then the project can provide a broad 108 set of traffic expectations based upon categorization of the device 109 by the home owner. 111 This document is about the process to be followed when a device is 112 observed to be violating the ACLs applied to it. While this document 113 will identify network protocols (and gaps where no protocol exists) 114 as appropriate, the goal of this document is more about the human 115 processes that need to be driven by available data. Specifically, 116 who gets called, and in what order. Who makes each call, and how are 117 they identified. 119 In addition, what kind of data needs to be shared among the parties 120 and what are the privacy and human rights implications of sharing the 121 required data. 123 Finally, in the security considerations section of this document some 124 concerns about prevention of so-called "SWAT"ing ([swatting]), where 125 an attempt might be made to take a location or network offline 126 through phony reports. 128 1.1. Terminology 130 This document is not a protocol specification, but rather a Best 131 Current Practices in the area of human operations. While this is 132 sometimes called a "Standard Operating Proceedure" (SOP), this 133 document should not be considered the actual SOP for an organization, 134 but rather be referenced. Each organization (ISPs, Manufacturers, 135 Cyber-security response entities, Law Enforcement) will need to 136 define how they interact with the protocols outlined in this 137 document. 139 Although this document is a BCP, the terminology [RFC2119] the key 140 words such as "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 141 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to 142 be interpreted as described in BCP 14, RFC 2119. In the context of 143 this human protocol, they do not describe network protocol 144 interoperability requirements, but rather constraints upon how the 145 humans need to operate in order to avoid unsafe situations. 147 The following terms are used in this document: 149 * owner's network: the network belonging to the owner of the device. 150 In residentical situations, this is typically the home owner. In 151 commercial environments, this may be the owner of the building, or 152 the commercial tenant in the building. 154 * tenant: one or more people who occupy a space in which a network 155 of devices exists which do not belong directly to them. 157 1.2. An overview of the stages of activity 159 This section provides a brief overview of the states that a device 160 may be in. The following section provides a detailed description of 161 the state. This document is primarily about how a device transitions 162 from one state to another, which is covered in {#transitions}. 164 .--------. .---------.<---------.------------. 165 | new |-------->| nominal | | suspicious | 166 | device |\ .----->| | -------->| | 167 '--------' \| '---------' '------------' 168 \ | | 169 |\ | | 170 | \ | | 171 | \ v v 172 | \ .------------. .------------. 173 .------------.| v| p0owned | | device-of | 174 | returning || | | | interest | 175 | to service | '------------' '------------' 176 '------------' | | 177 ^ | | 178 | v v 179 .------------. .------------. .-----------. 180 | upgrading | | quarantine | | suspect | 181 | |<-----| |<------| | 182 '------------' '------------' '-----------' 184 Figure 1: Device Connectivity States 186 new device: a device that has just been "connected" to the network 187 nominal: a device which is operating correctly 189 suspicious: a device which has once gone out of it's MUD profile 191 suspect: a device which has repeatedly gone out of it's MUD profile 193 device-of-interest: a device that is part of a class of devices 194 which is considered suspect 196 quarantined: a device which has been isolated into a network 197 "segment", it may stil be operating locally. 199 disabled: a device which has been disconnected from the network, and 200 has also had mains power removed. The device is believed to be 201 off. 203 upgrading: a device which is active for the purpose of having new 204 firmware installed 206 returning-to-service: a device which has new firmware, and is going 207 through a re-enrollment process. It may still lack critical 208 configuration, and may be unable to yet perform critical 209 functions. 211 p0wned: a device which is known to have malicious routines running, 212 but is still connected to the network. It may continue to provide 213 the services the device was designed to do, in additional to 214 performing functions controlled by an unauthorized entity. 216 2. Detailed description of states 218 A device is considered to be on one of the above states. The device 219 is not considered to be aware of it's state, rather this is a 220 characteristic that the network assigns to the device. 222 2.1. New device 224 A device newly installed will have no initial network connectivity. 225 It will be awaiting some kind of enrollment or onboarding process. 226 Examples of enrollment processes include 227 [I-D.ietf-anima-bootstrapping-keyinfra], [dpp], processes defined by 228 The Thread Group and Apple Homekit, as well as a great number of 229 custom and proprietary methods. 231 In many cases the device may provide limited network connectivity to 232 itself (such as by running as an Access Point itself), and can be 233 reached by attackers even before it has been onboarded. The owner of 234 the device may in fact in unaware that the device is "smart", and it 235 may be possible for a device to become compromised without ever 236 having joined a network. As an example, a smart clothing washer may 237 have been installed and may function perfectly fine without any 238 smart-features, but which may be, in its default configuration 239 vulnerable to any attacker that is within WiFi distance. This case 240 is particularly difficult, as having never joined a network, the 241 device will not emit signals on the owner's network that can be 242 detected to notice that the device has been attacked. Also, having 243 never been connected, the device is more likely to have old firmware. 245 A key concern for many users that cause them to decline to upgrade 246 their devices is that they are afraid that they device will lose 247 their customizations. A new device is one that has no such 248 customizations; users should be more willing to upgrade it at this 249 point. 251 2.2. Nominal 253 The device is operating normally and is not suspected to be corrupted 254 or under attack. 256 2.2.1. Use of Captive Portal API 258 In preperation for possible quarantine, the DHCP and RA options 259 defined in [RFC7710] and referenced by 260 [I-D.ietf-capport-architecture] (section 2.2.1) SHOULD be recorded if 261 present for later use. 263 An additional captive portal API key "quarantine", if having the true 264 value indicates that the device is not connected to the Internet for 265 security reasons. The existing key "captive" ([I-D.ietf-capport-api] 266 section 4.2) SHOULD also be checked, as the device MAY be subject to 267 a captive portal. 269 Based upon policy, it is appropriate for a MUD controller to put a 270 new device into a captive portal state until such time as inclusion 271 into the operational part of the network has been approved by a human 272 operator. The state should be "captive", but not "quarantined". 274 2.3. Suspicious 276 The device and/or the Internet has attempted a connection which is 277 forbidden by the MUD file. This activity is notable, but 278 particularly in the case where a MUD file was generated by a third 279 party (such as by a period of observation), it may signal that the 280 MUD file is inaccurate rather than that the device is compromised. 282 In the case of connections that originate from the Internet to the 283 device which are forbidden, this may indicate that device is being 284 scanned for, but that the security features of the router are 285 resisting the attack. 287 It is unclear how a device is returned from suspicious state to 288 nominal. A reasonable process might be that after a period of time 289 in which no new unwanted activity occurs it is returned. A clear 290 indication that it should return to nomimal is if a new MUD file is 291 applied to the device. 293 2.4. Suspect 295 The device is repeatedly attempting to connect to core infrastructure 296 which it has reasonably no reason to connect to. Examples of this 297 would include connecting to many IP addresses in a sequential or 298 high-frequency rate, connecting to well-known ports not intended to 299 for end devices (for instance TCP port 22, 23, 25). There might 300 still be a reasonable explanation for this behaviour, including that 301 the "inside" IP address has been reassigned to a different device 302 (such as desktop computer). 304 [RFC7011] is a candidate protocol for a MUD controller to inform an 305 ISP about the traffic patterns of the device. 307 [RFC7970] is a candidate protocol by which the ISP or other security 308 service provider might exchange information about the incident. It 309 is unclear if [RFC7970] should be extended to the CPE device or not. 311 2.5. Device of Interest 313 A device has become interesting based upon two possible situations: 314 an internal signal that a device has become suspected, and based upon 315 external indications that there are active threats against the 316 device. A device in this state SHOULD go into quarantine upon the 317 next observed attack. 319 If it can be observed that there are DNS spoofing attempts against 320 the device manufacturer's firmware repository, or it's command/ 321 control channel (for devices which have cloud connections), then it 322 would be reasonable to become interested in the device: an attack may 323 be coming. 325 A device under interest would continue to be able to perform it's 326 normal functions. For instance, a furnace would continue to heat the 327 house, and would continue to report it's statistics to it's 328 manufacturer/service-entity, and would continue to respond to 329 thermostat changes. 331 2.6. Quarantined 333 A device in quarantine gets no Internet access. 335 Devices in quarantine MAY use the API defined by 336 [I-D.ietf-capport-architecture] to determine if the device has been 337 quarantined. Devices which can display this information visually 338 SHOULD do so, such as on a status LCD display, or by a unique color 339 scheme for status LEDs. 341 A device in quarantine MAY do DNS requests to the local recursive DNS 342 resolvers for the IP address of it's firmware repository. This 343 address would be present in the device's MUD file using the 344 [I-D.richardson-shg-mud-quarantined-access]. Access to the firmware 345 repository is important to permit the device to apply new firmware 346 and/or reset itself to factory default. 348 A device in quarantine that performs other functions might continue 349 to be perform those functions. For instance, a fridge would remain 350 cold, but it would not respond to thermostat changes, or communicate 351 with a grocery store. 353 2.7. Disabled 355 A device that is disabled gets no network connectivity at all, 356 including no local network connectivity. 358 A device that is directly mains powered would be disconnected by a 359 human. A device that is powered by Power-over-Ethernet could be 360 disconnected by administratively turning power off on that port. 362 A device that is battery powered or scavanges power would remain on 363 as long as it had power. 365 2.8. Returning to Service 367 A device that is attempting to return to service has installed some 368 "fix" for the issue that lead it to be quarantined. It could also be 369 the case that the device did not need to anything, and that the 370 quarantine was a false positive, and a new MUD file is loaded with 371 the additionally accepted patterns. 373 A device returning to service MAY have erased all it's network 374 settings, and will have to go through some form of network enrollment 375 again. 377 2.9. Owned by malicious entity ("p0wned") 379 A device which is known to be controlled by a malicious entity. It 380 may be impossible to quarantine the device if it performs some 381 critical function and the imposition of quarantine would prevent 382 that. 384 3. Detailed description of transitions 386 This section deals with the transitions between states. These 387 transitions occur as a result of network and/or human signaling. The 388 occurance of these transitions will in most cases cause a signal to 389 be sent. 391 3.1. Initial Enrollment 393 The process of enrollment is out of scope for this document. 395 3.2. Re-enrollment 397 The process of re-enrollment is out of scope for this document. This 398 document does specify when this re-enrollment can take place, and how 399 a human can indicate to a device and to the network infrastructure 400 that re-enrollment can take place. 402 Re-enrollment can occur a number of different ways. 404 3.2.1. factory-default re-enrollment 406 A device can re-enroll in a factory-default state. This means that 407 all settings are lost and any private keys that might have been 408 visible to malicious code/coders who may have had access to the 409 device have are regenerated. 411 Devices that store private keys in Trusted Platform Modules (TPM), or 412 in Trusted Execution Environments (see [I-D.ietf-teep-architecture]) 413 could reasonably assume that private keys may be retained. From an 414 802.1AR perspective, the IDevID may be assumed to be intact, but the 415 integrity of the LDevID may be suspect. 417 As the device is in a factory-default state it will have no user/ 418 owner-specific configuration, and any authorization lists will need 419 to be re-established! 421 3.2.2. simple re-enrollment 423 The device does not return to a factory-default state, and has 424 existing network, owner credentials and configuration intact. A 425 network onboarding will need to be repeated to establish new per- 426 device network keys. 428 An audit of the device authorizations SHOULD be done, as an attacker 429 may have inserted additional authorizations in order to return. 431 3.2.3. other kinds? 433 Are there states in between these two extremes? 435 3.3. Initial suspicion 437 The transition from nomimal to initial suspicion occurs when the MUD 438 firewall detects (and blocks) network not described in the device 439 MUD. There are a number of non-critical reasons why this could 440 occur. 442 The mostly likely situation is that the MUD describes access rules 443 using DNS names, while the firewall is implemented in terms of IP 444 addresses. The name to IP mapping may well have changed, and the 445 firewall has not yet caught up to the new mapping. 447 3.4. Confirmed suspicion 449 TBD 451 3.5. Device identified as attack target 453 TBD 455 3.6. Suspension of connectivity 457 TBD 459 3.7. Re-Installation of valid firmware 461 TBD 463 4. An example process 465 Here will be somes examples of a device. 467 5. Human Rights Considerations 469 TBD 471 6. Privacy Considerations 473 TBD 475 7. Security Considerations 477 TBD 479 8. IANA Considerations 481 8.1. Captive Portal API JSON keys 483 A new JSON key for [I-D.ietf-capport-api]'s "Captive Portal API Keys" 484 is to be registred with the following values: 486 key: "quarantine" 487 type: "boolean" 488 description: [THISDOCUMENT] specifies that the quarantine key should be 489 marked true if the device has had its Internet access 490 revoked due to violation of an RFF8520 (MUD) profile. 492 9. Acknowledgements 494 10. References 496 10.1. Normative References 498 [I-D.ietf-capport-api] 499 Pauly, T. and D. Thakore, "Captive Portal API", Work in 500 Progress, Internet-Draft, draft-ietf-capport-api-08, 18 501 June 2020, . 504 [I-D.ietf-capport-architecture] 505 Larose, K., Dolson, D., and H. Liu, "Captive Portal 506 Architecture", Work in Progress, Internet-Draft, draft- 507 ietf-capport-architecture-10, 23 September 2020, 508 . 511 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 512 Requirement Levels", BCP 14, RFC 2119, 513 DOI 10.17487/RFC2119, March 1997, 514 . 516 [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, 517 "Specification of the IP Flow Information Export (IPFIX) 518 Protocol for the Exchange of Flow Information", STD 77, 519 RFC 7011, DOI 10.17487/RFC7011, September 2013, 520 . 522 [RFC7710] Kumari, W., Gudmundsson, O., Ebersman, P., and S. Sheng, 523 "Captive-Portal Identification Using DHCP or Router 524 Advertisements (RAs)", RFC 7710, DOI 10.17487/RFC7710, 525 December 2015, . 527 [RFC7970] Danyliw, R., "The Incident Object Description Exchange 528 Format Version 2", RFC 7970, DOI 10.17487/RFC7970, 529 November 2016, . 531 [RFC8520] Lear, E., Droms, R., and D. Romascanu, "Manufacturer Usage 532 Description Specification", RFC 8520, 533 DOI 10.17487/RFC8520, March 2019, 534 . 536 10.2. Informative References 538 [dpp] "Device Provisioning Protocol Specification", n.d., 539 . 543 [I-D.ietf-anima-bootstrapping-keyinfra] 544 Pritikin, M., Richardson, M., Eckert, T., Behringer, M., 545 and K. Watsen, "Bootstrapping Remote Secure Key 546 Infrastructures (BRSKI)", Work in Progress, Internet- 547 Draft, draft-ietf-anima-bootstrapping-keyinfra-44, 21 548 September 2020, . 551 [I-D.ietf-teep-architecture] 552 Pei, M., Tschofenig, H., Thaler, D., and D. Wheeler, 553 "Trusted Execution Environment Provisioning (TEEP) 554 Architecture", Work in Progress, Internet-Draft, draft- 555 ietf-teep-architecture-12, 13 July 2020, 556 . 559 [I-D.richardson-shg-mud-quarantined-access] 560 Richardson, M. and M. Ranganathan, "Manufacturer Usuage 561 Description for quarantined access to firmware", Work in 562 Progress, Internet-Draft, draft-richardson-shg-mud- 563 quarantined-access-01, 8 July 2019, . 567 [looneytunes] 568 "List of Looney Tunes Cartoons", n.d., 569 . 572 [SecureHomeGateway] 573 "CIRALabs Secure Home Gateway", n.d., 574 . 576 [swatting] "Cambridge English Dictionary: swatting", January 2019, 577 . 580 Authors' Addresses 582 Michael Richardson 583 Sandelman Software Works 585 Email: mcr+ietf@sandelman.ca 587 Jacques Latour 588 CIRA Labs 590 Email: Jacques.Latour@cira.ca