idnits 2.17.1 draft-richardson-shg-un-quarantine-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (20 February 2020) is 1528 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'THISDOCUMENT' is mentioned on line 480, but not defined == Outdated reference: A later version (-08) exists of draft-ietf-capport-api-05 == Outdated reference: A later version (-10) exists of draft-ietf-capport-architecture-06 ** Downref: Normative reference to an Informational draft: draft-ietf-capport-architecture (ref. 'I-D.ietf-capport-architecture') ** Obsolete normative reference: RFC 7710 (Obsoleted by RFC 8910) == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-35 == Outdated reference: A later version (-19) exists of draft-ietf-teep-architecture-06 == Outdated reference: A later version (-02) exists of draft-richardson-shg-mud-quarantined-access-01 Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Richardson 3 Internet-Draft Sandelman Software Works 4 Intended status: Best Current Practice J. Latour 5 Expires: 23 August 2020 CIRA Labs 6 20 February 2020 8 A standard process to quarantine and restore IoT Devices 9 draft-richardson-shg-un-quarantine-02 11 Abstract 13 The Manufacturer Usage Description (MUD) is a tool to describe the 14 limited access that a single function device such as an Internet of 15 Things device might need. The enforcement of the access control 16 lists described protects the device from attacks from the Internet, 17 and protects the Internets from compromised devices. 19 This document details the process which occurs when a device is 20 detected to have violated the stated policy. The goal of these steps 21 is to ensure that the device is correctly removed from operation, 22 fixed, and if possible, restored to safe operation. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on 23 August 2020. 41 Copyright Notice 43 Copyright (c) 2020 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 48 license-info) in effect on the date of publication of this document. 49 Please review these documents carefully, as they describe your rights 50 and restrictions with respect to this document. Code Components 51 extracted from this document must include Simplified BSD License text 52 as described in Section 4.e of the Trust Legal Provisions and are 53 provided without warranty as described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 59 1.2. An overview of the stages of activity . . . . . . . . . . 4 60 2. Detailed description of states . . . . . . . . . . . . . . . 5 61 2.1. New device . . . . . . . . . . . . . . . . . . . . . . . 5 62 2.2. Nominal . . . . . . . . . . . . . . . . . . . . . . . . . 6 63 2.2.1. Use of Captive Portal API . . . . . . . . . . . . . . 6 64 2.3. Suspicious . . . . . . . . . . . . . . . . . . . . . . . 6 65 2.4. Suspect . . . . . . . . . . . . . . . . . . . . . . . . . 7 66 2.5. Device of Interest . . . . . . . . . . . . . . . . . . . 7 67 2.6. Quarantined . . . . . . . . . . . . . . . . . . . . . . . 7 68 2.7. Disabled . . . . . . . . . . . . . . . . . . . . . . . . 8 69 2.8. Returning to Service . . . . . . . . . . . . . . . . . . 8 70 2.9. Owned by malicious entity ("p0wned") . . . . . . . . . . 8 71 3. Detailed description of transitions . . . . . . . . . . . . . 9 72 3.1. Initial Enrollment . . . . . . . . . . . . . . . . . . . 9 73 3.2. Re-enrollment . . . . . . . . . . . . . . . . . . . . . . 9 74 3.2.1. factory-default re-enrollment . . . . . . . . . . . . 9 75 3.2.2. simple re-enrollment . . . . . . . . . . . . . . . . 9 76 3.2.3. other kinds? . . . . . . . . . . . . . . . . . . . . 10 77 3.3. Initial suspicion . . . . . . . . . . . . . . . . . . . . 10 78 3.4. Confirmed suspicion . . . . . . . . . . . . . . . . . . . 10 79 3.5. Device identified as attack target . . . . . . . . . . . 10 80 3.6. Suspension of connectivity . . . . . . . . . . . . . . . 10 81 3.7. Re-Installation of valid firmware . . . . . . . . . . . . 10 82 4. An example process . . . . . . . . . . . . . . . . . . . . . 10 83 5. Human Rights Considerations . . . . . . . . . . . . . . . . . 10 84 6. Privacy Considerations . . . . . . . . . . . . . . . . . . . 10 85 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 86 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 87 8.1. Captive Portal API JSON keys . . . . . . . . . . . . . . 11 88 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 89 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 90 10.1. Normative References . . . . . . . . . . . . . . . . . . 11 91 10.2. Informative References . . . . . . . . . . . . . . . . . 12 92 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 94 1. Introduction 96 [RFC8520] describes the format of the Manufacturer Usage Description 97 (MUD) files. MUD files provide a set of network Access Control Lists 98 (ACL, pronounced [ak-uhl]) that describes the expected traffic from a 99 device, such as an Internet of Things (IoT) device. 101 MUD files are used in a number of projects, including the CIRALabs' 102 [SecureHomeGateway] (SHG) project. In this project a home gateway 103 ("router") is enhanced to be able to use MUD files to describe the 104 traffic expected from all connected devices. If a device does not 105 have a MUD format description, then the project can provide a broad 106 set of traffic expectations based upon categorization of the device 107 by the home owner. 109 This document is about the process to be followed when a device is 110 observed to be violating the ACLs applied to it. While this document 111 will identify network protocols (and gaps where no protocol exists) 112 as appropriate, the goal of this document is more about the human 113 process. Specifically, who gets called, and in what order. Who 114 makes each call, and how are they identified. 116 In addition, what kind of data needs to be shared among the parties 117 and what are the privacy and human rights implications of sharing the 118 required data. 120 Finally, in the security considerations section of this document some 121 concerns about prevention of so-called "SWAT"ing ([swatting]), where 122 an attempt might be made to take a location or network offline 123 through phony reports. 125 1.1. Terminology 127 This document is not a protocol specification, but rather a Best 128 Current Practices in the area of human operations. While this is 129 sometimes called a "Standard Operating Proceedure" (SOP), this 130 document should not be considered the actual SOP for an organization, 131 but rather be referrenced. 133 The terminology [RFC2119] the key words such as "MUST", "MUST NOT", 134 "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", 135 "RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as 136 described in BCP 14, RFC 2119. In the context of this human 137 protocol, they do not describe network protocol interoperability 138 requirements, but rather constraints upon how the humans need to 139 operate in order to avoid unsafe situations. 141 The following terms are used in this document: 143 * owner's network: the network belonging to the owner of the device. 144 In residentical situations, this is typically the home owner. In 145 commercial environments, this may be the owner of the building, or 146 the commercial tenant in the building. 148 * tenant: one or more people who occupy a space in which a network 149 of devices exists which do not belong directly to them. 151 1.2. An overview of the stages of activity 153 This section provides a brief overview of the states that a device 154 may be in. The following section provides a detailed description of 155 the state. This document is primarily about how a device transitions 156 from one state to another, which is covered in {#transitions}. 158 .--------. .---------.<---------.------------. 159 | new |-------->| nominal | | suspicious | 160 | device |\ .----->| | -------->| | 161 '--------' \| '---------' '------------' 162 \ | | 163 |\ | | 164 | \ | | 165 | \ v v 166 | \ .------------. .------------. 167 .------------.| v| p0owned | | device-of | 168 | returning || | | | interest | 169 | to service | '------------' '------------' 170 '------------' | | 171 ^ | | 172 | v v 173 .------------. .------------. .-----------. 174 | upgrading | | quarantine | | suspect | 175 | |<-----| |<------| | 176 '------------' '------------' '-----------' 178 Figure 1: Device Connectivity States 180 * new device: a device that has just been "connected" to the 181 network. 183 * nominal: a device which is operating correctly. 185 * suspicious: a device which has once gone out of it's MUD profile. 187 * suspect: a device which has repeatedly gone out of it's MUD 188 profile. 190 * device-of-interest: a device that is part of a class of devices 191 which is considered suspect. 193 * quarantined: a device which has been isolated into a network 194 "segment", it may stil be operating locally. 196 * disabled: a device which has been disconnected from the network, 197 and has also had mains power removed. The device is believed to 198 be off. 200 * upgrading: a device which is active for the purpose of having new 201 firmware installed. 203 * returning-to-service: a device which has new firmware, and is 204 going through a re-enrollment process. It may still lack critical 205 configuration, and may be unable to yet perform critical 206 functions. 208 * p0wned: a device which is known to have malicious routines 209 running, but is still connected to the network. It may continue 210 to provide the services the device was designed to do, in 211 additional to performing functions controlled by an unauthorized 212 entity. 214 2. Detailed description of states 216 A device is considered to be on one of the above states. The device 217 is not considered to be aware of it's state, rather this is a 218 characteristic that the network assigns to the device. 220 2.1. New device 222 A device newly installed will have no initial network connectivity. 223 It will be awaiting some kind of enrollment or onboarding process. 224 Examples of enrollment processes include 225 [I-D.ietf-anima-bootstrapping-keyinfra], [dpp], processes defined by 226 The Thread Group and Apple Homekit, as well as a great number of 227 custom and proprietary methods. 229 In many cases the device may provide limited network connectivity to 230 itself (such as by running as an Access Point itself), and can be 231 reached by attackers even before it has been onboarded. The owner of 232 the device may in fact in unaware that the device is "smart", and it 233 may be possible for a device to become compromised without ever 234 having joined a network. As an example, a smart clothing washer may 235 have been installed and may function perfectly fine without any 236 smart-features, but which may be, in its default configuration 237 vulnerable to any attacker that is within WiFi distance. This case 238 is particularly difficult, as having never joined a network, the 239 device will not emit signals on the owner's network that can be 240 detected to notice that the device has been attacked. Also, having 241 never been connected, the device is more likely to have old firmware. 243 2.2. Nominal 245 The device is operating normally and is not suspected to be corrupted 246 or under attack. 248 2.2.1. Use of Captive Portal API 250 In preperation for possible quarantine, the DHCP and RA options 251 defined in [RFC7710] and referenced by 252 [I-D.ietf-capport-architecture] (section 2.2.1) SHOULD be recorded if 253 present for later use. 255 An additional captive portal API key "quarantine", if having the true 256 value indicates that the device is not connected to the Internet for 257 security reasons. The existing key "captive" ([I-D.ietf-capport-api] 258 section 4.2) SHOULD also be checked, as the device MAY be subject to 259 a captive portal. 261 Based upon policy, it is appropriate for a MUD controller to put a 262 new device into a captive portal state until such time as inclusion 263 into the operational part of the network has been approved by a human 264 operator. The state should be "captive", but not "quarantined". 266 2.3. Suspicious 268 The device and/or the Internet has attempted a connection which is 269 forbidden by the MUD file. This activity is notable, but 270 particularly in the case where a MUD file was generated by a third 271 party (such as by a period of observation), it may signal that the 272 MUD file is inaccurate rather than that the device is compromised. 274 In the case of connections that originate from the Internet to the 275 device which are forbidden, this may indicate that device is being 276 scanned for, but that the security features of the router are 277 resisting the attack. 279 It is unclear how a device is returned from suspicious state to 280 nominal. A reasonable process might be that after a period of time 281 in which no new unwanted activity occurs it is returned. A clear 282 indication that it should return to nomimal is if a new MUD file is 283 applied to the device. 285 2.4. Suspect 287 The device is repeatedly attempting to connect to core infrastructure 288 which it has reasonably no reason to connect to. Examples of this 289 would include connecting to many IP addresses in a sequential or 290 high-frequency rate, connecting to well-known ports not intended to 291 for end devices (for instance TCP port 22, 23, 25). There might 292 still be a reasonable explanation for this behaviour, including that 293 the "inside" IP address has been reassigned to a different device 294 (such as desktop computer). 296 [RFC7011] is a candidate protocol for a MUD controller to inform an 297 ISP about the traffic patterns of the device. 299 [RFC7970] is a candidate protocol by which the ISP or other security 300 service provider might exchange information about the incident. It 301 is unclear if [RFC7970] should be extended to the CPE device or not. 303 2.5. Device of Interest 305 A device has become interesting based upon two possible situations: 306 an internal signal that a device has become suspected, and based upon 307 external indications that there are active threats against the 308 device. A device in this state SHOULD go into quarantine upon the 309 next observed attack. 311 If it can be observed that there are DNS spoofing attempts against 312 the device manufacturer's firmware repository, or it's command/ 313 control channel (for devices which have cloud connections), then it 314 would be reasonable to become interested in the device: an attack may 315 be coming. 317 A device under interest would continue to be able to perform it's 318 normal functions. For instance, a furnace would continue to heat the 319 house, and would continue to report it's statistics to it's 320 manufacturer/service-entity, and would continue to respond to 321 thermostat changes. 323 2.6. Quarantined 325 A device in quarantine gets no Internet access. 327 Devices in quarantine MAY use the API defined by 328 [I-D.ietf-capport-architecture] to determine if the device has been 329 quarantined. Devices which can display this information visually 330 SHOULD do so, such as on a status LCD display, or by a unique color 331 scheme for status LEDs. 333 A device in quarantine MAY do DNS requests to the local recursive DNS 334 resolvers for the IP address of it's firmware repository. This 335 address would be present in the device's MUD file using the 336 [I-D.richardson-shg-mud-quarantined-access]. Access to the firmware 337 repository is important to permit the device to apply new firmware 338 and/or reset itself to factory default. 340 A device in quarantine that performs other functions might continue 341 to be perform those functions. For instance, a fridge would remain 342 cold, but it would not respond to thermostat changes, or communicate 343 with a grocery store. 345 2.7. Disabled 347 A device that is disabled gets no network connectivity at all, 348 including no local network connectivity. 350 A device that is directly mains powered would be disconnected by a 351 human. A device that is powered by Power-over-Ethernet could be 352 disconnected by administratively turning power off on that port. 354 A device that is battery powered or scavanges power would remain on 355 as long as it had power. 357 2.8. Returning to Service 359 A device that is attempting to return to service has installed some 360 "fix" for the issue that lead it to be quarantined. It could also be 361 the case that the device did not need to anything, and that the 362 quarantine was a false positive, and a new MUD file is loaded with 363 the additionally accepted patterns. 365 A device returning to service MAY have erased all it's network 366 settings, and will have to go through some form of network enrollment 367 again. 369 2.9. Owned by malicious entity ("p0wned") 371 A device which is known to be controlled by a malicious entity. It 372 may be impossible to quarantine the device if it performs some 373 critical function and the imposition of quarantine would prevent 374 that. 376 3. Detailed description of transitions 378 This section deals with the transitions between states. These 379 transitions occur as a result of network and/or human signaling. The 380 occurance of these transitions will in most cases cause a signal to 381 be sent. 383 3.1. Initial Enrollment 385 The process of enrollment is out of scope for this document. 387 3.2. Re-enrollment 389 The process of re-enrollment is out of scope for this document. This 390 document does specify when this re-enrollment can take place, and how 391 a human can indicate to a device and to the network infrastructure 392 that re-enrollment can take place. 394 Re-enrollment can occur a number of different ways. 396 3.2.1. factory-default re-enrollment 398 A device can re-enroll in a factory-default state. This means that 399 all settings are lost and any private keys that might have been 400 visible to malicious code/coders who may have had access to the 401 device have are regenerated. 403 Devices that store private keys in Trusted Platform Modules (TPM), or 404 in Trusted Execution Environments (see [I-D.ietf-teep-architecture]) 405 could reasonably assume that private keys may be retained. From an 406 802.1AR perspective, the IDevID may be assumed to be intact, but the 407 integrity of the LDevID may be suspect. 409 As the device is in a factory-default state it will have no user/ 410 owner-specific configuration, and any authorization lists will need 411 to be re-established! 413 3.2.2. simple re-enrollment 415 The device does not return to a factory-default state, and has 416 existing network, owner credentials and configuration intact. A 417 network onboarding will need to be repeated to establish new per- 418 device network keys. 420 An audit of the device authorizations SHOULD be done, as an attacker 421 may have inserted additional authorizations in order to return. 423 3.2.3. other kinds? 425 Are there states in between these two extremes? 427 3.3. Initial suspicion 429 The transition from nomimal to initial suspicion occurs when the MUD 430 firewall detects (and blocks) network not described in the device 431 MUD. There are a number of non-critical reasons why this could 432 occur. 434 The mostly likely situation is that the MUD describes access rules 435 using DNS names, while the firewall is implemented in terms of IP 436 addresses. The name to IP mapping may well have changed, and the 437 firewall has not yet caught up to the new mapping. 439 3.4. Confirmed suspicion 441 TBD 443 3.5. Device identified as attack target 445 TBD 447 3.6. Suspension of connectivity 449 TBD 451 3.7. Re-Installation of valid firmware 453 TBD 455 4. An example process 457 Here will be somes examples of a device. 459 5. Human Rights Considerations 461 TBD 463 6. Privacy Considerations 465 TBD 467 7. Security Considerations 469 TBD 471 8. IANA Considerations 473 8.1. Captive Portal API JSON keys 475 A new JSON key for [I-D.ietf-capport-api]'s "Captive Portal API Keys" 476 is to be registred with the following values: 478 key: "quarantine" 479 type: "boolean" 480 description: [THISDOCUMENT] specifies that the quarantine key should be 481 marked true if the device has had its Internet access 482 revoked due to violation of an RFF8520 (MUD) profile. 484 9. Acknowledgements 486 10. References 488 10.1. Normative References 490 [I-D.ietf-capport-api] 491 Pauly, T. and D. Thakore, "Captive Portal API", Work in 492 Progress, Internet-Draft, draft-ietf-capport-api-05, 4 493 February 2020, . 496 [I-D.ietf-capport-architecture] 497 Larose, K., Dolson, D., and H. Liu, "CAPPORT 498 Architecture", Work in Progress, Internet-Draft, draft- 499 ietf-capport-architecture-06, 15 February 2020, 500 . 503 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 504 Requirement Levels", BCP 14, RFC 2119, 505 DOI 10.17487/RFC2119, March 1997, 506 . 508 [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, 509 "Specification of the IP Flow Information Export (IPFIX) 510 Protocol for the Exchange of Flow Information", STD 77, 511 RFC 7011, DOI 10.17487/RFC7011, September 2013, 512 . 514 [RFC7710] Kumari, W., Gudmundsson, O., Ebersman, P., and S. Sheng, 515 "Captive-Portal Identification Using DHCP or Router 516 Advertisements (RAs)", RFC 7710, DOI 10.17487/RFC7710, 517 December 2015, . 519 [RFC7970] Danyliw, R., "The Incident Object Description Exchange 520 Format Version 2", RFC 7970, DOI 10.17487/RFC7970, 521 November 2016, . 523 [RFC8520] Lear, E., Droms, R., and D. Romascanu, "Manufacturer Usage 524 Description Specification", RFC 8520, 525 DOI 10.17487/RFC8520, March 2019, 526 . 528 10.2. Informative References 530 [dpp] "Device Provisioning Protocol Specification", n.d., 531 . 535 [I-D.ietf-anima-bootstrapping-keyinfra] 536 Pritikin, M., Richardson, M., Eckert, T., Behringer, M., 537 and K. Watsen, "Bootstrapping Remote Secure Key 538 Infrastructures (BRSKI)", Work in Progress, Internet- 539 Draft, draft-ietf-anima-bootstrapping-keyinfra-35, 5 540 February 2020, . 543 [I-D.ietf-teep-architecture] 544 Pei, M., Tschofenig, H., Thaler, D., and D. Wheeler, 545 "Trusted Execution Environment Provisioning (TEEP) 546 Architecture", Work in Progress, Internet-Draft, draft- 547 ietf-teep-architecture-06, 8 February 2020, 548 . 551 [I-D.richardson-shg-mud-quarantined-access] 552 Richardson, M. and M. Ranganathan, "Manufacturer Usuage 553 Description for quarantined access to firmware", Work in 554 Progress, Internet-Draft, draft-richardson-shg-mud- 555 quarantined-access-01, 8 July 2019, . 559 [looneytunes] 560 "List of Looney Tunes Cartoons", n.d., 561 . 564 [SecureHomeGateway] 565 "CIRALabs Secure Home Gateway", n.d., 566 . 568 [swatting] "Cambridge English Dictionary: swatting", January 2019, 569 . 572 Authors' Addresses 574 Michael Richardson 575 Sandelman Software Works 577 Email: mcr+ietf@sandelman.ca 579 Jacques Latour 580 CIRA Labs 582 Email: Jacques.Latour@cira.ca