idnits 2.17.1 draft-ietf-ssh-handbook-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-24) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 43 longer pages, the longest (page 43) being 76 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 43 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 63 instances of too long lines in the document, the longest one being 8 characters in excess of 72. == There are 9 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 853 has weird spacing: '...onsider keepi...' == Line 1976 has weird spacing: '...nd each custo...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 1995) is 10358 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 124 looks like a reference -- Missing reference section? '10' on line 1729 looks like a reference Summary: 11 errors (**), 0 flaws (~~), 6 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft Barbara Fraser 2 Network Working Group SEI/CMU 3 Expires in six months December 1995 5 Site Security Handbook 6 8 Status of this Memo 10 This document is an Internet-Draft. Internet-Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its areas, 12 and its working groups. Note that other groups may also distribute 13 working documents as Internet-Drafts. 15 Internet-Drafts are draft documents valid for a maximum of six months 16 and may be updated, replaced, or obsoleted by other documents at any 17 time. It is inappropriate to use Internet- Drafts as reference 18 material or to cite them other than as ``work in progress.'' 20 To learn the current status of any Internet-Draft, please check the 21 ``1id-abstracts.txt'' listing contained in the Internet- Drafts 22 Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 23 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 24 ftp.isi.edu (US West Coast). 26 Table of Contents 28 1. Introduction.................................................... 2 29 1.1 Purpose of this Work............................................ 2 30 1.2 Audience........................................................ 2 31 1.3 Definitions..................................................... 3 32 1.4 Related Work.................................................... 3 33 2. Security Policies............................................... 3 34 2.1 What is a Computer Security Policy and Why Have One?............ 3 35 2.2 What Makes a Good Computer Security Policy?..................... 5 36 3. Architecture.................................................... 6 37 3.1 Objectives...................................................... 6 38 3.2 Network and Service Configurations.............................. 10 39 3.4 Firewalls....................................................... 13 40 4. Security Procedures............................................. 13 41 4.1 Authentication.................................................. 13 42 4.2 Authorization................................................... 15 43 4.3 Access.......................................................... 16 44 4.4 Modems.......................................................... 16 45 4.5 Cryptography.................................................... 18 46 4.6 Auditing........................................................ 18 47 4.7 Backups......................................................... 20 48 5. Security Incident Handling...................................... 20 49 5.1 Preparing and Planning for Incident Handling.................... 22 50 5.2 Notification and Points of Contact.............................. 24 51 5.3 Identifying an Incident......................................... 31 52 5.4 Handling an Incident............................................ 33 53 5.5 Aftermath of an Incident........................................ 38 54 5.6 Responsibilities................................................ 39 55 6. Maintenance and Evaluation...................................... 40 56 6.1 Risk Assessments................................................ 40 57 6.2 Notification of Problems and Events............................. 40 58 Appendix A1 Tools and Locations..................................... 40 59 Appendix A2 Mailing Lists and Other Resources....................... 41 60 References........................................................... ? 61 Annotated Bibliography............................................... ? 63 1. INTRODUCTION 65 This document provides guidance to system and network administrators 66 on how to address security issues within the Internet community. It 67 builds on the foundation provided in RFC 1244 and is the collective 68 work of a number of contributing authors. Those authors include: 69 Jules P. Aronson, Nevil Brownlee, Joao Nuno Ferreira, Erik Gutmann, 70 Klaus-Peter Kossakowski, Edward.P.Lewis, Gary Malkin, Philip J. 71 Nesser, and Michael S. Ramsey. 73 A special thank you goes to Joyce Reynolds, ISI, and Paul Holbrook, 74 CICnet, for their vision, leadership, and effort in the creation of 75 the first version of this handbook. It is the working group's sincere 76 hope that this version will be as helpful to the community as the 77 earlier one was. 79 1.1 Purpose of this Work 81 This handbook is a guide to setting computer security policies and 82 procedures for sites that have systems on the Internet. This guide 83 lists issues and factors that a site must consider when setting their 84 own policies. It makes some recommendations and provides discussions 85 of relevant areas. 87 This guide is only a framework for setting security policies and 88 procedures. In order to have an effective set of policies and 89 procedures, a site will have to make many decisions, gain agreement, 90 and then communicate and implement the policies. 92 1.2 Audience 94 The audience for this document are system administrators and decision 95 makers (who are more traditionally called "administrators" or "middle 96 management") at sites. This document is not directed at programmers 97 or those trying to create secure programs or systems. The focus of 98 this document is on the policies and procedures that need to be in 99 place to support any technical security features that a site may be 100 implementing. 102 The primary audience for this work are sites that are members of the 103 Internet community. However, this document should be useful to any 104 site that allows communication with other sites. As a general guide 105 to security policies, this document may also be useful to sites with 106 isolated systems. 108 1.3 Definitions 110 For the purposes of this guide, a "site" is any organization that 111 owns computers or network-related resources. These resources may 112 include host computers that users use, routers, terminal servers, 113 PC's or other devices that have access to the Internet. A site may 114 be a end user of Internet services or a service provider such as a 115 regional network. However, most of the focus of this guide is on 116 those end users of Internet services. 118 We assume that the site has the ability to set policies and 119 procedures for itself with the concurrence and support from those who 120 actually own the resources. 122 The "Internet" is those set of networks and machines that use the 123 TCP/IP protocol suite, connected through gateways, and sharing a 124 common name and address spaces [1]. 126 The term "system administrator" is used to cover all those who are 127 responsible for the day-to-day operation of resources. This may be a 128 number of individuals or an organization. 130 The term "decision maker" refers to those people at a site who set or 131 approve policy. These are often (but not always) the people who own 132 the resources. 134 1.4 Related Work 136 The IETF Guidelines for Security Incident Response Working Group 137 (GRIP) is developing a document for security incident response teams. 138 That document provides additional guidance to those organizations 139 planning to develop their own incident response team (IRT), including 140 a template that is useful to IRTs when descibing their policies and 141 services. 143 2. SECURITY POLICIES 145 2.1 What is a Computer Security Policy and Why Have One? 147 The security-related decisions you make, or fail to make, as network 148 administrator largely determines how secure or insecure your network 149 is, how much functionality your network offers, and how easy your 150 network is to use. However, you cannot make good decisions about 151 security without first determining what your security goals are. 152 Until you determine what your security goals are, you cannot make 153 effective use of any collection of security tools because you simply 154 will not know what to check for and what restrictions to impose. 156 For example, your goals will probably be very different from the 157 goals of a product vendor. Vendors are trying to make configuration 158 and operation of their products as simple as possible, which implies 159 that the default configurations will often be as open (i.e., 160 insecure) as possible. While this does make it easier to install new 161 products, it also leaves access to those systems, and other systems 162 through them, open to any user who wanders by. 164 Your goals will be largely determined by the following key tradeoffs: 166 (1) services offered vs. security provided - 167 Each service offered to users carries its own security risks. 168 For some services the risk outweighs the benefit of the service 169 and the administrator may choose to eliminate the service rather 170 than try to secure it. 172 (2) ease of use vs. security - 173 The easiest system to use would allow acces to any user and require 174 no passwords; that is, there would be no security. Requiring 175 passwords makes the system a little less convenient, but more secure. 176 Requiring device-generated one-time passwords makes the system even 177 more difficult to use, but much more secure. 179 (3) cost of security vs. risk of loss - 180 There are many different costs to security: monetary (i.e., the 181 cost of purchasing security hardware and software like firewalls 182 and one-time password generators), performance (i.e., encryption 183 and decryption take time), and ease of use (as mentioned above). 184 There are also many levels of risk: loss of privacy (i.e., the 185 reading of information by unauthorized individuals), loss of 186 data (i.e., the corruption or erasure of information), and the 187 loss of service (e.g., the filling of data storage space, usage 188 of computational resources, and denial of network access). Each 189 type of cost must be weighed against each type of loss. 191 Your goals should be communicated to all users, operations staff, and 192 managers through a set of security rules, called a "computer security 193 policy." 195 2.1.1 Definition of a Computer Security Policy 197 A computer security policy is a formal statement of the rules by 198 which people who are given access to an organization's technology and 199 information assets must abide. 201 2.1.2 Purposes of a Computer Security Policy 203 The main purpose of a computer security policy is to inform users, 204 staff and managers of their obligatory requirements for protecting 205 technology and information assets. The policy should specify the 206 mechanisms through which these requirements can be met. Another 207 purpose is to provide a baseline from which to acquire, configure and 208 audit computer systems and networks for compliance with the policy. 209 Therefore an attempt to use a set of security tools in the absence of 210 at least an implied security policy is meaningless. 212 2.1.3 Who Should be Involved When Forming Policy? 214 In order for a security policy to be appropriate and effective, it 215 needs to have the acceptance and support of all levels of employees 216 within the organization. The following is a list of individuals who 217 should be involved in the creation and review of security policy 218 documents: 220 (1) site security administrator 221 (2) legal counsel 222 (3) computing center personnel 223 (4) network administrators of large user groups within the organization 224 (e.g., business divisions, computer science department within a 225 university, etc.) 226 (5) local or national security incident response team 227 (6) representatives of the user groups affected by the security policy 229 The list of above is representative of many organizations, but is not 230 necessarily comprehensive. The idea is to bring in representation 231 from key stakeholders, management who have budget and policy 232 authority, technical staff who know what can and cannot be supported, 233 and legal counsel who know the legal ramifications of various policy 234 choices. Involving this group is important if resulting policy 235 statements are to reach the broadest possible acceptance. 237 2.2 What Makes a Good Computer Security Policy? 239 The characteristics of a good security policy are: 241 (1) It must be implementable through system administration 242 procedures, publishing of acceptable use guidelines, or other 243 appropriate methods. 245 (2) It must be enforcible with security tools, where appropriate, 246 and with sanctions, where actual prevention is not technically 247 feasible. 249 (3) It must clearly define the areas of responsibility for the 250 users, staff, and administrators. 252 The components of a good security policy include: 254 (1) Computer Technology Purchasing Guidelines which specify required, 255 or preferred, security features. These should supplement existing 256 purchasing policies and guidelines. 258 (2) A Privacy Policy which defines reasonable expectations of privacy 259 regarding such issues as monitoring of electronic mail, logging of 260 keystrokes, and access to users' files. 262 (3) An Access Policy which defines access rights and privileges to 263 protect assets from loss or disclosure by specifying acceptable use 264 guidelines for users, operations staff, and management. It should 265 provide guidelines for external connections, data communications, 266 connecting devices to a network, and adding new software to 267 systems. It should also specify any required notification messages 268 (e.g., connect messages should provide warnings about authorized 269 usage and line monitoring, and not simply say "Welcome"). 271 (4) An Accountability Policy which defines the responsibilities of users, 272 operations staff, and management. It should specify an audit 273 capability, and provide incident handling guidelines (i.e., what to 274 do and who to contact if a possible intrusion is detected). 276 (5) An Authentication Policy which establishes trust through an effective 277 password policy, and by setting guidelines for remote location 278 authentication and the use of authentication devices (e.g., one-time 279 passwords and the devices that generate them). 281 (6) An Availability statement which sets users' expectations for the 282 availability of resources. It should address redundancy and recovery 283 issues, as well as specify operating hours and maintenance down-time 284 periods. It should also include contact information for reporting 285 system and network failures. 287 (7) A Violations Reporting Policy that indicates which types of 288 violations (e.g., privacy and security, internal and external) 289 must be reported and to whom the reports are made. A 290 non-threatening atmosphere and the possibility of anonymous reporting 291 will result in a greater probability that a violation will be 292 reported if it is detected. 294 (8) Supporting Information which provides users, staff, and management 295 with contact information for each type of policy violation; 296 guidelines how handle outside queries about a security incident or 297 information which may be considered confidential or proprietary; and 298 cross-references to security procedures and related information, such 299 as company policies and regulatory requirements (federal, state, and 300 local). 302 There may be regulatory requirements that affect some aspects of your 303 security policy (e.g., line monitoring). The creators of the 304 security policy should consider seeking legal assistance in the 305 creation of the policy. At a minimum, the policy should be reviewed 306 by legal counsel. 308 Once your computer security policy has been established it should be 309 clearly communicated to users, staff, and management. Having all 310 personnel sign a statement indicating that they have read, 311 understood, and agreed to abide by the policy is an important part of 312 the process. Finally, your policy should be reviewed on a regular 313 basis to see if it is successfully supporting your security needs. 315 3. ARCHITECTURE 317 3.1 Objectives 319 3.1.1 Completely defined security plans 320 Defining a comprehensive security plan should be done by all sites. 321 This plan should be at a higher level than the specific policies 322 discussed in section 2. It should be crafted as a framework of broad 323 guidelines into which specific policies will fit. 325 It is important to have this framework in place so that individual 326 policies can be consistant with the overall site security 327 architecture. For example, having a strong policy with regard to 328 Internet access, but having weak restrictions on modem usage is 329 inconsistent with an overall philosophy of strong security 330 restrictions on external access. 332 A security policy should contain, at a minimum: a list of services 333 which are currently, or will forseably be, provided; who will have 334 access to those services; how access will be provided; who will 335 administer those services; etc. It is also imperative to define any 336 limitations on which portions of an organization can provide 337 services. 339 Another aspect of the plan should concern incident handling. Chapter 340 5 provides an in-depth discussion of responses to incidents, but it 341 is important to define classes of incidents and define responses to 342 each class of incident. For sites with firewalls, how many attempts 343 to foil the firewall will trigger a response? Are there levels of 344 escallation in both attacks and responses? For sites without 345 firewalls, does a single attempt to connect constitute an incident? 346 How about a systematic scan of machines? 348 For sites connected to the Internet, the rampant media glorification 349 of Internet related security incidents can overshadow a (potentially) 350 more serious internal security problem. Likewise, companies who have 351 never been on the Internet before, may have strong, well defined, 352 internal policies but fail to adequately address external connection 353 policy. 355 3.1.2 Separation of Services 357 There are many services which a site may wish to provide for its 358 users, some of which may be external. There are a variety of 359 security reasons to attempt to isolate services onto dedicated 360 machines. There are also performance reasons in most cases, but a 361 detailed discussion is beyond to scope of this document. 363 The services which a site may provide will, in most cases, have 364 different levels of access needs and models of trust. Services which 365 are essential to the security and smooth operation of a site would be 366 better off being placed on a dedicated machine with very limited 367 access restrictions (see Section 3.1.3 "deny all" model), rather than 368 a machine which provides a service (or services) which has 369 traditionally been less secure, or requires greater accessability by 370 users who may accidentally suborn security. 372 It is also important to distinguish between machines which operate 373 within different models of trust, say all the machines inside of a 374 firewall, and any machines on an exposed network. 376 Some of the services which should be examined for potential 377 separation are outlined below. More specific information will be 378 presented in section 3.3 (????). It is important to try to 379 understand that vulnerability is only as strong as the weakest link 380 in the chain. Several of the most publicized penetrations in recent 381 years has been through the electronic mail systems of machines. The 382 intruders were not trying to steal electronic mail, but they used the 383 vulnerability in that system to gain access to other systems. 385 If possible, each service should be running on a different machine 386 whose only duty is to provide a specific service. This help isolate 387 intruders and limit potential harm. 389 3.1.2.1 Name Servers (DNS and NIS(+)) 391 The Internet uses the Domain Name System (DNS) to perform name to 392 address resolution. The Network Information Service (NIS) and NIS+ 393 are not used on the global Internet but are subject to the same risks 394 internally as a DNS server. Name-to-address resolution is critical 395 to the secure operation of any network. An attacker who can 396 successfully control or impersonate a DNS server can route traffic to 397 quickly subvert numerous security guards. Routine traffic can be 398 diverted to a compromised system and traffic can be monitored or 399 users can trivially be tricked into offering authentication secrets. 400 An organization would do well to arrange well known and protected 401 sites to act as secondary name servers, and also protect their own 402 DNS masters from denial of service attacks using filtering routers. 404 3.1.2.2 Password/Key Servers (NIS(+) and KDC) 406 Machines used for password or key servers protect some of the most 407 valuable information to a potential intruder, the encrypted passwords 408 and/or keys, which can be used to access the site. Since many 409 current schemes of encryption are either subject to dictionary or 410 brute force attacks, every effort must be made to secure these 411 machines. 413 3.1.2.3 Authentication/Proxy Servers (SOCKS, FWTK) 415 The use of proxy servers as a security technique will only continue 416 to expand. A proxy server provides a number of security 417 enhancements. It allows sites to concentrate services through a 418 specific machine for monitoring, hiding of internal structure, etc. 419 This funnelling effect creates an attractive target for a potential 420 intruder. 422 3.1.2.4 Email 424 Electronic mail systems have traditionally been one source for 425 intruder break-ins. Most email servers typically accept input from 426 any source. Most email systems consist of two parts, a 427 receiving/sending agent and a processing agent. Since email is 428 delivered to all users and is usually private, the processing agent 429 typically requires system privileges to deliver the mail. Most email 430 implementations perform both portions of the service, which means the 431 receiving agent has a direct link to system privileges. There are 432 several packages available which allow a separation of the two 433 pieces. 435 3.1.2.5 World Wide Web (WWW) 437 The WWW is growing in popularity exponentially because of its ease of 438 use and the powerful abilities to concentrate information services. 439 Most WWW servers take some directions and actions from the persons 440 accessing their services. The most common example is taking a 441 request from a remote user and passing the provided information to a 442 program running on the server to process the request. Those programs 443 can be subverted easily if they are not written with security in 444 mind. 446 3.1.2.6 FTP 448 FTP allows users to receive and send electronic files in a point to 449 point manner. Improperly configured ftp servers can allow intruders 450 to copy or replace files at will from throughout a system. Access to 451 encrypted passwords, proprietary data, or the introduction of trojan 452 horses are just a few of the possibilities. 454 3.1.3 Deny all/ Allow all 456 There are two diametrically oppossed underlying philosophies which 457 can be adopted in defining a security plan. Both alternatives are 458 legitimate models to adopt, depending on the site and its needs for 459 security. 461 The first option is to turn off all services and then selectively 462 enable services on a case by case basis, be it at the machine or 463 network level, as they are needed. This model, which will here after 464 be referred to as the "deny all" model, is generally more secure. 465 More work is required to successfully implement a "deny all" 466 configuration and usually a better understanding of services; which 467 may require several pieces operating together to function correctly. 468 Only allowing known services allows a better analysis of a particular 469 service/protocol, and the design of a security mechanism suited to 470 the security level of the site. 472 The other model, which will here after be referred to as the "allow 473 all" model, is much easier to implement, but is in general less 474 secure than the "deny all" model. Simply turn on all services, 475 usually the default at the host level, and allow all protocols to 476 travel across network boundaries, usually the default at the router 477 level. As security holes become apparent, they are patched at either 478 the host or network level. 480 Each of these models can be applied to different portions of the 481 site, depending on functionality requirements, administrative 482 control, site policy etc. For example, the policy may be to use the 483 "allow all" model when setting up workstations for general use, but 484 adopt a "deny all" model when setting up information servers, like an 485 email hub. Likewise, an "allow all" policy may be adopted for 486 traffic between LAN's internal to the site, but a "deny all" policy 487 can be adopted between the site and the Internet. 489 Be careful when mixing philosophies as in the examples above. Many 490 sites adopt the M & M theory of a hard "crunchy" shell and a soft 491 "squishy" middle. They are willing to pay the cost of security for 492 their external traffic and require strong security measures, but are 493 unwilling or unable to provide similar protections internally. This 494 works fine as long as the outer defenses are never breached and the 495 internal users can be trusted. Once the outer shell (firewall) is 496 breached, subverting the internal network is trivial. 498 3.1.4 identify real needs for services 500 There is a large variety of services which may be provided, both 501 internally and on the Internet at large. Managing security is in 502 many ways managing access to services internal to the site and 503 managing how internal users access information at remote sites. 505 Services tend to rush like waves over the Internet. Over the years 506 many sites have established anonymous ftp servers, gopher servers, 507 wais servers, www servers, etc. as they became popular but not 508 particularly needed at all sites. Evaluate all new services that are 509 established with a skeptical attitude to determine if they are 510 actually needed or just the current fad sweeping the Internet. 512 Bear in mind that security complexity can grow exponentially with the 513 number of services provided. Filtering routers need to be modified 514 to support the new protocols. Some protocols are inherently 515 difficult to filter safely (ex. rpc and udp services), thus providing 516 more openings to the internal network. Services provided on the same 517 machine can interact in catastrophic ways. (ex. allowing anonymous 518 ftp on the same machine as the www server may allow an intruder to 519 place a file in the anonymous ftp area and cause the http server to 520 execute it.) 522 3.2 Network and Service Configuration 524 3.2.1 Protecting the Infrastructure 526 Many network administrators go to great lengths to protect the hosts 527 on their networks. Few administrators make any effort to protect the 528 networks themselves. There is some rationale to this. For example, 529 it is far easier to protect a host than a network. Also, intruders 530 are likely to be after data on the hosts; damaging the network would 531 not serve their purposes. That said, there are still reasons to 532 protect the networks. For example, an intruder might divert network 533 traffic through an outside host in order to examine the data (i.e., 534 to search for passwords). Also, infrastructure includes more than 535 the networks and the routers which interconnect them. Infrastructure 536 also includes network management (e.g., SNMP), services (e.g., DNS, 537 NFS, NTP, WWW), and security (i.e., user authentication and access 538 restrictions). 540 The infrastructure also needs protection against human error. When 541 an administrator misconfigures a host, that host may offer degraded 542 service. This only affects users who require that host, and unless 543 that host is a primary server, the number of affected users will 544 therefore be limited. However, if a router is misconfigured, all 545 users who require the network will be affected. Obviously, this is a 546 far larger number of users than those depending on any one host. 548 3.2.2 Protecting the Network 550 There are several problems to which networks are vulnerable. The 551 classic is a "denial of service" attack. In this case, the network 552 is brought to a state in which it can no longer carry legitimate 553 users' data. There are two common ways this can be done: by 554 attacking the routers and by flooding the network with extraneous 555 traffic. An attack on the router is designed to cause it to stop 556 forwarding packets, or to forward them improperly. The former case 557 may be due to a misconfiguration, the injection of a spurious routing 558 update, or a "flood attack" (i.e., the router is bombarded with 559 unroutable packets, causing its performance to degrade). A flood 560 attack on a network is similar to a flood attack on a router, except 561 that the flood packets are usually broadcast. An ideal flood attack 562 would be the injection of a single packet which exploits some known 563 flaw in the network nodes and causes them to retransmit the packet, 564 or generate error packets, each of which is picked up and repeated by 565 another host. A well chosen attack packet can even generate an 566 exponential explosion of transmissions. 568 Another classic problem is "spoofing." In this case, spurious 569 routing updates are sent to one or more routers causing them to 570 misroute packets. This differs from a denial of service attack only 571 in the purpose behind the spurious route. In denial of service, the 572 object is to make the router unusable; a state which will be quickly 573 detected by network users. In spoofing, the spurious route will 574 cause packets to be routed to a host from which an intruder may 575 monitor the data in the packets. These packets are then be re-routed 576 to their correct destinations. However, the intruder may or may not 577 have altered the contents of the packets. 579 The solution to most of these problems is to protect the routing 580 update packets sent by the routing protocols in use (e.g., RIP-2, 581 OSPF). There are three levels of protection: clear-text password, 582 cryptographic checksum, and encryption. Passwords offer only minimal 583 protection against intruders who do not have direct access to the 584 physical networks. Passwords also offer some protection against 585 misconfigured routers (i.e, routers which, out of the box, attempt to 586 route packets). The advantage of passwords is that they have a very 587 low overhead, in both bandwidth and CPU consumption. Checksums 588 protect against the injection of spurious packets, even if the 589 intruder has direct access to the physical network. Combined with a 590 sequence number, or other unique identifier, a checksum can also 591 protect again "replay" attacks, wherein an old (but valid at the 592 time) routing update is retransmitted by either an intruder or a 593 misbehaving router. The most security is provided by complete 594 encryption of sequenced, or uniquely identified, routing updates. 595 This prevents an intruder from determining the topology of the 596 network. The disadvantage to encryption is the overhead involved in 597 processing the updates. 599 RIP-2 (RFC 1723) and OSPF (RFC 1583) both support clear-text 600 passwords in their base design specifications. In addition, there 601 are extensions to each base protocol to support MD5 encryption. 603 Unfortunately, there is no adequate protection against a flooding 604 attack, or a misbehaving host or router which is flooding the 605 network. Fortunately, this type of attack is obvious when it occurs 606 and can be terminated relatively simply. 608 3.2.3 Protecting the Services 610 There are many types of services; each has its own security 611 requirements. These requirements will vary based on the intended use 612 of the service. For example, a service which should only be usable 613 within a site (e.g., NFS) requires little protection. That is, 614 protecting the server from external access is sufficient to protect 615 the service. However, a WWW server, which provides a home page 616 intended for viewing by users anywhere on the Internet, requires 617 built-in protection. That is, the service/protocol/server must 618 provide whatever security may be required to prevent unauthorized 619 access and modification of the Web database. 621 Internal services (i.e., services meant to be used only by users 622 within a site) and external services (i.e., services deliberately 623 available to users outside a site) will, in general, have protection 624 requirements which differ as previously described. It is therefore 625 wise to isolate the internal services to one set of server machines 626 and the external services to another set of server machines. That 627 is, internal and external servers should not be co-located. In fact, 628 many sites go so far as to have one set of subnets (or even different 629 networks) which are accessible from the outside and another set which 630 may be accessed only within the site. Of course, there is usually a 631 firewall which connects these partitions. Great care must be taken 632 to ensure that such a firewall is operating properly. 634 One form of external service deserves some special consideration, and 635 that is anonymous, or guest, access. This may be either anonymous 636 FTP or guest (unauthenticated) login. It is extremely important to 637 ensure that anonymous FTP servers and guest login userids are 638 carefully isolated from any hosts and file systems from which outside 639 users should be kept. Another area to which special attention must 640 be paid concerns anonymous, writable access. A site may be legally 641 responsible for the content of publicly available information, so 642 careful monitoring of the information deposited by anonymous users is 643 advised. 645 3.2.4 Protecting the Protection 647 It is amazing how often a site will overlook the most obvious 648 weakness in its security by leaving the security server itself open 649 to attack. Based on considerations previously discussed, it should 650 be clear that: the security server should not be accessible from 651 off-site; should offer minimum access, except for the authentication 652 function, to users on-site; and should not be co-located with any 653 other servers. Further, all access to the node, including access to 654 the service itself, should be logged to provide a "paper trail" in 655 the event of a security breach. 657 3.3 Firewalls 659 4. SECURITY PROCEDURES 661 4.1 Authentication 663 For many years, the prescribed method for authenticating users has 664 been through the use of standard, reusable passwords. Originally, 665 these passwords were used by users at terminals to authenticate 666 themselves to a central computer. At the time, there were no 667 networks (internally or externally) and hence the risk of disclosure 668 of the cleartext password was minimal. Today, systems are connected 669 together through local networks, and those local networks are further 670 connected together, and to the Internet. Users are logging in from 671 all over the globe, and their reusable passwords are often 672 transmitted across those same networks in cleartext, ripe for anyone 673 inbetween to capture. And indeed, the CERT Coordination Center and 674 other response teams are seeing a tremendous number of incidents 675 involving packet sniffers which are capturing the cleartext 676 passwords. To address this threat, we are including sections on 677 better technologies like one-time passwords, and Kerberos. 679 With the advent of newer technologies like one-time passwords (e.g., 680 S/Key), PGP, and token-based authentication devices, people are using 681 password- like strings as secret tokens and pins. We are including a 682 discussion on these since they are the foundation upon which stronger 683 authentication techniques are based. If these secret tokens and pins 684 are not properly selected and protected, the authentication will be 685 easily subverted. 687 4.1.1 One-Time passwords 689 As mentioned above, given today's networked environments, it is 690 recommended that sites concerned about the security and integrity of 691 their systems and networks consider moving away from standard, 692 reusable passwords. There have been many incidents involving Trojan 693 network programs (e.g., telnet and rlogin) and network packet 694 sniffing programs. These programs capture cleartext hostname, 695 account name, password triplets. Intruders can use the captured 696 information for subsequent access to those hosts and accounts. This 697 is possible because 1) the password is used over and over (hence the 698 term "reusable"), and 2) the password passes across the network in 699 clear text. 701 Several authentication techniques have been developed that address 702 this problem. Among these techniques are challenge-response 703 technologies that provide passwords that are only used once (commonly 704 called one-time passwords). This document provides a list of sources 705 for products that provide this capability. The decision to use a 706 product is the responsibility of each organization, and each 707 organization should perform its own evaluation and selection. 709 4.1.2 Kerberos 711 713 4.1.3 Choosing and Protecting Secret Tokens and Pins 715 717 4.1.4 Password Assurance 719 While the need to eliminate the use of standard, reusable passwords 720 cannot be overstated, it is recognized that some organizations may 721 have to transition to the use of better technology. Given that 722 situation, we have included the following advice to help with the 723 selection and maintenance of traditional passwords. But remember, 724 none of these measures provides protection against disclosure due to 725 sniffer programs. 727 (1) The importance of robust passwords - In many (if not most) cases of 728 system penetration, the intruder needs to gain access to an account 729 on the system, and one way that goal is typically accomplished is 730 through guessing the password of a legitimate user. This is often 731 accomplished by running an automated password cracking program, 732 which utilizes a very large dictionary, against the system's password 733 file. The only way to guard against passwords being disclosed in this 734 manner is through the careful selection of passwords which cannot be 735 easily guessed (i.e., combinations of numbers, letters, and punctuation 736 characters). 738 (2) Changing default passwords - Many operating systems and application 739 programs are installed with default accounts and passwords. These 740 must be changed immediately to something that cannot be guessed or 741 cracked. 743 (3) Restricting access to the password file - In particular, a site 744 wants to protect the encrypted password portion of the file so that 745 would-be intruders don't have them available for cracking. One 746 effective technique is to use shadow passwords where the password 747 field of the standard file contains a dummy or false password. The 748 file containing the legitimate passwords are protected elsewhere on 749 the system. 751 (4) Password aging - When and how to expire passwords is still a subject of 752 controversy among the security community. It is generally accepted that 753 a password should not be maintained once an account is no longer in use, 754 but it is hotly debated whether a user should be forced to change a 755 good password that's in active use. The arguments for changing 756 passwords relate to the prevention of the continued use of penetrated 757 accounts. However, the opposition claims that frequent password changes 758 lead to users writing down their passwords in visible areas (such as 759 pasting them to a terminal), or to users selecting very simple passwords 760 that are easy to guess. It should also be stated that an intruder will 761 probably use a captured or guessed password sooner rather than later, 762 in which case password aging provides little if any protection. 764 While there is no definitive answer to this dilemma, a password policy 765 should directly address the issue and provide guidelines for how often 766 a user should change the password. It is recommended that passwords be 767 changed whenever root is penetrated, there is a critical change in 768 personnel (especially if it is the system administrator!), or when an 769 account has been compromised. In particular, if the root password is 770 compromised, all passwords on the system should be changed. In 771 addition, an annual change in their password is usually not difficult 772 for most users, and you should consider requiring it. 774 4.2 Authorization 776 Authorization refers to the process of granting privileges to 777 processes and ultimately users. This differs from authentication in 778 that authentication is what occurs to identify a user. Once 779 identified (reliably), the privileges, rights, property, and 780 permissible actions of the user are determined by authorization. 782 Should "objects" and "entities" be defined here?> Explicity listing 783 the authorized activities of each user (and user process) with 784 respect to all resources (objects) is impossible in a reasonable 785 system. In a real system certain techniques are used to simplify the 786 process of granting and checking authorization(s). 788 One approach, popularized in UNIX systems, is to assign to each 789 object three classes of user - the super user, the owner, and the 790 group. Super user, or root, is an entity that has access to all 791 portions (and objects) of the computer. The owner of an object is 792 the "user" who either created the object or was given it by the super 793 user. A group is a collection of users that share privileges over a 794 collection of objects. Groups ease authorization management by 795 simplifying the process of changing the authorization of users and by 796 changing the authority of a group to manage an object. 798 Another approach is to attach to an object a list which explicitly 799 contains the identity of all permitted users (or groups). This is an 800 Access Control List. The advantage of these are that they are easily 801 maintained (one central list per object). 803 805 809 4.3 Access 811 4.4 Modems 813 4.4.1 Modem lines must be managed 815 Although they provide convenient access to a site for its users, they 816 can also provide an effective detour around the site's firewalls. 817 For this reason it is essential to maintain proper control of modems. 819 Don't allow users to install a modem line without proper 820 authorization. This includes temporary installations, e.g. plugging 821 a modem into a facsimile or telephone line overnight. 823 Maintain a register of all your modem lines. Conduct regular site 824 checks for unauthorized modems; keep your register up to date. 826 4.4.2 Dial-in users must be authenticated 828 A username and password check should be completed before a user can 829 access anything on your network. Normal password security 830 considerations (such as choosing passwords which don't appear in 831 dictionaries and changing them from time to time) are particularly 832 important. 834 Remember that telephone lines can be tapped, and that it is quite 835 easy to intercept messages to cellular 'phones. Modern high-speed 836 modems use more sophisticated modulation techniques -, which makes 837 them somewhat more difficult to monitor - but it is prudent to assume 838 that hackers know how to eavesdrop on your lines. For this reason 839 you should use one-shot passwords (e.g. skey) or hardware 840 authentication devices (e.g. SecureID) if this is at all possible. 842 It is helpful to have a single dial-in point, e.g. a single large 843 modem pool, so that all users are authenticated in the same way. 845 Users will occasionally mis-type a password. Set a short delay - say 846 two seconds - after the first and second failed logins, and force a 847 disconnect after the third. This will slow down automated password 848 attacks. Don't tell the user whether the username, the password or 849 both were incorrect. 851 4.4.3 All logins (successful and unsuccessful) should be logged 853 Don't keep correct passwords in the log, but consider keeping 854 incorrect passwords to aid in detecting password attacks. However, 855 bear in mind that most incorrect passwords are correct passwords with 856 one character mistyped, and may suggest the real password. If you 857 can't keep this information secure, don't log it at all. 859 If Calling Line Identification is available, take advantage of it by 860 recording the calling number for each login attempt. Be sensitive to 861 the privacy issues raised by Calling Lne Identification. Also be 862 aware that Calling Line Identification is not to be trusted; use the 863 data for informational purposes only, not for authentication. 867 4.4.5 Minimize the amount of information given in your opening banner 869 In particular, don't announce the type of host hardware or operating 870 system - this encourages specialist hackers. 872 Display a short banner, but don't offer an 'inviting' name (e.g. 873 University of XYZ, Student Records System). Instead, give your site 874 name, a short warning that all sessions are monitored, and a 875 username/password prompt. Get your site's lawyers to check your 876 banner to make sure it states your legal position correctly. 878 For high-security applications, consider using a 'blind' password, 879 i.e. give no response to an incoming call until the user has typed in 880 (without any echoing) a password. This effectively simulates a dead 881 modem. 883 4.4.6 Call-back Capability 885 Some dial-in servers offer call-back facilities, i.e. the user dials 886 in and is authenticated, then the system disconnects the call and 887 calls back on a specified number. You will probably have to pay the 888 charges for such calls. 890 This feature should be used with caution; it can easily be bypassed. 891 As a minimum, make sure that the return call is never made from the 892 same modem as the incoming one. Overall, although call-back can 893 improve modem security, you should not depend on it alone. 895 4.4.7 Dial-out authentication 897 Dial-out users should also be authenticated, particularly since your 898 site will have to pay their telephone charges. 900 Never allow dial-out from an unauthenticated dial-in call, and 901 consider whether you will allow it from an authenticated one. The 902 goal here is to prevent callers using your modem pool as part of a 903 chain of logins. This can be hard to detect, particularly if a 904 hacker sets up a path through several hosts on your site. 906 As a minimum, don't allow the same modems and phone lines to be used 907 for both dial-in and dial-out. This can be implemented easily if you 908 run separate dial-in and dial-out modem pools. 910 4.4.8 Make your modem programming as 'bullet-proof' as possible 912 Be sure modems can't be reprogrammed while they're in service. As a 913 minimum, make sure that three plus signs won't put your dial-in 914 modems into command mode! 916 Program your modems to reset to your standard configuration at the 917 start of each new call. Failing this, make them reset at the end of 918 each call. This precaution will protect you against accidental 919 reprogramming of your modems. 921 Check that your modems terminate calls cleanly. When a user logs out 922 from an access server, verify that the server hangs up the 'phone 923 line properly. It is equally important that the server forces 924 logouts from whatever sessions were active if the user hangs up 925 unexpectedly. 927 4.5 Cryptography 929 4.6 Auditing 931 This section covers the procedures for collecting data generated by 932 network activity that may be useful in analyzing the security of a 933 network and/or useful in responding to a security incident. This 934 section also covers the handling, preservation, and utilization of 935 the data. 937 (This will be reworked as I develop the remainder.) 939 4.6.1 What to collect 941 Audit data should include any attempt to achieve a different security 942 level of any person, process, or other entity in the network. The 943 most obvious example in this area is a log of attempts to login to a 944 host computer. 946 Audit data should also include data pertaining to any "public" or 947 anonymous access and retrieval of data, at least to the granularity 948 of the "remote" host. 950 And on and on... 952 956 4.6.2 Collection Process 958 The collection process should be enacted by the host or resource 959 being accessed. Depending on the importance of the data and the need 960 to have it local in instances in which services are being denied, 961 data could be kept local to the resource until needed or be 962 transmitted to storage after each event. 964 Reporting data can be done by writing to a file, writing to a line 965 printer, writing over a network, or writing over a non-network port, 966 such as a console port. Each of these has importance. 968 File system logging is the least resource intensive of all four 969 candidates (for a given audit log). It is also the least reliable. 970 If a resource has been compromised, the file system is the first to 971 go. If the network in front of the resource has become unusable, the 972 data is inaccessible, unless a direct console port is available. 974 Line printer logging is useful in system where permanent and 975 immediate logs are required. A real time system is an example of 976 this, where the exact point of a failure or attack must be recorded. 977 A laser printer, or other device which buffers data between the 978 auditing system and storage device may suffer from lost data if 979 buffers contain the needed data at a critical instant. 981 Reporting over the network provides for allowing a remote host to 982 store data in a more permanent and possibly more reliable manner. 983 However, this consumes bandwidth (at the minimum), exposes the audit 984 data in an easy package to a interloper, and could be lost during 985 network denial. 987 Reporting over a console port ensures the delivery of the data 988 follows the hardware design. The limitations are that the console 989 port requires physical security and using console ports on machines 990 more than a short distance away (e.g., across a campus) may require a 991 phone line in addition to the network connection. In some instances, 992 this is one more resource that may be constrained. 994 4.6.3 Collection Load 996 Collecting running data may result in a quick accumulation of bytes. 997 Storage of this must be considered in advance. There are a few ways 998 to limit the required storage space. Data may be compressed using 999 one of many methods. Another approach is to only archive summaries 1000 of activity (possibly losing some detail in the process). Data may 1001 be archived for just a fixed period of time, then is it permanently 1002 removed. 1004 The issue of archiving security data differs from archiving network 1005 management and application data. Network management data 1006 (statistics) can be reduced by altering the reporting period. 1007 Security data does not have that option (for the most part). 1008 Security data also does not have the permanence of application data. 1010 4.6.4 Handling the Data/Preservation 1012 Security data should be protected as least as much as any other data 1013 is protected because from it, much can be inferred. Security data 1014 may give away enough secrets to allow a "masquerader" to impersonate 1015 an authorized administrator. 1017 Data may also become key to the investigation, apprehension, or 1018 prosecution of the perpetrator of an incident. Because of this, the 1019 data needs to be protected and clearly documented. For this reason 1020 it is advisable to seek the advice of local legal council or law 1021 enforcement when deciding how security data is to be treated. This 1022 should happen before an incident occurs. 1024 If a data handling plan is not cleared prior to an incident, this 1025 does not mean that it is useless. It means two things. You may not 1026 have recourse in the aftermath of an event. You may also me liable 1027 for penalties resulting from your treatment of the data too. 1029 4.6.5 Audit Data Precautions 1031 In certain instances, audit data may contain personal information. 1032 Searching through the data, even for just a routine check of the 1033 network's security could present an invasion of privacy or make the 1034 auditing entity privy to information it should not be allowed to 1035 have. Note that this is not automatically true - not all data is 1036 "sensitive" and "sensitive" differs by locale. 1038 Another danger presented by auditing data is that it may reveal a 1039 pattern of incidents. If an organization knows about the incidents 1040 but permits them to continue and this results in damage to another 1041 organization ("downstream"), legal action could result. An 1042 organization may also be liable for not (making a "best effort" to) 1043 analyze this data for incidents. 1045 4.7 Backups 1047 5. SECURITY INCIDENT HANDLING 1049 This section of the document will supply guidance to be applied 1050 before, during, and after a computer security incident is in progress 1051 on a machine, network, site, or multi-site environment. The 1052 operative philosophy in the event of a breach of computer security is 1053 to react according to a plan. This is true whether the breach is the 1054 result of an external intruder attack, unintentional damage, a 1055 student testing some new program to exploit a software vulnerability, 1056 or a disgruntled employee. Each of the possible types of events 1057 described above should be addressed by an adequate contingency plans. 1058 Without a proactive approach to protect the assets in case of an 1059 incident the handling process can not be as efficient as with well 1060 prepared procedures, methods and policies in place. 1062 Traditional computer security, while quite important in the overall 1063 site security plan, usually falls heavily on protecting systems from 1064 attack, and perhaps monitoring systems to detect attacks. Little 1065 attention is usually paid for how to actually handle the attack when 1066 it occurs. The result is that when an attack is in progress, many 1067 decisions are made in haste and can be damaging to tracking down the 1068 source of the incident, collecting evidence to be used in prosecution 1069 efforts, preparing for the recovery of the system, and protecting the 1070 valuable data contained on the system. 1072 One of the most important but often overlooked benefit for efficient 1073 incident handling is an economic one. Having both technical and 1074 managerial personnel respond to an incident requires considerable 1075 resources, resources which could be utilized more profitably if an 1076 incident did not require their services. If these personnel are 1077 trained to handle an incident efficiently, less of their time is 1078 required to deal with that incident. 1080 Due to the worldwide network most of the incidents are not restricted 1081 to one site only. Operating systems vulnerabilities apply (in some 1082 cases) to several millions of systems and many vulnerabilities are 1083 exploited within the network itself. Therefore it is vital for all 1084 sites that all involved parties are informed as soon as possible. 1086 Another benefit is related to public relations. News about computer 1087 security incidents tends to be damaging to an organization's stature 1088 among current or potential clients. Efficient incident handling 1089 minimizes the potential for negative exposure. 1091 A final benefit of efficient incident handling is related to legal 1092 issues. It is possible that in the near future organizations may be 1093 sued because one of their nodes was used to launch a network attack. 1094 In a similar vein, people who develop patches or workarounds may be 1095 sued if the patches or workarounds are ineffective, resulting in 1096 damage to systems, or if the patches or workarounds themselves damage 1097 systems. Knowing about operating system vulnerabilities and patterns 1098 of attacks and then taking appropriate measures is critical to 1099 circumventing possible legal problems. 1101 This chapter is arranged such that a list of relevant topics may be 1102 drawn from the following outline, to provide a starting point for 1103 creating a policy for handling ongoing incidents. The main points to 1104 be included in a policy for handling incidents are: 1106 (1) Preparing and planning (what are the goals and objectives in 1107 handling an incident). 1108 (2) Notification (who should be contacted in case of an incident). 1109 (3) Evaluation (how serious is the incident). 1110 (4) Handling (what should be done when an incident occurs). 1111 This especially includes: 1112 - Notification (who should be notified about the incident). 1113 - Containment (how can the damage be limited). 1114 - Eradication (eliminate the reasons for the incident). 1115 - Recovery (how to reestablish service and systems). 1116 - Follow Up (what actions should be taken after the incident). 1117 - Legal/Investigative implications (what are the legal and 1118 prosecutorial implications of the incident). 1119 - Documentation Logs (what records should be kept from before, 1120 during, and after the incident). 1121 (5) Aftermath (overall implications of past incidents). 1122 (6) Responsibilities (for planning and handling an incident). 1124 Each of these points is important in an overall plan for handling 1125 incidents. The remainder of this chapter will detail the issues 1126 involved in each of the relevant topics, and provide some guidance as 1127 to what should be included in a site policy for handling incidents. 1129 Guidelines for End User involvement in dealing with compromised 1130 accounts and vulnerabilities are covered in the corresponding "End 1131 User Security Handbook" [RFC xxx]. Especially interesting for Site 1132 Administrators which act as Site Security Contact in assisting other 1133 users and administrators in dealing with incidents are the 1134 "Guidelines and Recommendations for Incident Processing" [RFC xxx]. 1136 5.1 Preparing and Planning for Incident Handling 1138 Part of handling an incident is being prepared to respond before the 1139 incident occurs. This includes establishing a suitable level of 1140 protections as explained in the chapters before. Not only are 1141 incidents avoided through this protection but if the incident becomes 1142 severe, the damage which can occur is limited. Protection includes 1143 preparing incident handling guidelines as part of a contingency plan 1144 for your organization or site. Having written plans eliminates much 1145 of the ambiguity which occurs during an incident, and will lead to a 1146 more appropriate and thorough set of responses. As explained for the 1147 site specific contingency plan in section xxx it is vitally important 1148 to test the proposed plan before an incident actually occurs through 1149 'dry runs'. A team might even consider hiring a tiger team to act in 1150 parallel with the dry run. 1152 Once a site has recovered from an incident, site policy and 1153 procedures should be reviewed to encompass changes to prevent similar 1154 incidents. If an incident is based on poor policy, and unless the 1155 policy is changed, then one is doomed to repeat the past. Even 1156 without an incident, it would be prudent to review policies and 1157 procedures on a regular basis. Reviews are imperative due to today's 1158 changing computing environments. To improve this process a problem 1159 reporting procedure should be implemented to describe, in detail, the 1160 incident and the solutions to the incident. Each incident should be 1161 reviewed by the site security subgroup to allow understanding of the 1162 incident with possible suggestions to the site policy and procedures. 1164 Learning to respond efficiently to an incident is important for 1165 numerous reasons: 1167 (1) protect the assets which are to protect by normal security 1168 in case of a worst event 1169 (2) protect your resources which could be utilized more profitably 1170 if an incident did not require their services 1171 (3) take care that (government) regulations are complied with 1172 (4) prevent use of your systems against other systems (which 1173 could incur legal liability) 1174 (5) minimize the potential for negative exposure 1176 As in any set of pre-planned procedures, attention must be placed on 1177 a set of goals for handling an incident. These goals will be 1178 prioritized differently depending on the site. The set of goals will 1179 be closely related to the goals for security in general. Therefore, 1180 the same guidelines as in section xxx for security in general might 1181 be applied here. A specific set of objectives can be identified for 1182 dealing with incidents: 1184 (1) Figure out how it happened. 1185 (2) Find out how to avoid further exploitations. 1186 (3) Avoid escalation and further incidents. 1187 (4) Recover from the incident. 1188 (5) Find out who did it. 1190 Due to the nature of the incident there might be a conflict between 1191 analyzing the original source of a problem instead of restoring 1192 systems and services. In this case overall goals (like assuring the 1193 integrity of (life) critical systems) might be the reason for not 1194 analyzing an incident. Of course this is an important management 1195 decision, but all involved parties must be aware that without a 1196 analysis the same incident may happen again. 1198 It is important to prioritize actions to be taken during an incident 1199 well in advance of the time an incident occurs. Sometimes an 1200 incident may be so complex that it is impossible to do everything at 1201 once to respond to it; priorities are essential. Although priorities 1202 will vary from institution to institution, the following suggested 1203 priorities may serve as a starting point for defining an 1204 organization's response: 1206 (1) Priority one -- protect human life and people's 1207 safety; human life always has precedence over all 1208 other considerations. 1210 (2) Priority two -- protect classified and/or sensitive 1211 data. Prevent exploitation of classified and/or 1212 sensitive systems, networks or sites. Inform effected 1213 classified and/or sensitive systems, networks or sites 1214 about already occurred penetrations. 1215 (Be aware of regulations by your site or by government) 1217 (3) Priority three -- protect other data, including 1218 proprietary, scientific, managerial and other data, 1219 because loss of data is costly in terms of resources. 1220 Prevent exploitations of other systems, networks or 1221 sites and inform already effected systems, networks or 1222 sites about successful penetrations. 1224 (4) Priority four -- prevent damage to systems (e.g., loss 1225 or alteration of system files, damage to disk drives, 1226 etc.); damage to systems can result in costly down 1227 time and recovery. 1229 (5) Priority five -- minimize disruption of computing 1230 resources; it is better in many cases to shut a system 1231 down or disconnect from a network than to risk damage 1232 to data or systems. 1234 An important implication for defining priorities is that once human 1235 life and national security considerations have been addressed, it is 1236 generally more important to save data than system software and 1237 hardware. Although it is undesirable to have any damage or loss 1238 during an incident, systems can be replaced; the loss or compromise 1239 of data (especially classified data), however, is usually not an 1240 acceptable outcome under any circumstances. 1242 Another important concern is the effect on others, beyond the systems 1243 and networks where the incident occurs. Within the limits imposed by 1244 government regulations it is always important to inform effected 1245 parties at soon as possible. Due to the legal implications of this 1246 topic, it should be included in the planned procedures to avoid 1247 further delays and uncertainty for the administrators. 1249 Any plan for responding to security incidents should be guided by 1250 local policies and regulations. Government and private sites that 1251 deal with classified material have specific rules that they must 1252 follow. 1254 The policies chosen by your site on how it reacts to incidents will 1255 shape your response. For example, it may make little sense to create 1256 mechanisms to monitor and trace intruders if your site does not plan 1257 to take action against the intruders if they are caught. Other 1258 organizations may have policies that affect your plans. Telephone 1259 companies often release information about telephone traces only to 1260 law enforcement agencies. 1262 You may also note that if any legal action is planned, there are 1263 specific guidelines that must be followed to make sure that any 1264 information collected can be used as evidence. 1266 5.2 Notification and Point of Contacts 1268 It is important to establish contacts with various personnel before a 1269 real incident occurs. These contacts are either local, other system 1270 administrators elsewhere on the internet or are investigative 1271 agencies. Working with these contacts appropriately will help to 1272 make your incident handling process more efficient. 1274 Communication may need to be established with various "Points of 1275 Contact." These may be technical or administrative in nature, may 1276 include legal or investigative agencies, as well as Service Providers 1277 and vendors. It is important to decide how much information will be 1278 shared, especially with the wider community of users at a site, with 1279 the public (the press) and with other sites. 1281 Settling these issues are especially important for the local person 1282 responsible for handling the incident, since that is the person 1283 responsible for the actual notification of others. A list of 1284 contacts in each of these categories is an important time saver for 1285 this person during an incident. It can be quite difficult to find an 1286 appropriate person during an incident when many urgent events are 1287 ongoing. Including relevant telephone numbers (also electronic mail 1288 addresses and fax numbers) in the site security policy is strongly 1289 recommended. It is especially important to know how to contact 1290 individuals who will be directly involved in handling a security 1291 incident. 1293 5.2.1 Local Managers and Personnel 1295 When an incident is under way, a major issue is deciding who is in 1296 charge of coordinating the activity of the multitude of players. A 1297 major mistake that can be made is to have a number of "points of 1298 contact" (POC) that are not pulling their efforts together. This 1299 will only add to the confusion of the event, and will probably lead 1300 to wasted or ineffective effort. 1302 The single point of contact may or may not be the person "in charge" 1303 of the incident. There are two distinct rolls to fill when deciding 1304 who shall be the point of contact and the person in charge of the 1305 incident. The person in charge will make decisions as to the 1306 interpretation of policy applied to the event. The responsibility 1307 for the handling of the event falls onto this person. In contrast, 1308 the point of contact must coordinate the effort of all the parties 1309 involved with handling the event. 1311 The point of contact must be a person with the technical expertise to 1312 successfully coordinate the effort of the system managers and users 1313 involved in monitoring and reacting to the attack. Often the 1314 management structure of a site is such that the administrator of a 1315 set of resources is not a technically competent person with regard to 1316 handling the details of the operations of the computers, but is 1317 ultimately responsible for the use of these resources. 1319 Another important function of the POC is to maintain contact with law 1320 enforcement and other external agencies to assure that multi- agency 1321 involvement occurs. (In the U.S. FBI, CIA, DoD, U.S. Army, or others 1322 might be concerned.) 1324 Finally, if legal action in the form of prosecution is involved, the 1325 POC may be asked to speak for the site in court. The alternative is 1326 to have multiple witnesses that will be hard to coordinate in a legal 1327 sense, and will weaken any case against the attackers. A single POC 1328 may also be the single person in charge of collecting evidence, which 1329 will keep the number of people accounting for evidence to a minimum. 1330 As a rule of thumb, the more people that touch a potential piece of 1331 evidence, the greater the possibility that it will be inadmissible in 1332 court. 1334 One of the most critical tasks for the POC is the coordination of all 1335 relevant processes. As responsibilities might be distributed over 1336 the whole site, which in fact can consist of multiple independent 1337 departments or groups, a well coordination effort is crucial for 1338 overall success. The situation get even worse if multiple sites are 1339 involved. In many cases, no single POC in one nvolved site can 1340 coordinate the handling of an entire incident. The appropriate 1341 incident response teams are more suitable, if multiple sites are 1342 involved. 1344 The incident handling process should provide some escalation 1345 mechanisms. The POC might change; the impact of the incident force 1346 the management to take the lead instead of giving the technical 1347 administrator the responsibility. Other reasons for changing the POC 1348 are the emergence of conflicts of interest, changing priorities or 1349 responsibilities. Regardless of why the POC is changed, all involved 1350 parties must be informed. Arrangements should be made to allow the 1351 new POC to contact the old one, to ensure an adequate briefing of 1352 background information. 1354 5.2.2 Law Enforcement and Investigative Agencies 1356 In the event of an incident it is important to establish contact with 1357 investigative agencies such as the FBI and Secret Service or their 1358 equivalent in your country, as soon as possible. Local law 1359 enforcement and local security offices or campus police departments 1360 should also be informed when appropriate. A primary reason is that 1361 once a major attack is in progress, there is little time to call 1362 various personnel in these agencies to determine exactly who the 1363 correct point of contact is. Another reason is that it is important 1364 to cooperate with these agencies in a manner that will foster a good 1365 working relationship, and that will be in accordance with the working 1366 procedures of these agencies. Knowing the working procedures in 1367 advance and the expectations of your point of contact is a big step 1368 in this direction. For example, it is important to gather evidence 1369 that will be admissible in a court of law. If you don't know in 1370 advance how to gather admissible evidence, your efforts to collect 1371 evidence during an incident are likely to be of no value to the 1372 investigative agency with which you deal. A final reason for 1373 establishing contacts as soon as possible is that it is impossible to 1374 know the particular agency that will assume jurisdiction in any given 1375 incident. Making contacts and finding the proper channels early will 1376 make responding to an incident go considerably more smoothly. 1378 If your organization or site has a legal counsel, you need to notify 1379 this office soon after you learn that an incident is in progress. At 1380 a minimum, your legal counsel needs to be involved to protect the 1381 legal and financial interests of your site or organization. There 1382 are many legal and practical issues, a few of which are: 1384 (1) Whether your site or organization is willing to risk 1385 negative publicity or exposure to cooperate with legal 1386 prosecution efforts. 1388 (2) Downstream liability--if you leave a compromised system 1389 as is so it can be monitored and another computer is damaged 1390 because the attack originated from your system, your site or 1391 organization may be liable for damages incurred. 1393 (3) Distribution of information--if your site or organization 1394 distributes information about an attack in which another 1395 site or organization may be involved or the vulnerability 1396 in a product that may affect ability to market that 1397 product, your site or organization may again be liable 1398 for any damages (including damage of reputation). 1400 (4) Liabilities due to monitoring--your site or organization 1401 may be sued if users at your site or elsewhere discover 1402 that your site is monitoring account activity without 1403 informing users. 1405 Unfortunately, there are no clear precedents yet on the liabilities 1406 or responsibilities of organizations involved in a security incident 1407 or who might be involved in supporting an investigative effort. 1408 Investigators will often encourage organizations to help trace and 1409 monitor intruders -- indeed, most investigators cannot pursue 1410 computer intrusions without extensive support from the organizations 1411 involved. However, investigators cannot provide protection from 1412 liability claims, and these kinds of efforts may drag out for months 1413 and may take lots of effort. 1415 On the other side, an organization's legal council may advise extreme 1416 caution and suggest that tracing activities be halted and an intruder 1417 shut out of the system. This in itself may not provide protection 1418 from liability, and may prevent investigators from identifying 1419 anyone. 1421 The balance between supporting investigative activity and limiting 1422 liability is tricky; you'll need to consider the advice of your 1423 council and the damage the intruder is causing (if any) in making 1424 your decision about what to do during any particular incident. 1426 Your legal counsel should also be involved in any decision to contact 1427 investigative agencies when an incident occurs at your site. The 1428 decision to coordinate efforts with investigative agencies is most 1429 properly that of your site or organization. Involving your legal 1430 counsel will also foster the multi-level coordination between your 1431 site and the particular investigative agency involved which in turn 1432 results in an efficient division of labor. Another result is that 1433 you are likely to obtain guidance that will help you avoid future 1434 legal mistakes. 1436 Finally, your legal counsel should evaluate your site's written 1437 procedures for responding to incidents. It is essential to obtain a 1438 "clean bill of health" from a legal perspective before you actually 1439 carry out these procedures. 1441 One of the most important considerations in dealing with 1442 investigative agencies is verifying that the person who calls asking 1443 for information is a legitimate representative from the agency in 1444 question. Unfortunately, many well intentioned people have 1445 unknowingly leaked sensitive information about incidents, allowed 1446 unauthorized people into their systems, etc., because a caller has 1447 masqueraded as a representative of a government agency (e. g. the FBI 1448 or Secret Service in the US). A similar consideration is using a 1449 secure means of communication. Because many network attackers can 1450 easily reroute electronic mail, avoid using electronic mail to 1451 communicate with other agencies (as well as others dealing with the 1452 incident at hand). Non-secured phone lines (e. g., the phones 1453 normally used in the business world) are also frequent targets for 1454 tapping by network intruders, so be careful! 1455 There is no established set of rules for responding to an incident 1456 when the local Government becomes involved. Normally, except by 1457 court order, no agency can force you to monitor, to disconnect from 1458 the network, to avoid telephone contact with the suspected attackers, 1459 etc. As discussed before, you should consult the matter with your 1460 legal counsel, especially before taking an action that your 1461 organization has never taken. The particular agency involved may ask 1462 you to leave an attacked machine on and to monitor activity on this 1463 machine, for example. Your complying with this request will ensure 1464 continued cooperation of the agency. This is usually the best route 1465 towards finding the source of the network attacks and, ultimately, 1466 terminating these attacks. Additionally, you may need information or 1467 a favor from the agency involved in the incident. You are likely to 1468 get what you need only if you have been cooperative. It is 1469 particularly important to avoid unnecessary or unauthorized 1470 disclosure of information about the incident, including any 1471 information furnished by the agency involved. The trust between your 1472 site and the agency hinges upon your ability to avoid compromising 1473 the case the agency will build; keeping "tight lipped" is imperative. 1475 Sometimes your needs and the needs of an investigative agency will 1476 differ. Your site may want to get back to normal business by closing 1477 an attack route, but the investigative agency may want you to keep 1478 this route open. Similarly, your site may want to close a 1479 compromised system down to avoid the possibility of negative 1480 publicity, but again the investigative agency may want you to 1481 continue monitoring. When there is such a conflict, there may be a 1482 complex set of tradeoffs (e.g., interests of your site's management, 1483 amount of resources you can devote to the problem, jurisdictional 1484 boundaries, etc.). An important guiding principle is related to what 1485 might be called "Internet citizenship" [22, IAB89, 23 (xxx old 1486 references)] and its responsibilities. Your site can shut a system 1487 down, and this will relieve you of the stress, resource demands, and 1488 danger of negative exposure. The attacker, however, is likely to 1489 simply move on to another system, temporarily leaving others blind to 1490 the attacker's intention and actions until another path of attack can 1491 be detected. Providing that there is no damage to your systems and 1492 others, the most responsible course of action is to cooperate with 1493 the participating agency by leaving your compromised system on. This 1494 will allow monitoring (and, ultimately, the possibility of 1495 terminating the source of the threat to systems just like yours). On 1496 the other hand, if there is damage to computers illegally accessed 1497 through your system, the choice is complicated: shutting down the 1498 intruder may prevent further damage to systems, but might make it 1499 impossible to track down the intruder. If there has been damage, the 1500 decision about whether it is important to leave systems up to catch 1501 the intruder should involve all the organizations effected. Further 1502 complicating the issue of network responsibility is the consideration 1503 that if you do not cooperate with an agency, you will be less likely 1504 to receive help from that agency in the future. 1506 5.2.3 Computer Security Incident Handling Teams 1508 There now exists a number of Computer Security Incident Handling 1509 teams (CSIH teams) such as the CERT Coordination Center and the CIAC 1510 or other teams around the globe. Teams exist for many major 1511 government agencies and large corporations. If such a team is 1512 available, notifying it should be of primary importance during the 1513 early stages of an incident. These teams are responsible for 1514 coordinating computer security incidents over a range of sites and 1515 larger entities. Even if the incident is believed to be contained 1516 within a single site, it is possible that the information available 1517 through a response team could help in closing out the incident. 1519 If it is determined that the breach occurred due to a flaw in the 1520 systems' hardware or software, the vendor (or supplier) and a 1521 Computer Security Incident Handling team should be notified as soon 1522 as possible. This is especially important due to the fact that many 1523 other systems are vulnerable, too. 1525 In setting up a site policy for incident handling, it may be 1526 desirable to create a subgroup, much like those teams that already 1527 exist, that will be responsible for handling computer security 1528 incidents for the site (or organization). If such a team is created, 1529 it is essential that communication lines be opened between this team 1530 and other teams. Once an incident is under way, it is difficult to 1531 open a trusted dialogue between other teams if none has existed 1532 before. (See [RFC xxx] for more information about the considerations 1533 for creating your own incident handling team.) 1535 5.2.4 Effected and involved Sites 1537 If an incident has an impact on other sites, it is good practice to 1538 inform them. It may be obvious from the beginning that the incident 1539 is not limited to the local site, or it may emerge only after further 1540 analysis. 1542 Each site might choose to contact other sites directly or they can 1543 pass the information to an appropriate incident response team, to 1544 which the involved site belongs. As it is often very difficult to 1545 find the responsible POC at remote sites, the involvement of an 1546 incident response team will facilitate contact by making use of 1547 already established channels. 1549 The legal and liability issues arising from a security incident may 1550 differ from site to site. It is important to define a policy for the 1551 sharing and logging of information about other sites before an 1552 incident occurs. This policy should be crafted in consultation with 1553 legal counsel. 1555 Information about specific people is especially sensitive, and is 1556 subject to privacy laws. To avoid problems in this area, irrelevent 1557 information should be deleted and a statement of how to handle the 1558 remaining information should be included. A clear statement of how 1559 this information is to be used is essential. (No one who informs a 1560 site of a security incident would like to read in the newspaper that 1561 they also had a problem.) Incident response teams are valuable in 1562 this respect. As they pass information to responsible POCs, they are 1563 able to protect the anonymity of the original source. But be aware 1564 that in many cases the analysis of logs and information at other 1565 sites will reveal addresses of your site. 1567 All the problems discussed should be not seen as reasons not to 1568 involve other sites. In fact, the experiences of existing teams 1569 reveal that most sites informed about security problems are not even 1570 aware that their site had been compromised. Without timely 1571 information other sites are often unable to take measures against 1572 intruders. 1574 5.2.5 Public Relations - Press Releases 1576 One of the most important issues to consider is when, who, and how 1577 much to release to the general public through the press. There are 1578 many issues to consider when deciding this particular issue. First 1579 and foremost, if a public relations office exists for the site, it is 1580 important to use this office as liaison to the press. The public 1581 relations office is trained in the type and wording of information 1582 released, and will help to assure that the image of the site is 1583 protected during and after the incident (if possible). A public 1584 relations office has the advantage that you can communicate candidly 1585 with them, and provide a buffer between the constant press attention 1586 and the need of the POC to maintain control over the incident. 1588 If a public relations office is not available, the information 1589 released to the press must be carefully considered. If the 1590 information is sensitive, it may be advantageous to provide only 1591 minimal or overview information to the press. It is quite possible 1592 that any information provided to the press will be quickly reviewed 1593 by the perpetrator of the incident. As a contrast to this 1594 consideration, it was discussed above that misleading the press can 1595 often backfire and cause more damage than releasing sensitive 1596 information. 1598 While it is difficult to determine in advance what level of detail to 1599 provide to the press, some guidelines to keep in mind are: 1601 (1) Keep the technical level of detail low. Detailed 1602 information about the incident may provide enough 1603 information for copy-cat events or even damage the 1604 site's ability to prosecute once the event is over. 1605 (2) Keep the speculation out of press statements. 1606 Speculation of who is causing the incident or the 1607 motives are very likely to be in error and may cause 1608 an inflamed view of the incident. 1609 (3) Work with law enforcement professionals to assure that 1610 evidence is protected. If prosecution is involved, 1611 assure that the evidence collected is not divulged to 1612 the press. 1613 (4) Try not to be forced into a press interview before you are 1614 prepared. The popular press is famous for the "2am" 1615 interview, where the hope is to catch the interviewee off 1616 guard and obtain information otherwise not available. 1617 (5) Do not allow the press attention to detract from the 1618 handling of the event. Always remember that the successful 1619 closure of an incident is of primary importance. 1621 5.3 Identifying an Incident 1623 5.3.1 Is it real? 1625 This stage involves determining, if a problem really exist. Of 1626 course many, if not most, signs often associated with virus 1627 infections, system intrusions, malicious users, etc., are simply 1628 anomalies such as hardware failures or suspicious system/user 1629 behavior. To assist in identifying whether there really is an 1630 incident, it is usually helpful to obtain and use any detection 1631 software which may be available. For example, widely available 1632 software packages can greatly assist someone who thinks there may be 1633 a virus in a personal computer. Audit information is also extremely 1634 useful, especially in determining whether there is a network attack. 1635 It is extremely important to obtain a system snapshot as soon as one 1636 suspects that something is wrong. Many incidents cause a dynamic 1637 chain of events to occur, and an initial system snapshot may do more 1638 good in identifying the problem and any source of attack than most 1639 other actions which can be taken at this stage. Finally, it is 1640 important to start a log book. Recording system events, telephone 1641 conversations, time stamps, etc., can lead to a more rapid and 1642 systematic identification of the problem, and is the basis for 1643 subsequent stages of incident handling. 1645 There are certain indications or "symptoms" of an incident which 1646 deserve special attention: 1648 (1) System crashes. 1649 (2) New user accounts (e.g., the account RUMPLESTILTSKIN 1650 has unexplainedly been created), or high activity on 1651 an account that has had virtually no activity for 1652 months. 1653 (3) New files (usually with novel or strange file names, 1654 such as data.xx or k). 1655 (4) Accounting discrepancies (e.g., in a UNIX system you 1656 might notice that the accounting file called 1657 /usr/admin/lastlog has shrunk, something that should 1658 make you very suspicious that there may be an 1659 intruder). 1660 (5) Changes in file lengths or dates (e.g., a user should 1661 be suspicious if he/she observes that the .EXE files in 1662 an MS DOS computer have unexplainedly grown 1663 by over 1800 bytes). 1664 (6) Attempts to write to system (e.g., a system manager 1665 notices that a privileged user in a VMS system is 1666 attempting to alter RIGHTSLIST.DAT). 1667 (7) Data modification or deletion (e.g., files start to 1668 disappear). 1669 (8) Denial of service (e.g., a system manager and all 1670 other users become locked out of a UNIX system, which 1671 has been changed to single user mode). 1672 (9) Unexplained, poor system performance (e.g., system 1673 response time becomes unusually slow). 1674 (10) Anomalies (e.g., "GOTCHA" is displayed on a display 1675 terminal or there are frequent unexplained "beeps"). 1676 (11) Suspicious probes (e.g., there are numerous 1677 unsuccessful login attempts from another node). 1678 (12) Suspicious browsing (e.g., someone becomes a root user 1679 on a UNIX system and accesses file after file in one 1680 user's account, then another's). 1682 By no means is this list comprehensive. We have just listed a number 1683 of common indicators. Furthermore, none of these indications is 1684 absolute "proof" that an incident is occurring, nor are all of these 1685 indications normally observed when an incident occurs. If you 1686 observe any of these indications, however, it is important to suspect 1687 that an incident might be occurring, and act accordingly. There is 1688 no formula for determining with 100 percent accuracy that an incident 1689 is occurring. It is best at this point to collaborate with other 1690 technical and computer security personnel to make a decision as a 1691 group about whether an incident is occurring. 1693 5.3.2 Types and Scope of Incidents 1695 Along with the identification of the incident is the evaluation of 1696 the scope and impact of the problem. It is important to correctly 1697 identify the boundaries of the incident in order to effectively deal 1698 with it. In addition, the impact of an incident will determine its 1699 priority in allocating resources to deal with the event. Without an 1700 indication of the scope and impact of the event, it is difficult to 1701 determine a correct response. 1703 In order to identify the scope and impact, a set of criteria should 1704 be defined which is appropriate to the site and to the type of 1705 connections available. Some of the issues are: 1707 (1) Is this a multi-site incident? 1708 (2) Are many computers at your site effected by this 1709 incident? 1710 (3) Is sensitive information involved? 1711 (4) What is the entry point of the incident (network, 1712 phone line, local terminal, etc.)? 1713 (5) Is the press involved? 1714 (6) What is the potential damage of the incident? 1715 (7) What is the estimated time to close out the incident? 1716 (8) What resources could be required to handle the incident? 1718 5.3.3 Assessing the Damage and Extent 1720 The analysis of the damage and extent of the incident can be quite 1721 time consuming, but should lead into some of the insight as to the 1722 nature of the incident, and aid investigation and prosecution. 1724 As soon as the breach has occurred, the entire system and all its 1725 components should be considered suspect. System software is the most 1726 probable target. Preparation is key to be able to detect all changes 1727 for a possibly tainted system. This includes checksumming all tapes 1728 from the vendor using a checksum algorithm which (hopefully) is 1729 resistant to tampering [10]. (See sections xxx.) Assuming original 1730 vendor distribution tapes are available, an analysis of all system 1731 files should commence, and any irregularities should be noted and 1732 referred to all parties involved in handling the incident. It can be 1733 very difficult, in some cases, to decide which backup tapes are 1734 showing a correct system status; consider that the incident may have 1735 continued for months or years before discovery, and that the suspect 1736 may be an employee of the site, or otherwise have intimate knowledge 1737 or access to the systems. In all cases, the pre-incident preparation 1738 will determine what recovery is possible. If the system supports 1739 centralized logging (most do), go back over the logs and look for 1740 abnormalities. If process accounting and connect time accounting is 1741 enabled, look for patterns of system usage. To a lesser extent, disk 1742 usage may shed light on the incident. Accounting can provide much 1743 helpful information in an analysis of an incident and subsequent 1744 prosecution. 1746 If you can address all aspects of a specific incident strongly 1747 depends on the success of this analysis. This also effects the 1748 efficiency of the incident handling process. Review the lessons 1749 learned from the analysis and always update the policy and procedures 1750 to reflect changes necessitated by the incident. 1752 5.4 Handling an Incident 1754 Certain steps are necessary to take during the handling of an 1755 incident. In all security related activities, the most important 1756 point to be made is: One should have policies in place. Without 1757 defined goals, activities undertaken will remain without focus. One 1758 of the most fundamental objectives is to restore control of the 1759 effected systems and to limit the impact and damage. In the worst 1760 case scenario, shutting down the system or disconnecting the system 1761 from the network may the only practical solution. 1763 As the activities involved are complex, try to get as much help as 1764 necessary. While trying to solve the problem alone, real damage 1765 might occur due to delays or missing information. Most system 1766 administrators take the discovery of an intruder as personal 1767 challenge. By proceeding this way, other objectives as outlined in 1768 the local policies may not always considered. Trying to catch 1769 intruders may be a very low priority, compared to system integrity, 1770 for example. One other pitfall is the premise that by monitoring the 1771 activities of a hacker, other sites can be warned about problems they 1772 have been exposed to. By allowing the intruder to (ab)use the local 1773 system, a site may be incurring legal liability or indirect 1774 responsibility for the damage caused to other sites. 1776 For all the reasons outlined above, it is necessary to establish 1777 procedures for the incident handling to allow the technical 1778 administrator to do the "right" thing, as identified by management 1779 and legal counsel. 1781 5.4.1 Types of notification, Exchange of information 1783 When you have confirmed that an incident is occurring, the 1784 appropriate personnel must be notified. How this notification is 1785 achieved is very important in keeping the event under control both 1786 from a technical and emotional standpoint. The circumstances should 1787 be described in as much detail as possible, in order to aid prompt 1788 acknowledgment and understanding of the problem. Great care should 1789 be taken to which groups detailed technical information is given 1790 during the notification. For example it is helpful to pass this kind 1791 of information to an incident handling team. They can assist you by 1792 providing helpful hints for eradicating the vulnerabilities involved 1793 in an incident. On the other hand putting the critical knowledge 1794 into the public domain (e. g. netnews, mailing lists) may potentially 1795 put a great number of systems at risk of intrusion. It is a wrong 1796 assumption, that all administrators are reading a particular news 1797 group, have access to operating system source code or can even 1798 understand the techniques well enough to take adequate steps. 1800 First of all, any notification to either local or off-site personnel 1801 must be explicit. This requires that any statement (be it an 1802 electronic mail message, phone call, or fax) provides information 1803 about the incident that is clear, concise, and fully qualified. When 1804 you are notifying others that will help you to handle an event, a 1805 "smoke screen" will only divide the effort and create confusion. If 1806 a division of labor is suggested, it is helpful to provide 1807 information to each section about what is being accomplished in other 1808 efforts. This will not only reduce duplication of effort, but allow 1809 people working on parts of the problem to know where to obtain other 1810 information that would help them resolve a part of the incident. 1812 Another important consideration when communicating about the incident 1813 is to be factual. Attempting to hide aspects of the incident by 1814 providing false or incomplete information may not only prevent a 1815 successful resolution to the incident, but may even worsen the 1816 situation. This is especially true when the press is involved. When 1817 an incident severe enough to gain press attention is ongoing, it is 1818 likely that any false information you provide will not be 1819 substantiated by other sources. This will reflect badly on the site 1820 and may create enough ill-will between the site and the press to 1821 damage the site's public relations. 1823 The choice of language used when notifying people about the incident 1824 can have a profound effect on the way that information is received. 1825 When you use emotional or inflammatory terms, you raise the 1826 expectations of damage and negative outcomes of the incident. It is 1827 important to remain calm both in written and spoken notifications. 1828 Another aspect of the choice of language used is that not all people 1829 speak the same language. Due to this fact misunderstandings and 1830 delay may arise, especially if it is a multi-national incident. 1832 Other international concerns include differing legal implications of 1833 a security incident and cultural differences. Cultural differences 1834 do not only exist between countries like Germany and Australia. They 1835 even exist within countries between different social groups or user 1836 groups. An administrator of a university system might be very 1837 relaxed about attempts to connect to the system via telnet, but the 1838 administrator of a military system will follow his procedures and 1839 consider the same action as a possible attack. 1841 Another issue associated with the choice of language is the 1842 notification of non-technical or off-site personnel. It is important 1843 to accurately describe the incident without undue alarm or confusing 1844 messages. While it is more difficult to describe the incident to a 1845 non-technical audience, it is often more important. A non-technical 1846 description may be required for upper-level management, the press, or 1847 law enforcement liaisons. The importance of these notifications 1848 cannot be underestimated and may make the difference between handling 1849 the incident properly and escalating to some higher level of damage. 1851 If a IRT becomes involved it might be necessary to fill out a 1852 template for the information exchange. Although this may seem to be 1853 an additional burden and adds a certain delay, it helps the team to 1854 act on this minimum set of information. The response team may be 1855 able to respond to aspects of the incident which the local 1856 administrator is unaware of. 1858 If information is given out to someone else, the following minimum 1859 information should be provided: 1861 (1) timezone of logs, ... in GMT or local time 1862 (2) information about the remote system (including host names, 1863 ip addresses and user ids) 1864 (3) all log entries relevant for the remote site 1866 If local information (ie. local user ids) is included in the log 1867 entries, it might be necessary to sanitize the entries beforehand 1868 to avoid privacy issues. In general, all information which might 1869 assist a remote site in resolving an incident should be given out, 1870 unless local policies prohibit this. 1872 5.4.2 Protection of evidence and activity logs 1874 When you respond to an incident, document all details related to the 1875 incident. This will provide valuable information to yourself and 1876 others as you try to unravel the course of events. Documenting all 1877 details will ultimately save you time. If you don't document every 1878 relevant phone call, for example, you are likely to forget a good 1879 portion of information you obtain, requiring you to contact the 1880 source of information once again. This wastes yours and others' 1881 time, something you can ill afford. At the same time, recording 1882 details will provide evidence for prosecution efforts, providing the 1883 case moves in this direction. Documenting an incident also will help 1884 you perform a final assessment of damage (something your management 1885 as well as law enforcement officers will want to know), and will 1886 provide the basis for a follow-up analysis in which you can engage in 1887 a valuable "lessons learned" exercise. Additionally it will help 1888 during later phases of the handling process, especially during the 1889 eradiction and recovery. 1891 During the initial stages of an incident, it is often infeasible to 1892 determine whether prosecution is viable, so you should document as if 1893 you are gathering evidence for a court case. At a minimum, you 1894 should record: 1896 (1) All system events (audit records). 1897 (2) All actions you take (time tagged). 1898 (3) All phone conversations (including the person with whom 1899 you talked, the date and time, and the content of the 1900 conversation). 1902 The most straightforward way to maintain documentation is keeping a 1903 log book. This allows you to go to a centralized, chronological 1904 source of information when you need it, instead of requiring you to 1905 page through individual sheets of paper. Much of this information is 1906 potential evidence in a court of law. Thus, when you initially 1907 suspect that an incident will result in prosecution or when an 1908 investigative agency becomes involved, you need to regularly (e.g., 1909 every day) turn in photocopied, signed copies of your logbook (as 1910 well as media you use to record system events) to a document 1911 custodian who can store these copied pages in a secure place (e.g., a 1912 safe). When you submit information for storage, you should in return 1913 receive a signed, dated receipt from the document custodian. Failure 1914 to observe these procedures can result in invalidation of any 1915 evidence you obtain in a court of law. 1917 5.4.3 Containment 1919 The purpose of containment is to limit the extent of an attack. For 1920 example, it is important to limit the spread of a worm attack on a 1921 network as quickly as possible. An essential part of containment is 1922 decision making (i.e., determining whether to shut a system down, to 1923 disconnect from a network, to monitor system or network activity, to 1924 set traps, to disable functions such as remote file transfer on a 1925 UNIX system, etc.). Sometimes this decision is trivial; shut the 1926 system down if the system is life-critical, classified or sensitive, 1927 or if proprietary information is at risk! In other cases, it is 1928 worthwhile to risk having some damage to the system if keeping the 1929 system up might enable you to identify an intruder. 1931 This stage should involve carrying out predetermined procedures. 1932 Your organization or site should, for example, define acceptable 1933 risks in dealing with an incident, and should prescribe specific 1934 actions and strategies accordingly. This is especially important 1935 when a quick decision is necessary without the possibility to contact 1936 all involved parties and discuss the decision. In most of the cases 1937 the person in charge will have not the power to make a difficult 1938 management decision (like to loss the results of a costly 1939 experiment). Finally, notification of cognizant authorities should 1940 occur during this stage. 1942 In some cases, it is prudent to remove all access or functionality as 1943 soon as possible, and then restore normal operation in limited 1944 stages. Bear in mind that removing all access while an incident is 1945 in progress will obviously notify all users, including the alleged 1946 problem users, that the administrators are aware of a problem; this 1947 may have a deleterious effect on an investigation. 1949 5.4.4 Eradication 1951 Once an incident has been detected, it is important to first think 1952 about containing the incident. Once the incident has been contained, 1953 it is now time to eradicate the cause. But before eradicate the 1954 cause great care should be taken to collect all necessary information 1955 about the compromised system and the cause of the incident due later 1956 on they will disappear during the eradication. 1958 Software may be available to help you in the eradiction process. For 1959 example, eradication software is available to eliminate most viruses 1960 which infect small systems. If any bogus files have been created, it 1961 is time to archive them for later use in case of a court case. 1962 Thereafter delete them from the system at this point. In the case of 1963 virus infections, it is important to clean and reformat any disks 1964 containing infected files. Finally, ensure that all backups are 1965 clean. Many systems infected with viruses become periodically 1966 reinfected simply because people do not systematically eradicate the 1967 virus from backups. After eradiction a new backup should be taken, 1968 too. 1970 Removing all vulnerabilities once an incident has occurred is 1971 difficult. The key to removing vulnerabilities is knowledge and 1972 understanding of the breach. 1974 It may be necessary to go back to the original distributed tapes and 1975 recustomize the system. To facilitate this worst case scenario, a 1976 record of the original systems setup and each customization change 1977 should be kept current with each change to the system. In the case 1978 of a network-based attack, it is important to install patches for any 1979 operating system vulnerability which was exploited. 1981 As discussed in section $$.4.2, a security log can be most valuable 1982 during this phase of removing vulnerabilities. There are two 1983 considerations here; the first is to keep logs of the procedures that 1984 have been used to make the system secure again. This should include 1985 command procedures (e.g., shell scripts) that can be run on a 1986 periodic basis to recheck the security. Second, keep logs of 1987 important system events. These can be referenced when trying to 1988 determine the extent of the damage of a given incident. 1990 If a particular vulnerability is isolated as having been exploited, 1991 the next step is to find a mechanism to protect your system. The 1992 security mailing lists and bulletins would be a good place to search 1993 for this information and you can get advice from incident response 1994 teams. 1996 5.4.5 Recovery 1998 Once the cause of an incident has been eradicated, the recovery phase 1999 defines the next stage of action. The goal of recovery is to return 2000 the system to normal. In general, bringing up services in the order 2001 of demand to allow a minimum of user inconvenience is the best 2002 practice. Understand that the proper recovery procedures for the 2003 system are extremely important and should be specific to the site. 2005 5.4.6 Follow-Up 2007 Once you believe that a system has been restored to a "safe" state, 2008 it is still possible that holes and even traps could be lurking in 2009 the system. One of the most important stages of responding to 2010 incidents is also the most often omitted---the follow-up stage. In 2011 the follow-up stage, the system should be monitored for items that 2012 may have been missed during the cleanup stage. It would be prudent 2013 to utilize some of the tools mentioned in section xxx (e.g., xxx) as 2014 a start. Remember, these tools don't replace continual system 2015 monitoring and good systems administration procedures. 2017 The follow-up stage is important for another reason, too, because it 2018 helps those involved in handling the incident develop a set of 2019 "lessons learned" (see section $$.5) to improve future performance in 2020 such situations. This stage also provides information which 2021 justifies an organization's computer security effort to management, 2022 and yields information which may be essential in legal proceedings. 2024 The most important element of the follow-up stage is performing a 2025 postmortem analysis. Exactly what happened, and at what times? How 2026 well did the staff involved with the incident perform? What kind of 2027 information did the staff need quickly, and how could they have 2028 gotten that information as soon as possible? What would the staff do 2029 differently next time? A follow-up report is valuable because it 2030 provides a reference to be used in case of other similar incidents. 2031 Creating a formal chronology of events (including time stamps) is 2032 also important for legal reasons. Similarly, it is also important to 2033 as quickly obtain a monetary estimate of the amount of damage the 2034 incident caused in terms of any loss of software and files, hardware 2035 damage, and manpower costs to restore altered files, reconfigure 2036 affected systems, and so forth. This estimate may become the basis 2037 for subsequent prosecution activity. 2039 5.5 Aftermath of an Incident 2041 In the wake of an incident, several actions should take place. These 2042 actions can be summarized as follows: 2044 (1) An inventory should be taken of the systems' assets, 2045 i. e., a careful examination should determine how the 2046 system was affected by the incident, 2048 (2) The lessons learned as a result of the incident 2049 should be included in revised security plan to 2050 prevent the incident from re-occurring, 2052 (3) A new risk analysis should be developed in light of the 2053 incident, 2055 (4) An investigation and prosecution of the individuals 2056 who caused the incident should commence, if it is 2057 deemed desirable. 2059 All four steps should provide feedback to the site security policy 2060 committee, leading to prompt re-evaluation and amendment of the 2061 current policy. 2063 If an incident is based on poor policy, and unless the policy is 2064 changed, then one is doomed to repeat the past. Once a site has 2065 recovered from and incident, site policy and procedures should be 2066 reviewed to encompass changes to prevent similar incidents. Even 2067 without an incident, it would be prudent to review policies and 2068 procedures on a regular basis. Reviews are imperative due to today's 2069 changing computing environments. 2071 After an incident, it is prudent to write a report describing the 2072 incident, method of discovery, correction procedure, monitoring 2073 procedure, and a summary of lesson learned. This will aid in the 2074 clear understanding of the problem. Remember, it is difficult to 2075 learn from an incident if you don't understand the source. 2077 The whole purpose of this "post mortem" process is to improve all 2078 security measures to protect the site against future attacks. In the 2079 light of an incident one should gather practical knowledge from the 2080 experience. This should improve one's ability to detect the 2081 occurance of similar problems in the future. A concrete goal is 2082 developing proactive methods, for example: early warning by probes. 2083 Another important facet of the aftermath is more related to end user 2084 and administrator awareness and eduction. By reviewing the actual 2085 incident handling effort the process can be improved and extended to 2086 reflect new lessons learned. All this will help the site in the 2087 handling of future incidents even if a completely different kind of 2088 attack occurs. 2090 5.6 Responsibilities 2092 It is one thing to protect one's own network, but quite another to 2093 assume that one should protect other networks. During the handling 2094 of an incident, certain system vulnerabilities of one's own systems 2095 and the systems of others become apparent. It is quite easy and may 2096 even be tempting to pursue the intruders in order to track them. 2097 Keep in mind that at a certain point it is possible to 'cross the 2098 line,' and with the best intentions, become no better than the 2099 intruder. 2101 The best rule when it comes to propriety is to not use any facility 2102 of remote sites which is not public. This clearly excludes any entry 2103 onto a system (such as a remote shell or login session) which is not 2104 expressly permitted. This may be very tempting; after a breach of 2105 security is detected, a system administrator may have the means to 2106 'follow it up,' to ascertain what damage is being done to the remote 2107 site. Don't do it. Instead attempt to reach the POC of the effected 2108 site. 2110 6. MAINTENANCE and EVALUATION 2112 6.1 Risk assessments 2114 6.2 Notification of problems/events 2116 APPENDICES 2118 A1 Tools and Locations 2120 This section provides a brief overview of publicly available security 2121 technology which can be downloaded from the Internet. Many of the 2122 items described below will undoubtedly be surpassed or made obsolete 2123 before this document is published. This section is divided into two 2124 major subsections, applications and tools. The applications heading 2125 will include all end user programs (clients) and their supporting 2126 system infrastructure (servers). The tools heading will deal with 2127 the tools that a general user will never see or need to use, but 2128 which may be part of or used by applications, used to troubleshoot 2129 security problems or guard against intruders by system and network 2130 administrators. 2132 The emphasis will be on unix applications and tools, but other 2133 platforms, particularly PC's and Macintoshes, will be mentioned where 2134 information is available. 2136 Most of the tools and applications described below can be found in 2137 one of the following two archive sites: 2139 (1) CERT Coordination Center 2140 ftp://info.cert.org:/pub/tools 2141 (2) DFN-CERT 2142 ftp://ftp.cert.dfn.de/pub/tools/ 2143 (3) Computer Operations, Audit, and Security Tools (COAST) 2144 coast.cs.purdue.edu:/pub/tools 2146 Any references to CERT or COAST will refer to these two locations. 2147 These two sites act as repositories for most tools, exceptions will 2148 be noted in the text. *** It is important to note that many sites, 2149 including CERT and COAST are mirrored throughout the Internet. Be 2150 careful to use a "well known" mirror site to retrieve software and to 2151 use whatever verification tools possible, checksums, md5 checksums, 2152 etc... to validate that software. A clever cracker might advertise 2153 security software with designed flaws in order to gain access to data 2154 or machines. *** 2156 Applications 2158 The sad truth is that there are very few security conscious 2159 applications currently available. The real reason is the need for a 2160 security infrastructure which must be first put into place for most 2161 applications to operate securely. There is considerable effort 2162 currently taking place to place this infrastructure so that 2163 applications can take advantage of secure communications. 2165 Unix based applications 2167 PGP 2168 MD5 2169 S/KEY 2170 TROJAN.PL 2171 PEM 2172 KERBEROS 2173 Drawbridge 2174 Tripwire 2175 logdaemon 2176 TCP-Wrapper 2177 rpcbind/portmapper replacement 2178 cops 2179 tiger 2180 ISS 2181 SATAN 2182 smrsh 2183 swatch 2184 identd (not really a security tool) 2185 DES (non-US versions) 2186 lsof 2187 sfingerd 2188 passwd-replacements (npasswd / ANLpasswd / passwd+ / ...) 2190 A2 Mailing lists and other resources 2192 It would be impossible to list all of the mail-lists and other 2193 resources dealing with site security. However, these are some "jump- 2194 points" from which the reader can begin. All of these references are 2195 for the "INTERNET" constituency. More specific (vendor and 2196 geographical) resouces can be found through these references. 2198 Mailing Lists 2200 (1) CERT Advisory 2201 Send mail to: cert-advisory-request@cert.org 2202 Message Body: subscribe cert FIRSTNAME LASTNAME 2204 A CERT advisory provides information on how to obtain a patch or 2205 details of a workaround for a known computer security problem. 2206 The CERT Coordination Center works with vendors to produce a 2207 workaround or a patch for a problem, and does not publish 2208 vulnerability information until a workaround or a patch is 2209 available. A CERT advisory may also be a warning to our 2210 constituency about ongoing attacks (e.g., 2211 "CA-91:18.Active.Internet.tftp.Attacks"). 2213 CERT advisories are also published on the USENET newsgroup: 2215 comp.security.announce 2217 CERT advisory archives are available via anonymous FTP from 2218 info.cert.org in the /pub/cert_advisories directory. 2220 (2) CERT Tools Mailing List 2221 Send mail to: cert-tools-request@cert.sei.cmu.edu 2222 Message Body: subscribe cert-tools FIRSTNAME LASTNAME 2224 The purpose of this moderated mailing list is to 2225 encourage the exchange of information on security 2226 tools and techniques. The list should not be used 2227 for security problem reports. 2229 (3) VIRUS-L List 2230 Send mail to: listserv%lehiibm1.bitnet@mitvma.mit.edu 2231 Message Body: subscribe virus-L FIRSTNAME LASTNAME 2233 VIRUS-L is a moderated mailing list with a focus 2234 on computer virus issues. For more information, 2235 including a copy of the posting guidelines, see 2236 the file "virus-l.README", available by anonymous 2237 FTP from cs.ucr.edu. 2239 (4) Academic Firewalls 2240 Send mail to: majordomo@greatcircle.com 2241 Message Body: subscribe firewalls user@host 2243 The Firewalls mailing list is a discussion forum for 2244 firewall administrators and implementors. 2246 USENET newsgroups 2248 (1) comp.security.announce 2249 The comp.security.announce newsgroup is moderated 2250 and is used solely for the distribution of CERT 2251 advisories. 2253 (2) comp.security.misc 2254 The comp.security.misc is a forum for the 2255 discussion of computer security, especially as it 2256 relates to the UNIX(r) Operating System. 2258 (3) alt.security 2259 The alt.security newsgroup is also a forum for the 2260 discussion of computer security, as well as other 2261 issues such as car locks and alarm systems. 2263 (4) comp.virus 2264 The comp.virus newsgroup is a moderated newsgroup 2265 with a focus on computer virus issues. For more 2266 information, including a copy of the posting 2267 guidelines, see the file "virus-l.README", 2268 available via anonymous FTP on info.cert.org 2269 in the /pub/virus-l directory. 2271 (5) comp.risks 2272 The comp.risks newsgroup is a moderated forum on 2273 the risks to the public in computers and related 2274 systems. 2276 World-Wide Web Pages 2278 (1) http://www.first.org/ 2280 Computer Security Resource Clearinghouse. The main focus is on 2281 crisis response information; information on computer 2282 security-related threats, vulnerabilities, and solutions. At the 2283 same time, the Clearinghouse strives to be a general index to 2284 computer security information on a broad variety of subjects, 2285 including general risks, privacy, legal issues, viruses, 2286 assurance, policy, and training. 2288 (2) http://www.telstra.com.au/info/security.html 2290 This Reference Index contains a list of links to information 2291 sources on Network and Computer Security. There is no implied 2292 fitness to the Tools, Techniques and Documents contained within this 2293 archive. Many if not all of these items work well, but we do 2294 not guarantee that this will be so. This information is for the 2295 education and legitimate use of computer security techniques only. 2297 (3) http://www.alw.nih.gov/Security/security.html 2299 This page features general information about computer security. 2300 Information is organized by source and each section is organized 2301 by topic. Recent modifications are noted in What's New page. 2303 Editor Information 2305 Barbara Y. Fraser 2306 Software Engineering Institute 2307 Carnegie Mellon University 2308 5000 Forbes Avenue 2309 Pittsburgh, PA 15213 2311 Phone: (412) 268-5010 2312 Fax: (412) 268-6989 2313 email: byf@cert.org