idnits 2.17.1 draft-jevans-phishing-xml-00.txt: -(1112): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1275): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 17. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1342. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1319. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1326. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1332. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 3 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 16 instances of too long lines in the document, the longest one being 38 characters in excess of 72. ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 14, 2004) is 7066 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'IDMEF' is defined on line 580, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'IDMEF' -- Possible downref: Non-RFC (?) normative reference: ref. 'INCH' Summary: 7 errors (**), 0 flaws (~~), 4 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INCH D. Jevans 2 Internet-Draft The Anti-Phishing Working Group 3 Expires: June 14, 2005 P. Cain, Ed. 4 The Cooper-Cain Group 5 December 14, 2004 7 Extension to IODEF-Document Class for Phishing Reports 8 draft-jevans-phishing-xml-00 10 Status of this Memo 12 This document is an Internet-Draft and is subject to all provisions 13 of section 3 of RFC 3667. By submitting this Internet-Draft, each 14 author represents that any applicable patent or other IPR claims of 15 which he or she is aware have been or will be disclosed, and any of 16 which he or she become aware will be disclosed, in accordance with 17 RFC 3668. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as 22 Internet-Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 This Internet-Draft will expire on June 14, 2005. 37 Copyright Notice 39 Copyright (C) The Internet Society (2004). 41 Abstract 43 Phishing, a broadly-launched social engineering attack in which an 44 electronic identity is misrepresented in an attempt to trick 45 individuals into revealing credentials, is expanding on the Internet. 46 Corporations, Service Providers, consumer agencies, and financial 47 institutions have started to collect and correlate phishing attack 48 information to better plan out mitigation activities and to assist in 49 prosecution. Early on it became obvious that a common format for the 50 data reported or exchanged between this parties was necessary. 52 This document defines a data format for reporting phishing attacks 53 and sharing data between repositories of phishing attacks. The 54 format is an outgrowth of the Anti-Phishing Working Group (APWG) 55 activities in data sharing and is based upon the Incident Handling 56 Working Group's (INCH) XML-based format for sharing incident data. 57 Although we use the term "phishing attack", the data format is 58 flexible enough to support information gleaned from activities 59 throughout the entire phishing life cycle. The attack format is also 60 extensible enough to be used for other related reporting such as DNS 61 spoofs (eg. localhost file takeover on PCs) and keyloggers typically 62 related to the phishing attack. The format shall also support very 63 simple reporting as well as optional fields for detailed reports and 64 supports single phish reports as well as consolidated reports of 65 multiple phish reports. 67 RFC 2129 Keywords 69 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 70 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 71 document are to be interpreted as described in [RFC2119]. 73 Table of Contents 75 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 76 1.1 INCH Dependencies . . . . . . . . . . . . . . . . . . . . 5 77 2. Phishing Actvitiy Reporting via an IODEF-Document Incident . 6 78 3. PhishingReport Class Definition . . . . . . . . . . . . . . 8 79 3.1 Version parameter . . . . . . . . . . . . . . . . . . . . 8 80 3.2 PhishType parameter . . . . . . . . . . . . . . . . . . . 8 81 3.3 PhishedBrandName element . . . . . . . . . . . . . . . . . 9 82 3.4 DataCollectionType element . . . . . . . . . . . . . . . . 9 83 3.5 DataCollectionSite class . . . . . . . . . . . . . . . . . 9 84 3.6 OriginatingSensor class . . . . . . . . . . . . . . . . . 10 85 3.7 TakeDownInfo class . . . . . . . . . . . . . . . . . . . . 11 86 3.8 ArchivedData element . . . . . . . . . . . . . . . . . . . 11 87 3.9 RelatedSites element . . . . . . . . . . . . . . . . . . . 12 88 3.10 CorrelationData element . . . . . . . . . . . . . . . . 12 89 3.11 Comments element . . . . . . . . . . . . . . . . . . . . 12 90 4. Definition of PhishRecord class . . . . . . . . . . . . . . 13 91 4.1 PhishRecord class . . . . . . . . . . . . . . . . . . . . 13 92 5. IODEF Required Elements . . . . . . . . . . . . . . . . . . 14 93 6. Guidance on Usage . . . . . . . . . . . . . . . . . . . . . 15 94 7. Sample Phishing Report . . . . . . . . . . . . . . . . . . . 16 95 8. Security Considerations . . . . . . . . . . . . . . . . . . 17 96 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 18 97 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 19 98 11. Normative References . . . . . . . . . . . . . . . . . . . . 19 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 19 100 A. Phishing Data DTD . . . . . . . . . . . . . . . . . . . . . 20 101 B. Example of a Complete Phishing Activity Report . . . . . . . 26 102 C. Mapping from the APWG work into this Document . . . . . . . 27 103 C.1 Overall Format . . . . . . . . . . . . . . . . . . . . . . 29 104 C.2 Header Format . . . . . . . . . . . . . . . . . . . . . . 29 105 C.3 Individual Report Format . . . . . . . . . . . . . . . . . 30 106 D. Still To Do in This Document . . . . . . . . . . . . . . . . 36 107 Intellectual Property and Copyright Statements . . . . . . . 37 109 1. Introduction 111 The accumulation and correlation of information is very important 112 when dealing with security incidents. In phishing attacks the source 113 of the attack may be forged and it's quite possible that the targeted 114 organization may not even be aware of the ongoing attack. Parties 115 aware of the attack may wish to notify the target, or an 116 organization's internal monitoring systems may detect the attack and 117 wish to take mitigation steps. Unfortunately, there is no recognized 118 standard way of expressing the detection of a phishing attack nor an 119 acceptable way to exchange the required information. For an 120 organization that employs multi anti-phishing technologies, 121 correlating data from multiple vendors or products is close to 122 impossible as the data is reported in multiple, mostly incompatible, 123 formats. 125 This document defines a data format that should be used to capture 126 relevant information from a phishing attack and shared, correlated, 127 or to populate a database. Additionally, the use of products that 128 export information in this format will allow an organization to 129 correlate and analyze phishing information across their organization, 130 in effect information sharing with themselves. Although targeted at 131 both the accumulation of phishing attack information from a single 132 institution and a means of sharing attack information between 133 cooperating parties, the actual information sharing process and 134 related political challenges are not covered in this document. 136 Instead of defining report format and language from scratch, the 137 phishing activities information is encoded as an extension to the 138 INCH incident exchange format.[INCH] The use of an already existent 139 and operational format allows for quicker vendor adoption and reuse 140 of existing tools in organizations. To reduce duplication and to be 141 compatible with modifications to the base IODEF definitions, this 142 document only identifies additional structures; The reader is 143 expected to have a copy of the IODEF documents handy while reading 144 this one. 146 In general, an incident report contains detailed incident-specific 147 data which populates an EventData Structure. That EventData 148 structure is then incorporated, either singularly or in aggregation, 149 with additional summary and contact data, into an Incident structure. 150 The populated Incident structure is what is reported as a Phishing 151 Activity Report. 153 Unsavory phishing activity may include multiple email messages, 154 attacks, or events, scattered over various times, locations, and 155 methodoligies. As each of these activities may generate multiple 156 reports to an incident team, the Phishing Activity Report is composed 157 of multiple XML Incident classes. Each Incident class is used to 158 report one or more individual phishing reports and may include 159 multiple RecordData elelments. 161 This document defines new attributes for the EventData and Record 162 Item IODEF XML classes, then identifies attributes that are required 163 in a compliant Phishing Activity Report. The Appendices contain 164 sample Phishing Activity Reports and the complete Document Type 165 Definition. 167 1.1 INCH Dependencies 169 As discussions started to define a format for this information, it 170 became apparent that the output needed two things: include cognizant 171 data, and be supported by large numbers of vendors and products. 172 Instead of reinventing a basic reporting formula, we selected the 173 IETF IncidentHandling Working Group's (INCH) already-defined 174 XML-based attack data exchange models and formats. 176 The IODEF Extensions defined in this document comply with section4, 177 "Extending the IODEF Format" in [INCH]. 179 2. Phishing Actvitiy Reporting via an IODEF-Document Incident 181 A Phishing Activity Report is an instance of a IODEF-Document XML 182 Incident class [INCH] with added EventData and AdditionalData 183 classes. Some required information with many optional items are 184 populated into the new structure to form a Phishing Activity Report. 185 To facilitate completeness, the report originator should fill out as 186 much as possible of the optional Incident fields, but SHALL stay 187 consistent with the IODEF-Document structure. 189 This document defines the new Incident classes for the 190 AdditionalData, EventData, and Record Item IODEF XML classes; then 191 identifies attributes that are required in a compliant Phishing 192 Activity Report. The Appendices contain sample Phishing Activity 193 Reports and the complete XML Document Type Definition and schema. 195 The Incident class is summarized below and provides a standardized 196 representation for commonly exchanged incident data and associates a 197 CSIRT assigned unique identifier with the described activity. 199 +-------------------+ 200 | Incident | 201 +-------------------+ 202 | ENUM purpose |<>----------[ IncidentID ] 203 | ENUM restriction |<>--{0..1}--[ AlternativeID ] 204 | |<>--{0..1}--[ RelatedActivity ] 205 | |<>--{0..*}--[ Description ] 206 | |<>--{1..*}--[ Assessment ] 207 | |<>--{0..*}--[ Method ] 208 | |<>--{0..1}--[ DetectTime ] 209 | |<>--{0..1}--[ StartTime ] 210 | |<>--{0..1}--[ EndTime ] 211 | |<>----------[ ReportTime ] 212 | |<>--{1..*}--[ Contact ] 213 | |<>--{0..*}--[ Expectation ] 214 | |<>--{0..1}--[ History ] 215 | |<>--{0..*}--[ EventData ] --> RecordData --> RecordItem --> PhishRecord added 216 | |<>--{0..*}--[ AdditionalData ] --> PhishingReport added 217 +-------------------+ 219 Figure 1. The INCH XML Incident class (modified) 221 A Phishing Activity Report is composed of one Incident class, 222 containing one or more EventData attributes. This document defines a 223 PhishingReport class for the Incident.EventData.AdditionalData 224 comprising of phishing-related information that does not map to 225 existing Incident or EventData attributes. The following section 226 defines the new extensions specific to the Incident class EventData 227 and AdditionalData classes 229 3. PhishingReport Class Definition 231 A PhishingReport consists of an Extension to the Incident 232 AdditionalData class, and is structured as follows. 234 +---------------------------+ 235 | EventData.AdditionalData | 236 +---------------------------+ 237 | ENUM type (9 = xml) |<>---------[ PhishingReport ] 238 | STRING meaning (xml) | 239 +---------------------------+ 241 +------------------------+ 242 | PhishingReport | 243 +------------------------+ 244 | ENUM Version |<>--(0..*)--[ PhishParameter ] 245 | ENUM PhishType |----(0..*)--[ PhishedBrandName ] 246 | |<>--(0..*)--[ DataCollectionType ] 247 | |<>--(0..*)--[ DataCollectionSite ] 248 | |<>----------[ OriginatingSensor ] 249 | |<>--(0..*)--[ TakeDownInfo ] 250 | |<>--(0..*)--[ ArchivedData ] 251 | |<>--(0..*)--[ RelatedSites ] 252 | |<>--(0..*)--[ CorrelationData ] 253 | |----(0..1)--[ Comments ] 254 +------------------------+ 256 Figure 2. The PhishingReport Extensions to the INCH XML 257 Incident.AdditionalData class 259 3.1 Version parameter 261 STRING. The version shall be the value 1.0 to be compliant with this 262 document. 264 3.2 PhishType parameter 266 ENUM. The PhishType attribute contains one of the following numbers 267 representing these types: 268 1. Email, and the PhishParameter is the email subject line of 269 the phishing email. This is a standard email phish, usually sent 270 by spam. 271 2. Fraudsite, no PhishParamerter. This identifies a known 272 fraudulent site that does not necessarily send spam lures. 273 3. DNSspoof, with the malware name as a parameter. This is used 274 for a spoofed DNS (e.g., malware changes localhost file so visits 275 to www.example.com go to another IP address). 277 4. Keylogger, with the malware name as the PhishParameter. 278 5. OLE, no parameter. This identifies background OLE 279 information. 280 6. IM, no parameter. 281 7. CVE, with the CVEnumber as the PhishParameter. 282 8. SiteArchive, with the data archived from the phishing server 283 placed in the ArchiveInfo class. 284 9. Unknown. 286 When a PhishParameter is required, it is one of the following values: 288 SubjectLine element 289 STRING. This is the subject line of the email lure. 291 MalwareName element 292 STRING. This is the name of the malware that installed the 293 keylogger or DNSspoofer. 295 CVENumber element 296 STRING. This is the CVEidentifier of this exploit used to phish. 298 3.3 PhishedBrandName element 300 STRING. This is the identifier of the recognized brandname or 301 company name used to launch the phishing activity. 303 3.4 DataCollectionType element 305 ENUM. This is the method of data collection, as determined by 306 analysing the victim computer, lure, or malware. 308 1. Web. The user is redirected to a website to collects the 309 data. 310 2. Email Form. A form is embedded in the email lure. 311 3. Keylogger. Some form of keylogger was offered. 312 4. Automation. Other forms of automation such as background OLE 313 automation. 314 5. Unspecified. 316 3.5 DataCollectionSite class 318 This is the collection site where phished data is sent. 320 +-----------------------+ 321 | DataCollectionSite | 322 +-----------------------+ 323 | ENUM Type |<>--(0..*)---[ SiteData ] 324 +-----------------------+ 325 Type parameter 326 1. Web, no parameter. Data is collected on a website. 327 2. Email, with email site(es), comma separated, as parameter(s). 328 Data is sent to one or more email addresses. 329 3. IP Address, with protocol and comma separated IP address(es) 330 as parameter(s). Data is sent to one or more IP addresses using 331 the identified protocol. 332 4. Unknown. 334 SiteData is one of the following, depending on the type. 335 STRING Site URL. 336 STRING Email Site 337 ADDRESS site IP 339 3.6 OriginatingSensor class 341 +--------------------+ 342 | OriginatingSensor | 343 +--------------------+ 344 | ENUM Type |<>---(0..*)---[ OriginatingSensorName ] 345 | |<>---(0..*)---[ OriginatingSensorIPAddress ] 346 | |<>------------[ OriginatingSensorFirstSeen ] 347 +--------------------+ 349 The OriginatingSensor requires a type value and identification of the 350 entity that generated this report. 352 Type parameter is an ENUM from the following: 353 1. Web Server/Service. 354 2. Web Gateway (Proxy or Firewall). 355 3. Mail Gateway. 356 4. Browser Element. 357 5. ISP sensor. 358 6. Human 359 7. Other. 361 OriginatingSensorName element 362 STRING. This is the DNS name of the entity that generated this 363 report. 365 OriginatingSensorIPAddress element 366 ADDRESS. This is the IPAddress of the entity that generated this 367 report. 369 OriginatingSensorFirstSeen element 370 DATETIME. This is the date and time that this sensor first saw 371 this phishing activity. 373 3.7 TakeDownInfo class 375 +-------------------+ 376 | TakeDownInfo | 377 +-------------------+ 378 | |<>---(0..1)--[ TakeDownDate ] 379 | |<>---(0..1)--[ TakeDownAgency ] 380 | |<>---(0..1)--[ TakeDownComments ] 381 +------------------------------+ 383 This class identifies information regarding the disablement of the 384 phish collector site. A PhishingReport may have multiple 385 TakeDownInfo classes. 387 TakeDownDate element 388 DATETIME. This is the date and time that takedown occurred. 390 TakeDownAgency element 391 STRING. This is a freeform string identifying the agency that 392 performed the takedown 394 TakeDownComments element 395 STRING. A free form field to add any exciting details of the 396 this takedown effort. 398 3.8 ArchivedData element 400 +-------------------+ 401 | ArchivedData | 402 +-------------------+ 403 | ENUM Type |<>---(0..1)--[ ArchivedDataURL ] 404 | |<>---(0..1)--[ ArchivedDataComments ] 405 +------------------------------+ 407 The element is used to type and include a gzip archive file o a 408 datacolection site , basecamp, or other site where the phisher 409 developed their code. This element will be populated when, for 410 example, an ISP takes down a phisher's web site and has copied the 411 site data into an archive file. There are three types of archives 412 currently supported, as specified in the type filed. 414 Type parameter 415 1. Data Collection Site. 416 2. Basecamp Site. 417 3. Sender Site 419 ArchivedDataURL 420 URL. This is the URL where the gzip archive file is located. 421 [As archives are quite large, a Phishing Report just points out 422 where the archive is, and doe snot include it in the report.] 424 ArchivedDataComments 425 STRING. This field is a free form area where one can comment on 426 the archive and/or URL, if they so please. 428 3.9 RelatedSites element 430 URL. These are non-phish web sites that are related to this incident 431 (e.g., victim site, etc). 433 3.10 CorrelationData element 435 STRING. Any information that correlates this incident to other 436 incidents can be entered here. 438 3.11 Comments element 440 STRING. Comments specific to this phishing activity that does not 441 fit in any other field. 443 4. Definition of PhishRecord class 445 Extensions are also made to the Incident EventData Additional Data 446 class, to support descriptive information received in phish emails. 448 4.1 PhishRecord class 450 Data specific to this phishing activity is represented within a new 451 extenxion to the RecordItem class of the RecordData class of an 452 EventData class in an Incident class. 453 +-----------------------+ 454 | RecordItem | 455 +-----------------------+ 456 | ANY (PhishRecord) | 457 | ENUM Type (xml) | 458 +-----------------------+ 460 +-----------------------+ 461 | PhishRecord | 462 +-----------------------+ 463 | |<>--(0..*)---[ EmailOriginatorIP ] 464 | |<>--(0..*)---[ EmailHeader ] 465 | |<>--(0..*)---[ EmailBody ] 466 | |<>--(0..*)---[ EmailComments ] 467 +-----------------------+ 469 EmailOriginatorIP element 470 ADDRESS. This is the IP Address of the site originating the 471 phish email. 473 EmailHeader element 474 STRING. The headers of the phish email are included in this 475 class. 477 EmailBody element 478 STRING. The body of the phish email is included here. 480 EmailComments element 481 STRING. Data not placed elsewhere about this email may be added 482 in this field. 484 5. IODEF Required Elements 486 A Phishing Report requires certain identifying information, which is 487 contained within the standard IODEF Incident data structure. The 488 following table identifies attributes used in a Phishing Activity 489 Report and their obligation. Note that the Required column notes 490 attributes required by the base IODEF Incident class. Attributes 491 identified as required SHALL be populated in conforming phishing 492 activity reports. 494 The following table is a visual description of the required fields. 496 +--------------+ 497 | Incident | 498 +--------------+ 499 | ENUM |---[ IncidentID ] 500 | |---[ Assessment ]|---[ Confidence ] 501 | |---[ ReportTime ] 502 | |---[ Contact ]|---[ Role ] 503 | | |---[ Type ]|---[ Name ] 504 | |---[ AdditionalData ]|---[ PhishReport ]|---[ Version ] 505 | | |---[ PhishType ] 506 | | |---[ OriginatingSensor ]|---OriginatingSensorFirstSeen ] 507 | | 508 +------+ 510 These following MUST be populated in a compliant Phishing Activity 511 Report: 512 IncidentID 513 Assessment -> Confidence 514 ReportTime 515 Contact -> Role 516 Contact -> Type 517 Contact -> Name 518 AdditionalData -> PhishReport -> Version 519 AdditionalData -> PhishReport -> PhishType 520 AdditionalData -> PhishReport -> OriginatingSensor -> 521 OriginatingSensorFirstSeen 523 6. Guidance on Usage 525 It may be apparent that the mandatory attributes for a phishing 526 activity report make for a quite sparse report. As incident 527 forensics and data analysis require detailed information, the 528 originator of a PhishingReport should include any tidbit of 529 information gleaned from the attack. Information that is considered 530 more sensitive than public can be marked to a higher sensitivity 531 using the restriction paramater of each data class. 533 7. Sample Phishing Report 535 A sample (and useful) phishing activity report, that is one that has 536 only the required and data items populated, is as follows: 537 [ ed. To be supplied] 539 8. Security Considerations 541 This document specifies the format of security incident data. As 542 such, the security of transactions containing the incident report 543 will vary from organization to organization. We do not want to 544 burden the information exchange with unnecessary encryption 545 requirements, as the transport service for the data exchange may 546 provide adequate protections, or even encryption. The use of 547 encryption is expected to be agreed upon on originator-recipient 548 agreement. 550 The critical security concern is that phishing activity reports may 551 be falsified or the report may become corrupt during transit. 552 Applying a digital signature on each report will counteract both of 553 these concerns, but again the signature may be overkill for most 554 activity report users. For this reason, phishing activity reports 555 SHOULD be digitally signed with the optional IODEF XML signature, 556 although we expect that each receiving entity will determine the need 557 for this signature independently. Generators of phishing activity 558 reports SHOULD digitally sign each report. 560 Originators of phishing activity reports SHOULD digitally sign their 561 report with the XML signature as described in [INCH] . 563 Recipients of phishing reports SHALL be prepared to accept XML 564 digitally signed reports and SHOULD support receiving encrypted 565 reports. 567 9. IANA Considerations 569 This document has no actions for IANA. 571 10. Contributors 573 This document has received significant assistance from two groups 574 addressing the phishing problem: members of the Anti-Phishing Working 575 Group and participants in the Financial Services Technology 576 Consortium's Counter-Phishing project. 578 11 Normative References 580 [IDMEF] Curry, D. and H. Debar, "The Intrusion Detection Message 581 Exchange Format", July 2004. 583 [INCH] Meijer, J., Danyliw and Demchenko, "The Incident Object 584 Description Exchange Format Data Model and XML 585 Implementation", November 2004. 587 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 588 Requirement Levels", BCP 14, RFC 2119, March 1997. 590 Authors' Addresses 592 David Jevans 593 The Anti-Phishing Working Group 594 38 Rice Street, Suites 2-0/2-2 595 Cambridge, MA, 02140 596 USA 598 EMail: dave.jevans@antiphishing.org 600 Patrick Cain (editor) 601 The Cooper-Cain Group 602 P.O. Box 400992 603 Cambridge, MA 604 USA 606 EMail: pcain@coopercain.com 608 Appendix A. Phishing Data DTD 610 611 628 637 643 644 645 647 664 669 671 674 678 684 690 692 697 699 704 706 707 711 726 728 729 733 745 748 752 756 757 758 760 766 768 781 783 784 785 787 799 801 803 806 807 809 815 817 823 825 847 851 852 853 854 856 Appendix B. Example of a Complete Phishing Activity Report 858 859 861 862 863 pat_001 864 865 EST 866 867 2004-12-31:01:42 868 869 870 871 872 873 874 875 876 877 207.148.245.213 878 879 Empty header 880 empty body 881 This was fake email to keep the 882 size of the sample report small. 883 884 885 886 887 888 889 890 891 892 893 2004-12-15:23:53:01 894 895 896 897 898 899 901 Appendix C. Mapping from the APWG work into this Document 903 Note: This appendix is Informational and will be removed in the next 904 version of the document. 906 As this document incorporates some previous work done by the APWG, 907 this section identifies where the APWG-required data items map into 908 the INCH data structures. The following figure summarizes the APWG 909 nomenclature as expressed as Incident classes/fields. 911 +------------------+---------------------+--------------------------+ 912 | APWG identifier | member | IODEF class | 913 +------------------+---------------------+--------------------------+ 914 | phishingreport | uniqueid | Incident.IncidentID | 915 | | | | 916 | Header | | Incident | 917 | | | | 918 | | format version | EventData.AdditionalData | 919 | | | PhishingReport.Version | 920 | | | | 921 | | datecreated | Incident.ReportTime | 922 | | | | 923 | | reporterorg | IncidentID.UID | 924 | | | | 925 | | reportername | Incident.Contact | 926 | | | | 927 | | reporteremail | Incident.Contact.Email | 928 | | | | 929 | | reportersignature | (still under flux in | 930 | | | Incident) | 931 | | | | 932 | | comments | Incident.Description or | 933 | | | Incident.EventData.Descr | 934 | | | ption | 935 | | | | 936 | | aggregateflag | Multiple EventData | 937 | | | structures | 938 | | | | 939 | phish | | EventData. | 940 | | | | 941 | | datedetected | DetectTime | 942 | | | | 943 | | phishtype | EventData.AdditionalData | 944 | | | PhishingReport.Eventtype | 945 | | | | 946 | | datacollectiontype | EventData.AdditionalData | 947 | | | PhishingReport.DataColle | 948 | | | tionType | 949 | | | | 950 | | datacollectionsite | EventData.AdditionalData | 951 | | | PhishingReport.DataColle | 952 | | | tionSite | 953 | | | | 954 | | originatingsensorty | EventData.AdditionalData | 955 | | e | PhishingReport.Originato | 956 | | | Sensor.Type | 957 | | | | 958 | | originatingsensorna | EventData.AdditionalData | 959 | | e | PhishingReport.Originati | 960 | | | gSensor.SensorName | 961 | | | | 962 | | originatingsensorIP | EventData.AdditionalData | 963 | | ddress | PhishingReport.Originati | 964 | | | gSensor.IPaddress | 965 | | | | 966 | | forensics | EventData.Record | 967 | | | | 968 | | emailsite-url | PhishReport.DataCollecto | 969 | | | Site.SiteData | 970 | | | | 971 | | site-url | PhishReport.DataCollecto | 972 | | | Site.SiteData | 973 | | | | 974 | | emailsubject | PhishReport.PhishParamet | 975 | | | r | 976 | | | | 977 | | takedowndate | PhishingReport.TakedownI | 978 | | | fo.Date | 979 | | | | 980 | | takedownagency | EventData.AdditionalData | 981 | | | PhishingReport.TakedownI | 982 | | | fo.Agency | 983 | | | | 984 | | site-ip | PhishReport.PhishParamet | 985 | | | r and | 986 | | | PhishReport.DataCollecto | 987 | | | Site.SiteData | 988 | | | | 989 | | emailheaders | PhishRecord.EmailHeaders | 990 | | | | 991 | | emailbody | PhishRecord.EmailBody | 992 | | | | 993 | | brand | PhishReport.PhishedBrand | 994 | | | ame | 995 | | | | 996 | | senderip | | 997 | | | | 998 | | otherlink | PhishingReport.RelatedSi | 999 | | | es | 1000 | | | | 1001 | | correlations | PhishingReport.Correlati | 1002 | | | nData | 1003 | | | | 1004 | | comments | EventData. | 1005 +------------------+---------------------+--------------------------+ 1007 C.1 Overall Format 1009 Each phishing report is encapsulated in the phishingreport element 1011 header followed by one or more reports 1012 One or more phishing reports are included. 1014 C.2 Header Format 1016 Each report must have a header. Each header MUST have: 1018 >formatversion< version number>/formatversion< The 1019 version of the XML reporting format. Eg. 1.0 1021 32-bit UNIX time This is the 1022 date that the phish report was created. 1024 organization who created the phish report 1025 Name or other id of who created the phish report (not the name of the 1026 person who submitted it). Do we need a unique database of 1027 reporternames? Eg. TMWD ��� Tumbleweed Communications Corp. 1029 Each header MAY have: 1031 name of the person who created the 1032 report Name of an individual. 1034 email address of the person who created the 1035 report email address of the individual who created 1036 the report. 1038 digitalsignature An XML 1039 digital signature of the canonicalized file, everything between the 1040 . We need the hash of the document, 1041 the certs or URL to them, and the signature. What format should that 1042 be? XML-DSIG? This verifies the authenticity of the report. 1044 >ext Any comments that the reporter chooses to 1045 add to the individual phish report. 1047 01 or nn This is a flag for whether 1048 this XML doc represents a single phish attack event; or it is an 1049 aggregated document that represents nn discrete events 1051 C.3 Individual Report Format 1053 Each individual report is encapsulated between phish. 1055 report fields Each individual 1056 report is encapsulated between the with the 1057 uniqueids. 1059 Each report MUST have: 1061 32-bit UNIX time This is the date that 1062 the phish was reported by a consumer or detected by a trap or other 1063 means. 1065 phishtype optional_parameter 1067 One of the following. 1069 +----------------------+----------------------+---------------------+ 1070 | String | Parameter | Description | 1071 +----------------------+----------------------+---------------------+ 1072 | Email | | a standard email | 1073 | | | phish, usually sent | 1074 | | | by spam | 1075 | | | | 1076 | Fraudsite | a known fraudulent | DNSspoof | 1077 | | site that does not | | 1078 | | necessarily send | | 1079 | | spam lures | | 1080 | | | | 1081 | malwarename | spoofed DNS (eg. | Keylogger | 1082 | | Malware changes | | 1083 | | localhost file so | | 1084 | | visits to | | 1085 | | www.example.com go | | 1086 | | to an incorrect IP | | 1087 | | address) | | 1088 | | | | 1089 | malwarename | a keylogger site | OLE | 1090 | | | | 1091 | | Background OLE | IM | 1092 | | Automation | | 1093 | | | | 1094 | | Instant | CVE | 1095 | | message/NNTP/etc | | 1096 | | | | 1097 | CVEnumber | CVE number?? For | | 1098 | | malware exploits. Or | | 1099 | | is this the | | 1100 | | keylogger | | 1101 | | malwarename? | | 1102 +----------------------+----------------------+---------------------+ 1104 The optional parameter is a string, without whitespace, that may be 1105 used to name the malware that installed the keylogger or the 1106 DNSspoofer. 1108 Each report MAY have: 1110 type 1112 The method of data collection. This is derived from the victim���s 1113 computer (eg. By analyzing the email lure or malware sent to them). 1114 One of the following: 1116 +---------------------------------+---------------------------------+ 1117 | String | Description | 1118 +---------------------------------+---------------------------------+ 1119 | Web | User is redirected to a website | 1120 | | that collects the data | 1121 | | | 1122 | EmailForm | A form is embedded in the email | 1123 | | lure | 1124 | | | 1125 | Keylogger | Some form of keystroke logger | 1126 | | | 1127 | Automation | Other form of automation such | 1128 | | as a background OLE automation | 1129 +---------------------------------+---------------------------------+ 1131 NOTE: This is somewhat redundant with phishtype, especially if a 1132 keylogger. 1134 type optional_parameters 1136 Where the data is sent. This can be found by seizing a capture site 1137 and analyzing the code on the server. One of the following: 1139 +----------------------+----------------------+---------------------+ 1140 | String | Parameter | Description | 1141 +----------------------+----------------------+---------------------+ 1142 | Web | | Data is collected | 1143 | | | on a website. | 1144 | | | Emailsite-url and | 1145 | | | site-url fields are | 1146 | | | used to specify the | 1147 | | | location of the | 1148 | | | site. | 1149 | | | | 1150 | Email | addr, addr | Data is sent to one | 1151 | | | or more email | 1152 | | | address. List them. | 1153 | | | Comma separated. | 1154 | | | | 1155 | IP | ip, IP | Data is sent to one | 1156 | | | or more IP address, | 1157 | | | comma separated. | 1158 | | | (how to specify | 1159 | | | protocol e.g., | 1160 | | | IRC?) | 1161 +----------------------+----------------------+---------------------+ 1163 type 1165 The type of technology that generated this XML document. One of the 1166 following: 1168 +---------------------------------+---------------------------------+ 1169 | String | Description | 1170 +---------------------------------+---------------------------------+ 1171 | Web Server/Service | This XML doc was generated by a | 1172 | | web server or service | 1173 | | | 1174 | Web gateway | This XML doc was generated by a | 1175 | | web gateway | 1176 | | | 1177 | Mail gateway | This XML doc was generated by a | 1178 | | mail gateway | 1179 | | | 1180 | Browser element | This XML doc was generated by a | 1181 | | web browser element (i.e. | 1182 | | plugin) | 1183 | | | 1184 | ISP sensor | An ISP sensor generated this | 1185 | | XML document | 1186 | | | 1187 | Human | A Human generated this XML | 1188 | | doc/report (e.g.,. Discovered | 1189 | | phishing base camp) | 1190 | | | 1191 | Other technology | This XML doc was generated by | 1192 | | some other technology | 1193 +---------------------------------+---------------------------------+ 1195 name 1197 The DNS name of the entity that generated this XML document. 1199 name 1201 The IP address of the entity that generated this XML document. 1203 /forensics< 1205 Any length of strings of forensic information. Useful for law 1206 enforcement. This could be watermarks in images, comments in HTML 1207 fields, poisoned user data. 1209 URL 1211 This is the base URI of phishing site that is included in the email 1212 lure. This can be used by email spam filters to detect and filter 1213 out phishing emails by posting it to SURBL. This also can be used in 1214 a Web browser to access the phishing site. 1216 If the site is an SSL site, then the URL specifies https://URL 1218 URL 1220 This is the URI of the phishing data collection site that the browser 1221 actually goes to in order to post data. This may differ from the 1222 emailsite-url, because the URL included in the email might redirect 1223 users to the actual data collection site, which is the site-url. The 1224 emailsite-url is useful for spam filters, the site-url is useful for 1225 takedown, law enforcement, or web proxy filters to prevent users from 1226 visiting the collection site. 1228 If the site is an SSL site, then the URL specifies https://URL 1230 subject 1232 The subject of the email phish lure. 1234 UNIX 32-bit time 1235 If the site has been taken down, this is the date and time when that 1236 was effected. Which site? Redirector or data collection site? 1237 Multiples with designator? 1239 string 1241 Who took the site down. If more than one party took it down, you may 1242 list multiple parties as freeform text here, or have multiple 1243 takedownagency fields. 1245 >p address (port number optional if not 80) 1247 The IP address of the server hosting the phishing site in standard IP 1248 address format A.C.D.E:portnum. If no portnumber specified, then 1249 port 80 assumed. 1251 These IP addresses could be used by ISPs and web filters to block 1252 access to servers. However, this is dangerous if the sites are 1253 running on hacked servers or ISPs that are hosting legitimate sites 1254 as well. It can be very useful to filter out access to servers that 1255 have hijacked DNS through modifying localhosts files for example 1256 (e.g., 11.1.2004). 1258 body 1260 The body of the email. I think we need the uniqueid strings. 1262 What about when the body is an image only? Ex. GDI exploit to 1263 install keylogger or single image with hyperlink. 1265 headers 1267 The headers of the email 1269 Do we need to create xml records for each entry in a decomposed 1270 header? No, only the open relays and the apparent source and possibly 1271 a few others.. 1273 brand name 1275 The name of the company who���s brand is being used to launch the 1276 phishing attack 1278 IP address (optional port number) 1280 The IP address of the mail server or relay that delivered the 1281 phishing email. This can be used for RBLs. A single attack may have 1282 multiple senderips if the mail was sent from multiple relays. 1284 URL 1286 Links to non-phish sites that may be relevant (victim site, other 1287 sites) 1289 strings 1291 Any correlations to known phishing kits or groups. Freeform text. 1293 text 1295 Any freeform text comments that the reporter chooses to add to the 1296 individual phish report. e.g.,. "images sourced from victim 1297 online-banking site" or "background popup populated with victim 1298 privacy statement". 1300 Appendix D. Still To Do in This Document 1302 This appendix will be removed when it is empty. 1304 This list are the tasks that are still needed to be comleted withni 1305 this document. 1306 1. Make a test report that verifies every possible option. 1307 2. Finish and insert the schema. 1308 Add more detail on what the specific elements mean, 1310 Intellectual Property Statement 1312 The IETF takes no position regarding the validity or scope of any 1313 Intellectual Property Rights or other rights that might be claimed to 1314 pertain to the implementation or use of the technology described in 1315 this document or the extent to which any license under such rights 1316 might or might not be available; nor does it represent that it has 1317 made any independent effort to identify any such rights. Information 1318 on the procedures with respect to rights in RFC documents can be 1319 found in BCP 78 and BCP 79. 1321 Copies of IPR disclosures made to the IETF Secretariat and any 1322 assurances of licenses to be made available, or the result of an 1323 attempt made to obtain a general license or permission for the use of 1324 such proprietary rights by implementers or users of this 1325 specification can be obtained from the IETF on-line IPR repository at 1326 http://www.ietf.org/ipr. 1328 The IETF invites any interested party to bring to its attention any 1329 copyrights, patents or patent applications, or other proprietary 1330 rights that may cover technology that may be required to implement 1331 this standard. Please address the information to the IETF at 1332 ietf-ipr@ietf.org. 1334 Disclaimer of Validity 1336 This document and the information contained herein are provided on an 1337 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1338 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1339 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1340 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1341 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1342 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1344 Copyright Statement 1346 Copyright (C) The Internet Society (2004). This document is subject 1347 to the rights, licenses and restrictions contained in BCP 78, and 1348 except as set forth therein, the authors retain all their rights. 1350 Acknowledgment 1352 Funding for the RFC Editor function is currently provided by the 1353 Internet Society.