idnits 2.17.1 draft-jevans-phishing-xml-00.txt: -(1112): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1275): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 17. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1342. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1319. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1326. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1332. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 3 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 16 instances of too long lines in the document, the longest one being 38 characters in excess of 72. ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 14, 2004) is 7066 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'IDMEF' is defined on line 580, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. 'IDMEF'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'INCH'


     Summary: 7 errors (**), 0 flaws (~~), 4 warnings (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INCH                                                           D. Jevans
2	Internet-Draft                           The Anti-Phishing Working Group
3	Expires: June 14, 2005                                      P. Cain, Ed.
4	                                                   The Cooper-Cain Group
5	                                                       December 14, 2004

7	         Extension to IODEF-Document Class for Phishing Reports
8	                      draft-jevans-phishing-xml-00

10	Status of this Memo

12	   This document is an Internet-Draft and is subject to all provisions
13	   of section 3 of RFC 3667.  By submitting this Internet-Draft, each
14	   author represents that any applicable patent or other IPR claims of
15	   which he or she is aware have been or will be disclosed, and any of
16	   which he or she become aware will be disclosed, in accordance with
17	   RFC 3668.

19	   Internet-Drafts are working documents of the Internet Engineering
20	   Task Force (IETF), its areas, and its working groups.  Note that
21	   other groups may also distribute working documents as
22	   Internet-Drafts.

24	   Internet-Drafts are draft documents valid for a maximum of six months
25	   and may be updated, replaced, or obsoleted by other documents at any
26	   time.  It is inappropriate to use Internet-Drafts as reference
27	   material or to cite them other than as "work in progress."

29	   The list of current Internet-Drafts can be accessed at
30	   http://www.ietf.org/ietf/1id-abstracts.txt.

32	   The list of Internet-Draft Shadow Directories can be accessed at
33	   http://www.ietf.org/shadow.html.

35	   This Internet-Draft will expire on June 14, 2005.

37	Copyright Notice

39	   Copyright (C) The Internet Society (2004).

41	Abstract

43	   Phishing, a broadly-launched social engineering attack in which an
44	   electronic identity is misrepresented in an attempt to trick
45	   individuals into revealing credentials, is expanding on the Internet.
46	   Corporations, Service Providers, consumer agencies, and financial
47	   institutions have started to collect and correlate phishing attack
48	   information to better plan out mitigation activities and to assist in
49	   prosecution.  Early on it became obvious that a common format for the
50	   data reported or exchanged between this parties was necessary.

52	   This document defines a data format for reporting phishing attacks
53	   and sharing data between repositories of phishing attacks.  The
54	   format is an outgrowth of the Anti-Phishing Working Group (APWG)
55	   activities in data sharing and is based upon the Incident Handling
56	   Working Group's (INCH) XML-based format for sharing incident data.
57	   Although we use the term "phishing attack", the data format is
58	   flexible enough to support information gleaned from activities
59	   throughout the entire phishing life cycle.  The attack format is also
60	   extensible enough to be used for other related reporting such as DNS
61	   spoofs (eg.  localhost file takeover on PCs) and keyloggers typically
62	   related to the phishing attack.  The format shall also support very
63	   simple reporting as well as optional fields for detailed reports and
64	   supports single phish reports as well as consolidated reports of
65	   multiple phish reports.

67	RFC 2129 Keywords

69	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
70	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
71	   document are to be interpreted as described in [RFC2119].

73	Table of Contents

75	   1.   Introduction . . . . . . . . . . . . . . . . . . . . . . . .   4
76	     1.1  INCH Dependencies  . . . . . . . . . . . . . . . . . . . .   5
77	   2.   Phishing Actvitiy Reporting via an IODEF-Document Incident .   6
78	   3.   PhishingReport Class Definition  . . . . . . . . . . . . . .   8
79	     3.1  Version parameter  . . . . . . . . . . . . . . . . . . . .   8
80	     3.2  PhishType parameter  . . . . . . . . . . . . . . . . . . .   8
81	     3.3  PhishedBrandName element . . . . . . . . . . . . . . . . .   9
82	     3.4  DataCollectionType element . . . . . . . . . . . . . . . .   9
83	     3.5  DataCollectionSite class . . . . . . . . . . . . . . . . .   9
84	     3.6  OriginatingSensor class  . . . . . . . . . . . . . . . . .  10
85	     3.7  TakeDownInfo class . . . . . . . . . . . . . . . . . . . .  11
86	     3.8  ArchivedData element . . . . . . . . . . . . . . . . . . .  11
87	     3.9  RelatedSites element . . . . . . . . . . . . . . . . . . .  12
88	     3.10   CorrelationData element  . . . . . . . . . . . . . . . .  12
89	     3.11   Comments element . . . . . . . . . . . . . . . . . . . .  12
90	   4.   Definition of PhishRecord class  . . . . . . . . . . . . . .  13
91	     4.1  PhishRecord class  . . . . . . . . . . . . . . . . . . . .  13
92	   5.   IODEF Required Elements  . . . . . . . . . . . . . . . . . .  14
93	   6.   Guidance on Usage  . . . . . . . . . . . . . . . . . . . . .  15
94	   7.   Sample Phishing Report . . . . . . . . . . . . . . . . . . .  16
95	   8.   Security Considerations  . . . . . . . . . . . . . . . . . .  17
96	   9.   IANA Considerations  . . . . . . . . . . . . . . . . . . . .  18
97	   10.  Contributors . . . . . . . . . . . . . . . . . . . . . . . .  19
98	   11.  Normative References . . . . . . . . . . . . . . . . . . . .  19
99	        Authors' Addresses . . . . . . . . . . . . . . . . . . . . .  19
100	   A.   Phishing Data DTD  . . . . . . . . . . . . . . . . . . . . .  20
101	   B.   Example of a Complete Phishing Activity Report . . . . . . .  26
102	   C.   Mapping from the APWG work into this Document  . . . . . . .  27
103	     C.1  Overall Format . . . . . . . . . . . . . . . . . . . . . .  29
104	     C.2  Header Format  . . . . . . . . . . . . . . . . . . . . . .  29
105	     C.3  Individual Report Format . . . . . . . . . . . . . . . . .  30
106	   D.   Still To Do in This Document . . . . . . . . . . . . . . . .  36
107	        Intellectual Property and Copyright Statements . . . . . . .  37

109	1.  Introduction

111	   The accumulation and correlation of information is very important
112	   when dealing with security incidents.  In phishing attacks the source
113	   of the attack may be forged and it's quite possible that the targeted
114	   organization may not even be aware of the ongoing attack.  Parties
115	   aware of the attack may wish to notify the target, or an
116	   organization's internal monitoring systems may detect the attack and
117	   wish to take mitigation steps.  Unfortunately, there is no recognized
118	   standard way of expressing the detection of a phishing attack nor an
119	   acceptable way to exchange the required information.  For an
120	   organization that employs multi anti-phishing technologies,
121	   correlating data from multiple vendors or products is close to
122	   impossible as the data is reported in multiple, mostly incompatible,
123	   formats.

125	   This document defines a data format that should be used to capture
126	   relevant information from a phishing attack and shared, correlated,
127	   or to populate a database.  Additionally, the use of products that
128	   export information in this format will allow an organization to
129	   correlate and analyze phishing information across their organization,
130	   in effect information sharing with themselves.  Although targeted at
131	   both the accumulation of phishing attack information from a single
132	   institution and a means of sharing attack information between
133	   cooperating parties, the actual information sharing process and
134	   related political challenges are not covered in this document.

136	   Instead of defining report format and language from scratch, the
137	   phishing activities information is encoded as an extension to the
138	   INCH incident exchange format.[INCH] The use of an already existent
139	   and operational format allows for quicker vendor adoption and reuse
140	   of existing tools in organizations.  To reduce duplication and to be
141	   compatible with modifications to the base IODEF definitions, this
142	   document only identifies additional structures; The reader is
143	   expected to have a copy of the IODEF documents handy while reading
144	   this one.

146	   In general, an incident report contains detailed incident-specific
147	   data which populates an EventData Structure.  That EventData
148	   structure is then incorporated, either singularly or in aggregation,
149	   with additional summary and contact data, into an Incident structure.
150	   The populated Incident structure is what is reported as a Phishing
151	   Activity Report.

153	   Unsavory phishing activity may include multiple email messages,
154	   attacks, or events, scattered over various times, locations, and
155	   methodoligies.  As each of these activities may generate multiple
156	   reports to an incident team, the Phishing Activity Report is composed
157	   of multiple XML Incident classes.  Each Incident class is used to
158	   report one or more individual phishing reports and may include
159	   multiple RecordData elelments.

161	   This document defines new attributes for the EventData and Record
162	   Item IODEF XML classes, then identifies attributes that are required
163	   in a compliant Phishing Activity Report.  The Appendices contain
164	   sample Phishing Activity Reports and the complete Document Type
165	   Definition.

167	1.1  INCH Dependencies

169	   As discussions started to define a format for this information, it
170	   became apparent that the output needed two things: include cognizant
171	   data, and be supported by large numbers of vendors and products.
172	   Instead of reinventing a basic reporting formula, we selected the
173	   IETF IncidentHandling Working Group's (INCH) already-defined
174	   XML-based attack data exchange models and formats.

176	   The IODEF Extensions defined in this document comply with section4,
177	   "Extending the IODEF Format" in [INCH].

179	2.  Phishing Actvitiy Reporting via an IODEF-Document Incident

181	   A Phishing Activity Report is an instance of a IODEF-Document XML
182	   Incident class [INCH] with added EventData and AdditionalData
183	   classes.  Some required information with many optional items are
184	   populated into the new structure to form a Phishing Activity Report.
185	   To facilitate completeness, the report originator should fill out as
186	   much as possible of the optional Incident fields, but SHALL stay
187	   consistent with the IODEF-Document structure.

189	   This document defines the new Incident classes for the
190	   AdditionalData, EventData, and Record Item IODEF XML classes; then
191	   identifies attributes that are required in a compliant Phishing
192	   Activity Report.  The Appendices contain sample Phishing Activity
193	   Reports and the complete XML Document Type Definition and schema.

195	   The Incident class is summarized below and provides a standardized
196	   representation for commonly exchanged incident data and associates a
197	   CSIRT assigned unique identifier with the described activity.

199	      +-------------------+
200	      | Incident          |
201	      +-------------------+
202	      | ENUM purpose      |<>----------[ IncidentID      ]
203	      | ENUM restriction  |<>--{0..1}--[ AlternativeID   ]
204	      |                   |<>--{0..1}--[ RelatedActivity ]
205	      |                   |<>--{0..*}--[ Description     ]
206	      |                   |<>--{1..*}--[ Assessment      ]
207	      |                   |<>--{0..*}--[ Method          ]
208	      |                   |<>--{0..1}--[ DetectTime      ]
209	      |                   |<>--{0..1}--[ StartTime       ]
210	      |                   |<>--{0..1}--[ EndTime         ]
211	      |                   |<>----------[ ReportTime      ]
212	      |                   |<>--{1..*}--[ Contact         ]
213	      |                   |<>--{0..*}--[ Expectation     ]
214	      |                   |<>--{0..1}--[ History         ]
215	      |                   |<>--{0..*}--[ EventData       ] --> RecordData --> RecordItem --> PhishRecord added
216	      |                   |<>--{0..*}--[ AdditionalData  ] --> PhishingReport added
217	      +-------------------+

219	   Figure 1.  The INCH XML Incident class (modified)

221	   A Phishing Activity Report is composed of one Incident class,
222	   containing one or more EventData attributes.  This document defines a
223	   PhishingReport class for the Incident.EventData.AdditionalData
224	   comprising of phishing-related information that does not map to
225	   existing Incident or EventData attributes.  The following section
226	   defines the new extensions specific to the Incident class EventData
227	   and AdditionalData classes

229	3.  PhishingReport Class Definition

231	   A PhishingReport consists of an Extension to the Incident
232	   AdditionalData class, and is structured as follows.

234	   +---------------------------+
235	   |  EventData.AdditionalData |
236	   +---------------------------+
237	   | ENUM type (9 = xml)       |<>---------[ PhishingReport     ]
238	   | STRING meaning (xml)      |
239	   +---------------------------+

241	   +------------------------+
242	   | PhishingReport         |
243	   +------------------------+
244	   | ENUM Version           |<>--(0..*)--[ PhishParameter      ]
245	   | ENUM PhishType         |----(0..*)--[ PhishedBrandName    ]
246	   |                        |<>--(0..*)--[ DataCollectionType  ]
247	   |                        |<>--(0..*)--[ DataCollectionSite  ]
248	   |                        |<>----------[ OriginatingSensor   ]
249	   |                        |<>--(0..*)--[ TakeDownInfo        ]
250	   |                        |<>--(0..*)--[ ArchivedData        ]
251	   |                        |<>--(0..*)--[ RelatedSites        ]
252	   |                        |<>--(0..*)--[ CorrelationData     ]
253	   |                        |----(0..1)--[ Comments            ]
254	   +------------------------+

256	   Figure 2.  The PhishingReport Extensions to the INCH XML
257	   Incident.AdditionalData class

259	3.1  Version parameter

261	   STRING.  The version shall be the value 1.0 to be compliant with this
262	   document.

264	3.2  PhishType parameter

266	   ENUM.  The PhishType attribute contains one of the following numbers
267	   representing these types:
268	       1.  Email, and the PhishParameter is the email subject line of
269	       the phishing email.  This is a standard email phish, usually sent
270	       by spam.
271	       2.  Fraudsite, no PhishParamerter.  This identifies a known
272	       fraudulent site that does not necessarily send spam lures.
273	       3.  DNSspoof, with the malware name as a parameter.  This is used
274	       for a spoofed DNS (e.g., malware changes localhost file so visits
275	       to www.example.com go to another IP address).

277	       4.  Keylogger, with the malware name as the PhishParameter.
278	       5.  OLE, no parameter.  This identifies background OLE
279	       information.
280	       6.  IM, no parameter.
281	       7.  CVE, with the CVEnumber as the PhishParameter.
282	       8.  SiteArchive, with the data archived from the phishing server
283	       placed in the ArchiveInfo class.
284	       9.  Unknown.

286	   When a PhishParameter is required, it is one of the following values:

288	   SubjectLine element
289	       STRING.  This is the subject line of the email lure.

291	   MalwareName element
292	       STRING.  This is the name of the malware that installed the
293	       keylogger or DNSspoofer.

295	   CVENumber element
296	       STRING.  This is the CVEidentifier of this exploit used to phish.

298	3.3  PhishedBrandName element

300	   STRING.  This is the identifier of the recognized brandname or
301	   company name used to launch the phishing activity.

303	3.4  DataCollectionType element

305	   ENUM.  This is the method of data collection, as determined by
306	   analysing the victim computer, lure, or malware.

308	       1.  Web.  The user is redirected to a website to collects the
309	       data.
310	       2.  Email Form.  A form is embedded in the email lure.
311	       3.  Keylogger.  Some form of keylogger was offered.
312	       4.  Automation.  Other forms of automation such as background OLE
313	       automation.
314	       5.  Unspecified.

316	3.5  DataCollectionSite class

318	   This is the collection site where phished data is sent.

320	   +-----------------------+
321	   | DataCollectionSite    |
322	   +-----------------------+
323	   | ENUM Type             |<>--(0..*)---[ SiteData ]
324	   +-----------------------+
325	   Type parameter
326	       1.  Web, no parameter.  Data is collected on a website.
327	       2.  Email, with email site(es), comma separated, as parameter(s).
328	       Data is sent to one or more email addresses.
329	       3.  IP Address, with protocol and comma separated IP address(es)
330	       as parameter(s).  Data is sent to one or more IP addresses using
331	       the identified protocol.
332	       4.  Unknown.

334	   SiteData is one of the following, depending on the type.
335	       STRING Site URL.
336	       STRING Email Site
337	       ADDRESS site IP

339	3.6  OriginatingSensor class

341	   +--------------------+
342	   | OriginatingSensor  |
343	   +--------------------+
344	   | ENUM Type          |<>---(0..*)---[ OriginatingSensorName      ]
345	   |                    |<>---(0..*)---[ OriginatingSensorIPAddress ]
346	   |                    |<>------------[ OriginatingSensorFirstSeen ]
347	   +--------------------+

349	   The OriginatingSensor requires a type value and identification of the
350	   entity that generated this report.

352	   Type parameter is an ENUM from the following:
353	       1.  Web Server/Service.
354	       2.  Web Gateway (Proxy or Firewall).
355	       3.  Mail Gateway.
356	       4.  Browser Element.
357	       5.  ISP sensor.
358	       6.  Human
359	       7.  Other.

361	   OriginatingSensorName element
362	       STRING.  This is the DNS name of the entity that generated this
363	       report.

365	   OriginatingSensorIPAddress element
366	       ADDRESS.  This is the IPAddress of the entity that generated this
367	       report.

369	   OriginatingSensorFirstSeen element
370	       DATETIME.  This is the date and time that this sensor first saw
371	       this phishing activity.

373	3.7  TakeDownInfo class

375	   +-------------------+
376	   | TakeDownInfo      |
377	   +-------------------+
378	   |                   |<>---(0..1)--[ TakeDownDate        ]
379	   |                   |<>---(0..1)--[ TakeDownAgency      ]
380	   |                   |<>---(0..1)--[ TakeDownComments    ]
381	   +------------------------------+

383	   This class identifies information regarding the disablement of the
384	   phish collector site.  A PhishingReport may have multiple
385	   TakeDownInfo classes.

387	   TakeDownDate element
388	       DATETIME.  This is the date and time that takedown occurred.

390	   TakeDownAgency element
391	       STRING.  This is a freeform string identifying the agency that
392	       performed the takedown

394	   TakeDownComments element
395	       STRING.  A free form field to add any exciting details of the
396	       this takedown effort.

398	3.8  ArchivedData element

400	   +-------------------+
401	   | ArchivedData      |
402	   +-------------------+
403	   | ENUM Type         |<>---(0..1)--[ ArchivedDataURL       ]
404	   |                   |<>---(0..1)--[ ArchivedDataComments  ]
405	   +------------------------------+

407	   The element is used to type and include a gzip archive file o a
408	   datacolection site , basecamp, or other site where the phisher
409	   developed their code.  This element will be populated when, for
410	   example, an ISP takes down a phisher's web site and has copied the
411	   site data into an archive file.  There are three types of archives
412	   currently supported, as specified in the type filed.

414	   Type parameter
415	       1.  Data Collection Site.
416	       2.  Basecamp Site.
417	       3.  Sender Site

419	   ArchivedDataURL
420	       URL.  This is the URL where the gzip archive file is located.
421	       [As archives are quite large, a Phishing Report just points out
422	       where the archive is, and doe snot include it in the report.]

424	   ArchivedDataComments
425	       STRING.  This field is a free form area where one can comment on
426	       the archive and/or URL, if they so please.

428	3.9  RelatedSites element

430	   URL.  These are non-phish web sites that are related to this incident
431	   (e.g., victim site, etc).

433	3.10  CorrelationData element

435	   STRING.  Any information that correlates this incident to other
436	   incidents can be entered here.

438	3.11  Comments element

440	   STRING.  Comments specific to this phishing activity that does not
441	   fit in any other field.

443	4.  Definition of PhishRecord class

445	   Extensions are also made to the Incident EventData Additional Data
446	   class, to support descriptive information received in phish emails.

448	4.1  PhishRecord class

450	   Data specific to this phishing activity is represented within a new
451	   extenxion to the RecordItem class of the RecordData class of an
452	   EventData class in an Incident class.
453	   +-----------------------+
454	   | RecordItem            |
455	   +-----------------------+
456	   | ANY (PhishRecord)     |
457	   | ENUM Type (xml)       |
458	   +-----------------------+

460	   +-----------------------+
461	   | PhishRecord           |
462	   +-----------------------+
463	   |                       |<>--(0..*)---[ EmailOriginatorIP  ]
464	   |                       |<>--(0..*)---[ EmailHeader        ]
465	   |                       |<>--(0..*)---[ EmailBody          ]
466	   |                       |<>--(0..*)---[ EmailComments      ]
467	   +-----------------------+

469	   EmailOriginatorIP element
470	       ADDRESS.  This is the IP Address of the site originating the
471	       phish email.

473	   EmailHeader element
474	       STRING.  The headers of the phish email are included in this
475	       class.

477	   EmailBody element
478	       STRING.  The body of the phish email is included here.

480	   EmailComments element
481	       STRING.  Data not placed elsewhere about this email may be added
482	       in this field.

484	5.  IODEF Required Elements

486	   A Phishing Report requires certain identifying information, which is
487	   contained within the standard IODEF Incident data structure.  The
488	   following table identifies attributes used in a Phishing Activity
489	   Report and their obligation.  Note that the Required column notes
490	   attributes required by the base IODEF Incident class.  Attributes
491	   identified as required SHALL be populated in conforming phishing
492	   activity reports.

494	   The following table is a visual description of the required fields.

496	   +--------------+
497	   | Incident     |
498	   +--------------+
499	   | ENUM |---[ IncidentID     ]
500	   |      |---[ Assessment     ]|---[ Confidence  ]
501	   |      |---[ ReportTime     ]
502	   |      |---[ Contact        ]|---[ Role        ]
503	   |      |                     |---[ Type        ]|---[ Name              ]
504	   |      |---[ AdditionalData ]|---[ PhishReport ]|---[ Version           ]
505	   |      |                                        |---[ PhishType         ]
506	   |      |                                        |---[ OriginatingSensor ]|---OriginatingSensorFirstSeen ]
507	   |      |
508	   +------+

510	   These following MUST be populated in a compliant Phishing Activity
511	   Report:
512	      IncidentID
513	      Assessment -> Confidence
514	      ReportTime
515	      Contact -> Role
516	      Contact -> Type
517	      Contact -> Name
518	      AdditionalData -> PhishReport -> Version
519	      AdditionalData -> PhishReport -> PhishType
520	      AdditionalData -> PhishReport -> OriginatingSensor ->
521	      OriginatingSensorFirstSeen

523	6.  Guidance on Usage

525	   It may be apparent that the mandatory attributes for a phishing
526	   activity report make for a quite sparse report.  As incident
527	   forensics and data analysis require detailed information, the
528	   originator of a PhishingReport should include any tidbit of
529	   information gleaned from the attack.  Information that is considered
530	   more sensitive than public can be marked to a higher sensitivity
531	   using the restriction paramater of each data class.

533	7.  Sample Phishing Report

535	   A sample (and useful) phishing activity report, that is one that has
536	   only the required and data items populated, is as follows:
537	       [ ed.  To be supplied]

539	8.  Security Considerations

541	   This document specifies the format of security incident data.  As
542	   such, the security of transactions containing the incident report
543	   will vary from organization to organization.  We do not want to
544	   burden the information exchange with unnecessary encryption
545	   requirements, as the transport service for the data exchange may
546	   provide adequate protections, or even encryption.  The use of
547	   encryption is expected to be agreed upon on originator-recipient
548	   agreement.

550	   The critical security concern is that phishing activity reports may
551	   be falsified or the report may become corrupt during transit.
552	   Applying a digital signature on each report will counteract both of
553	   these concerns, but again the signature may be overkill for most
554	   activity report users.  For this reason, phishing activity reports
555	   SHOULD be digitally signed with the optional IODEF XML signature,
556	   although we expect that each receiving entity will determine the need
557	   for this signature independently.  Generators of phishing activity
558	   reports SHOULD digitally sign each report.

560	   Originators of phishing activity reports SHOULD digitally sign their
561	   report with the XML signature as described in [INCH] .

563	   Recipients of phishing reports SHALL be prepared to accept XML
564	   digitally signed reports and SHOULD support receiving encrypted
565	   reports.

567	9.  IANA Considerations

569	   This document has no actions for IANA.

571	10.  Contributors

573	   This document has received significant assistance from two groups
574	   addressing the phishing problem: members of the Anti-Phishing Working
575	   Group and participants in the Financial Services Technology
576	   Consortium's Counter-Phishing project.

578	11  Normative References

580	   [IDMEF]    Curry, D. and H. Debar, "The Intrusion Detection Message
581	              Exchange Format", July 2004.

583	   [INCH]     Meijer, J., Danyliw and Demchenko, "The Incident Object
584	              Description Exchange Format Data Model and XML
585	              Implementation", November 2004.

587	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
588	              Requirement Levels", BCP 14, RFC 2119, March 1997.

590	Authors' Addresses

592	   David Jevans
593	   The Anti-Phishing Working Group
594	   38 Rice Street, Suites 2-0/2-2
595	   Cambridge, MA, 02140
596	   USA

598	   EMail: dave.jevans@antiphishing.org

600	   Patrick Cain (editor)
601	   The Cooper-Cain Group
602	   P.O. Box 400992
603	   Cambridge, MA
604	   USA

606	   EMail: pcain@coopercain.com

608	Appendix A.  Phishing Data DTD

610	   
611	   
628	   

637	   

643	   
644	   
645	   

647	   

664	   
669	   
671	   

674	   
678	   

684	   

690	   

692	   
697	   

699	   
704	   

706	   
707	   

711	   

726	   

728	   
729	   

733	   
745	   

748	   
752	    

756	   
757	   
758	   

760	   

766	   

768	   

781	   
783	   
784	   
785	   

787	   
799	   

801	   
803	   
806	   
807	   

809	   

815	   

817	   

823	   

825	   

847	   

851	   
852	   
853	   
854	   

856	Appendix B.  Example of a Complete Phishing Activity Report

858	   
859	   

861	   
862	   
863	   pat_001
864	     
865	     EST
866	     
867	     2004-12-31:01:42
868	     
869	       
870	     
871	     
872	       
873	         
874	           
875	             
876	               
877	                 207.148.245.213
878	               
879	               Empty header 
880	               empty body
881	                This was fake email to keep the
882	                 size of the sample report small.
883	               
884	             
885	           
886	         
887	       
888	     
889	     
890	       
891	         
892	           
893	             2004-12-15:23:53:01
894	           
895	         
896	       
897	     
898	   
899	   

901	Appendix C.  Mapping from the APWG work into this Document

903	   Note: This appendix is Informational and will be removed in the next
904	   version of the document.

906	   As this document incorporates some previous work done by the APWG,
907	   this section identifies where the APWG-required data items map into
908	   the INCH data structures.  The following figure summarizes the APWG
909	   nomenclature as expressed as Incident classes/fields.

911	   +------------------+---------------------+--------------------------+
912	   | APWG identifier  | member              | IODEF class              |
913	   +------------------+---------------------+--------------------------+
914	   | phishingreport   | uniqueid            | Incident.IncidentID      |
915	   |                  |                     |                          |
916	   | Header           |                     | Incident                 |
917	   |                  |                     |                          |
918	   |                  | format version      | EventData.AdditionalData |
919	   |                  |                     | PhishingReport.Version   |
920	   |                  |                     |                          |
921	   |                  | datecreated         | Incident.ReportTime      |
922	   |                  |                     |                          |
923	   |                  | reporterorg         | IncidentID.UID           |
924	   |                  |                     |                          |
925	   |                  | reportername        | Incident.Contact         |
926	   |                  |                     |                          |
927	   |                  | reporteremail       | Incident.Contact.Email   |
928	   |                  |                     |                          |
929	   |                  | reportersignature   | (still under flux in     |
930	   |                  |                     | Incident)                |
931	   |                  |                     |                          |
932	   |                  | comments            | Incident.Description or  |
933	   |                  |                     | Incident.EventData.Descr |
934	   |                  |                     | ption                    |
935	   |                  |                     |                          |
936	   |                  | aggregateflag       | Multiple EventData       |
937	   |                  |                     | structures               |
938	   |                  |                     |                          |
939	   | phish            |                     | EventData.               |
940	   |                  |                     |                          |
941	   |                  | datedetected        | DetectTime               |
942	   |                  |                     |                          |
943	   |                  | phishtype           | EventData.AdditionalData |
944	   |                  |                     | PhishingReport.Eventtype |
945	   |                  |                     |                          |
946	   |                  | datacollectiontype  | EventData.AdditionalData |
947	   |                  |                     | PhishingReport.DataColle |
948	   |                  |                     | tionType                 |
949	   |                  |                     |                          |
950	   |                  | datacollectionsite  | EventData.AdditionalData |
951	   |                  |                     | PhishingReport.DataColle |
952	   |                  |                     | tionSite                 |
953	   |                  |                     |                          |
954	   |                  | originatingsensorty | EventData.AdditionalData |
955	   |                  | e                   | PhishingReport.Originato |
956	   |                  |                     | Sensor.Type              |
957	   |                  |                     |                          |
958	   |                  | originatingsensorna | EventData.AdditionalData |
959	   |                  | e                   | PhishingReport.Originati |
960	   |                  |                     | gSensor.SensorName       |
961	   |                  |                     |                          |
962	   |                  | originatingsensorIP | EventData.AdditionalData |
963	   |                  | ddress              | PhishingReport.Originati |
964	   |                  |                     | gSensor.IPaddress        |
965	   |                  |                     |                          |
966	   |                  | forensics           | EventData.Record         |
967	   |                  |                     |                          |
968	   |                  | emailsite-url       | PhishReport.DataCollecto |
969	   |                  |                     | Site.SiteData            |
970	   |                  |                     |                          |
971	   |                  | site-url            | PhishReport.DataCollecto |
972	   |                  |                     | Site.SiteData            |
973	   |                  |                     |                          |
974	   |                  | emailsubject        | PhishReport.PhishParamet |
975	   |                  |                     | r                        |
976	   |                  |                     |                          |
977	   |                  | takedowndate        | PhishingReport.TakedownI |
978	   |                  |                     | fo.Date                  |
979	   |                  |                     |                          |
980	   |                  | takedownagency      | EventData.AdditionalData |
981	   |                  |                     | PhishingReport.TakedownI |
982	   |                  |                     | fo.Agency                |
983	   |                  |                     |                          |
984	   |                  | site-ip             | PhishReport.PhishParamet |
985	   |                  |                     | r and                    |
986	   |                  |                     | PhishReport.DataCollecto |
987	   |                  |                     | Site.SiteData            |
988	   |                  |                     |                          |
989	   |                  | emailheaders        | PhishRecord.EmailHeaders |
990	   |                  |                     |                          |
991	   |                  | emailbody           | PhishRecord.EmailBody    |
992	   |                  |                     |                          |
993	   |                  | brand               | PhishReport.PhishedBrand |
994	   |                  |                     | ame                      |
995	   |                  |                     |                          |
996	   |                  | senderip            |                          |
997	   |                  |                     |                          |
998	   |                  | otherlink           | PhishingReport.RelatedSi |
999	   |                  |                     | es                       |
1000	   |                  |                     |                          |
1001	   |                  | correlations        | PhishingReport.Correlati |
1002	   |                  |                     | nData                    |
1003	   |                  |                     |                          |
1004	   |                  | comments            | EventData.               |
1005	   +------------------+---------------------+--------------------------+

1007	C.1  Overall Format

1009	   Each phishing report is encapsulated in the phishingreport element

1011	    header followed by one or more reports
1012	    One or more phishing reports are included.

1014	C.2  Header Format

1016	   Each report must have a header.  Each header MUST have:

1018	   >formatversion< version number>/formatversion< The
1019	   version of the XML reporting format.  Eg.  1.0

1021	    32-bit UNIX time This is the
1022	   date that the phish report was created.

1024	    organization who created the phish report
1025	   Name or other id of who created the phish report (not the name of the
1026	   person who submitted it).  Do we need a unique database of
1027	   reporternames? Eg.  TMWD ��� Tumbleweed Communications Corp.

1029	   Each header MAY have:

1031	   name of the person who created the
1032	   report Name of an individual.

1034	   email address of the person who created the
1035	   report email address of the individual who created
1036	   the report.

1038	   digitalsignature An XML
1039	   digital signature of the canonicalized file, everything between the
1040	   .  We need the hash of the document,
1041	   the certs or URL to them, and the signature.  What format should that
1042	   be? XML-DSIG? This verifies the authenticity of the report.

1044	   >ext Any comments that the reporter chooses to
1045	   add to the individual phish report.

1047	   01 or nn This is a flag for whether
1048	   this XML doc represents a single phish attack event; or it is an
1049	   aggregated document that represents nn discrete events

1051	C.3  Individual Report Format

1053	   Each individual report is encapsulated between phish.

1055	    report fields  Each individual
1056	   report is encapsulated between the  with the
1057	   uniqueids.

1059	   Each report MUST have:

1061	   32-bit UNIX time This is the date that
1062	   the phish was reported by a consumer or detected by a trap or other
1063	   means.

1065	   phishtype optional_parameter

1067	   One of the following.

1069	   +----------------------+----------------------+---------------------+
1070	   | String               | Parameter            | Description         |
1071	   +----------------------+----------------------+---------------------+
1072	   | Email                |                      | a standard email    |
1073	   |                      |                      | phish, usually sent |
1074	   |                      |                      | by spam             |
1075	   |                      |                      |                     |
1076	   | Fraudsite            | a known fraudulent   | DNSspoof            |
1077	   |                      | site that does not   |                     |
1078	   |                      | necessarily send     |                     |
1079	   |                      | spam lures           |                     |
1080	   |                      |                      |                     |
1081	   | malwarename          | spoofed DNS (eg.     | Keylogger           |
1082	   |                      | Malware changes      |                     |
1083	   |                      | localhost file so    |                     |
1084	   |                      | visits to            |                     |
1085	   |                      | www.example.com go   |                     |
1086	   |                      | to an incorrect IP   |                     |
1087	   |                      | address)             |                     |
1088	   |                      |                      |                     |
1089	   | malwarename          | a keylogger site     | OLE                 |
1090	   |                      |                      |                     |
1091	   |                      | Background OLE       | IM                  |
1092	   |                      | Automation           |                     |
1093	   |                      |                      |                     |
1094	   |                      | Instant              | CVE                 |
1095	   |                      | message/NNTP/etc     |                     |
1096	   |                      |                      |                     |
1097	   | CVEnumber            | CVE number?? For     |                     |
1098	   |                      | malware exploits. Or |                     |
1099	   |                      | is this the          |                     |
1100	   |                      | keylogger            |                     |
1101	   |                      | malwarename?         |                     |
1102	   +----------------------+----------------------+---------------------+

1104	   The optional parameter is a string, without whitespace, that may be
1105	   used to name the malware that installed the keylogger or the
1106	   DNSspoofer.

1108	   Each report MAY have:

1110	   type

1112	   The method of data collection.  This is derived from the victim���s
1113	   computer (eg.  By analyzing the email lure or malware sent to them).
1114	   One of the following:

1116	   +---------------------------------+---------------------------------+
1117	   | String                          | Description                     |
1118	   +---------------------------------+---------------------------------+
1119	   | Web                             | User is redirected to a website |
1120	   |                                 | that collects the data          |
1121	   |                                 |                                 |
1122	   | EmailForm                       | A form is embedded in the email |
1123	   |                                 | lure                            |
1124	   |                                 |                                 |
1125	   | Keylogger                       | Some form of keystroke logger   |
1126	   |                                 |                                 |
1127	   | Automation                      | Other form of automation such   |
1128	   |                                 | as a background OLE automation  |
1129	   +---------------------------------+---------------------------------+

1131	   NOTE: This is somewhat redundant with phishtype, especially if a
1132	   keylogger.

1134	   type optional_parameters

1136	   Where the data is sent.  This can be found by seizing a capture site
1137	   and analyzing the code on the server.  One of the following:

1139	   +----------------------+----------------------+---------------------+
1140	   | String               | Parameter            | Description         |
1141	   +----------------------+----------------------+---------------------+
1142	   | Web                  |                      | Data is collected   |
1143	   |                      |                      | on a website.       |
1144	   |                      |                      | Emailsite-url and   |
1145	   |                      |                      | site-url fields are |
1146	   |                      |                      | used to specify the |
1147	   |                      |                      | location of the     |
1148	   |                      |                      | site.               |
1149	   |                      |                      |                     |
1150	   | Email                | addr, addr           | Data is sent to one |
1151	   |                      |                      | or more email       |
1152	   |                      |                      | address. List them. |
1153	   |                      |                      | Comma separated.    |
1154	   |                      |                      |                     |
1155	   | IP                   | ip, IP               | Data is sent to one |
1156	   |                      |                      | or more IP address, |
1157	   |                      |                      | comma separated.    |
1158	   |                      |                      | (how to specify     |
1159	   |                      |                      | protocol e.g.,      |
1160	   |                      |                      | IRC?)               |
1161	   +----------------------+----------------------+---------------------+

1163	   type

1165	   The type of technology that generated this XML document.  One of the
1166	   following:

1168	   +---------------------------------+---------------------------------+
1169	   | String                          | Description                     |
1170	   +---------------------------------+---------------------------------+
1171	   | Web Server/Service              | This XML doc was generated by a |
1172	   |                                 | web server or service           |
1173	   |                                 |                                 |
1174	   | Web gateway                     | This XML doc was generated by a |
1175	   |                                 | web gateway                     |
1176	   |                                 |                                 |
1177	   | Mail gateway                    | This XML doc was generated by a |
1178	   |                                 | mail gateway                    |
1179	   |                                 |                                 |
1180	   | Browser element                 | This XML doc was generated by a |
1181	   |                                 | web browser element (i.e.       |
1182	   |                                 | plugin)                         |
1183	   |                                 |                                 |
1184	   | ISP sensor                      | An ISP sensor generated this    |
1185	   |                                 | XML document                    |
1186	   |                                 |                                 |
1187	   | Human                           | A Human generated this XML      |
1188	   |                                 | doc/report (e.g.,. Discovered   |
1189	   |                                 | phishing base camp)             |
1190	   |                                 |                                 |
1191	   | Other technology                | This XML doc was generated by   |
1192	   |                                 | some other technology           |
1193	   +---------------------------------+---------------------------------+

1195	   name

1197	   The DNS name of the entity that generated this XML document.

1199	   name

1201	   The IP address of the entity that generated this XML document.

1203	   /forensics<

1205	   Any length of strings of forensic information.  Useful for law
1206	   enforcement.  This could be watermarks in images, comments in HTML
1207	   fields, poisoned user data.

1209	   URL

1211	   This is the base URI of phishing site that is included in the email
1212	   lure.  This can be used by email spam filters to detect and filter
1213	   out phishing emails by posting it to SURBL.  This also can be used in
1214	   a Web browser to access the phishing site.

1216	   If the site is an SSL site, then the URL specifies https://URL

1218	   URL

1220	   This is the URI of the phishing data collection site that the browser
1221	   actually goes to in order to post data.  This may differ from the
1222	   emailsite-url, because the URL included in the email might redirect
1223	   users to the actual data collection site, which is the site-url.  The
1224	   emailsite-url is useful for spam filters, the site-url is useful for
1225	   takedown, law enforcement, or web proxy filters to prevent users from
1226	   visiting the collection site.

1228	   If the site is an SSL site, then the URL specifies https://URL

1230	   subject

1232	   The subject of the email phish lure.

1234	   UNIX 32-bit time
1235	   If the site has been taken down, this is the date and time when that
1236	   was effected.  Which site? Redirector or data collection site?
1237	   Multiples with designator?

1239	   string

1241	   Who took the site down.  If more than one party took it down, you may
1242	   list multiple parties as freeform text here, or have multiple
1243	   takedownagency fields.

1245	   >p address (port number optional if not 80)

1247	   The IP address of the server hosting the phishing site in standard IP
1248	   address format A.C.D.E:portnum.  If no portnumber specified, then
1249	   port 80 assumed.

1251	   These IP addresses could be used by ISPs and web filters to block
1252	   access to servers.  However, this is dangerous if the sites are
1253	   running on hacked servers or ISPs that are hosting legitimate sites
1254	   as well.  It can be very useful to filter out access to servers that
1255	   have hijacked DNS through modifying localhosts files for example
1256	   (e.g., 11.1.2004).

1258	   body

1260	   The body of the email.  I think we need the uniqueid strings.

1262	   What about when the body is an image only? Ex.  GDI exploit to
1263	   install keylogger or single image with hyperlink.

1265	   headers

1267	   The headers of the email

1269	   Do we need to create xml records for each entry in a decomposed
1270	   header? No, only the open relays and the apparent source and possibly
1271	   a few others..

1273	   brand name

1275	   The name of the company who���s brand is being used to launch the
1276	   phishing attack

1278	   IP address (optional port number)

1280	   The IP address of the mail server or relay that delivered the
1281	   phishing email.  This can be used for RBLs.  A single attack may have
1282	   multiple senderips if the mail was sent from multiple relays.

1284	   URL

1286	   Links to non-phish sites that may be relevant (victim site, other
1287	   sites)

1289	   strings

1291	   Any correlations to known phishing kits or groups.  Freeform text.

1293	   text

1295	   Any freeform text comments that the reporter chooses to add to the
1296	   individual phish report.  e.g.,.  "images sourced from victim
1297	   online-banking site" or "background popup populated with victim
1298	   privacy statement".

1300	Appendix D.  Still To Do in This Document

1302	   This appendix will be removed when it is empty.

1304	   This list are the tasks that are still needed to be comleted withni
1305	   this document.
1306	      1.  Make a test report that verifies every possible option.
1307	      2.  Finish and insert the schema.
1308	      Add more detail on what the specific elements mean,

1310	Intellectual Property Statement

1312	   The IETF takes no position regarding the validity or scope of any
1313	   Intellectual Property Rights or other rights that might be claimed to
1314	   pertain to the implementation or use of the technology described in
1315	   this document or the extent to which any license under such rights
1316	   might or might not be available; nor does it represent that it has
1317	   made any independent effort to identify any such rights.  Information
1318	   on the procedures with respect to rights in RFC documents can be
1319	   found in BCP 78 and BCP 79.

1321	   Copies of IPR disclosures made to the IETF Secretariat and any
1322	   assurances of licenses to be made available, or the result of an
1323	   attempt made to obtain a general license or permission for the use of
1324	   such proprietary rights by implementers or users of this
1325	   specification can be obtained from the IETF on-line IPR repository at
1326	   http://www.ietf.org/ipr.

1328	   The IETF invites any interested party to bring to its attention any
1329	   copyrights, patents or patent applications, or other proprietary
1330	   rights that may cover technology that may be required to implement
1331	   this standard.  Please address the information to the IETF at
1332	   ietf-ipr@ietf.org.

1334	Disclaimer of Validity

1336	   This document and the information contained herein are provided on an
1337	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1338	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1339	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1340	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1341	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1342	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1344	Copyright Statement

1346	   Copyright (C) The Internet Society (2004).  This document is subject
1347	   to the rights, licenses and restrictions contained in BCP 78, and
1348	   except as set forth therein, the authors retain all their rights.

1350	Acknowledgment

1352	   Funding for the RFC Editor function is currently provided by the
1353	   Internet Society.