idnits 2.17.1 

draft-golovinsky-cloud-services-log-format-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 9, 2012) is 4211 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                      G. Golovinsky
3	Internet-Draft                                               S. Johnston
4	Intended status: Experimental
5	Expires: April 12, 2013                                          D. Birk
6	                                           Ruhr University Bochum; Horst
7	                                                Goertz Institute for IT
8	                                                                Security
9	                                                         October 9, 2012

11	        Syslog Extension for Cloud Using Syslog Structured Data
12	             draft-golovinsky-cloud-services-log-format-03

14	Abstract

16	   This document provides an open and extensible log format to be used
17	   by any cloud entity or cloud application to log and trace activities
18	   that occur in the cloud.  The logs and traces can be utilized for
19	   billing, charging, and debugging purposes.  In addition, these logs
20	   and traces are equally applicable for cloud infrastructure (IaaS),
21	   platform (PaaS), and application (SaaS) services.  CloudLog is
22	   different in content, but not in nature from the traditional logging
23	   as it takes in account transient nature of Identities and resources
24	   in the cloud.

26	Status of this Memo

28	   This Internet-Draft is submitted in full conformance with the
29	   provisions of BCP 78 and BCP 79.

31	   Internet-Drafts are working documents of the Internet Engineering
32	   Task Force (IETF).  Note that other groups may also distribute
33	   working documents as Internet-Drafts.  The list of current Internet-
34	   Drafts is at http://datatracker.ietf.org/drafts/current/.

36	   Internet-Drafts are draft documents valid for a maximum of six months
37	   and may be updated, replaced, or obsoleted by other documents at any
38	   time.  It is inappropriate to use Internet-Drafts as reference
39	   material or to cite them other than as "work in progress."

41	   This Internet-Draft will expire on April 12, 2013.

43	Copyright Notice

45	   Copyright (c) 2012 IETF Trust and the persons identified as the
46	   document authors.  All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	Table of Contents

60	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
61	   2.  Conventions Used in This Document  . . . . . . . . . . . . . .  3
62	   3.  Problem Statement  . . . . . . . . . . . . . . . . . . . . . .  3
63	     3.1.  Scope of the application . . . . . . . . . . . . . . . . .  3
64	     3.2.  The Traditional Logging and its Applications . . . . . . .  3
65	     3.3.  Challenges with the cloud deployment . . . . . . . . . . .  4
66	       3.3.1.  SaaS Use Case  . . . . . . . . . . . . . . . . . . . .  4
67	       3.3.2.  PaaS Use Case  . . . . . . . . . . . . . . . . . . . .  5
68	       3.3.3.  IaaS Use Case  . . . . . . . . . . . . . . . . . . . .  5
69	   4.  Cloud Log Structured Data Definitions  . . . . . . . . . . . .  6
70	     4.1.  SD-ELEMENT context . . . . . . . . . . . . . . . . . . . .  6
71	       4.1.1.  SD-PARAM aid - Mandatory . . . . . . . . . . . . . . .  6
72	       4.1.2.  SD-PARAM provider - Optional . . . . . . . . . . . . .  7
73	       4.1.3.  SD-PARAM rid - Optional  . . . . . . . . . . . . . . .  7
74	       4.1.4.  SD-PARAM eid - Optional  . . . . . . . . . . . . . . .  7
75	     4.2.  SD-ELEMENT transit . . . . . . . . . . . . . . . . . . . .  7
76	       4.2.1.  SD-PARAM client - Mandatory  . . . . . . . . . . . . .  7
77	       4.2.2.  SD-PARAM gw - Optional . . . . . . . . . . . . . . . .  8
78	   5.  Log Format Samples . . . . . . . . . . . . . . . . . . . . . .  8
79	     5.1.  Log Sample of Simple Non-Authenticated Request . . . . . .  8
80	     5.2.  Successful Authenticated User Request  . . . . . . . . . .  8
81	     5.3.  Log Sample of Successful Request on Behalf of Another
82	           Identity . . . . . . . . . . . . . . . . . . . . . . . . .  9
83	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . .  9
84	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  9
85	     7.1.  SD-IDs . . . . . . . . . . . . . . . . . . . . . . . . . . 10
86	   8.  Normative References . . . . . . . . . . . . . . . . . . . . . 10
87	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10

89	1.  Introduction

91	   This document describes a standard for syslog structured data
92	   elements in messages generated by services that may be running on
93	   different physical or virtual machines when those services are
94	   processing information generated by a single request.  The purpose of
95	   which is to provide an audit trail that allows correlation of such
96	   messages.  In addition, this document defines a number of parameters
97	   that MUST or SHOULD be included in these structured data elements so
98	   these messages can be used to identify users of such services, when
99	   the real and/or effective identities of users is known.

101	2.  Conventions Used in This Document

103	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
104	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
105	   document are to be interpreted as described in RFC 2119 [RFC2119].

107	3.  Problem Statement

109	3.1.  Scope of the application

111	   The three service models proposed by the NIST differ in the way the
112	   single cloud services are offered to the customers.  Hence, besides
113	   the usage of general logging concepts which can be applied to all
114	   three service models alike, individual logging measures for each
115	   single service model with its specific circumstances have to be taken
116	   into account.

118	3.2.  The Traditional Logging and its Applications

120	   Practically all hardware and software entities deployed on the
121	   network log their activities.  Network elements such as routers,
122	   servers, firewalls and switches log information about their
123	   activities using mostly Syslog (except for Windows).  Applications
124	   running on the network also log activities, but often using
125	   proprietary mechanisms.  While logging mechanisms are inconsistent
126	   between different entities - Syslog, Windows events, proprietary
127	   files - they generally carry enough information to identify type of
128	   the activity, time of the occurrence, physical entity involved in the
129	   event, and often user(s) that participated in the event.
130	   Availability of this information is crucial for accomplishing
131	   multiple business objectives ranging from assuring security and
132	   performing forensics to adhering to compliance regulations (SOX, PCI,
133	   etc.).  The existence of logs and information in them is necessary,
134	   but not sufficient for achieving security, compliance and other
135	   business objectives.  The process of collecting, processing,
136	   searching and even simply interpreting information in logs is
137	   exceptionally labor and time consuming process and often cannot even
138	   be done on any meaningful scale without appropriate tools in place.
139	   Log Management tools used to solve the problem of scale and
140	   interpretation heavily depend on the fact that format of logs is
141	   largely well defined and understood.

143	3.3.  Challenges with the cloud deployment

145	   In cloud deployments the situation with availability of logs in
146	   reliability of information in them is drastically different.  By
147	   definition, cloud resources are shared.  A piece of hardware is now
148	   running multiple Virtual Instances of "it".  They can be brought up
149	   and down within very short period of time and at any given moment the
150	   hardware can be shared not just by different users but by different
151	   users from different companies.  Even if Linux or Windows VMs
152	   continue to log their activity the information in these logs is very
153	   likely to be irrelevant since you cannot really tie logs to the
154	   physical entity.  Moreover, even if one managed to map logs to a
155	   physical entity, there is absolutely no guarantee that the same VM
156	   image will be running on the same hardware in its next reincarnation.
157	   And there is really no clear way to determine how many users share
158	   the hardware and what are their identities and roles.  Tracing
159	   environmental changes is practically impossible task unless there is
160	   traceability between physical and virtual entities.  As a result,
161	   achieving such business objectives as adhering to compliance
162	   regulations or performing regular security auditing is very difficult
163	   if not an impossible task.

165	   Generally, logging mechanisms for cloud environments do not differ in
166	   the way traditional logging mechanisms work.  However, the
167	   environmental circumstances of the cloud presuppose additional
168	   measurements.  Customers mostly rely on the CSP if logging data is
169	   required.  In SaaS scenarios, the customers have almost no chance to
170	   prepare the application with additional logging features.  This
171	   situation slightly changes for PaaS and also states in IaaS a
172	   tremendous problem.  Hence, logging standards should be applied by
173	   the CSP in order to improve this situation for the customers.  The
174	   following use cases underline the need for an additional standard and
175	   the differentiation between the various cloud services.

177	3.3.1.  SaaS Use Case

179	   In SaaS scenarios, the CSP obtains all the power over the application
180	   itself and the offered services.  The customer mainly uses a client
181	   device for communicating with a specific API offered by the CSP.  In
182	   most of the cases, the user agent on the client is a web browser
183	   communicating with a web application located on the server
184	   infrastructure of the CSP.  Unfortunately, the customer does not
185	   obtain the ability to manage or control the underlying cloud
186	   infrastructure, network components, servers, operating systems etc.
187	   Hence, the CSP has to provide additional logging mechanisms to
188	   improve this situation.  In case of a web-based email service, the
189	   customer has almost no chance to figure out whether his account has
190	   been compromised or accessed from an unknown IP address.  Even some
191	   providers provide some of the last IP addresses which accessed the
192	   application, this procedure does not solve the problem of NAT or used
193	   proxies.  Furthermore, if the customer's account has been
194	   compromised, he can't determine which emails have been edited or
195	   accessed by the adversary.  Additional, fine granular logging
196	   mechanisms could improve this situation for the customer and even
197	   forensic investigations in case of an account compromise could be
198	   possible.

200	3.3.2.  PaaS Use Case

202	   The logging situation in PaaS scenarios slightly changes compared to
203	   SaaS.  The CSP decides which system-specific logging information is
204	   provided to the customers, however, the application deployed by the
205	   customer can contain hard-coded logging features.  This unfortunately
206	   requires the underlying OS environment to support that.  For
207	   instance, the application could contain mechanisms which transfer
208	   encrypted and signed logging data to third party logging servers in
209	   real-time.  CSP claim that the transfer of data between the PaaS
210	   instance and the corresponding database backend is encrypted.  This
211	   can hardly be confirmed by the customer.  Hence, customers should not
212	   rely on such promises but apply their own logging mechanisms as far
213	   as possible.  This logging information could be improved by
214	   information provided by CSP which cannot directly been extracted by
215	   the customer application.

217	3.3.3.  IaaS Use Case

219	   In IaaS cloud environments the situation with availability of logs in
220	   reliability of information in them has somewhat been improved.  The
221	   customers can prepare their VM for logging purposes and control the
222	   single instance.  Therefore, crucial application specific logging
223	   information can be collected by the customer itself under the
224	   theoretical reserve, that the CSP can theoretically maliciously or
225	   unintentionally modify this logging information.  Unfortunately, by
226	   definition, cloud resources are shared.  This means, the customer
227	   could share the same physical host with an potential adversary.
228	   Hence, it is of greater importance whether the customer shares the
229	   physical host with any other tenant or is the only virtual instance.
230	   This information cannot be obtained by the customer without the help
231	   of the CSP.  This situation is further complicated by the flexibility
232	   of the cloud.  Within a short range of time, virtual instances are
233	   transferred to other physical hosts without the knowledge of the
234	   customer.  These transactions cannot be detected and logged by the
235	   customer without the assistance of the CSP.  IaaS cloud environments
236	   should provide the ability to detect and log the bounding of the
237	   virtual instance to a specific hardware.  For an exhaustive forensic
238	   analysis of an incident, this information is however of greater
239	   importance.  Moreover network components containing important
240	   information about the network in which the instance is deployed,
241	   cannot be accessed by the customer without the help of the CSP.  As a
242	   result, achieving such business objectives as adhering to compliance
243	   regulations or performing regular security auditing is very difficult
244	   if not an impossible task.

246	4.  Cloud Log Structured Data Definitions

248	   1.  RUI - real user identity, the identity of the user that has
249	       authenticated to the entity.

251	   2.  EUI - effective or impersonated user identity, the identity of
252	       the user that the real user identity is acting for.  For example,
253	       an administrator account could have the ability to impersonate
254	       another user account.

256	   3.  Provider - is the domain, service, application, or other entity
257	       providing the user identities.

259	   Structured data elements, defined in RFC 5424 [RFC5424], provides a
260	   mechanism for adding data to syslog messages.  Since additional data
261	   is necessary to trace user identities and their activities in the
262	   cloud we use the mechanism of structured data elements to provide
263	   this additional information in the syslog messages.

265	4.1.  SD-ELEMENT context

267	   The SD-ELEMENT identified by the SD-ID "context" defines the context
268	   of the external request that causes for the activity to take place.
269	   The syslog message that is generated as a result of this activity
270	   should be identified by this "context".

272	4.1.1.  SD-PARAM aid - Mandatory

274	   The parameter "aid" represents the audit identifier, which uniquely
275	   identifies an external request for activity.  The value is a UTF-8-
276	   STRING representation of the UUID generated by the entity when
277	   request is received.

279	   This parameter MUST be present within the SD-ELEMENT "context".

281	4.1.2.  SD-PARAM provider - Optional

283	   The parameter "provider" represents the provider of the identity for
284	   the Real User Identity - 'rid' and Effective User Identity - 'eid',
285	   User identities are not always exist or available.  In cases that
286	   they are, either "rid" or "eid" MUST be present in the syslog
287	   messages.

289	   The parameter "provider" is not required, but SHOULD be present
290	   within the SD-ELEMENT "context" when either the 'rid' or 'eid'
291	   identifiers are present.

293	4.1.3.  SD-PARAM rid - Optional

295	   The parameter "rid" represents the real user identity.

297	   This parameter SHOULD be present within the SD-ELEMENT "context" when
298	   the real user identity is availbale.

300	4.1.4.  SD-PARAM eid - Optional

302	   The parameter "eid" represents the effective user identity.  This
303	   parameter SHOULD be present within the SD-ELEMENT "context" when user
304	   impersonation has happened and the effective user identity is
305	   available.

307	   The 'eid' parameter represents the effective user identity.

309	   This parameter SHOULD be present within the 'context' SD-ELEMENT when
310	   the effective user identity is known.

312	4.2.  SD-ELEMENT transit

314	   The SD-ELEMENT identified by the SD-ID "transit" defines logical
315	   gateway entities which were traversed while request for activity was
316	   routed to the final destination entity that would satisfy the
317	   request.

319	4.2.1.  SD-PARAM client - Mandatory

321	   The parameter "client" represents the IP address or Fully Qualified
322	   Domain Name (FQDN) of the client entity on behalf of which the
323	   request is being made.  This is different from SD-ID 'ip' in RFC 5424
324	   that defines IP of the entity producing the log message itself.  IPv4
325	   or IPv6 addresses MUST be represented as STRING-UTF-8 .

327	   The parameter "client" represents the IP address or FQDN of the
328	   client on behalf of which the request is being made.

330	4.2.2.  SD-PARAM gw - Optional

332	   The parameter "gw" represents a gateway entity through which the
333	   request for activity passes before arriving to the final destination
334	   entity actually responsible processing of the request.  The value of
335	   the parameter is comprised of the STRING-UTF-8 representation of UUID
336	   of the entity , identifying the gateway, a colon character (i.e.
337	   ':'), and finally the STRING-UTF-8 representation of IP address or
338	   FQDN of the gateway through which the request has been routed.

340	   This parameter MAY appear more than once within the SD-ELEMENT
341	   "transit" as request may pass through multiple gateway entities.
342	   Each occurrence represents a different gateway through which the
343	   request passed.

345	5.  Log Format Samples

347	5.1.  Log Sample of Simple Non-Authenticated Request

349	   Here is an example of a log produced as a result of simple non-
350	   authenticated request to a web service.  Only the mandatory
351	   parameters "aid" and "client" are represented.

353	   Jul 7 09:01:40 [context aid="9BE817EB-8ACC-1004-D9DF-
354	   00000A00065E"][transit client="56.2.222.83"] Initializing request to
355	   /example_api/index

357	   Jul 7 09:01:40 [context aid="9BE817EB-8ACC-1004-D9DF-
358	   00000A00065E"][transit client="56.2.222.83"] "64.39.0.40" - "1023"
359	   ""GET /example_api/index HTTP/1.1"" 200 2543 -- performed in 600 ms

361	5.2.  Successful Authenticated User Request

363	   Here is an example of a simple request including user authentication.
364	   Note that the 'provider' and 'rid' SD-PARAMs are added to the message
365	   after the user has authenticated to the service, and that those
366	   parameters are included in each subsequent message.

368	   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-
369	   00000A000152"][transit client="172.16.1.82"] Initializing request to
370	   /api/example:instance/1

372	   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
373	   provider="example.com" rid="1:123"][transit client="172.16.1.82"]
374	   User authentication successful for 1:123

376	   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
377	   provider="example.com" rid="1:123"][transit client="172.16.1.82"]
378	   "172.16.1.82" - "-" ""GET /api/example:instance/1 HTTP/1.1"" 200 119
379	   -- performed in 2 ms

381	5.3.  Log Sample of Successful Request on Behalf of Another Identity

383	   Here is a request made by an authenticated user on behalf of another
384	   identity.  Note that the parameter "eid" is added after the user
385	   authentication takes place and the effective user identity is
386	   validated.  This parameter is included in each subsequent message.

388	   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-
389	   00000A000152"][transit client="172.16.1.82"] Initializing request to
390	   /api/example:instance/1

392	   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
393	   provider="example.com" rid="1:123"][transit client="172.16.1.82"]
394	   User authentication successful for 1:123

396	   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
397	   eid="2:456" provider="example.com" rid="1:123"][transit
398	   client="172.16.1.82"] User impersonation successful for 1:123 to
399	   2:456

401	   Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152"
402	   eid="2:456" provider="example.com" rid="1:123"][transit
403	   client="172.16.1.82"] "172.16.1.82" - "-" ""GET /api/
404	   example:instance/1 HTTP/1.1"" 200 119 -- performed in 2 ms

406	6.  Security Considerations

408	   In addition to general syslog security considerations discussed in
409	   RFC 5424 [RFC5424], he information contained in these messages may
410	   provide information about how services interact, user identities, and
411	   other information about network or service inventory.

413	   Users should not have access to these messages if they would not have
414	   access to this information through other authenticated means.

416	7.  IANA Considerations
417	7.1.  SD-IDs

419	   ANA is requested to register the syslog structured data element SD-
420	   IDs and PARAM-NAMEs shown below:

422	                   +---------+------------+-----------+
423	                   | SD-ID   | PARAM-NAME |           |
424	                   +---------+------------+-----------+
425	                   | context |            | OPTIONAL  |
426	                   |         | aid        | MANDATORY |
427	                   |         | eid        | OPTIONAL  |
428	                   |         | provider   | OPTIONAL  |
429	                   |         | rid        | OPTIONAL  |
430	                   | transit |            | OPTIONAL  |
431	                   |         | client     | MANDATORY |
432	                   |         | gw         | OPTIONAL  |
433	                   +---------+------------+-----------+

435	                                  Table 1

437	8.  Normative References

439	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
440	              Requirement Levels", RFC 2119.

442	   [RFC5424]  Gerhards, R., "The Syslog Protocol", RFC 5424.

444	Authors' Addresses

446	   Gene Golovinsky
447	   Redwood City, CA  94065
448	   US

450	   Phone: (650)8016259
451	   Email: ggolovinsky@qualys.com
452	   URI:   NA

454	   Sam Johnston

456	   Phone:
457	   Email: samj@samj.net
458	   Dominik Birk
459	   Ruhr University Bochum; Horst Goertz Institute for IT Security
460	   Bochum,   44780
461	   Germany

463	   Phone: +49(0)234-32-26740
464	   Email: dominik.birk@rub.de
465	   URI: