idnits 2.17.1 draft-golovinsky-cloud-services-log-format-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 9, 2012) is 4211 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Golovinsky 3 Internet-Draft S. Johnston 4 Intended status: Experimental 5 Expires: April 12, 2013 D. Birk 6 Ruhr University Bochum; Horst 7 Goertz Institute for IT 8 Security 9 October 9, 2012 11 Syslog Extension for Cloud Using Syslog Structured Data 12 draft-golovinsky-cloud-services-log-format-03 14 Abstract 16 This document provides an open and extensible log format to be used 17 by any cloud entity or cloud application to log and trace activities 18 that occur in the cloud. The logs and traces can be utilized for 19 billing, charging, and debugging purposes. In addition, these logs 20 and traces are equally applicable for cloud infrastructure (IaaS), 21 platform (PaaS), and application (SaaS) services. CloudLog is 22 different in content, but not in nature from the traditional logging 23 as it takes in account transient nature of Identities and resources 24 in the cloud. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on April 12, 2013. 43 Copyright Notice 45 Copyright (c) 2012 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Conventions Used in This Document . . . . . . . . . . . . . . 3 62 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 3 63 3.1. Scope of the application . . . . . . . . . . . . . . . . . 3 64 3.2. The Traditional Logging and its Applications . . . . . . . 3 65 3.3. Challenges with the cloud deployment . . . . . . . . . . . 4 66 3.3.1. SaaS Use Case . . . . . . . . . . . . . . . . . . . . 4 67 3.3.2. PaaS Use Case . . . . . . . . . . . . . . . . . . . . 5 68 3.3.3. IaaS Use Case . . . . . . . . . . . . . . . . . . . . 5 69 4. Cloud Log Structured Data Definitions . . . . . . . . . . . . 6 70 4.1. SD-ELEMENT context . . . . . . . . . . . . . . . . . . . . 6 71 4.1.1. SD-PARAM aid - Mandatory . . . . . . . . . . . . . . . 6 72 4.1.2. SD-PARAM provider - Optional . . . . . . . . . . . . . 7 73 4.1.3. SD-PARAM rid - Optional . . . . . . . . . . . . . . . 7 74 4.1.4. SD-PARAM eid - Optional . . . . . . . . . . . . . . . 7 75 4.2. SD-ELEMENT transit . . . . . . . . . . . . . . . . . . . . 7 76 4.2.1. SD-PARAM client - Mandatory . . . . . . . . . . . . . 7 77 4.2.2. SD-PARAM gw - Optional . . . . . . . . . . . . . . . . 8 78 5. Log Format Samples . . . . . . . . . . . . . . . . . . . . . . 8 79 5.1. Log Sample of Simple Non-Authenticated Request . . . . . . 8 80 5.2. Successful Authenticated User Request . . . . . . . . . . 8 81 5.3. Log Sample of Successful Request on Behalf of Another 82 Identity . . . . . . . . . . . . . . . . . . . . . . . . . 9 83 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 84 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 85 7.1. SD-IDs . . . . . . . . . . . . . . . . . . . . . . . . . . 10 86 8. Normative References . . . . . . . . . . . . . . . . . . . . . 10 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 89 1. Introduction 91 This document describes a standard for syslog structured data 92 elements in messages generated by services that may be running on 93 different physical or virtual machines when those services are 94 processing information generated by a single request. The purpose of 95 which is to provide an audit trail that allows correlation of such 96 messages. In addition, this document defines a number of parameters 97 that MUST or SHOULD be included in these structured data elements so 98 these messages can be used to identify users of such services, when 99 the real and/or effective identities of users is known. 101 2. Conventions Used in This Document 103 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 104 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 105 document are to be interpreted as described in RFC 2119 [RFC2119]. 107 3. Problem Statement 109 3.1. Scope of the application 111 The three service models proposed by the NIST differ in the way the 112 single cloud services are offered to the customers. Hence, besides 113 the usage of general logging concepts which can be applied to all 114 three service models alike, individual logging measures for each 115 single service model with its specific circumstances have to be taken 116 into account. 118 3.2. The Traditional Logging and its Applications 120 Practically all hardware and software entities deployed on the 121 network log their activities. Network elements such as routers, 122 servers, firewalls and switches log information about their 123 activities using mostly Syslog (except for Windows). Applications 124 running on the network also log activities, but often using 125 proprietary mechanisms. While logging mechanisms are inconsistent 126 between different entities - Syslog, Windows events, proprietary 127 files - they generally carry enough information to identify type of 128 the activity, time of the occurrence, physical entity involved in the 129 event, and often user(s) that participated in the event. 130 Availability of this information is crucial for accomplishing 131 multiple business objectives ranging from assuring security and 132 performing forensics to adhering to compliance regulations (SOX, PCI, 133 etc.). The existence of logs and information in them is necessary, 134 but not sufficient for achieving security, compliance and other 135 business objectives. The process of collecting, processing, 136 searching and even simply interpreting information in logs is 137 exceptionally labor and time consuming process and often cannot even 138 be done on any meaningful scale without appropriate tools in place. 139 Log Management tools used to solve the problem of scale and 140 interpretation heavily depend on the fact that format of logs is 141 largely well defined and understood. 143 3.3. Challenges with the cloud deployment 145 In cloud deployments the situation with availability of logs in 146 reliability of information in them is drastically different. By 147 definition, cloud resources are shared. A piece of hardware is now 148 running multiple Virtual Instances of "it". They can be brought up 149 and down within very short period of time and at any given moment the 150 hardware can be shared not just by different users but by different 151 users from different companies. Even if Linux or Windows VMs 152 continue to log their activity the information in these logs is very 153 likely to be irrelevant since you cannot really tie logs to the 154 physical entity. Moreover, even if one managed to map logs to a 155 physical entity, there is absolutely no guarantee that the same VM 156 image will be running on the same hardware in its next reincarnation. 157 And there is really no clear way to determine how many users share 158 the hardware and what are their identities and roles. Tracing 159 environmental changes is practically impossible task unless there is 160 traceability between physical and virtual entities. As a result, 161 achieving such business objectives as adhering to compliance 162 regulations or performing regular security auditing is very difficult 163 if not an impossible task. 165 Generally, logging mechanisms for cloud environments do not differ in 166 the way traditional logging mechanisms work. However, the 167 environmental circumstances of the cloud presuppose additional 168 measurements. Customers mostly rely on the CSP if logging data is 169 required. In SaaS scenarios, the customers have almost no chance to 170 prepare the application with additional logging features. This 171 situation slightly changes for PaaS and also states in IaaS a 172 tremendous problem. Hence, logging standards should be applied by 173 the CSP in order to improve this situation for the customers. The 174 following use cases underline the need for an additional standard and 175 the differentiation between the various cloud services. 177 3.3.1. SaaS Use Case 179 In SaaS scenarios, the CSP obtains all the power over the application 180 itself and the offered services. The customer mainly uses a client 181 device for communicating with a specific API offered by the CSP. In 182 most of the cases, the user agent on the client is a web browser 183 communicating with a web application located on the server 184 infrastructure of the CSP. Unfortunately, the customer does not 185 obtain the ability to manage or control the underlying cloud 186 infrastructure, network components, servers, operating systems etc. 187 Hence, the CSP has to provide additional logging mechanisms to 188 improve this situation. In case of a web-based email service, the 189 customer has almost no chance to figure out whether his account has 190 been compromised or accessed from an unknown IP address. Even some 191 providers provide some of the last IP addresses which accessed the 192 application, this procedure does not solve the problem of NAT or used 193 proxies. Furthermore, if the customer's account has been 194 compromised, he can't determine which emails have been edited or 195 accessed by the adversary. Additional, fine granular logging 196 mechanisms could improve this situation for the customer and even 197 forensic investigations in case of an account compromise could be 198 possible. 200 3.3.2. PaaS Use Case 202 The logging situation in PaaS scenarios slightly changes compared to 203 SaaS. The CSP decides which system-specific logging information is 204 provided to the customers, however, the application deployed by the 205 customer can contain hard-coded logging features. This unfortunately 206 requires the underlying OS environment to support that. For 207 instance, the application could contain mechanisms which transfer 208 encrypted and signed logging data to third party logging servers in 209 real-time. CSP claim that the transfer of data between the PaaS 210 instance and the corresponding database backend is encrypted. This 211 can hardly be confirmed by the customer. Hence, customers should not 212 rely on such promises but apply their own logging mechanisms as far 213 as possible. This logging information could be improved by 214 information provided by CSP which cannot directly been extracted by 215 the customer application. 217 3.3.3. IaaS Use Case 219 In IaaS cloud environments the situation with availability of logs in 220 reliability of information in them has somewhat been improved. The 221 customers can prepare their VM for logging purposes and control the 222 single instance. Therefore, crucial application specific logging 223 information can be collected by the customer itself under the 224 theoretical reserve, that the CSP can theoretically maliciously or 225 unintentionally modify this logging information. Unfortunately, by 226 definition, cloud resources are shared. This means, the customer 227 could share the same physical host with an potential adversary. 228 Hence, it is of greater importance whether the customer shares the 229 physical host with any other tenant or is the only virtual instance. 230 This information cannot be obtained by the customer without the help 231 of the CSP. This situation is further complicated by the flexibility 232 of the cloud. Within a short range of time, virtual instances are 233 transferred to other physical hosts without the knowledge of the 234 customer. These transactions cannot be detected and logged by the 235 customer without the assistance of the CSP. IaaS cloud environments 236 should provide the ability to detect and log the bounding of the 237 virtual instance to a specific hardware. For an exhaustive forensic 238 analysis of an incident, this information is however of greater 239 importance. Moreover network components containing important 240 information about the network in which the instance is deployed, 241 cannot be accessed by the customer without the help of the CSP. As a 242 result, achieving such business objectives as adhering to compliance 243 regulations or performing regular security auditing is very difficult 244 if not an impossible task. 246 4. Cloud Log Structured Data Definitions 248 1. RUI - real user identity, the identity of the user that has 249 authenticated to the entity. 251 2. EUI - effective or impersonated user identity, the identity of 252 the user that the real user identity is acting for. For example, 253 an administrator account could have the ability to impersonate 254 another user account. 256 3. Provider - is the domain, service, application, or other entity 257 providing the user identities. 259 Structured data elements, defined in RFC 5424 [RFC5424], provides a 260 mechanism for adding data to syslog messages. Since additional data 261 is necessary to trace user identities and their activities in the 262 cloud we use the mechanism of structured data elements to provide 263 this additional information in the syslog messages. 265 4.1. SD-ELEMENT context 267 The SD-ELEMENT identified by the SD-ID "context" defines the context 268 of the external request that causes for the activity to take place. 269 The syslog message that is generated as a result of this activity 270 should be identified by this "context". 272 4.1.1. SD-PARAM aid - Mandatory 274 The parameter "aid" represents the audit identifier, which uniquely 275 identifies an external request for activity. The value is a UTF-8- 276 STRING representation of the UUID generated by the entity when 277 request is received. 279 This parameter MUST be present within the SD-ELEMENT "context". 281 4.1.2. SD-PARAM provider - Optional 283 The parameter "provider" represents the provider of the identity for 284 the Real User Identity - 'rid' and Effective User Identity - 'eid', 285 User identities are not always exist or available. In cases that 286 they are, either "rid" or "eid" MUST be present in the syslog 287 messages. 289 The parameter "provider" is not required, but SHOULD be present 290 within the SD-ELEMENT "context" when either the 'rid' or 'eid' 291 identifiers are present. 293 4.1.3. SD-PARAM rid - Optional 295 The parameter "rid" represents the real user identity. 297 This parameter SHOULD be present within the SD-ELEMENT "context" when 298 the real user identity is availbale. 300 4.1.4. SD-PARAM eid - Optional 302 The parameter "eid" represents the effective user identity. This 303 parameter SHOULD be present within the SD-ELEMENT "context" when user 304 impersonation has happened and the effective user identity is 305 available. 307 The 'eid' parameter represents the effective user identity. 309 This parameter SHOULD be present within the 'context' SD-ELEMENT when 310 the effective user identity is known. 312 4.2. SD-ELEMENT transit 314 The SD-ELEMENT identified by the SD-ID "transit" defines logical 315 gateway entities which were traversed while request for activity was 316 routed to the final destination entity that would satisfy the 317 request. 319 4.2.1. SD-PARAM client - Mandatory 321 The parameter "client" represents the IP address or Fully Qualified 322 Domain Name (FQDN) of the client entity on behalf of which the 323 request is being made. This is different from SD-ID 'ip' in RFC 5424 324 that defines IP of the entity producing the log message itself. IPv4 325 or IPv6 addresses MUST be represented as STRING-UTF-8 . 327 The parameter "client" represents the IP address or FQDN of the 328 client on behalf of which the request is being made. 330 4.2.2. SD-PARAM gw - Optional 332 The parameter "gw" represents a gateway entity through which the 333 request for activity passes before arriving to the final destination 334 entity actually responsible processing of the request. The value of 335 the parameter is comprised of the STRING-UTF-8 representation of UUID 336 of the entity , identifying the gateway, a colon character (i.e. 337 ':'), and finally the STRING-UTF-8 representation of IP address or 338 FQDN of the gateway through which the request has been routed. 340 This parameter MAY appear more than once within the SD-ELEMENT 341 "transit" as request may pass through multiple gateway entities. 342 Each occurrence represents a different gateway through which the 343 request passed. 345 5. Log Format Samples 347 5.1. Log Sample of Simple Non-Authenticated Request 349 Here is an example of a log produced as a result of simple non- 350 authenticated request to a web service. Only the mandatory 351 parameters "aid" and "client" are represented. 353 Jul 7 09:01:40 [context aid="9BE817EB-8ACC-1004-D9DF- 354 00000A00065E"][transit client="56.2.222.83"] Initializing request to 355 /example_api/index 357 Jul 7 09:01:40 [context aid="9BE817EB-8ACC-1004-D9DF- 358 00000A00065E"][transit client="56.2.222.83"] "64.39.0.40" - "1023" 359 ""GET /example_api/index HTTP/1.1"" 200 2543 -- performed in 600 ms 361 5.2. Successful Authenticated User Request 363 Here is an example of a simple request including user authentication. 364 Note that the 'provider' and 'rid' SD-PARAMs are added to the message 365 after the user has authenticated to the service, and that those 366 parameters are included in each subsequent message. 368 Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8- 369 00000A000152"][transit client="172.16.1.82"] Initializing request to 370 /api/example:instance/1 372 Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152" 373 provider="example.com" rid="1:123"][transit client="172.16.1.82"] 374 User authentication successful for 1:123 376 Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152" 377 provider="example.com" rid="1:123"][transit client="172.16.1.82"] 378 "172.16.1.82" - "-" ""GET /api/example:instance/1 HTTP/1.1"" 200 119 379 -- performed in 2 ms 381 5.3. Log Sample of Successful Request on Behalf of Another Identity 383 Here is a request made by an authenticated user on behalf of another 384 identity. Note that the parameter "eid" is added after the user 385 authentication takes place and the effective user identity is 386 validated. This parameter is included in each subsequent message. 388 Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8- 389 00000A000152"][transit client="172.16.1.82"] Initializing request to 390 /api/example:instance/1 392 Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152" 393 provider="example.com" rid="1:123"][transit client="172.16.1.82"] 394 User authentication successful for 1:123 396 Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152" 397 eid="2:456" provider="example.com" rid="1:123"][transit 398 client="172.16.1.82"] User impersonation successful for 1:123 to 399 2:456 401 Aug 16 13:34:18 [context aid="149683FC-8DF5-1004-E1A8-00000A000152" 402 eid="2:456" provider="example.com" rid="1:123"][transit 403 client="172.16.1.82"] "172.16.1.82" - "-" ""GET /api/ 404 example:instance/1 HTTP/1.1"" 200 119 -- performed in 2 ms 406 6. Security Considerations 408 In addition to general syslog security considerations discussed in 409 RFC 5424 [RFC5424], he information contained in these messages may 410 provide information about how services interact, user identities, and 411 other information about network or service inventory. 413 Users should not have access to these messages if they would not have 414 access to this information through other authenticated means. 416 7. IANA Considerations 417 7.1. SD-IDs 419 ANA is requested to register the syslog structured data element SD- 420 IDs and PARAM-NAMEs shown below: 422 +---------+------------+-----------+ 423 | SD-ID | PARAM-NAME | | 424 +---------+------------+-----------+ 425 | context | | OPTIONAL | 426 | | aid | MANDATORY | 427 | | eid | OPTIONAL | 428 | | provider | OPTIONAL | 429 | | rid | OPTIONAL | 430 | transit | | OPTIONAL | 431 | | client | MANDATORY | 432 | | gw | OPTIONAL | 433 +---------+------------+-----------+ 435 Table 1 437 8. Normative References 439 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 440 Requirement Levels", RFC 2119. 442 [RFC5424] Gerhards, R., "The Syslog Protocol", RFC 5424. 444 Authors' Addresses 446 Gene Golovinsky 447 Redwood City, CA 94065 448 US 450 Phone: (650)8016259 451 Email: ggolovinsky@qualys.com 452 URI: NA 454 Sam Johnston 456 Phone: 457 Email: samj@samj.net 458 Dominik Birk 459 Ruhr University Bochum; Horst Goertz Institute for IT Security 460 Bochum, 44780 461 Germany 463 Phone: +49(0)234-32-26740 464 Email: dominik.birk@rub.de 465 URI: