ISMS Interim Meeting Minutes February 13/14 2006, MIT, Cambridge, MA Juergen Schoenwaelder (Scribe and Editor) Attendees: * David Harrington, Huawei [DH] * Sam Hartman, MIT [SH] * Jeffrey Hutzelman, CMU [JH] * David Nelson, Enterasys [DN] * Juergen Quittek, NEC [JQ] * David Perkins, SNMPinfo [DP] * Joe Salowey, Cisco [JA] * Juergen Schoenwaelder, International University Bremen [JS] * Margaret Wasserman, ThingMagic [MW] * Bert Wijnen, Lucent [BW] * Wes Hardacker, Sparta [WH] (via phone) 1. Summary of day #1: 1.1. Kickoff presentations: a) DH explained the core blocks of the SNMP architecture and how the ASIs glue them together. During the discussion, it became clear that in some cases the ASIs do not convey all information actually needed to make implementations work. b) JA explained the layers of the SSH architecture. He pointed out that some newer key-exchange mechanisms such as GSS-API are different from the widely used password and public-key authentication mechanisms in the sense that the claimed user identity might be different from the authenticated identity. c) DP walked through the logic which determines where to sent notifications and with which security parameters. The good news is that the procedure on the left hand side of his page do not need any changes to support SSHSM. All that is needed is to define a step 5d) which explains how to deliver a notification via SSHSM given the transport domain/address of the target, the security model/name/level triple and the context engineID/name pair, and the notification itself. d) DN explained the philosophy behind RADIUS and how RADIUS is typically integrated into applications. He pointed out that RADIUS is a "fascist" protocol - an applications passes a question to a RADIUS server and it provides a positive/negative decision which is to be followed (RADIUS is not a negotiation protocol). Hints are used in Accept-Request messages to tell the server which particular question is asked and the server passes attributes back in an Access-Accept message to be used by the requester. e) RADIUS, as currently defined and deployed, is about service provisioning, based on authenticated identity. It does not have the notion of a set of permissions, ACLs, etc, that are used during a login session to access various resources, as one might expect in a host OS. Of course, such a notion could be added, if required. 1.2. Requirements: a) SH stated that an SSHSM solution must support multiple key-exchange mechanisms to pass IESG review. b) There are legitimate reasons to out-source only authorization to RADIUS while authentication is handled by other mechanisms. A solution should not require that RADIUS authorization is always bound to RADIUS authentication. c) There are two types of notification originating programs - those bundled into an agent trying to do all steps on DP's page and those who short cut the whole left column and jump straight into the right column using information provided in an implementation specific manner. It is important to support both. 1.3. Directions: a) It is believed that the notion of an authoritative engineID is not needed within SSHSM. The authoritative engineID seems to be an USM specific detail, even though it is parsed into a security model as part of the generateRequestMsg() and generateResponseMsg() ASIs and out of a security as part of the processIncomingMsg() ASI. It seems the reason the authoritative engineID is passed out of the security subsystem is because, depending on the operation, the engineID needs to be compared to snmpEngineID (in 3412 7.2 13a). The MPM has knowledge of the Confirmed/Unconfirmed class of the operation, but the security model does not. b) It was agreed that RADIUS integration is not an integral part of SSHSM. SSHSM does not require RADIUS support as mandatory to implement. c) It will help SSHSM tremendously if RADIUS would support an authorize-only mechanism so that authentication exchanges and authorization exchanges can be separated. d) The RADIUS security name to group mapping authorization may be invoked by VACM (unfortunately the SNMP architecture has put the name to group mapping into the ACM rather than abstracting it out of the ACMs) and cause the respective VACM tables to be populated. In this case, there would be a timer which controls the lifetime of the authorization decision. The alternative would be to bind the authorization decision to the lifetime of a session (but this would then work only for session-based security models). e) It remains unclear whether it is desirable to be able to retrieve the contents of the vacmAccessTable via RADIUS or whether the security name to group name mapping is sufficient. Documentation of the best current practices of using VACM would be highly valuable and may help clarify whether regular updates on the vacmAccessTable are indeed needed to meet operational needs. The approach to be taken is to focus on security name to group name mapping as the first step. f) Authorization happens in three cases: (1) invocation of the SSHSM subsystem, (2) SSH identity to security name mapping, (3) security name to group name mapping. [ED: It seems that SSH does not have a notion of an abstract security name which is key-exchange mechanism agnostic which might mean that ISMS has to define for each mechanism how that maps to the SNMP security name and be extensible in this regard.] g) While the engineID is the primary way to identify SNMP engines in the architecture, applications and operators prefer to identity SNMP engines by their transport endpoints and rely on engineID discovery to make SNMPv3 work. SSHSM therefore must support engineID discovery. The snmpUnknownPDUHandlers report which is part of the SNMPv3 message processing does not help since it does not report the "correct" engineID in the response and it may not work in proxy situations (but note that USM discovery also relies on the assumption that the contextEngineID == securityEngineID). 2. Summary of day #2: 2.1. Terminology and Understanding After reviewing the results of the first meeting, we once again worked out common understandings of terminology. The engineID again caused confusion and we ended explaining it using the following model: engine engine engine | | | o <- engineID -> o <- engineID -> o | | | transport transport transport | | | o <- transport address -> o <- transport address -> o | | | +-------------------------+-------------------------+ The security engineID is hop-by-hop and used to bind security information (clock sync, key localization) while the context engineID is end-to-end used to identify the communicating peers across a proxy chain. DH explains that the engine ID has three usages: (1) snmpEngineID = "this" pointer - an ID for itself, so it can determine whether a passed engineID means "me" (2) securityEngineID = authoritative engine (3) contextEngineID = data source Sam talks about SSH and explains that SSH does not have ASIs and allows to use any additional information during the processing. Basically three things happen in SSH: things happen: 1) connect and authenticate the remote system using keyex method (the only input is effectively the hostname and the output is whether we are connected to the right system) 2) client chooses a user name he would like to act as and runs the user authentication process which is again very method specific; output is you are permitted as a user or not; internally there may be internal identities that may or may not be related to the user name (a string) 3) can i be that [...?] Generic identities: (1) reasonable to think clients identify servers by hostnames (ip addresses less likely, as a matter most clients do not include ports but they could very well) (2) servers identify clients by a user name 2.2. Walk through the ASIs for a CG <-> CR communication a) The notion of an SSH hostname may be different from IP addresses, DNS names, ... - so we need to define an SshTDomain and an SshTAddress which carries SSH hostnames. b) An SNMP engine must not allow to use the SSHSM over a non SshTDomain transport. c) There is a mapping between the SNMP security name and the SSH user name via the local data store and it might be the identity mapping. So far, we assume that the mapping happens in the TM portion of the SSHSM. d) The transport on the CR identifies the getpeername() endpoint of the CG, not the listening port of a CG. If the connection breaks during processing, the message is dropped since there is no way to establish a new session from the CR. e) Problem: In the security model, we do not have access to the transport address and thus we have a problem to identify the SSH session to lookup and the SSH user name. The proposed solution is to add an argument to the ASIs, the tmStateReference. (See also the TMSM document.) f) The SSHSM document currently does not explain how access to the SSHSM-TM happens via the tmStateReference. This needs to be explained. g) So far, we assume that the mapping of the SSH user name happens in the TM portion of the SSHSM as part of session establishment. h) There is a need to identify an SSH session, in particular, it is necessary that a response does not eventually go into a newly established session which happens to have the same transport address. In other words, we have to ensure that the response really goes to the same session the request came in. If the correct session has gone, the response message must be dropped. (Note that sessions may also go away if the periodic SSH rekeying fails.) i) Once back in the TM portion of the SSHSM, we use the session info provided by the tmStateReference to send the response (actually ignoring the transport address). j) The command generator will map back the SSH user name to the security name originally provided by the command generator. k) What is the lifetime of the tmStateReference? The session lifetime? Very likely. j) There was quite some discussion how the securityName provided when a CG calls sendPDU() will be cached and how it is pulled out of the cache later while processing the response. The issue is whether an engine that waits for a response (i.e., it hosts a CG) can process an unexpectedly received request by applying the cached securityName of the outstanding request. This clearly would jeopardize security. The specs need to be clear how to prevent this. 2.2. NO -> NR communication (NO establishes SSH session) a) A first attempt to untwist things: | SSH Client | SSH Server ---+-------------------+-------------------- CG | secName->userName \ (not allowed?) ---+-----------------. `------------------- CR | (?) \ userName->secName ---+-------------------+-------------------- NO | | ---+-------------------+-------------------- NR | | ---+-------------------+-------------------- b) It is very possible to have a CR and CG use the same system. How does the SNMP architecture figure out whether a stateReference applies to a given received message? [Homework to the SNMP geeks to figure this out.] c) WH reminds us that we should not be too strict about the architecture (which is just _an_ architecture) since most implementations are not following the architecture anyway and implementors will just make things work. d) WH prototyped SNMP over SSH. WH believes the authenticated engineID is a USM only concept (and the ASIs export it for no good reasons, probably they should instead have told the security subsystem what kind of communication happens so the security subsystem could have determined who is authoritative). e) DP believes that operationally you do not want to put any credentials on the notification originator, at least if the credentials are not localized. f) WH says that agents today do not have knowledge of hostkeys and leap-of-faith does not work for agents. In other words, agents must be properly configured with the relevant host keys. g) Is it not possible to give NOs the public keys they need for them to work? Not all host SSH key-exchange mechanisms use host keys. h) DP does not like this because we put credentials on the network devices. Passwords really are worrying, public keys may be less worrying. Note that the key pairs can and should be agent specific and thus a stolen devices is not as worrying. Still, if keys change, there is the usual key update problem. i) DP designs a table (SshUserTable): | hostName | userName | secName | pubKey | privKey .. +----------+----------+---------+--------+------------ | | | | | +----------+----------+---------+--------+------------ With that, a new step 5d) in his notification writeup will have the information to establish a session. Do we need a separate table for the hostkeys? j) SH suggests to have a table which maps an SSH transport address to a hostkey. k) OK, we have a solution where the NO establishes an SSH connection. Now lets see how we can work out how to send a notification over an already existing SSH connection. l) BW asks whether the configuration needed not puts us back to the beginning - do operators configure this? DNSSEC might be an option but DNSSEC is not widely deployed as of today. What needs to be configured: (a) add known host keys on agents, (b) put user keys on the agents. m) Simplification: There will be an identity mapping between security name and user name. The key pair will be the key that is used by the SSH host. With this approach, the only thing needed to be configured is the known host key. n) WH says to make sure that MIB tables allow for more flexibility. The original plan was to allow different key pairs everywhere, sharing the hostkey is just an optimization. o) WH asks whether it makes sense to introduce different security models for different user authentication mechanisms? The answer was no. p) It was questioned whether it is possible to define a generic SM portion of the TMSM which works with other TMSM? This remains unclear, but it is not unlikely that such a generic SM portion of arbitrary TMSMs may not work once the second TMSM pops up due to false assumption during the design. (This scepticism is to some extend driven by the insight that the SNMPv3 ASIs also prove "interesting" during the whole day.) q) The MIB module should provide a button to generate key pairs so that one can download a new public key from the MIB rather than having to upload a private key. It needs to be worked out how the change over to an updated key pair can work, that is when the old key expires (basically the next re-keying or session end). 2.4. NO -> NR communication (over established CG -> CR session) a) Started off with an interesting discussion about audit logs and the need in such use cases to keep an authenticated security name. This lead to a discussion which role the securityName plays in notifications and how access control is applied in this case. The key issue with the usage of an established session is that the source of the notification is not authenticated, unless one introduces security names such as "audit-from-front-door" and "audit-from-vault-door". b) When sending a message, SNMP authorizes depending on to whom you are sending to. SH does not agree with this approach, but can't explain the issues (two important issues, security and correctness). WH remarks that many notification receivers today do not distinguish how they handle received notifications with different security levels. c) JH proposed a set of rules and modifications to describe which connection I pick and to show that it is correctly authenticated. If I am sending a message, the identity I pick is one that I authenticated. If I pick a mapping, ensure that the protocol mapping on the other side is good. d) Notification case with NO established sessions (previous section) is more difficult because the message flow is similar but the NO likes to ensure that it is sending to a specific principal, which does not work with a hostkey which only proves the identity of the notification receiver server. Solution may require that after SSH startup we ask the NR server to proof that it can act as a particular principal. e) In other words, sending notifications via an existing CG-CR session actually gets the authentication of the receiver right as seen from the existing SNMPv3 architecture and the role of access control on notifications. This was a surprising insight. f) More work is needed to work out how the two scenarios can be made to work in order to provide SNMP authorization semantics in the case of notifications. 2.5 Wrap Up a) The meeting was closed at around 17:30. Sam did a great job in hosting the meeting. b) We made significant progress to understand the problem space and to find a common language between security experts and SNMP experts. Additional followup work is needed to work out the remaining details of the notification handling. 3. Action Items a) [Someone] to write down the RADIUS requirements in order to hand them over to the RADEXT WG. b) [SNMP] to check whether the securityEngineID is used outside of the security model and if so for what purpose. Answer: The securityEngineID is used in RFC 3412 section 7.2 13a) in the MPM. It is also referenced in RFC 3584. c) [DP] to work out the MIB design and to report what is wrong with the current draft. d) [DH] to revise the IDs and to submit new version before the cutoff deadline e) [ALL] name volunteers for the Radius integration document (to take care of a) above). DN volunteered to review the document and/or contribute text. f) [DN] to take the care of the RADIUS aspects of authorize-only. g) [JQ/JS] to send a formal request to RADEXT. h) [JQ/JS] to revise the milestones, at least the Dec 06 one. g) [JH] to proposed a set of rules of modifications to describe which connection I pick and to show that it is correctly authenticated. h) [JQ/JS] to organize an editing meeting at the next IETF (Sunday?)