Session Initiation Protocol (SIP) Recording MetadataCisco Systems, Inc.Cessna Business Park,Kadabeesanahalli Village, Varthur Hobli,Sarjapur-Marathahalli Outer Ring RoadBangaloreKarnataka560103Indiarmohanr@cisco.comNokia Siemens NetworksBangaloreKarnatakaIndiapartha@parthasarathi.co.inHuaweiHudson, MAUSApkyzivat@alum.mit.edu
Transport
SIPREC
Session recording is a critical requirement in many communications environments such as call centers and financial trading. In some of these environments, all calls must be recorded for regulatory, compliance, and consumer protection reasons. Recording of a session is typically performed by sending a copy of a media stream to a recording device. This document describes the metadata model as viewed by Session Recording Server(SRS) and the Recording metadata format.
Session recording is a critical requirement in many communications environments such as call centers and financial trading. In some of these environments, all calls must be recorded for regulatory, compliance, and consumer protection reasons. Recording of a session is typically performed by sending a copy of a media stream to a recording device. This document focuses on the Recording metadata which describes the communication session. The document describes a metadata model as viewed by Session Recording Server and the Recording metadata format, the requirements for which are described in and the architecture for which is described in .
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in . This
document only uses these key words when referencing normative
statements in existing RFCs."
Metadata Model: An abstract representation of metadata using a Unified Modelling Language(UML) class diagram.Metadata classes: Each block in the model represents a class. A class is a construct that is used as a blueprint to create instances(called objects) of itself. The description of each class also has representation of its attributes in a second compartment below the class name.Attributes: Attributes represents the attributes listed in each of the classes. The attributes of a class are listed in the second compartment below the class name. Each instance of class conveys values for these attributes which adds to the recording's Metadata.Linkages: Linkages represents the relationship between the classes in the model. It represents the logical connections betweens classes(or objects) in class diagrams/ object diagrams. The linkages used in the Metadata model of this document are associations.
Metadata is the information that describes recorded media and the CS to which they relate. Below diagram shows a model for Metadata as viewed by Session Recording Server (SRS).
The Metadata model is a class diagram in Unified Modelling Language(UML). The model describes the structure of a metadata in general by showing the classes, their attributes, and the relationships among the classes. Each block in the model above represents a class. The linkages between the classes represents the relationships which can be associations or Composition.
The metadata is conveyed from SRC to SRS.
The model allows the capture of a snapshot of a recording's Metadata at a given instant in time. Metadata changes to reflect changes in what is being recorded. For example, if in a conference a participant joins SRC sends a snapshot of metadata having that participant information (with attributes like name/AoR pair and associate-time) to the SRS.Some of the metadata is not required to be conveyed explicitly from the SRC to the SRS, if it can be obtained contextually by the SRS(e.g., from SIP or SDP signalling).
This section gives an overview of Recording Metadata Format. Some data from the metadata model is assumed to be made available to the SRS through Session Description Protocol (SDP), and therefore this data is not represented in the XML document format specified in this document. SDP attributes describes about different media formats like audio, video. The other metadata attributes like participant details are represented in a new Recording specific XML document namely application/rs-metadata+xml. The SDP label attribute provides an identifier by which a metadata XML document can refer to a specific media description in the SDP sent from the SRC to the SRS.
The XML document format can be used to represent either the complete metadata or a partial update to the metadata. The latter includes only elements that have changed compared to the previously reported metadata.
Recording Metadata document is an XML document. recording element MUST be present in all recording metadata XML document. recording acts as container for all other elements in this XML document.
Recording object is a XML document. It MUST have the XML declaration and it SHOULD contain an encoding declaration in the XML declaration, e.g., ". ]]> If the charset parameter of the MIME content type declaration is present and it is different from the encoding declaration, the charset parameter takes precedence.
Every application conforming to this specification MUST accept the UTF-8 character encoding to ensure the minimal interoperability.
Syntax and semantics error in recording XML document has to be informed to the originator using application specific mechanism.
The namespace URI for elements defined by this specification is a Uniform Resource Namespace (URN) , using the namespace identifier 'ietf' defined by and extended by .
The URN is as follows: urn:ietf:params:xml:ns:recording
recording element MUST contain an xmlns namespace attribute with value as urn:ietf:params:xml:ns:recording. One recording element MUST be present in the all recording metadata XML document.
dataMode element shows whether the XML document is complete document or partial update. The default value is complete.
This section describes each class of the metadata model, and the attributes of each class. This section also describes how different classes are linked and the XML element for each of them.Each instance of a Recording Session class (namely the Recording Session Object) represents a SIP session created between an SRC and SRS for the purpose of recording a Communication Session. A Recording Session class has the following attributes: Start/End Time - Represents the Start/End time of a Recording Session object. Each instance of Recording Session has:Zero or more instances of Communication Session Group. CSG may be zero because it is optional metadata object. Also the allowance of zero instances is to accommodate persistent recording, where there may be none.Zero or more instances of Communication Session objects.Recording Session object is represented by recording XML element. That in turn relies on the SIP/SDP session with which the XML document is associated to provide some of the attributes of the Recording Session element. Start and End time value are derivable from Date header(if present in SIP message) in RS. In cases where Date header is not present, Start/End time are derivable from the time at which SRS receives the notification of SIP message to setup RS / disconnect RS.
One instance of a Communication Session Group class (namely the Communication Session Group object) provides association or linking of Communication Sessions.
A CS Group has the following attributes: Unique-ID - This Unique-ID is to group different CSs that are related. SRC (or SRS) is responsible for ensuring the uniqueness of Unique-ID in case multiple SRC interacts with the same SRS. The mechanism by which SRC groups the CS is outside the scope of SIPREC. Associate-time - Associate-time for CS-Group shall be calculated by SRC as the time when a grouping is formed. The rules that determine how a grouping of different Communication Session objects is done by SRC is outside the scope of SIPREC.Disassociate-time - Disassociate-time for CS-Group shall be calculated by SRC as the time when the grouping endsThe linkages between Communication Session Group class and other classes is association. A communication Session Group is associated with RS and CS in the following manner: There is one or more Recording Session objects per Communication Session Group. Each Communication Session Group object has to be associated with one or more RS [Here each RS can be setup by the potentially different SRCs] There is one or more Communication Sessions per CS Group [e.g. Consult Transfer]
Group element is an optional element provides the information about the communication session group
Each communication session group (CSG)object is represented using one group element. Each group element has unique Base 64 URN UUID attribute which helps to uniquely identify CSG.
A Communication Session class and its object in the metadata model represents Communication Session and its properties needed as seen by SRC. A communication Session class has the following attributes:Termination Reason - This represents the reason why a CS was terminated. The communication session MAY contain a Call Termination Reason. This MAY be derived from SIP Reason header of CS.CS Identifier - This attribute is used to uniquely identify a CS.Start-time - This optional attribute represents start time of CS as seen by SRCStop-time - This optional attribute represents stop time of CS as seen by SRCThis document does not specify attributes relating to what should happen to a recording of a CS after it has been delivered to the SRS, e.g., how long to retain the recording, what access controls to apply. The SRS is assumed to behave in accordance with policy. The ability for the SRC to influence this policy is outside the scope of this document. However if there are implementations where SRC has enough information, this could be sent as Extension Data attached to CS
A Communication Session is linked to CS-Group, Participant, Media Stream and Recording Session classes using the association relationship. Association between CS and Participant allows:
CS to have atleast zero or more participantsParticipant is associated with zero or more CSs. This includes participants who are not directly part of any CS. An example of such a case is participants in a premixed media stream. The SRC may have knowledge of such Participants, yet not have any signaling relationship with them. This might arise if one participant in CS is a conf focus. To summarize even if SRC does not have direct signalling relationships with all participants in a CS, it should nevertheless create a Participant object for each participant that it knows about. The model also allows participants in CS that are not participants in the media. An example is the identity of a 3pcc controller that has initiated a CS to two or more participants of the CS. Another example is the identity of a conference focus. Of course a focus is probably in the media, but since it may only be there as a mixer, it may not report itself as a participant in any of the media streams.Association between CS and Media Stream allows:A CS to have zero or more Streams
A stream can be associated with at most one CS. Stream in persistent RS is not required to be associated with any CS before CS is created and hence the zero association is allowed. Association between CS and RS allows:Each instance of RS has Zero or more instances of Communication Session objects.Each CS has to be associated with one more RS [ Here each RS can be potentially setup by different SRCs]
Session element provides the information about the communication session
Each communication session(CS) object is represented by one session element. Each session element has unique Base 64 URN UUID attribute which helps to uniquely identify CS.
Reason element MAY be included to represent the Termination Reason attribute. group-ref element MAY exist to indicate the group where the mentioned session belongs.
A CSRS Association class and its objects has attributes of CS object which are attributes of association of a session to a RS. CSRS association class has the following attributes:Associate-time - associate-time is calculated by SRC as the time it sees a CS is associated to a RSDisassociate-time- Disassociate-time is calculated by SRC as the time it see a CS disassociate from a RS.
It is possible that a given CS can have multiple associate/disassociate times within given RS.
CSRS association class is linked to CS and RS classes. There are no cardinalties for this linkage. sessionrecordingassoc is the XML element to represent CSRS association object. session URN UUID is used to uniquely identify this element and link with the specific session.
A Participant class and its objects has information about a device that is part of a CS and/or contributes/consumes media stream(s) belonging to a CS. Participant has attributes like:AoR / Name pair list - This attribute is a list of Name/AoR tuple. An AoR MAY be SIP/SIPS/TEL URI. Name represents Participant name(SIP display name) or DN number ( in case it is known). There are cases where a participant can have more than one AoR [e.g. P-Asserted-identity header which can have both SIP and TEL URIs]This document does not specify other attributes relating to participant e.g. Participant Role, Participant type. An SRC which has information of these attributes can indicate the same as part of extension data to Participant from SRC to SRS.The participant class is linked to MS and CS class using association relationship. The association between participant and Media Stream allows: Participant to receives zero or more media streams Participant to send zero or more media streams. (Same participant provides multiple streams e.g. audio and video) Media stream to be received by zero or more participants. Its possible, though perhaps unlikely, that a stream is generated but sent only to the SRC and SRS, not to any participant. E.g. In conferencing where all participants are on hold and the SRC is collocated with the focus. Also a media stream may be received by multiple participants (e.g. Whisper calls, side conversations). Media stream to be sent by one or more participants (pre-mixed streams).
Example of a case where a participant receives Zero or more streams - a Supervisor may have side conversation with Agent, while Agent converses with customer.
A participant element represents a Participant object.
Participant MUST have a NameID complex element which contains AoR as attribute and Name as element. AOR element is SIP/SIPS URI FQDN or IP address which represents the user. name is an optional element to represent display name.
Each participant element has unique ID (Base 64 URN UUID) attribute which helps to uniquely identify participant and session Base 64 URN UUID to associate participant with specific session element. Base 64 URN UUID of participant MUST used in the scope of CSG and no new Base 64 URN UUID has to be created for the same element (participant, stream) between different CS in the same CSG. In case Base 64 URN UUID has to be used permanent, careful usage of Base 64 URN UUID to original AoR has to be decided by the implementers and it is implementer's choice.
A participantCS Association class and its objects has attributes of participant object which are attributes of association of a participant to a Session.ParticipantCS association class has the following attributes:Associate-time - associate-time is calculated by SRC as the time it sees a participant is associated to CSDisassociate-time- Disassociate-time is calculated by SRC as the time it see a participant disassociate from a CS. It is possible that a given participant can have multiple associate/disassociate times within given communication session. Capabilities - A participant capabilities as defined in which is an optional attribute that includes the capabilities of a participant in a CS. Each participant shall have Zero or more capabilities. A participant may use different capabilities depending on the role it plays at a particular instance. IOW if a participants moves across different CSs ( due to transfer e.t.c) OR is simultaneously present in different CSs its role may be different and hence the capability used.
"send" or "recv" element in each participant is associating SDP m-lines with the participant. send element indicates that participant is sending the stream of media with the mentioned media description. recv element indicates that participant is receiving the stream and by default all participant will receive the stream. recv element has relevance in case whisper call scenario wherein few of the participant in the session receives the stream and not others.
The participantCS association class is linked to participant and CS classes. There are no cardinalties for this linkage. participantsessionassoc XML element represent participantCS association object. participant and session id is used to uniquely identify this element NOTE: RFC 4235 encoding shall be used to represent capabilities attribute in XML. A Media Stream class (and its objects) has the properties of media as seen by SRC and sent to SRS. Different snapshots of media stream object may be sent whenever there is a change in media (e.g. dir change like pause/resume and/or codec change and/or participant change.).A Media Stream class has the the following attributes:Media Stream Reference - In implementations this can reference to m-line
Content - The content of an MS element will be described in terms of value from the registry.
The metadata model should include media streams that are not being delivered to the SRS. Examples include cases where SRC offered certain media types but SRS chooses to accept only a subset of them OR an SRC may not even offer a certain media type due it its restrictions to recordA Media Stream is linked to participant and CS classes using the association relationship. The details of association with the Participant are described in the Participant class section. The details of association with CS is mentioned in the CS section.
stream element represents a Media Stream object. Stream element indicates SDP media lines associated with the session and participants.
This element indicates the SDP m-line properties like label attributes. Label attribute is used to link m-line SDP body using label attribute in SDP m-line.
Each stream element has unique Base 64 URN UUID attribute which helps to uniquely identify stream and session Base 64 URN UUID to associate stream with specific session element.The content attribute if an SRC wishes to send is conveyed in RS SDP.A ParticipantStream association class and its object has attributes that are attributes of association of a Participant to a Stream.A participantStream association class has the following attributes:Associate-Time: This attributes indicates the time a Participant started contributing to a Media StreamDisassociate-Time: This attribute indicates the time a Participant stopped contributing to a Media StreamThe participantStream association class is linked to participant and Stream classes. There are no cardinalties for this linkage. ParticipantStreamAssoc XML element represents participant to stream association object. participant element is used to uniquely identify this element and related with stream using stream unique URN id..
associate-time/disassociate-time contains a string indicating the date and time of the status change of this tuple. The value of this element MUST follow the IMPP datetime format . Timestamps that contain 'T' or 'Z' MUST use the capitalized forms. At a time, any of the time tuple associate-time or disassociate-time MAY exist in the element namely group, session, participant and not both timestamp at the same time.
As a security measure, the timestamp element SHOULD be included in all tuples unless the exact time of the status change cannot be determined.
Unique id is generated in two steps: UUID is created using ) UUID is encoded using base64 as defined in The above mentioned unique-id mechanism SHOULD be used for each metadata element.
The following example provides all the tuples involved in Recording Metadata XML body.
The following example provides partial update in Recording Metadata XML body for the above example. The example has a snapshot that carries the disassociate-time for a participant from a session.
This section defines XML schema for Recording metadata document
]]>
The metadata information sent from SRC to SRS MAY reveal sensitive information about different participants in a session. For this reason, it is RECOMMENDED that a SRC use a strong means for authentication and metadata information protection and that it apply comprehensive authorization rules when using the metadata format defined in this document. The following sections will discuss each of these aspects in more detail.It is RECOMMENDED that a SRC authenticate SRS using the normal SIP authentication mechanisms, such as Digest as defined in Section 22 of . The mechanism used for conveying the metadata information MUST ensure integrity and SHOULD ensure confidentially of the information. In order to achieve these, an end-to-end SIP encryption mechanism, such as S/MIME described in , SHOULD be used.If a strong end-to-end security means (such as above) is not available, it is RECOMMENDED that a SRC use mutual hop-by-hop Transport Layer Security (TLS) authentication and encryption mechanisms described in "SIPS URI Scheme" and "Interdomain Requests" of .
This specification registers a new XML namespace, and a new XML schema.
URI: urn:ietf:params:xml:ns:recording
Registrant Contact: IETF SIPREC working group, Ram mohan R(rmohanr@cisco.com)
XML: the XML schema to be registered is contained in Section 6.
Its first line is ]]> and its last line is ]]>
We wish to thank John Elwell(Siemens-Enterprise), Henry Lum(Alcatel-Lucent), Leon Portman(Nice), De Villers, Andrew Hutton(Siemens-Enterprise), Deepanshu Gautam(Huawei), Charles Eckel(Cisco), Muthu Arul(Cisco), Michael Benenson(Cisco), Hadriel Kaplan (ACME), Brian Rosen(Neustar), Scott Orton(Broadsoft), Ofir Roth for their valuable comments and inputs.
We wish to thank Joe Hildebrand(Cisco), Peter Saint-Andre(Cisco) for the valuable XML related guidance and Martin Thompson for validating the XML schema and providing comments on the same. Note: This Appendix has to be moved to callflow document after the discussion in the mailing alias This section describes the metadata model object instances for different use cases of SIPREC. For the sake of simplicity as the media streams sent by each of the participants is received by every other participant in these use cases, it is NOT shown in the object instance diagrams below. Also for the sake of ease not all attributes of each object are shown in these instance diagrams. Basic call between two Participants A and B. In this use case each participant sends one Media Stream. For the sake of simplicity "receives" lines are not shown in this instance diagram. Media Streams sent by each participant is received all other participants of that CS. Basic call between two Participants A and B and with Participant A or B doing a Hold/Resume. In this use case each participant sends one Media Stream. After Hold/Resume the properties of Media can change. For the sake of simplicity "receives" lines are not shown in this instance diagram. Media Streams sent by each participant is received all other participants of that CS. Basic call between two Participants A and B and with Participant A transfer(consult transfer) to Participant C. In this use case each participant sends one Media Stream. After transfer the properties of Participant A Media can change. For the sake of simplicity "receives" lines are not shown in this instance diagram. Media Streams sent by each participant is received all other participants of that CS.Depending on who act as SRC and the information that an SRC has there can be several ways to model conference use cases. This section has instance diagrams for the following cases:A CS where one of the participant (which is also SRC) is a user in a conferenceA CS where one of the participant is focus ( which is also SRC)A CS where one of the participant is user and the SRC is a different entity like B2BUAA CS where one of the participant is focus and the SRC is a different entity like B2BUANOTE: There MAY be other ways to model the same use cases depending on what information the SRC has.This is the usecase where there is a CS with one of the participant (who is also SRC) as a user in a conference. For the sake of simplicity the receive lines for each of the participant is not shown.In this example we have two participants A and B who are part of a Communication Session(CS). One of the participants B is part of a conference and also acts as SRC.There can be two cases here. B can be a participant of the conference or B can be a focus. In this instance diagram Participant B is a user in a conference. The SRC (Participant B) subscribes to conference event package to get the details of other particiants. Participant B(SRC) sends the same through the metadata to SRS. In this instance diagram the Media Stream(mixed stream) sent from Participant B has media streams contributed by conference participants (D,E,F and G). For the sake of simplicity the "receives" line is not shown here. In this example the media stream sent by each participant(A or B) of CS is received by all other participant(A or B).This is the usecase where there is a CS where one of the participant is focus ( which is also SRC). In this example we have two participants A and B who are part of a Communication Session(CS). One of the participants (C) is focus of a conference and also acts as SRC. The SRC (Participant C) being the Focus of the conference has access to the details of other particiants. SRC (Participant C) sends the same through the metadata to SRS. In this instance diagram the Media Stream(mixed stream) sent by C has media streams contributed by conference participants (A, B, D and E). Participants A, B,D and E sends Media Streams A1, B1, D1 and E1 respectively. The media stream sent by Participant C(Focus) is received by all other participants of CS. For the sake of simplicity the "receives" line is not shown linked to all other participants.NOTE: SRC ( Participant C) can send mixed stream or seperate streams to SRSA CS where one of the participant is user and the SRC is a different entity like B2BUA. In this case the SRC may not know that one of the user is part of conference. Hence the instance diagram will not have information about the conference participants.A CS where one of the participant is focus and the SRC is a different entity like B2BUA. In this case the participant which is focus sends "isfocus" in SIP message to SRC. The SRC subscribe to conference event package on seeing this "isfocus". SRC learns the details of other participants of conference from the conference package and send the same in metadata to SRS. The instance diagram for this use case is same as Case 1. Note: This Appendix has to be moved to callflow document after the discussion in the mailing alias This section describes the metadata model XML instances for different use cases of SIPREC. For the sake of simplicity the complete SIP messages are NOT shown here. Basic call between two Participants A(Alice) and B(Bob) who are part of one session. In this use case each participant sends two Media Streams. Media Streams sent by each participant is received all other participants of that CS in this use-case. Below is the initial snapshot sent by SRC that has complete metadata. For the sake of completeness even snippets of SDP is shown. For the sake of simplicity these use-cases assume the RS stream is unmixed.complete2010-12-16T23:41:07Zsip:alice@cisco.comFOO!bar7+OTCyoxTmqmqyA/1weDAg==
2010-12-16T23:41:07ZFOO!barAliceFOO!bar2010-12-16T23:41:07Zi1Pz3to5hGk8fuXl+PbwCw==UAAMm5GRQKSCMVvLyl4rFw==8zc6e0lYTlWIINA6GR+3ag==EiXGlc+4TruqqoDaNE76ag==BobFOO!bar2010-12-16T23:41:07Z8zc6e0lYTlWIINA6GR+3ag==EiXGlc+4TruqqoDaNE76ag==UAAMm5GRQKSCMVvLyl4rFw==i1Pz3to5hGk8fuXl+PbwCw==
]]>
Basic call between two Participants A and B. This is the continuation of above use-case. One of the participants(say A) goes on hold and then resumes as part of the same session. The metadata snapshot looks as below During holdpartial8zc6e0lYTlWIINA6GR+3ag==EiXGlc+4TruqqoDaNE76ag==8zc6e0lYTlWIINA6GR+3ag==EiXGlc+4TruqqoDaNE76ag==
]]>
During resumeThe snapshot will look pretty much same as Use-case 1. Basic call between two Participants A and B is connected as in Use-case 1. Transfer is initiated by one of the participants of by other entity(3PCC case). SRC sends a snapshot of the participant changes to SRS. In this instance participant A(Alice) drops out during the transfer and Participant C(Paul) joins the session. There can be two cases here, same session continues after transfer or a new session (e.g. REFER based transfer) is created .
Transfer with same session retained - (.e.g. RE-INVITE based transfer). Participant A drops out and C is added to the same session. No change to session/group element. C will be new stream element which maps to RS SDP using the same labels in this instance.partial8zc6e0lYTlWIINA6GR+3ag==EiXGlc+4TruqqoDaNE76ag==60JAJm9UTvik0Ltlih/Gzw==AcR5FUd3Edi8cACQJy/3JQ==PaulFOO!bar2010-12-16T23:41:07Z60JAJm9UTvik0Ltlih/Gzw==AcR5FUd3Edi8cACQJy/3JQ==8zc6e0lYTlWIINA6GR+3ag==EiXGlc+4TruqqoDaNE76ag==
]]>
Transfer with new session - (.e.g. REFER based transfer). In this case new session is part of same grouping (done by SRC).SRC may send an optional snapshot indicating stop for the old session.Partial7+OTCyoxTmqmqyA/1weDAg==
2010-12-16T23:41:07ZFOO!bar2010-12-16T23:41:07Z2010-12-16T23:41:07Z
]]>
SRC sends a snapshot to indicate the participant change and new session information after transfer. In this example the same RS is used.partial7+OTCyoxTmqmqyA/1weDAg==
2010-12-16T23:41:07ZFOO!barFOO!bar2010-12-16T23:32:03Z8zc6e0lYTlWIINA6GR+3ag==EiXGlc+4TruqqoDaNE76ag==60JAJm9UTvik0Ltlih/Gzw==AcR5FUd3Edi8cACQJy/3JQ==FOO!bar2010-12-16T23:41:07Z60JAJm9UTvik0Ltlih/Gzw==AcR5FUd3Edi8cACQJy/3JQ==8zc6e0lYTlWIINA6GR+3ag==EiXGlc+4TruqqoDaNE76ag==
]]>
This example shows a snapshot of metadata sent by an SRC at CS disconnect where the participants of CS are Alice and BobPartial7+OTCyoxTmqmqyA/1weDAg==
2010-12-16T23:41:07ZFOO!bar2010-12-16T23:41:07Z2010-12-16T23:41:07Z
]]>