< draft-allan-spring-mpls-multicast-framework-00.txt   draft-allan-spring-mpls-multicast-framework-01.txt >
SPRING Working Group Dave Allan, Jeff Tantsura SPRING Working Group Dave Allan
Internet Draft Ericsson Internet Draft Ericsson
Intended status: Standards Track Intended status: Standards Track Jeff Tantsura
Expires: August 2016 Expires: December 2016
February 2016 June 2016
A Framework for Computed Multicast applied to MPLS based Segment A Framework for Computed Multicast applied to MPLS based Segment
Routing Routing
draft-allan-spring-mpls-multicast-framework-00 draft-allan-spring-mpls-multicast-framework-01
Abstract Abstract
This document describes a multicast solution for Segment Routing with This document describes a multicast solution for Segment Routing with
MPLS data plane. It is consistent with the Segment Routing MPLS data plane. It is consistent with the Segment Routing
architecture in that an IGP is augmented to distribute information in architecture in that an IGP is augmented to distribute information in
addition to the link state. In this solution it is multicast group addition to the link state. In this solution it is multicast group
membership information sufficient to synchronize state in a given membership information sufficient to synchronize state in a given
network domain. Computation is employed to determine the topology of network domain. Computation is employed to determine the topology of
any loosely specified multicast distribution tree. any loosely specified multicast distribution tree.
skipping to change at page 1, line 45 skipping to change at page 1, line 45
documents at any time. It is inappropriate to use Internet- documents at any time. It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as "work Drafts as reference material or to cite them other than as "work
in progress". in progress".
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 2016. This Internet-Draft will expire on December 2016.
Copyright and License Notice Copyright and License Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with carefully, as they describe your rights and restrictions with
skipping to change at page 2, line 34 skipping to change at page 2, line 34
3. Solution Overview..............................................4 3. Solution Overview..............................................4
3.1. Mapping source specific trees onto the segment routing 3.1. Mapping source specific trees onto the segment routing
architecture......................................................5 architecture......................................................5
3.2. Role of the Routing System...................................5 3.2. Role of the Routing System...................................5
3.3. MDT Construction Requirements................................6 3.3. MDT Construction Requirements................................6
3.4. Pruning - theory of operation................................6 3.4. Pruning - theory of operation................................6
4. Elements of Procedure..........................................7 4. Elements of Procedure..........................................7
4.1. Triggers for Computation.....................................7 4.1. Triggers for Computation.....................................7
4.2. FIB Determination............................................7 4.2. FIB Determination............................................7
4.2.1. Information in the IGP.....................................7 4.2.1. Information in the IGP.....................................7
4.2.2. Computation of individual segments.........................7 4.2.2. Computation of individual segments.........................8
4.3. FIB Generation..............................................10 4.3. FIB Generation..............................................10
4.4. FIB installation............................................10 4.4. FIB installation............................................11
5. Related work..................................................11 5. Related work..................................................11
5.1. IGP Extensions..............................................11 5.1. IGP Extensions..............................................11
5.2. BGP Extensions..............................................11 5.2. BGP Extensions..............................................11
6. Observations..................................................11 6. Observations..................................................12
7. Acknowledgements..............................................12 7. Acknowledgements..............................................12
8. Security Considerations.......................................12 8. Security Considerations.......................................12
9. IANA Considerations...........................................12 9. IANA Considerations...........................................12
10. References...................................................12 10. References...................................................12
10.1. Normative References.......................................12 10.1. Normative References.......................................12
10.2. Informative References.....................................12 10.2. Informative References.....................................12
11. Authors' Addresses...........................................13 11. Authors' Addresses...........................................13
1. Introduction 1. Introduction
skipping to change at page 3, line 23 skipping to change at page 3, line 23
nodes determined to have a role. Therefore state only need be nodes determined to have a role. Therefore state only need be
installed in nodes that have one of these three roles to fully installed in nodes that have one of these three roles to fully
instantiate an MDT. instantiate an MDT.
Although this approach is computationally intensive, a significant Although this approach is computationally intensive, a significant
amount of computation can be avoided when the computing agent amount of computation can be avoided when the computing agent
determines that the node it is computing for has no role in a given determines that the node it is computing for has no role in a given
MDT. This permits a computed approach to multicast convergence to be MDT. This permits a computed approach to multicast convergence to be
computationally tractable. computationally tractable.
1.1. Authors 1.1. Authors
David Allan, Jeff Tantsura Dave Allan, Jeff Tantsura
1.2. Requirements Language 1.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC2119 [RFC2119]. document are to be interpreted as described in RFC2119 [RFC2119].
2. Conventions used in this document 2. Conventions used in this document
2.1. Terminology 2.1. Terminology
skipping to change at page 4, line 13 skipping to change at page 4, line 13
or more leaves for a given multicast distribution tree or more leaves for a given multicast distribution tree
Multicast convergence - is when all computation and state Multicast convergence - is when all computation and state
installation to ensure the FIB reflects the multicast information in installation to ensure the FIB reflects the multicast information in
the IGP is complete. the IGP is complete.
MDT - multicast distribution tree. Is a tree composed of one or more MDT - multicast distribution tree. Is a tree composed of one or more
multicast segments. multicast segments.
Multicast segment - is a portion of the multicast tree where only the Multicast segment - is a portion of the multicast tree where only the
root and the leaves have been specified, and computation based upon root and the leaves have been specified, and computation based upon
the current state of the IGP database will be employed to determine the current state of the IGP database is employed to determine and
and install the required state to implement the segment. A multicast install the required state to implement the segment. For MPLS a
segment is identified by a multicast SID. multicast segment is implemented as a p2mp LSP. A multicast segment
is identified by a multicast SID.
Multicast SID - Is the data plane identifier that is used to
implement a multicast segment. As per a unicast MPLS segment, the
rightmost 20 bits of a multicast SID is encoded as a label. It is
drawn from an SRGB that is global to the SR domain.
Pinned path - Is a unique shortest path extending from a leaf Pinned path - Is a unique shortest path extending from a leaf
upstream towards the root for a given multicast segment. Therefore is upstream towards the root for a given multicast segment. Therefore is
a component of the multicast segment that it has been determined must a component of the multicast segment that it has been determined must
be there. It will not necessarily extend from the leaf all the way to be there. It will not necessarily extend from the leaf all the way to
the root during intermediate computation steps. A pinned path can the root during intermediate computation steps. A pinned path can
result from pruning operations. result from pruning operations.
Role - refers specifically to a node that is either a root, a leaf, a Role - refers specifically to a node that is either a root, a leaf, a
replication node, or a pinned waypoint for a given MDT. replication node, or a pinned waypoint for a given MDT.
skipping to change at page 5, line 8 skipping to change at page 5, line 13
information in the IGP database. information in the IGP database.
Explicitly routed MDTs are expressed as a tree of concatenated Explicitly routed MDTs are expressed as a tree of concatenated
multicast segments where both the leaves of each segment and the multicast segments where both the leaves of each segment and the
waypoints coupling a given segment to the upstream and/or downstream waypoints coupling a given segment to the upstream and/or downstream
segment(s) is specified in information flooded in the IGP by the segment(s) is specified in information flooded in the IGP by the
overall root of the MDT. The segments themselves will be computed as overall root of the MDT. The segments themselves will be computed as
per a loosely specified MDT. per a loosely specified MDT.
A PE acting as an overall root for a given tree is expected to be A PE acting as an overall root for a given tree is expected to be
configured by management as to where to source multicast traffic configured by the operator as to where to source multicast traffic
from, be it an attachment circuit, interworking function for client from, be it an attachment circuit, interworking function for client
technology or other. Similarly a leaf for a given tree is expected to technology or other. Similarly a leaf for a given tree is expected to
be configured by management as to the disposition of received be configured by the operator as to the disposition of received
multicast traffic. multicast traffic.
A computed segment is guaranteed to be loop free in a stable system. A computed segment is guaranteed to be loop free in a stable system.
A concatenation of segments to construct an MDT will similarly be A concatenation of segments to construct an MDT will similarly be
loop free as any collision of segments can be disambiguated in the loop free as any collision of segments can be disambiguated in the
data plane via the SIDs. data plane via the SIDs.
This architecture significantly reduces the amount of state that This architecture significantly reduces the amount of state that
needs to be installed in the data plane to support multicast. This needs to be installed in the data plane to support multicast. This
also means that the impact of many failures in the network on also means that the impact of many failures in the network on
multicast traffic distribution will be recovered by unicast local multicast traffic distribution will be recovered by unicast local
repair or unicast convergence with subsequent multicast convergence repair or unicast convergence with subsequent multicast convergence
acting in the role of network re-optimization (as opposed to acting in the role of network re-optimization (as opposed to
restoration). restoration).
3.1. Mapping source specific trees onto the segment routing architecture 3.1. Mapping source specific trees onto the segment routing architecture
A computed source specific tree for a given multicast group A computed source specific tree for a given multicast group
corresponds to one or more multicast segments in the SR architecture, corresponds to one or more multicast segments in the SR architecture.
each of which is assigned a SID, typically by management Each multicast segment is assigned a SID, typically by management
configuration of the node that will be the overall root for the configuration of the node that will be the overall root for the
source specific tree, which then uses the IGP to advertise this source specific tree. The root node then uses the IGP to advertise
information to the root"s peers. this information to all nodes in the IGP area/domain.
A multicast group is implemented as the set of source specific trees A multicast group is implemented as the set of source specific trees
from all nodes that have registered transmit interest to all nodes from all nodes that have registered transmit interest to all nodes
that have registered receive interest in a multicast group. that have registered receive interest in a multicast group.
3.2. Role of the Routing System 3.2. Role of the Routing System
The role of the IGP is to communicate topology information, multicast The role of the IGP is to communicate topology information, multicast
registrations, unicast to SID bindings, multicast to SID bindings and capability and associated algorithm, multicast registrations, unicast
waypoints in multi-segment MDTs. No changes to topology or unicast to to SID bindings, multicast to SID bindings and waypoints in multi-
SID bindings advertisement are proposed by this memo. segment MDTs. No changes to topology or unicast to SID binding
advertisements are proposed by this memo.
The multicast registrations/bindings will be in the form of source, The multicast registrations/bindings will be in the form of source,
group, transmit/receive interest and the SID to use for the source group, transmit/receive interest and the SID to use for the source
specific multicast tree. Registrations are originated by any node specific multicast tree. Registrations are originated by any node
that has send or receive interest in a given multicast group. Nodes that has send or receive interest in a given multicast group. Nodes
will use the combination of topology and multicast registrations to will use the combination of topology and multicast registrations to
determine the nodes that have a role in each source specific tree and determine the nodes that have a role in each source specific tree and
the SID information to then derive the required FIB state. the SID information to then derive the required FIB state.
The definition of the required IGP TLVs is out of scope of this memo
and will be done in relevant IGP drafts.
3.3. MDT Construction Requirements 3.3. MDT Construction Requirements
A multicast segment in an MDT is constructed such that between any A multicast segment in an MDT is constructed such that between any
pair of nodes that have a role in the segment and are connected by a pair of nodes that have a role in the segment and are connected by a
unicast tunnel, there is not another node on the shortest path unicast tunnel, there is not another node on the shortest path
between the two with a role in that segment. This ensures that copies between the two with a role in that segment. This ensures that copies
of a packet forwarded by an multicast segment will traverse a link of a packet forwarded by an multicast segment will traverse a link
only once in a stable system. only once in a stable system.
Note that this can be satisfied by a minimum cost shortest path tree, Note that this can be satisfied by a minimum cost shortest path tree,
skipping to change at page 7, line 36 skipping to change at page 7, line 36
Group membership information for a multicast segment is obtained from Group membership information for a multicast segment is obtained from
the IGP. This is true for single segment MDTs as well as multi- the IGP. This is true for single segment MDTs as well as multi-
segment MDTs. Included in the multi-segment MDT specification is the segment MDTs. Included in the multi-segment MDT specification is the
waypoint nodes in MDT and the upstream and downstream SIDs. The waypoint nodes in MDT and the upstream and downstream SIDs. The
specified node is expected to cross connect the SIDs to join the specified node is expected to cross connect the SIDs to join the
segments together acting in the role of leaf for the upstream segment segments together acting in the role of leaf for the upstream segment
and root for the downstream segment. and root for the downstream segment.
When a waypoint in an MDT descriptor does not exist in the IGP, the When a waypoint in an MDT descriptor does not exist in the IGP, the
assumption is that the node has failed. The response of the other assumption is that the node identified by the waypoint SID has
nodes in the system in FIB determination is to add the leaves of the failed. The response of the other nodes in the system in FIB
downstream segment to the upstream segment. determination is to add the leaves of the downstream segment to the
upstream segment.
An example of this would be consider a node "x", and another node
"y". At some point in time, "x" advertises a tree that identifies "y"
as a waypoint that cross connects upstream SID "a" to downstream SID
"b". At some later point node "y" fails. The other nodes in the
network will compute segment "a" as if it included all leaves and
waypoints in segment "b". All apriori state installed for segment "b"
would be removed as the failure of "y" has required "b" to be
subsumed by "a".
4.2.2. Computation of individual segments 4.2.2. Computation of individual segments
FIB generation for a multicast segment is the result of computation, FIB generation for a multicast segment is the result of computation,
ultimately as applied to all source specific trees in the network. ultimately as applied to all source specific trees in the network.
All computing nodes implement a common algorithm for tree generation, All computing nodes implement a common algorithm for tree generation,
as all MUST agree on the solution. as all MUST agree on the solution.
One algorithm is as follows: One algorithm is as follows:
skipping to change at page 8, line 28 skipping to change at page 8, line 41
Root---------A----------B Root---------A----------B
B is a leaf. A is not but is in a potential shortest path from root B is a leaf. A is not but is in a potential shortest path from root
to B. However A will have no role in the MDT that serves B as it to B. However A will have no role in the MDT that serves B as it
provides simple transit therefore is replaced with a direct provides simple transit therefore is replaced with a direct
connection between the root and B. connection between the root and B.
Root--------------------B Root--------------------B
Note that such pruning also needs to avoid the creation of Note that such pruning also needs to avoid the creation of
duplicate links. For example: duplicate parallel links. For example:
/----------A----------\ /----------A----------\
Root B Root B
\----------C----------/ \----------C----------/
Where A and C have no role, they can be replaced with a single link Where A and C have no role and the cost root-A-B = cost root-C-B,
from Root to B. they can be replaced with a single link from Root to B.
3) Simplify via the elimination of fewer hop paths 3) Simplify via the elimination of fewer hop paths
When for a given set of leaves, a node has multiple downstream When for a given set of leaves, a node has multiple downstream
links that converge on a common downstream point, and that set of links that converge on a common downstream point, and that set of
leaves is only a subset of the leaves reachable on one or more of leaves is only a subset of the leaves reachable on one or more of
the links, any link that only serves that subset of leaves can be the links, any link that only serves that subset of leaves can be
pruned. pruned.
For example: For example:
skipping to change at page 9, line 4 skipping to change at page 9, line 18
links that converge on a common downstream point, and that set of links that converge on a common downstream point, and that set of
leaves is only a subset of the leaves reachable on one or more of leaves is only a subset of the leaves reachable on one or more of
the links, any link that only serves that subset of leaves can be the links, any link that only serves that subset of leaves can be
pruned. pruned.
For example: For example:
--A---------------------------B --A---------------------------B
\ / \ /
-----------C----------- -----------C-----------
\ \
----D ----D
Link AB is cost 2, link AC and CB are cost 1 (cost of link CD does
not affect the example).
B and D are leaves of a root upstream of A. From A, link AB can B and D are leaves of a root upstream of A. From A, link AB can
reach leaf B. Path AC can reach leaf B and D. In this case path A-B reach leaf B. Path AC can reach leaf B and D. In this case path A-B
can be pruned from consideration. The set of leaves reachable via can be pruned from consideration. The set of leaves reachable via
link A-B is a subset of that reachable by A-C, and the paths from A link A-B is a subset of that reachable by A-C, and the paths from A
that serves that subset converges at B. that serves that subset converges at B.
4) Prune via the elimination of upstream links where the nearest 4) Prune via the elimination of upstream links where the nearest
reachable leaf is further than the closest leaf or pinned path, reachable leaf is further than the closest leaf or pinned path,
and that path does not have a candidate replication point closer and that path does not have a candidate replication point closer
than the closet leaf or pinned path, as the resulting tree will than the closet leaf or pinned path, as the resulting tree will
skipping to change at page 11, line 20 skipping to change at page 11, line 36
the given segment. the given segment.
b. Installation of state for waypoints in multi-segment MDTs. b. Installation of state for waypoints in multi-segment MDTs.
2) After T1: Update state for nodes that both had and have a role in 2) After T1: Update state for nodes that both had and have a role in
a given multicast segment. a given multicast segment.
3) After T2: Removal of state for nodes that transition from having a 3) After T2: Removal of state for nodes that transition from having a
role to not having a role for a given multicast segment. role to not having a role for a given multicast segment.
T1 and T2 will be network wide configurable values. T1 and T2 are network wide configurable values.
5. Related work 5. Related work
5.1. IGP Extensions 5.1. IGP Extensions
RFC 6329 provides a useful example of some of the type of IGP changes The required IGP changes are documented in [MCAST-ISIS] and [MCAST-
that will be required. There are two aspects in RFC 6329 that are OPSF].
worth emulating:
- The advertisement of multicast registrations
- The negotiation of the algorithm to be used for MDT computation
The required changes for both IS-IS and OSPF will be documented in
separate WG targeted I-Ds.
5.2. BGP Extensions 5.2. BGP Extensions
This memo will require the specification of a new PMSI Tunnel This memo will require the specification of a new PMSI Tunnel
Attribute (SPRING P2MP tunnel, tentatively 0x09) to order to Attribute (SPRING P2MP tunnel, tentatively 0x09) to order to
integrate into the multicast framework documented in RFC 6514 integrate into the multicast framework documented in RFC 6514
6. Observations 6. Observations
This technique is not confined to segment routing, and with the This technique is not confined to segment routing, and with the
skipping to change at page 12, line 16 skipping to change at page 12, line 24
speakers and converge independently so is written in a form that speakers and converge independently so is written in a form that
assumes a node, computing node and IGP speaker are one in the same. assumes a node, computing node and IGP speaker are one in the same.
It should be observed that the relative frugality of data plane state It should be observed that the relative frugality of data plane state
would suggest that separation of computation from nodes in the data would suggest that separation of computation from nodes in the data
plane combined with management or "software defined networking" based plane combined with management or "software defined networking" based
population of the multicast FIB entries may also be useful modes of population of the multicast FIB entries may also be useful modes of
network operation. network operation.
7. Acknowledgements 7. Acknowledgements
Thanks to Uma Chunduri for his detailed review and suggestions.
8. Security Considerations 8. Security Considerations
For a future version of this document. For a future version of this document.
9. IANA Considerations 9. IANA Considerations
For a future version of this document. This document requires the allocation of a PMSI tunnel type to
identify a SPRING P2MP tunnel type from the P-Multicast Service
Interface Tunnel (PMSI Tunnel) Tunnel Types registry.
10. References 10. References
10.1. Normative References 10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
10.2. Informative References 10.2. Informative References
[RFC6379] Ashwood-Smith et.al., "IS-IS Extensions Supporting IEEE [MCAST-ISIS] Allan et.al., "IS-IS extensions for Computed Multicast
802.1aq Shortest Path Bridging", IETF RFC 6329, April 2012 applied to MPLS based Segment Routing", IETF work in progress,
draft-allan-isis-spring-multicast-00, July 2016
[MCAST-OSPF] Allan et.al., "OSPF extensions for Computed Multicast
applied to MPLS based Segment Routing", IETF work in progress,
draft-allan-ospf-spring-multicast-00, July 2016
[RFC6514] Aggarwal et.al., "BGP Encodings and Procedures for Multicast [RFC6514] Aggarwal et.al., "BGP Encodings and Procedures for Multicast
in MPLS/BGP IP VPNs", IETF RFC 6514, February 2012 in MPLS/BGP IP VPNs", IETF RFC 6514, February 2012
[RFC7385] Andersson & Swallow "IANA Registry for P-Multicast Service [RFC7385] Andersson & Swallow "IANA Registry for P-Multicast Service
Interface (PMSI) Tunnel Type Code Points", IETF RFC 7385, Interface (PMSI) Tunnel Type Code Points", IETF RFC 7385,
October 2014 October 2014
11. Authors' Addresses 11. Authors' Addresses
Dave Allan (editor) Dave Allan (editor)
Ericsson Ericsson
300 Holger Way 300 Holger Way
San Jose, CA 95134 San Jose, CA 95134
USA USA
Email: david.i.allan@ericsson.com Email: david.i.allan@ericsson.com
Jeff Tantsura Jeff Tantsura
Ericsson Email: jefftant.ietf@gmail.com
200 Holger Way
San Jose, CA 95134
Email: jeff.tantsura@ericsson.com
 End of changes. 27 change blocks. 
45 lines changed or deleted 64 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/