PIM flooding mechanism and source discoveryCisco Systems, Inc.De kleetlaan 6aDiegem1831Belgiumice@cisco.comCisco Systems, Inc.Tasman DriveSan JoseCA 95134USAstig@cisco.comAegis BMD Program Office17211 Avenue D, Suite 160DahlgrenVA 22448-5148USAmichael.brig@mda.mil
Routing
Multicast
PIM Sparse-Mode uses a Rendezvous Point (RP) and shared trees to forward
multicast packets to Last Hop Routers (LHR). After the first packet is
received by the LHR, the source of the multicast stream is learned and the
Shortest Path Tree (SPT) can be joined. This draft proposes a solution to
support PIM Sparse Mode (SM) without the need for PIM registers, RPs or
shared trees. Multicast source information is flooded throughout the multicast
domain using a new generic PIM flooding mechanism. This mechanism is
defined in this document, and is modeled after the PIM Bootstrap Router
protocol. By removing the need for RPs and shared trees, the PIM-SM procedures
are simplified, improving router operations, management and making the
protocol more robust.
PIM Sparse-Mode uses a Rendezvous Point (RP) and shared trees to forward
multicast packets to Last Hop Routers (LHR). After the first packet is
received by the LHR, the source of the multicast stream is learned and the
Shortest Path Tree (SPT) can be joined. This draft proposes a solution to
support PIM Sparse Mode (SM) without the need for PIM registers, RPs or
shared trees. Multicast source information is flooded throughout the multicast
domain using a new generic PIM flooding mechanism. This mechanism is
defined in this document, and is modeled after the Bootstrap Router
protocol . By removing the need for RPs and shared
trees, the PIM-SM procedures are simplified, improving router operations,
management and making the protocol more robust.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in RFC 2119 .
Rendezvous Point.
Bootstrap Router.
Reverse Path Forwarding.
Shortest Path Tree.
First Hop Router, directly connected to the Source.
Last Hop Router, directly connected to the receiver.
Multicast source to group mapping.
A PIM message containing SG Mappings.
The Bootstrap Router protocol (BSR) is a commonly used
protocol for distributing dynamic Group to RP mappings in PIM. It is
responsible for flooding information about such mappings throughout a PIM
domain, so that all routers in the domain can have the same information.
BSR as defined, is only able to distribute Group to RP mappings. We are
defining a more generic mechanism that can flood any kind of information
throughout a PIM domain. It is not necessarily a domain though, it depends
on administrative boundaries being configured. The forwarding rules are
identical to BSR, except that there is no BSR election. The protocol includes
an originator address which is used for RPF checking to restrict the flooding,
just like BSR. Just like BSR it is also sent hop by hop. Note that there
is no built in election mechanism as in BSR, so there can be multiple
originators. It is still possible to add such an election mechanism if this
protocol is used in scenarios where this is desirable. We include a type
field, which can allow boundaries to be defined, and election to take place,
independently per type. We call this protocol the PIM Flooding Protocol (PFP).
Reserved, Checksum Described in .
PIM Message Type. Value (pending IANA) for a PFP message.
When set, this bit means that the PFP message is not to be forwarded.
The address of the router that originated the message. This can be any
address assigned to this router, but MUST be routable in the domain to allow
successful forwarding (just like BSR address). The format for this address
is given in the Encoded-Unicast address in .
There may be different sub protocols or different uses for this generic
protocol. The PFP Type specifies which sub protocol it is used for.
Some sub protocols may require each router to do some processing of the
contents and not simply forwarding. This bit controls how a router should
treat an unknown PFP Type. When set, a router MUST NOT forward the message
when the PFP Type is unknown. When clear, a router MUST forward the message
when possible. If the PFP Type is known, then the specification of that type
will specify how to handle
the message, including whether it should be forwarded.
A message contains one or more TLVs, in this case n TLVs. The Type
specifies what kind of information is in the Value. Note that the
Type space is shared between all PFP. Not all
types make sense for all protocol types though.
The length of the the value field.
The value associated with the type and of the specified length.
We want to provide information about active multicast sources throughout a
PIM domain by making use of the generic flooding mechanism defined in the
previous section. We request PFP Type 0 to be assigned for
this purpose. We call a message with PFP Type 0 an SG Message.
We also define a PFP TLV which we request to
be type 0.
How this TLV is used with PFP Type 0 is defined in the next section.
Other PFP Types may specify the use of this TLV for other purposes.
For PFP Type 0 the U-bit MUST NOT be set. This means that routers not
supporting PFP Type 0 would still forward the message.
This TLV has type 0.
The length of the value.
The group we are announcing sources for. The format
for this address is given in the Encoded-Group format in
.
How many unicast encoded sources address encodings follow.
The Holdtime (in seconds) for the corresponding source(s).
The source address for the corresponding group.
The format for these addresses is given in the Encoded-Unicast address in .
An SG Mesage, that is a PFP message of Type 0, may contain one or more
Group Source Holdtime TLVs. This is used to flood information about
active multicast sources. Each FHR that is directly connected to an active
multicast source originates SG BSR messages. How a multicast router discovers
the source of the multicast packet and when it considers itself the FHR
follows the same procedures as the registering process described in
. After it is decided that a register needs to be sent,
the SG is not registered via the PIM SM register procedures, but the SG mapping
is included in an SG message. Note, only the SG mapping is distributed in the
message, not the entire packet as would have been done with a PIM register.
The router originating the SG messages includes one of its own addresses in
the originator field. Note that this address must be routeable due to RPF
checking. The SG messages are periodically sent for as long as the
multicast source is active, similar to how PIM registers are periodically sent.
The default announcement period is 60 seconds, which means that as long as
the source is active, it is included in an SG message originated every 60
seconds. The holdtime for the source is by default 210 seconds. Other values
can be configured, but the holdtime must be larger than the announcement
period.
A router that receives an SG message should parse the message and store the
SG mappings with a holdtimer started with the advertised holdtime for that
group. If there are directly connected receivers for that group this router
should send PIM (S,G) joins for all the SG mappings advertised in the message.
The SG mappings are kept alive for as long as the holdtimer for the source
is running. Once the holdtimer expires a PIM (S,G) prune must be sent to
remove itself from the tree.
The PIM register procedure is designed to deliver Multicast packets to the RP in the absence of a native SPT tree from the RP to the source. The register packets received on the RP are decapsulated and forwarded down the shared tree to the LHRs. As soon as an SPT tree is built, multicast packets would flow natively over the SPT to the RP or LHR and the register process would stop. The PIM register process bridges the gap between how long it takes to build the SPT tree to the FHR. If the packets would not be unicast encapsulated to the RP they would be dropped by the FHR until the SPT is setup. This functionality is important for applications where the initial packet(s) must be received for the application to work correctly. Another reason would be for bursty sources. If the application sends out a multicast packet every 4 minutes (or longer), the SPT is torn down (typically after 3:30 minutes of inactivity) before the next packet is forwarded down the tree. This will cause no multicast packet to ever be forwarded. A well behaved application should really be able to deal with packet loss since IP is a best effort based packet delivery system. But in reality this is not always the case.
With the procedures proposed in this draft the packet(s) received by the FHR will be dropped until the LHR has learned about the source and the SPT tree is built. That means for bursty sources or applications sensitive for the delivery of the first packet this proposal would not be very applicable. This proposal is mostly useful for applications that don't have strong dependency on the initial packet(s) and have a fairly constant data rate, like video distribution for example. For applications with strong dependency on the initial packet(s) we recommend using PIM Bidir or SSM . The protocol operations are much simpler compared to PIM SM, it will cause less churn in the network and both guarantee best effort delivery for the initial packet(s).
Another solution to address the problems described above is documented in . This proposal allows for a host to tell the FHR its willingness to act as Source for a certain Group before sending the data packets. LHRs have time to join the SPT tree before the host starts sending which would avoid packet loss. The SG mappings announced by can be advertised directly in SG messages, allowing a very nice integration of both proposals.
The life time of the SPT is not driven by the liveliness of Multicast data packets (which is the case with PIM SM), but by the announcements driven via . This will also prevent packet loss due to bursty sources.
In a PIM SM deployment where the network becomes partitioned, due to link or node failure, it is possible that the RP becomes unreachable to a certain part of the network. New sources that become active in that partition will not be able to register to the RP and receivers within that partition are not able to receive the traffic. Ideally you would want to have a candidate RP in each partition, but you never know in advance which routers will form a partitioned network. In order to be fully resilient, each router in the network may end up being a candidate RP. This would increase the operational complexity of the network.
The solution described in this document does not suffer from that problem. If a network becomes partitioned and new sources become active, the receivers in that partitioned will receive the SG Mappings and join the source tree. Each partition works independently of the other partition(s) and will continue to have access to sources within that partition. As soon as the network heals, the SG Mappings are re-flooded into the other partition(s) and other receives can join to the newly learned sources.
The security considerations are no different from what is documented in .
This document requires the assignment of a new PIM Protocol type for the
PIM Flooding Protocol (PFP). IANA also needs to create a registry for PFP
Types with type 0 allocated to "Source-Group Message". IANA also needs to
create a registry for PFP TLVs, with type 0 allocated to the
"Source Group Holdtime" TLV. The allocation procedures are yet to be
determined.
The authors would like to thank Arjen Boers for contributing to the initial idea and Yiqun Cai for his comments on the draft.