< draft-adubey-bfd-service-redundancy-00.txt   draft-adubey-bfd-service-redundancy-01.txt >
INTERNET-DRAFT Sami Boutros INTERNET-DRAFT Sami Boutros
Intended Status: Standard Track Ankur Dubey Intended Status: Standard Track Ankur Dubey
VMware VMware
Reshad Rahman Reshad Rahman
Cisco Cisco
Expires: November 15, 2017 May 14, 2017 Expires: May 31, 2018 November 27, 2017
Service Redundancy using BFD Service Redundancy using BFD
draft-adubey-bfd-service-redundancy-00 draft-adubey-bfd-service-redundancy-01
Abstract Abstract
In a data center, when multiple routing/service nodes are providing In a data center, when multiple routing/service nodes are providing
single active redundancy for a set of L2, L3 and/or L4-L7 services. single active redundancy for a set of L2, L3 and/or L4-L7 services.
Both non-revertive and revertive fail over modes are required for the Both non-revertive and revertive fail over modes are required for the
services. This draft describes a method to achieve the non-revertive services. This draft describes a method to achieve the non-revertive
and revertive fail over modes for services using Bidirectional and revertive fail over modes for services using Bidirectional
Forwarding Detection (BFD). Forwarding Detection (BFD).
skipping to change at page 2, line 21 skipping to change at page 2, line 21
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Solution Overview . . . . . . . . . . . . . . . . . . . . . . . 4 2. Solution Overview . . . . . . . . . . . . . . . . . . . . . . . 4
3 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Node failover . . . . . . . . . . . . . . . . . . . . . . . 4
4 Security Considerations . . . . . . . . . . . . . . . . . . . . 5 2.2 Per service failover for non-revertive services . . . . . . 5
5 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 5 3 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 6
6 References . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4 Security Considerations . . . . . . . . . . . . . . . . . . . . 6
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 5 5 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6
6 References . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 6
1 Introduction 1 Introduction
This document describes how can a group of service/routing nodes in a This document describes how can a group of service/routing nodes in a
data center providing single active redundancy for multiple L2/L3 data center providing single active redundancy for multiple L2/L3
and/or L4/L7 services, can use BFD protocol to support non-revertive and/or L4/L7 services, can use BFD protocol to support non-revertive
as well as revertive fail over mode. as well as revertive fail over mode.
Typically, BFD is used between the group of service nodes to verify Typically, BFD is used between the group of service nodes to verify
the connectivity as well as the aliveness of the service nodes. The the connectivity as well as the aliveness of the service nodes. The
skipping to change at page 4, line 21 skipping to change at page 4, line 21
// | \ // | \
// | \ // | \
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+
|Node1 |-BFD-|Node2 |-BFD-|Node3 | |Node1 |-BFD-|Node2 |-BFD-|Node3 |
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+
|--------------BFD--------| |--------------BFD--------|
Figure 1: Figure 1:
Figure 1 shows 3 routing nodes using BFD to implement the single Figure 1 shows 3 routing nodes using BFD to implement the single
active redundancy for revertive and non-revertive services. active redundancy for revertive and non-revertive services. More than
3 routing nodes can be used.
Multiple L2/L3 and/or L4/L7 services are offered in a data center by Multiple L2/L3 and/or L4/L7 services are offered in a data center by
a set of routing/service nodes providing single active redundancy. a set of routing/service nodes providing single active redundancy.
The provisioning of the services can be done using a centralized The provisioning of the services can be done using a centralized
control plane implemented in a controller or using a distributed control plane implemented in a controller or using a distributed
dynamic control plane. dynamic control plane.
Every L2/L3 and/or L4/L7 service is identified by a unique ID known 2.1 Node failover
across the routing/service nodes providing the services.
A bitmap will be used to represent the services, where each service An implementation MAY choose to support only node failover and not a
is represented by one bit in the bit map. All the service nodes MUST per service failover. A node can be primary or backup for a given
have the same mapping of the bit position to the service unique ID. service. On a primary node failure, all non-revertive and revertive
The bitmap position and the unique service ID could be maintained by services will become active on the backup node.
a network controller. The bitmap will be used in the payload of the
BFD packets sent by the service node to indicate which service the
node maintain an active status for.
Service nodes providing single active redundancy will communicate In figure 1, lets assume that Node1 is the primary node for a set A
using BFD this bitmap carried in the BFD control packet payload. When of non-revertive services with node2 as backup, and another set B of
a backup service node takes over a service with a non-revertive fail non-revertive services with Node3 as backup. As well, Node1 is
over mode after primary node failure. The backup node once the BFD primary for a set C of revertive services with Node2 as backup and,
session comes up with the recovered primary node, will set the bit another set D of revertive services with Node3 as backup.
associated with this service in the bitmap payload carried in the BFD
control packet sent to the primary node. Furthermore, the backup node
will use a new Diag code in the BFD control packet to inform the
primary node that it out-lived it and took over the set of non-
preemptive services encoded in the bitmap of the BFD control packet
payload.
The BFD control packet with the new Diag code and the bitmap will be If Node1 fails, Node2 and Node3 will set a new diag code in the BFD
sent after the BFD session came up in the BFD control packets for at control packet. This diag code will inform Node1 that both Node2 and
least twice the detection multiplier count. Only the non-revertive Node3 didn't fail, and Node1 MUST NOT activate the non-revertive set
services associated bits in the bitmap will be set by a service node of services A and B respectively, when it comes back up. The BFD
acting as a backup for those services after a primary node failure control packet with the new diag code will be sent after the BFD
recovery. Primary node upon receiving the BFD control packet with the session came up for at least twice the detection multiplier count.
bit set for the corresponding non-revertive service MUST not attempt
to activate the service, but should remain in standby state for the Therefore, Node1 upon receiving the BFD control packet with the new
service until the backup node that took over fails. diag code, MUST not attempt to activate the non-revertive services,
but remain in standby state for the non-revertive services until the
Node2 or Node3 that took over fails.
Revertive services are assumed to revert back to the primary node Revertive services are assumed to revert back to the primary node
after primary node recovers. Once the BFD session comes up between Node1, after the node recovers. Once the BFD session comes up between
the primary and backup node, the backup node should stop forwarding the primary and backup nodes, the backup node should stop forwarding
for any revertive services. A node MUST start forwarding all for any revertive services. A node MUST start forwarding all
revertive services for which it is configured as a primary once the revertive services for which it is configured as a primary once the
BFD session comes up with the corresponding backup nodes. A node MUST BFD session comes up with the corresponding backup nodes. A node MUST
stop forwarding for revertive services for which it is a backup once stop forwarding for revertive services for which it is a backup once
the BFD session comes up with the corresponding primary. the BFD session comes up with the corresponding primary.
2.2 Per service failover for non-revertive services
An implementation MAY choose to support per service failover for non-
revertive services. For example, in figure1, some non-revertive
services could be active on Node1 while some non-revertive services
could be active on Node2 or Node3 for better load balancing of
services traffic. In this mode, every L2/L3 and/or L4/L7 non-
revertive service will be identified by a unique ID known across the
routing/service nodes providing the services.
A bitmap will be used to represent the non-revertive services, where
each non-revertive service is represented by one bit in the bitmap.
All the service nodes MUST have the same mapping of the bit position
to the non-revertive service unique ID. The bitmap position and the
unique service ID could be maintained by a network controller.
A node that is assigned as backup for a given non-revertive service
node will take over as active in either of the following cases: 1)
The node assigned as primary for this service failed. 2) This
specific service failed on the primary node for this service.
In case 1, the BFD session will go down since it is a node failure.
In case 2, BFD session between the nodes will remain up. In either
scenarios, the node assigned as secondary will become active for the
non-revertive service. In case 1, the secondary node will set the new
diag code in the BFD control packets once the BFD session is
established. The new diag code will be set in the BFD control packets
for at least twice the detection multiplier count. In case 2, this
diag code will be set in the next BFD control packets sent after the
node takes over as Active for a given non-revertive service. If there
is at least one non-revertive service for which this node is not
active AND at least 1 non-revertive service for which it is active,
the node will also send the bitmap in the BFD control packets
payload. The bits identifying the active non-revertive services will
be set in this bitmap. The new diag code and the optional bitmap
payload will be sent in the BFD control packets for at least twice
the detection multiplier count.
Therefore, if a node receives a BFD control packet with the new diag
code set but no payload in the BFD control packet, this means that it
MUST NOT activate all non-revertive services for which this node is
primary. Whereas, if a payload is present in the BFD control packet
that has the new diag code set, the receiving node MUST NOT activate
the non-revertive services indicated by the set bits in the bitmap.
Per service failover is not applicable to revertive services. They
will behave the same way as described in section 2.1
3 Acknowledgements 3 Acknowledgements
4 Security Considerations 4 Security Considerations
This document does not introduce any additional security constraints. This document does not introduce any additional security constraints.
5 IANA Considerations 5 IANA Considerations
IANA is requested to assign a new diag code from the "BFD Diagnostic IANA is requested to assign a new diag code from the "BFD Diagnostic
Codes" Codes"
Value BFD Diagnostic Code Name Value BFD Diagnostic Code Name
----- ------------------------------------------------------------ ----- ------------------------------------------------
0xNN Out-lived and BitMap payload set with non-revertive services 0xNN Out-lived and optional BitMap BFD control packet
payload for non-revertive services.
6 References 6 References
[RFC5880] D. Katz, D. Ward "Bidirectional Forwarding Detection [RFC5880] D. Katz, D. Ward "Bidirectional Forwarding Detection
(BFD)". (BFD)".
Authors' Addresses Authors' Addresses
Sami Boutros Sami Boutros
VMware VMware
Email: sboutros@vmware.com Email: sboutros@vmware.com
Ankur Dubey Ankur Dubey
VMware VMware
Email: adubey@vmware.com Email: adubey@vmware.com
Reshad Rahman Reshad Rahman
Cisco Cisco
Email: rrahman@cisco.com Email: rrahman@cisco.com
 End of changes. 13 change blocks. 
43 lines changed or deleted 87 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/