| rfc9871v2.txt | rfc9871.txt | |||
|---|---|---|---|---|
| skipping to change at line 27 ¶ | skipping to change at line 27 ¶ | |||
| This document describes the routing framework and BGP extensions to | This document describes the routing framework and BGP extensions to | |||
| enable intent-aware routing using the BGP CAR solution. The solution | enable intent-aware routing using the BGP CAR solution. The solution | |||
| defines two new BGP SAFIs (BGP CAR SAFI and BGP VPN CAR SAFI) for | defines two new BGP SAFIs (BGP CAR SAFI and BGP VPN CAR SAFI) for | |||
| IPv4 and IPv6. It also defines an extensible Network Layer | IPv4 and IPv6. It also defines an extensible Network Layer | |||
| Reachability Information (NLRI) model for both SAFIs that allows | Reachability Information (NLRI) model for both SAFIs that allows | |||
| multiple NLRI types to be defined for different use cases. Each type | multiple NLRI types to be defined for different use cases. Each type | |||
| of NLRI contains key and TLV-based non-key fields for efficient | of NLRI contains key and TLV-based non-key fields for efficient | |||
| encoding of different per-prefix information. This specification | encoding of different per-prefix information. This specification | |||
| defines two NLRI types: Color-Aware Route NLRI and IP Prefix NLRI. | defines two NLRI types: Color-Aware Route NLRI and IP Prefix NLRI. | |||
| It defines non-key TLV types for the MPLS label stack, SR-MPLS Label | It defines non-key TLV types for the MPLS label stack, SR-MPLS label | |||
| Index and Segment Routing over IPv6 (SRv6) Segment Identifiers | index, and Segment Routing over IPv6 (SRv6) Segment Identifiers | |||
| (SIDs). This solution also defines a new Local Color Mapping (LCM) | (SIDs). This solution also defines a new Local Color Mapping (LCM) | |||
| Extended Community. | Extended Community. | |||
| Status of This Memo | Status of This Memo | |||
| This document is not an Internet Standards Track specification; it is | This document is not an Internet Standards Track specification; it is | |||
| published for examination, experimental implementation, and | published for examination, experimental implementation, and | |||
| evaluation. | evaluation. | |||
| This document defines an Experimental Protocol for the Internet | This document defines an Experimental Protocol for the Internet | |||
| skipping to change at line 305 ¶ | skipping to change at line 305 ¶ | |||
| (Section 8 of [RFC9256]), in this document (service route) | (Section 8 of [RFC9256]), in this document (service route) | |||
| steering is used to describe the mapping of the traffic for a | steering is used to describe the mapping of the traffic for a | |||
| service route onto a BGP CAR path. In contrast, the term | service route onto a BGP CAR path. In contrast, the term | |||
| resolution is preserved for the mapping of an inter-domain BGP CAR | resolution is preserved for the mapping of an inter-domain BGP CAR | |||
| route on an intra-domain color-aware path. | route on an intra-domain color-aware path. | |||
| Service steering: | Service steering: | |||
| Service route maps traffic to a BGP CAR path (or other color- | Service route maps traffic to a BGP CAR path (or other color- | |||
| aware path, e.g., SR Policy). If a color-aware path is not | aware path, e.g., SR Policy). If a color-aware path is not | |||
| available, local policy may map to a color-unaware routing/TE | available, local policy may map to a color-unaware routing/TE | |||
| path (e.g., BGP LU, RSVP-TE, IGP/LDP). The service steering | path (e.g., BGP-LU, RSVP-TE, IGP/LDP). The service steering | |||
| concept is agnostic to the transport technology used. | concept is agnostic to the transport technology used. | |||
| Section 3 describes the specific service steering mechanisms | Section 3 describes the specific service steering mechanisms | |||
| leveraged for MPLS, SR-MPLS, and SRv6. | leveraged for MPLS, SR-MPLS, and SRv6. | |||
| Intra-domain resolution: | Intra-domain resolution: | |||
| BGP CAR route maps to an intra-domain color-aware path (e.g., | BGP CAR route maps to an intra-domain color-aware path (e.g., | |||
| SR Policy, IGP Flexible Algorithm, BGP CAR) or a color-unaware | SR Policy, IGP Flexible Algorithm, BGP CAR) or a color-unaware | |||
| routing/TE path (e.g., RSVP-TE, IGP/LDP, BGP-LU). | routing/TE path (e.g., RSVP-TE, IGP/LDP, BGP-LU). | |||
| Transport network: | Transport network: | |||
| skipping to change at line 454 ¶ | skipping to change at line 454 ¶ | |||
| - W/w is steered on a color-aware path provided by SR Policy | - W/w is steered on a color-aware path provided by SR Policy | |||
| * Seamless interworking of BGP CAR and SR Policy | * Seamless interworking of BGP CAR and SR Policy | |||
| - V/v is steered on a BGP CAR path that is itself resolved within | - V/v is steered on a BGP CAR path that is itself resolved within | |||
| domain 2 onto an SR Policy bound to the color of V/v | domain 2 onto an SR Policy bound to the color of V/v | |||
| Other properties: | Other properties: | |||
| * MPLS data-plane: with 300k PEs and 5 colors, the BGP CAR solution | * MPLS data plane: with 300k PEs and 5 colors, the BGP CAR solution | |||
| ensures that no single node needs to support a data-plane scaling | ensures that no single node needs to support a data plane scaling | |||
| in the order of Remote PE * C (Section 5). This would otherwise | in the order of Remote PE * C (Section 5). This would otherwise | |||
| exceed the MPLS data-plane. | exceed the MPLS data plane. | |||
| * Control-plane: a node should not install a (E, C) path if it's not | * Control plane: a node should not install a (E, C) path if it's not | |||
| participating in that color-aware path. | participating in that color-aware path. | |||
| * Incongruent color-intent mapping: the solution supports the | * Incongruent color-intent mapping: the solution supports the | |||
| signaling of a BGP CAR route across different color domains | signaling of a BGP CAR route across different color domains | |||
| (Section 2.8). | (Section 2.8). | |||
| The key benefits of this model are: | The key benefits of this model are: | |||
| * Leverage of the BGP Color-EC [RFC9012] to color service routes | * Leverage of the BGP Color-EC [RFC9012] to color service routes | |||
| * The definition of the automated service steering: a C-colored | * The definition of the automated service steering: a C-colored | |||
| service route V/v from E2 is steered onto a color-aware path (E2, | service route V/v from E2 is steered onto a color-aware path (E2, | |||
| C) | C) | |||
| * The definition of the data model of a BGP CAR path: (E, C) | * The definition of the data model of a BGP CAR path: (E, C) | |||
| - Natural extension of BGP IP/LU data model (E) | - Natural extension of BGP-IP/BGP-LU data model (E) | |||
| - Consistent with SR Policy data model | - Consistent with SR Policy data model | |||
| * The definition of the recursive resolution of a BGP CAR route: a | * The definition of the recursive resolution of a BGP CAR route: a | |||
| BGP CAR (E2, C) route via N is resolved onto the color-aware path | BGP CAR (E2, C) route via N is resolved onto the color-aware path | |||
| (N, C), which may itself be provided by BGP CAR or via another | (N, C), which may itself be provided by BGP CAR or via another | |||
| color-aware routing solution (e.g., SR Policy, IGP Flexible | color-aware routing solution (e.g., SR Policy, IGP Flexible | |||
| Algorithm) | Algorithm) | |||
| * Explicit definitions for multiple transport encapsulations (e.g., | * Explicit definitions for multiple transport encapsulations (e.g., | |||
| skipping to change at line 578 ¶ | skipping to change at line 578 ¶ | |||
| 2.3. BGP CAR Route Origination | 2.3. BGP CAR Route Origination | |||
| A BGP CAR route may be originated locally (e.g., loopback) or through | A BGP CAR route may be originated locally (e.g., loopback) or through | |||
| redistribution of an (E, C) color-aware path provided by another | redistribution of an (E, C) color-aware path provided by another | |||
| routing solution (e.g., SR Policy, IGP Flexible Algorithm, RSVP-TE, | routing solution (e.g., SR Policy, IGP Flexible Algorithm, RSVP-TE, | |||
| BGP-LU [RFC8277]). | BGP-LU [RFC8277]). | |||
| 2.4. BGP CAR Route Validation | 2.4. BGP CAR Route Validation | |||
| A BGP CAR path (E, C) via next hop N with encapsulation T is valid if | A BGP CAR path (E, C) via next hop N with encapsulation T is valid if | |||
| color-aware path (N, C) exists with encapsulation T available in | color-aware path (N, C) exists with encapsulation T available in data | |||
| data-plane. | plane. | |||
| A local policy may customize the validation process: | A local policy may customize the validation process: | |||
| * The color constraint in the first check may be relaxed. If N is | * The color constraint in the first check may be relaxed. If N is | |||
| reachable via alternate color(s) or in the default routing table, | reachable via alternate color(s) or in the default routing table, | |||
| the route may be considered valid. | the route may be considered valid. | |||
| * The data-plane availability constraint of T may be relaxed to use | * The data plane availability constraint of T may be relaxed to use | |||
| an alternate encapsulation. | an alternate encapsulation. | |||
| * A performance-measurement verification may be added to ensure that | * A performance-measurement verification may be added to ensure that | |||
| the intent associated with C is met (e.g., delay < bound). | the intent associated with C is met (e.g., delay < bound). | |||
| A path that is not valid MUST NOT be considered for BGP best path | A path that is not valid MUST NOT be considered for BGP best path | |||
| selection. | selection. | |||
| 2.5. BGP CAR Route Resolution | 2.5. BGP CAR Route Resolution | |||
| skipping to change at line 631 ¶ | skipping to change at line 631 ¶ | |||
| domain, an egress node selects and advertises an SRv6 SID from its | domain, an egress node selects and advertises an SRv6 SID from its | |||
| locator for intent C1, with a BGP CAR route. In such a case, the | locator for intent C1, with a BGP CAR route. In such a case, the | |||
| ingress node resolves the received SRv6 SID over an IPv6 route for | ingress node resolves the received SRv6 SID over an IPv6 route for | |||
| the intent-aware locator of the egress node for C1 or a summary | the intent-aware locator of the egress node for C1 or a summary | |||
| route that covers the locator. This summary route may be provided | route that covers the locator. This summary route may be provided | |||
| by SRv6 Flexible Algorithm or BGP CAR IP Prefix route itself | by SRv6 Flexible Algorithm or BGP CAR IP Prefix route itself | |||
| (e.g., Appendix C.2). | (e.g., Appendix C.2). | |||
| * Local policy may map the CAR route to mechanisms that are unaware | * Local policy may map the CAR route to mechanisms that are unaware | |||
| of color or that provide best-effort, such as RSVP-TE, IGP/LDP, | of color or that provide best-effort, such as RSVP-TE, IGP/LDP, | |||
| BGP LU/IP (e.g., Appendix A.3.2) for brownfield scenarios. | BGP-LU/BGP-IP (e.g., Appendix A.3.2) for brownfield scenarios. | |||
| Route resolution via a different color C2 can be automated by | Route resolution via a different color C2 can be automated by | |||
| attaching BGP Color-EC C2 to CAR route (E2, C1), leveraging automated | attaching BGP Color-EC C2 to CAR route (E2, C1), leveraging automated | |||
| steering as described in Section 8.4 of "Segment Routing Policy | steering as described in Section 8.4 of "Segment Routing Policy | |||
| Architecture" [RFC9256] for BGP CAR routes. This mechanism is | Architecture" [RFC9256] for BGP CAR routes. This mechanism is | |||
| illustrated in Appendix B.2. This mechanism SHOULD be supported. | illustrated in Appendix B.2. This mechanism SHOULD be supported. | |||
| For CAR route resolution, if Color-EC color is present with the | For CAR route resolution, if Color-EC color is present with the | |||
| route, it takes precedence over the route's intent color. The | route, it takes precedence over the route's intent color. The | |||
| route’s intent color is the LCM-EC color if present (see | route’s intent color is the LCM-EC color if present (see | |||
| skipping to change at line 704 ¶ | skipping to change at line 704 ¶ | |||
| AIGP updates. | AIGP updates. | |||
| Additional AIGP extensions may be defined to signal state for | Additional AIGP extensions may be defined to signal state for | |||
| specific use cases such as Maximum SID Depth (MSD) along the BGP CAR | specific use cases such as Maximum SID Depth (MSD) along the BGP CAR | |||
| route advertisement and minimum MTU along the BGP CAR route | route advertisement and minimum MTU along the BGP CAR route | |||
| advertisement. This is out of scope for this document. | advertisement. This is out of scope for this document. | |||
| 2.7. Inherent Multipath Capability | 2.7. Inherent Multipath Capability | |||
| The (E, C) route definition inherently provides availability of | The (E, C) route definition inherently provides availability of | |||
| redundant paths at every BGP hop identical to BGP-LU or BGP IP. For | redundant paths at every BGP hop identical to BGP-LU or BGP-IP. For | |||
| instance, BGP CAR routes originated by two or more egress ABRs in a | instance, BGP CAR routes originated by two or more egress ABRs in a | |||
| domain are advertised as multiple paths to ingress ABRs in the | domain are advertised as multiple paths to ingress ABRs in the | |||
| domain, where they become equal-cost or primary-backup paths. A | domain, where they become equal-cost or primary-backup paths. A | |||
| failure of an egress ABR is detected and handled by ingress ABRs | failure of an egress ABR is detected and handled by ingress ABRs | |||
| locally within the domain for faster convergence, without any | locally within the domain for faster convergence, without any | |||
| necessity to propagate the event to upstream nodes for traffic | necessity to propagate the event to upstream nodes for traffic | |||
| restoration. | restoration. | |||
| BGP ADD-PATH [RFC7911] SHOULD be enabled for BGP CAR to signal | BGP ADD-PATH [RFC7911] SHOULD be enabled for BGP CAR to signal | |||
| multiple next hops through a transport RR. | multiple next hops through a TRR. | |||
| 2.8. BGP CAR Signaling Through Different Color Domains | 2.8. BGP CAR Signaling Through Different Color Domains | |||
| [Color Domain 1 A]-----[B Color Domain 2 E2] | [Color Domain 1 A]-----[B Color Domain 2 E2] | |||
| [C1=low delay ] [C2=low delay ] | [C1=low delay ] [C2=low delay ] | |||
| Let us assume a BGP CAR route (E2, C2) is signaled from B to A, two | Let us assume a BGP CAR route (E2, C2) is signaled from B to A, two | |||
| border routers of Domain 2 and Domain 1, respectively. Let us assume | BRs of Domain 2 and Domain 1, respectively. Let us assume that these | |||
| that these two domains do not share the same color-to-intent mapping | two domains do not share the same color-to-intent mapping (i.e., they | |||
| (i.e., they belong to different color domains). Low delay in Domain | belong to different color domains). Low delay in Domain 2 is color | |||
| 2 is color C2, while it is C1 in Domain 1 (C1 <> C2). | C2, while it is C1 in Domain 1 (C1 <> C2). | |||
| It is not expected to be a typical scenario to have an underlay | It is not expected to be a typical scenario to have an underlay | |||
| transport path (e.g., an MPLS LSP) extend across different color | transport path (e.g., an MPLS LSP) extend across different color | |||
| domains. However, the BGP CAR solution seamlessly supports this rare | domains. However, the BGP CAR solution seamlessly supports this rare | |||
| scenario while maintaining the separation and independence of the | scenario while maintaining the separation and independence of the | |||
| administrative authority in different color domains. | administrative authority in different color domains. | |||
| The solution works as described below: | The solution works as described below: | |||
| * Within Domain 2, the BGP CAR route is (E2, C2) via E2. | * Within Domain 2, the BGP CAR route is (E2, C2) via E2. | |||
| skipping to change at line 786 ¶ | skipping to change at line 786 ¶ | |||
| resolution and steering. | resolution and steering. | |||
| * In the rare case of color incongruence, the local color encoded in | * In the rare case of color incongruence, the local color encoded in | |||
| LCM-EC takes precedence. | LCM-EC takes precedence. | |||
| Operational considerations are in Section 11. Further illustrations | Operational considerations are in Section 11. Further illustrations | |||
| are provided in Appendix B. | are provided in Appendix B. | |||
| 2.9. Format and Encoding | 2.9. Format and Encoding | |||
| BGP CAR leverages BGP multi-protocol extensions [RFC4760] and uses | BGP CAR leverages BGP multiprotocol extensions [RFC4760] and uses the | |||
| the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route updates | MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route updates within | |||
| within SAFI value 83 along with AFI 1 for IPv4 prefixes and AFI 2 for | SAFI value 83 along with AFI 1 for IPv4 prefixes and AFI 2 for IPv6 | |||
| IPv6 prefixes. | prefixes. | |||
| BGP speakers MUST use the BGP Capabilities Advertisement to ensure | BGP speakers MUST use the BGP Capabilities Advertisement to ensure | |||
| support for processing of BGP CAR updates. This is done as specified | support for processing of BGP CAR updates. This is done as specified | |||
| in [RFC4760], by using capability code 1 (multi-protocol BGP), with | in [RFC4760], by using capability code 1 (multiprotocol BGP), with | |||
| AFI 1 and 2 (as required) and SAFI 83. | AFI 1 and 2 (as required) and SAFI 83. | |||
| The Next Hop network address field in the MP_REACH_NLRI may either be | The Next Hop network address field in the MP_REACH_NLRI may either be | |||
| an IPv4 address or an IPv6 address, independent of AFI. If the next | an IPv4 address or an IPv6 address, independent of AFI. If the next | |||
| hop length is 4, then the next hop is an IPv4 address. The next hop | hop length is 4, then the next hop is an IPv4 address. The next hop | |||
| length may be 16 or 32 for an IPv6 next hop address, set as per | length may be 16 or 32 for an IPv6 next hop address, set as per | |||
| Section 3 of [RFC2545]. Processing of the Next Hop field is governed | Section 3 of [RFC2545]. Processing of the Next Hop field is governed | |||
| by standard BGP procedures as described in Section 3 of [RFC4760]. | by standard BGP procedures as described in Section 3 of [RFC4760]. | |||
| The sub-sections below specify the generic encoding of the BGP CAR | The sub-sections below specify the generic encoding of the BGP CAR | |||
| skipping to change at line 1334 ¶ | skipping to change at line 1334 ¶ | |||
| relied upon to extract the key and perform 'treat-as-withdraw' for | relied upon to extract the key and perform 'treat-as-withdraw' for | |||
| malformed information. | malformed information. | |||
| A sender MUST ensure that the NLRI and key lengths are the number of | A sender MUST ensure that the NLRI and key lengths are the number of | |||
| actual bytes encoded in the NLRI and key fields, respectively, | actual bytes encoded in the NLRI and key fields, respectively, | |||
| regardless of content being encoded. | regardless of content being encoded. | |||
| Given the NLRI length and Key length MUST be valid, failures in the | Given the NLRI length and Key length MUST be valid, failures in the | |||
| following checks result in 'AFI/SAFI disable' or 'session reset': | following checks result in 'AFI/SAFI disable' or 'session reset': | |||
| * The minimum NLRI length MUST be at least 2, as key length and NLRI | * The minimum NLRI Length MUST be at least 2, as Key Length and NLRI | |||
| type are required fields. | Type are required fields. | |||
| * The Key Length MUST be at least 2 less than NLRI Length. | * The Key Length MUST be at least 2 less than NLRI Length. | |||
| NLRI type-specific error handling: | NLRI type-specific error handling: | |||
| * By default, a speaker SHOULD discard an unrecognized or | * By default, a speaker SHOULD discard an unrecognized or | |||
| unsupported NLRI type and move to the next NLRI. | unsupported NLRI type and move to the next NLRI. | |||
| * Key length and key errors of a known NLRI type SHOULD result in | * Key length and key errors of a known NLRI type SHOULD result in | |||
| the discard of NLRI similar to an unrecognized NLRI type. (This | the discard of NLRI similar to an unrecognized NLRI type. (This | |||
| skipping to change at line 1453 ¶ | skipping to change at line 1453 ¶ | |||
| routes from upstream routers or Route Reflectors (RRs) to limit the | routes from upstream routers or Route Reflectors (RRs) to limit the | |||
| routes that it needs to learn. On-demand subscription and automated | routes that it needs to learn. On-demand subscription and automated | |||
| filtering procedures for individual CAR routes are outside the scope | filtering procedures for individual CAR routes are outside the scope | |||
| of this document. | of this document. | |||
| 5. Scaling | 5. Scaling | |||
| This section analyzes the key scale requirement of [INTENT-AWARE], | This section analyzes the key scale requirement of [INTENT-AWARE], | |||
| specifically: | specifically: | |||
| * No intermediate node data-plane should need to scale to (Colors * | * No intermediate node data plane should need to scale to (Colors * | |||
| PEs). | PEs). | |||
| * No node should learn and install a BGP CAR route to (E, C) if it | * No node should learn and install a BGP CAR route to (E, C) if it | |||
| does not install a colored service route to E. | does not install a colored service route to E. | |||
| While the requirements and design principles generally apply to any | While the requirements and design principles generally apply to any | |||
| transport, the logical analysis based on the network design in this | transport, the logical analysis based on the network design in this | |||
| section focuses on MPLS/SR-MPLS transport since the scaling | section focuses on MPLS/SR-MPLS transport since the scaling | |||
| constraints are specifically relevant to these technologies. BGP CAR | constraints are specifically relevant to these technologies. BGP CAR | |||
| SAFI is used here, but the considerations can apply to [RFC8277] or | SAFI is used here, but the considerations can apply to [RFC8277] or | |||
| skipping to change at line 1522 ¶ | skipping to change at line 1522 ¶ | |||
| * Each domain has Flex-Algo 128. Prefix-SID for a node is Segment | * Each domain has Flex-Algo 128. Prefix-SID for a node is Segment | |||
| Routing Global Block (SRGB) 168000 plus node number. | Routing Global Block (SRGB) 168000 plus node number. | |||
| * A BGP CAR route (E2, C1) is advertised by egress BRM node 451. | * A BGP CAR route (E2, C1) is advertised by egress BRM node 451. | |||
| The route is sourced locally from redistribution from IGP Flex- | The route is sourced locally from redistribution from IGP Flex- | |||
| Algo 128. | Algo 128. | |||
| * Not shown for simplicity, node 452 will also advertise (E2, C1). | * Not shown for simplicity, node 452 will also advertise (E2, C1). | |||
| * When a transport RR is used within the domain or across domains, | * When a TRR is used within the domain or across domains, ADD-PATH | |||
| ADD-PATH is enabled to advertise paths from both egress BRs to its | is enabled to advertise paths from both egress BRs to its clients. | |||
| clients. | ||||
| * Egress PE E2 advertises a VPN route RD:V/v with BGP Color-EC C1 | * Egress PE E2 advertises a VPN route RD:V/v with BGP Color-EC C1 | |||
| that propagates via service RRs to ingress PE E1. | that propagates via service RRs to ingress PE E1. | |||
| * E1 steers V/v prefix via color-aware path (E2, C1) and VPN label | * E1 steers V/v prefix via color-aware path (E2, C1) and VPN label | |||
| 30030. | 30030. | |||
| 5.2. Deployment Model | 5.2. Deployment Model | |||
| 5.2.1. Flat | 5.2.1. Flat | |||
| skipping to change at line 1636 ¶ | skipping to change at line 1635 ¶ | |||
| * Each BGP hop allocates local label and programs swap entry in | * Each BGP hop allocates local label and programs swap entry in | |||
| forwarding for (451, C1). | forwarding for (451, C1). | |||
| * 121 resolves received BGP CAR route (451, C1) via 231 (label | * 121 resolves received BGP CAR route (451, C1) via 231 (label | |||
| 168451) on color-aware path (231, C1). | 168451) on color-aware path (231, C1). | |||
| - Color-aware path (231, C1) is Flex-Algo 128 path to 231 (label | - Color-aware path (231, C1) is Flex-Algo 128 path to 231 (label | |||
| 168231). | 168231). | |||
| * 451 advertises BGP CAR route (E2, C1) via 451 to transport RR | * 451 advertises BGP CAR route (E2, C1) via 451 to TRR T-RR2, which | |||
| T-RR2, which reflects it to transport RR T-RR1, which reflects it | reflects it to TRR T-RR1, which reflects it to 121. | |||
| to 121. | ||||
| * 121 receives BGP CAR route (E2, C1) via 451 with label 168002. | * 121 receives BGP CAR route (E2, C1) via 451 with label 168002. | |||
| - Let's assume 121 selects that path. | - Let's assume 121 selects that path. | |||
| * 121 resolves BGP CAR route (E2, C1) via 451 on color-aware path | * 121 resolves BGP CAR route (E2, C1) via 451 on color-aware path | |||
| (451, C1). | (451, C1). | |||
| - Color-aware path (451, C1) is BGP CAR path to 451 (label | - Color-aware path (451, C1) is BGP CAR path to 451 (label | |||
| 168451). | 168451). | |||
| skipping to change at line 1720 ¶ | skipping to change at line 1718 ¶ | |||
| Figure 5: Hierarchical BGP Transport CAR, Next-Hop-Unchanged | Figure 5: Hierarchical BGP Transport CAR, Next-Hop-Unchanged | |||
| (NHU) at iBR | (NHU) at iBR | |||
| * Nodes 341, 231, and 121 receive and resolve BGP CAR route (451, | * Nodes 341, 231, and 121 receive and resolve BGP CAR route (451, | |||
| C1) the same as in the previous model. | C1) the same as in the previous model. | |||
| * Node 121 allocates local label and programs swap entry in | * Node 121 allocates local label and programs swap entry in | |||
| forwarding for (451, C1). | forwarding for (451, C1). | |||
| * 451 advertises BGP CAR route (E2, C1) to transport RR T-RR2, which | * 451 advertises BGP CAR route (E2, C1) to TRR T-RR2, which reflects | |||
| reflects it to transport RR T-RR1, which reflects it to 121. | it to TRR T-RR1, which reflects it to 121. | |||
| * Node 121 advertises (E2, C1) to E1 with next hop as 451 (i.e., | * Node 121 advertises (E2, C1) to E1 with next hop as 451 (i.e., | |||
| next-hop-unchanged). | next-hop-unchanged). | |||
| * 121 also advertises (451, C1) to E1 with next-hop-self (121) and | * 121 also advertises (451, C1) to E1 with next-hop-self (121) and | |||
| label 168451. | label 168451. | |||
| * E1 resolves BGP CAR route (451, C1) via 121 on color-aware path | * E1 resolves BGP CAR route (451, C1) via 121 on color-aware path | |||
| (121, C1). | (121, C1). | |||
| skipping to change at line 1764 ¶ | skipping to change at line 1762 ¶ | |||
| * Nodes 121, 231, and 341 perform swap operation on 168451 bound to | * Nodes 121, 231, and 341 perform swap operation on 168451 bound to | |||
| (451, C1). | (451, C1). | |||
| * 451 performs swap operation on 168002 bound to color-aware path | * 451 performs swap operation on 168002 bound to color-aware path | |||
| (E2, C1). | (E2, C1). | |||
| 5.3. Scale Analysis | 5.3. Scale Analysis | |||
| The following two tables summarize the logically analyzed scaling of | The following two tables summarize the logically analyzed scaling of | |||
| the control-plane and data-plane for the previous three models: | the control plane and data plane for the previous three models: | |||
| +=======+=====================+=====================+=============+ | +=======+=====================+=====================+=============+ | |||
| | | E1 | 121 | 231 | | | | E1 | 121 | 231 | | |||
| +=======+=====================+=====================+=============+ | +=======+=====================+=====================+=============+ | |||
| | FLAT | (E2,C) via (121,C) | (E2,C) via (231,C) | (E2,C) via | | | FLAT | (E2,C) via (121,C) | (E2,C) via (231,C) | (E2,C) via | | |||
| | | | | (341,C) | | | | | | (341,C) | | |||
| +=======+---------------------+---------------------+-------------+ | +=======+---------------------+---------------------+-------------+ | |||
| | H.NHS | (E2,C) via (121,C) | (E2,C) via (451,C) | (451,C) via | | | H.NHS | (E2,C) via (121,C) | (E2,C) via (451,C) | (451,C) via | | |||
| | | | (451,C) via (231,C) | (341,C) | | | | | (451,C) via (231,C) | (341,C) | | |||
| +=======+---------------------+---------------------+-------------+ | +=======+---------------------+---------------------+-------------+ | |||
| skipping to change at line 1806 ¶ | skipping to change at line 1804 ¶ | |||
| +=======+------------+------------------+------------------+ | +=======+------------+------------------+------------------+ | |||
| Table 2 | Table 2 | |||
| * The flat model is the simplest design, with a single BGP transport | * The flat model is the simplest design, with a single BGP transport | |||
| level. It results in the minimum label/SID stack at each BGP hop. | level. It results in the minimum label/SID stack at each BGP hop. | |||
| However, it significantly increases the scale impact on the core | However, it significantly increases the scale impact on the core | |||
| BRs (e.g., 341), whose FIB capacity and even MPLS label space may | BRs (e.g., 341), whose FIB capacity and even MPLS label space may | |||
| be exceeded. | be exceeded. | |||
| - 341's data-plane scales with (E2, C) where there may be 300k Es | - 341's data plane scales with (E2, C) where there may be 300k Es | |||
| and 5 Cs, hence 1.5M entries > 1M MPLS data-plane. | and 5 Cs, hence 1.5M entries > 1M MPLS data plane. | |||
| * The hierarchical models avoid the need for core BRs to learn | * The hierarchical models avoid the need for core BRs to learn | |||
| routes and install label forwarding entries for (E, C) routes. | routes and install label forwarding entries for (E, C) routes. | |||
| - Whether next hop is set to self or left unchanged at 121, 341's | - Whether next hop is set to self or left unchanged at 121, 341's | |||
| data-plane scales with (451, C) where there may be thousands of | data plane scales with (451, C) where there may be thousands of | |||
| 451s and 5 Cs. Therefore, this scaling is well under the 1 | 451s and 5 Cs. Therefore, this scaling is well under the 1 | |||
| million MPLS labels data-plane limit. | million MPLS labels data plane limit. | |||
| - They also aid faster convergence by allowing the PE routes to | - They also aid faster convergence by allowing the PE routes to | |||
| be distributed via out-of-band RRs that can be scaled | be distributed via out-of-band RRs that can be scaled | |||
| independent of the transport BRs. | independent of the transport BRs. | |||
| * The next-hop-self option at ingress BRM (e.g., 121) hides the | * The next-hop-self option at ingress BRM (e.g., 121) hides the | |||
| hierarchical design from the ingress PE, keeping its outgoing | hierarchical design from the ingress PE, keeping its outgoing | |||
| label programming as simple as the flat model. However, the | label programming as simple as the flat model. However, the | |||
| ingress BRM requires an additional BGP transport level recursion, | ingress BRM requires an additional BGP transport level recursion, | |||
| which coupled with load-balancing adds data-plane complexity. It | which coupled with load-balancing adds data plane complexity. It | |||
| needs to support a swap and push operation. It also needs to | needs to support a swap and push operation. It also needs to | |||
| install label forwarding entries for the egress PEs that are of | install label forwarding entries for the egress PEs that are of | |||
| interest to its local ingress PEs. | interest to its local ingress PEs. | |||
| * With the next-hop-unchanged option at ingress BRM (e.g., 121), | * With the next-hop-unchanged option at ingress BRM (e.g., 121), | |||
| only an ingress PE needs to learn and install output label entries | only an ingress PE needs to learn and install output label entries | |||
| for egress (E, C) routes. The ingress BRM only installs label | for egress (E, C) routes. The ingress BRM only installs label | |||
| forwarding entries for the egress ABR (e.g., 451). However, the | forwarding entries for the egress ABR (e.g., 451). However, the | |||
| ingress PE needs an additional BGP transport level recursion and | ingress PE needs an additional BGP transport level recursion and | |||
| pushes a BGP VPN label and two BGP transport labels. It may also | pushes a BGP VPN label and two BGP transport labels. It may also | |||
| need to handle load-balancing for the egress ABRs. This is the | need to handle load-balancing for the egress ABRs. This is the | |||
| most complex data-plane option for the ingress PE. | most complex data plane option for the ingress PE. | |||
| 5.4. Anycast SID | 5.4. Anycast SID | |||
| This section describes how Anycast SID complements and improves the | This section describes how Anycast SID complements and improves the | |||
| scaling designs above. | scaling designs above. | |||
| 5.4.1. Anycast SID for Transit Inter-Domain Nodes | 5.4.1. Anycast SID for Transit Inter-Domain Nodes | |||
| * Redundant BRs (e.g., two egress BRMs, 451 and 452) advertise BGP | * Redundant BRs (e.g., two egress BRMs, 451 and 452) advertise BGP | |||
| CAR routes for a local PE (e.g., E2) with the same SID (based on | CAR routes for a local PE (e.g., E2) with the same SID (based on | |||
| skipping to change at line 2194 ¶ | skipping to change at line 2192 ¶ | |||
| with existing operational usage, the CAR IP Prefix route is allowed | with existing operational usage, the CAR IP Prefix route is allowed | |||
| to be without color for best-effort. In this case, the routes will | to be without color for best-effort. In this case, the routes will | |||
| not carry an LCM-EC. Resolution is described in Section 2.5. | not carry an LCM-EC. Resolution is described in Section 2.5. | |||
| As described in Section 7.3, infrastructure prefixes are intended to | As described in Section 7.3, infrastructure prefixes are intended to | |||
| be carried in CAR SAFI instead of SAFIs that also carry service | be carried in CAR SAFI instead of SAFIs that also carry service | |||
| routes such as BGP-IP (SAFI 1, [RFC4271]) and BGP-LU (SAFI 4, | routes such as BGP-IP (SAFI 1, [RFC4271]) and BGP-LU (SAFI 4, | |||
| [RFC4798]). However, if such infrastructure routes are also | [RFC4798]). However, if such infrastructure routes are also | |||
| distributed in these SAFIs, a router may receive both BGP CAR SAFI | distributed in these SAFIs, a router may receive both BGP CAR SAFI | |||
| paths and IP/LU SAFI paths. By default, the CAR SAFI transport path | paths and IP/LU SAFI paths. By default, the CAR SAFI transport path | |||
| is preferred over the BGP IP or BGP-LU SAFI path. | is preferred over the BGP-IP or BGP-LU SAFI path. | |||
| A BGP transport CAR speaker that supports packet forwarding lookup | A BGP transport CAR speaker that supports packet forwarding lookup | |||
| based on the IPv6 prefix route (such as a BR) will set itself as next | based on the IPv6 prefix route (such as a BR) will set itself as next | |||
| hop while advertising the route to peers. It will also install the | hop while advertising the route to peers. It will also install the | |||
| IPv6 route into forwarding with the received next hop and/or | IPv6 route into forwarding with the received next hop and/or | |||
| encapsulation. If such a transit router does not support this route | encapsulation. If such a transit router does not support this route | |||
| type, it will not install this route and will not set itself as next | type, it will not install this route and will not set itself as next | |||
| hop; hence, it will not propagate the route any further. | hop; hence, it will not propagate the route any further. | |||
| 9. VPN CAR | 9. VPN CAR | |||
| skipping to change at line 2289 ¶ | skipping to change at line 2287 ¶ | |||
| CAR routes distributed in VPN CAR SAFI are infrastructure routes | CAR routes distributed in VPN CAR SAFI are infrastructure routes | |||
| advertised by CEs in different customer VRFs on a PE. Example use | advertised by CEs in different customer VRFs on a PE. Example use | |||
| cases are intent-aware L3VPN Carriers' Carriers (Section 9 of | cases are intent-aware L3VPN Carriers' Carriers (Section 9 of | |||
| [RFC4364]) and SRv6 over a provider network. The VPN RD | [RFC4364]) and SRv6 over a provider network. The VPN RD | |||
| distinguishes CAR routes of different customers being advertised by | distinguishes CAR routes of different customers being advertised by | |||
| the PE. | the PE. | |||
| 9.1. Format and Encoding | 9.1. Format and Encoding | |||
| BGP VPN CAR SAFI leverages BGP multi-protocol extensions [RFC4760] | BGP VPN CAR SAFI leverages BGP multiprotocol extensions [RFC4760] and | |||
| and uses the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route | uses the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route | |||
| updates within SAFI value 84 along with AFI 1 for IPv4 VPN CAR | updates within SAFI value 84 along with AFI 1 for IPv4 VPN CAR | |||
| prefixes and AFI 2 for IPv6 VPN CAR prefixes. | prefixes and AFI 2 for IPv6 VPN CAR prefixes. | |||
| BGP speakers MUST use the BGP Capabilities Advertisement to ensure | BGP speakers MUST use the BGP Capabilities Advertisement to ensure | |||
| support for processing of BGP VPN CAR updates. This is done as | support for processing of BGP VPN CAR updates. This is done as | |||
| specified in [RFC4760], by using capability code 1 (multi-protocol | specified in [RFC4760], by using capability code 1 (multiprotocol | |||
| BGP), with AFI 1 and 2 (as required) and SAFI 84. | BGP), with AFI 1 and 2 (as required) and SAFI 84. | |||
| The Next Hop network address field in the MP_REACH_NLRI may contain | The Next Hop network address field in the MP_REACH_NLRI may contain | |||
| either a VPN-IPv4 or a VPN-IPv6 address with 8-octet RD set to zero, | either a VPN-IPv4 or a VPN-IPv6 address with 8-octet RD set to zero, | |||
| independent of AFI. If the next hop length is 12, then the next hop | independent of AFI. If the next hop length is 12, then the next hop | |||
| is a VPN-IPv4 address with an RD of 0 constructed as per [RFC4364]. | is a VPN-IPv4 address with an RD of 0 constructed as per [RFC4364]. | |||
| If the next hop length is 24 or 48, then the next hop is a VPN-IPv6 | If the next hop length is 24 or 48, then the next hop is a VPN-IPv6 | |||
| address constructed as per Section 3.2.1.1 of [RFC4659]. | address constructed as per Section 3.2.1.1 of [RFC4659]. | |||
| 9.1.1. VPN CAR (E, C) NLRI Type | 9.1.1. VPN CAR (E, C) NLRI Type | |||
| skipping to change at line 2819 ¶ | skipping to change at line 2817 ¶ | |||
| * The following description applies to the reference topology above: | * The following description applies to the reference topology above: | |||
| - IGP Flex-Algo 128 is running in each domain, and mapped to | - IGP Flex-Algo 128 is running in each domain, and mapped to | |||
| color C1. | color C1. | |||
| - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | |||
| EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN | EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN | |||
| route propagates via service RRs to ingress PE E1. | route propagates via service RRs to ingress PE E1. | |||
| - BGP CAR route (E2, C1) with next hop, label index, and label as | - BGP CAR route (E2, C1) with next hop, label index, and label as | |||
| shown above are advertised through border routers in each | shown above are advertised through BRs in each domain. When an | |||
| domain. When an RR is used in the domain, ADD-PATH is enabled | RR is used in the domain, ADD-PATH is enabled to advertise | |||
| to advertise multiple available paths. | multiple available paths. | |||
| - On each BGP hop, the (E2, C1) route's next hop is resolved over | - On each BGP hop, the (E2, C1) route's next hop is resolved over | |||
| IGP Flex-Algo 128 of the domain. The AIGP attribute influences | IGP Flex-Algo 128 of the domain. The AIGP attribute influences | |||
| the BGP CAR route best path decision as per [RFC7311]. The BGP | the BGP CAR route best path decision as per [RFC7311]. The BGP | |||
| CAR label swap entry is installed that goes over Flex-Algo 128 | CAR label swap entry is installed that goes over Flex-Algo 128 | |||
| LSP to next hop providing intent in each IGP domain. The AIGP | LSP to next hop providing intent in each IGP domain. The AIGP | |||
| metric should be updated to reflect Flex-Algo 128 metric to | metric should be updated to reflect Flex-Algo 128 metric to | |||
| next hop. | next hop. | |||
| - Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN | - Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN | |||
| skipping to change at line 2905 ¶ | skipping to change at line 2903 ¶ | |||
| o SR Policy (C1, 231) segments <S2, 231>, and | o SR Policy (C1, 231) segments <S2, 231>, and | |||
| o SR Policy (C1, E2) segments <S3, E2>. | o SR Policy (C1, E2) segments <S3, E2>. | |||
| - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | |||
| EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN | EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN | |||
| route propagates via service RRs to ingress PE E1. | route propagates via service RRs to ingress PE E1. | |||
| - BGP CAR route (E2, C1) with next hop, label index, and label as | - BGP CAR route (E2, C1) with next hop, label index, and label as | |||
| shown above are advertised through border routers in each | shown above are advertised through BRs in each domain. When an | |||
| domain. When an RR is used in the domain, ADD-PATH is enabled | RR is used in the domain, ADD-PATH is enabled to advertise | |||
| to advertise multiple available paths. | multiple available paths. | |||
| - On each BGP hop, the CAR route (E2, C1) next hop is resolved | - On each BGP hop, the CAR route (E2, C1) next hop is resolved | |||
| over an SR Policy (C1, next hop). The BGP CAR label swap entry | over an SR Policy (C1, next hop). The BGP CAR label swap entry | |||
| is installed that goes over SR Policy segment list. | is installed that goes over SR Policy segment list. | |||
| - Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN | - Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN | |||
| route RD:V/v into (E2, C1). | route RD:V/v into (E2, C1). | |||
| * Important: | * Important: | |||
| skipping to change at line 2978 ¶ | skipping to change at line 2976 ¶ | |||
| * The following description applies to the reference topology above: | * The following description applies to the reference topology above: | |||
| - IGP Flex-Algo 128 is only enabled in core (e.g., WAN network), | - IGP Flex-Algo 128 is only enabled in core (e.g., WAN network), | |||
| mapped to C1. Access network domain only has Base Algo 0. | mapped to C1. Access network domain only has Base Algo 0. | |||
| - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | |||
| EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN | EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN | |||
| route propagates via service RRs to ingress PE E1. | route propagates via service RRs to ingress PE E1. | |||
| - BGP CAR route (E2, C1) with next hop, label index, and label as | - BGP CAR route (E2, C1) with next hop, label index, and label as | |||
| shown above are advertised through border routers in each | shown above are advertised through BRs in each domain. When an | |||
| domain. When an RR is used in the domain, ADD-PATH is enabled | RR is used in the domain, ADD-PATH is enabled to advertise | |||
| to advertise multiple available paths. | multiple available paths. | |||
| - Local policy on 231 and 232 maps intent C1 to resolve CAR route | - Local policy on 231 and 232 maps intent C1 to resolve CAR route | |||
| next hop over IGP Base Algo 0 in right access domain. The BGP | next hop over IGP Base Algo 0 in right access domain. The BGP | |||
| CAR label swap entry is installed that goes over Base Algo 0 | CAR label swap entry is installed that goes over Base Algo 0 | |||
| LSP to next hop. AIGP metric is updated to reflect Base Algo 0 | LSP to next hop. AIGP metric is updated to reflect Base Algo 0 | |||
| metric to next hop with an additional penalty (+1000). | metric to next hop with an additional penalty (+1000). | |||
| - On 121 and 122, CAR route (E2, C1) next hop learnt from Core | - On 121 and 122, CAR route (E2, C1) next hop learnt from Core | |||
| domain is resolved over IGP Flex-Algo 128. The BGP CAR label | domain is resolved over IGP Flex-Algo 128. The BGP CAR label | |||
| swap entry is installed that goes over Flex-Algo 128 LSP to | swap entry is installed that goes over Flex-Algo 128 LSP to | |||
| skipping to change at line 3058 ¶ | skipping to change at line 3056 ¶ | |||
| - RSVP-TE MPLS tunnel mesh is configured only in core (e.g., WAN | - RSVP-TE MPLS tunnel mesh is configured only in core (e.g., WAN | |||
| network). Access only has IS-IS/LDP. (The figure does not | network). Access only has IS-IS/LDP. (The figure does not | |||
| show all TE tunnels.) | show all TE tunnels.) | |||
| - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | |||
| EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN | EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN | |||
| route propagates via service RRs to ingress PE E1. | route propagates via service RRs to ingress PE E1. | |||
| - BGP CAR route (E2, C1) with next hops and labels as shown above | - BGP CAR route (E2, C1) with next hops and labels as shown above | |||
| is advertised through border routers in each domain. When an | is advertised through BRs in each domain. When an RR is used | |||
| RR is used in the domain, ADD-PATH is enabled to advertise | in the domain, ADD-PATH is enabled to advertise multiple | |||
| multiple available paths. | available paths. | |||
| - Local policy on 231 and 232 maps intent C1 to resolve CAR route | - Local policy on 231 and 232 maps intent C1 to resolve CAR route | |||
| next hop over best-effort LDP LSP in access domain 1. The BGP | next hop over best-effort LDP LSP in access domain 1. The BGP | |||
| CAR label swap entry is installed that goes over LDP LSP to | CAR label swap entry is installed that goes over LDP LSP to | |||
| next hop. AIGP metric is updated to reflect best-effort metric | next hop. AIGP metric is updated to reflect best-effort metric | |||
| to next hop with an additional penalty (+1000). | to next hop with an additional penalty (+1000). | |||
| - Local policy on 121 and 122 maps intent C1 to resolve CAR route | - Local policy on 121 and 122 maps intent C1 to resolve CAR route | |||
| next hop in Core domain over RSVP-TE tunnels. The BGP CAR | next hop in Core domain over RSVP-TE tunnels. The BGP CAR | |||
| label swap entry is installed that goes over a TE tunnel to | label swap entry is installed that goes over a TE tunnel to | |||
| skipping to change at line 3092 ¶ | skipping to change at line 3090 ¶ | |||
| - Dynamic BGP CAR label carries intent from PEs, which is | - Dynamic BGP CAR label carries intent from PEs, which is | |||
| realized in Core domain by resolution via RSVP-TE tunnel. | realized in Core domain by resolution via RSVP-TE tunnel. | |||
| A.4. Transit Network Domains That Do Not Support CAR | A.4. Transit Network Domains That Do Not Support CAR | |||
| * In a brownfield deployment, color-aware paths between two PEs may | * In a brownfield deployment, color-aware paths between two PEs may | |||
| need to go through a transit domain that does not support CAR. | need to go through a transit domain that does not support CAR. | |||
| Examples of such a brownfield network include an MPLS LDP network | Examples of such a brownfield network include an MPLS LDP network | |||
| with IGP best-effort, or a multi-domain network based on BGP-LU. | with IGP best-effort, or a multi-domain network based on BGP-LU. | |||
| An MPLS LDP network with best-effort IGP can adopt the above | An MPLS LDP network with best-effort IGP can adopt the above | |||
| scheme in Appendix A.3. Below is the example scenario for BGP LU. | scheme in Appendix A.3. Below is the example scenario for BGP-LU. | |||
| * Reference topology: | * Reference topology: | |||
| E1 --- BR1 --- BR2 ......... BR3 ---- BR4 --- E2 | E1 --- BR1 --- BR2 ......... BR3 ---- BR4 --- E2 | |||
| Ci <----LU----> Ci | Ci <----LU----> Ci | |||
| Figure 10: BGP CAR Not Supported in Transit Domain | Figure 10: BGP CAR Not Supported in Transit Domain | |||
| - Network between BR2 and BR3 comprises of multiple BGP-LU hops | - Network between BR2 and BR3 comprises of multiple BGP-LU hops | |||
| (over IGP-LDP domains). | (over IGP-LDP domains). | |||
| skipping to change at line 3215 ¶ | skipping to change at line 3213 ¶ | |||
| different domains. | different domains. | |||
| A.6. Per-Flow Steering over CAR Routes | A.6. Per-Flow Steering over CAR Routes | |||
| This section provides an example of ingress PE per-flow steering as | This section provides an example of ingress PE per-flow steering as | |||
| defined in Section 8.6 of [RFC9256] onto BGP CAR routes. | defined in Section 8.6 of [RFC9256] onto BGP CAR routes. | |||
| The following description applies to the reference topology in | The following description applies to the reference topology in | |||
| Figure 6: | Figure 6: | |||
| * Ingress PE E1 learns best-effort BGP LU route E2. | * Ingress PE E1 learns best-effort BGP-LU route E2. | |||
| * Ingress PE E1 learns CAR route (E2, C1), C1 is mapped to "low | * Ingress PE E1 learns CAR route (E2, C1), C1 is mapped to "low | |||
| delay". | delay". | |||
| * Ingress PE E1 learns CAR route (E2, C2), C2 is mapped to "low | * Ingress PE E1 learns CAR route (E2, C2), C2 is mapped to "low | |||
| delay and avoid resource R". | delay and avoid resource R". | |||
| * Ingress PE E1 is configured to instantiate an array of paths to E2 | * Ingress PE E1 is configured to instantiate an array of paths to E2 | |||
| where entry 0 is the BGP LU path to next hop, color C1 is the | where entry 0 is the BGP-LU path to next hop, color C1 is the | |||
| first entry, and color C2 is the second entry. The index into the | first entry, and color C2 is the second entry. The index into the | |||
| array is called a Forwarding Class (FC). The index can have | array is called a Forwarding Class (FC). The index can have | |||
| values 0 to 7, especially when derived from the MPLS TC bits | values 0 to 7, especially when derived from the MPLS TC bits | |||
| [RFC5462]. | [RFC5462]. | |||
| * E1 is configured to match flows in its ingress interfaces (upon | * E1 is configured to match flows in its ingress interfaces (upon | |||
| any field such as Ethernet destination/source/VLAN/TOS or IP | any field such as Ethernet destination/source/VLAN/TOS or IP | |||
| destination/source/DSCP or transport ports, etc.) and color them | destination/source/DSCP or transport ports, etc.) and color them | |||
| with an internal per-packet FC variable (0, 1, or 2 in this | with an internal per-packet FC variable (0, 1, or 2 in this | |||
| example). | example). | |||
| skipping to change at line 3640 ¶ | skipping to change at line 3638 ¶ | |||
| - Similarly, Prefix B:C12::/32 summarizes Flex-Algo 128 block in | - Similarly, Prefix B:C12::/32 summarizes Flex-Algo 128 block in | |||
| AS2. | AS2. | |||
| - Per Flex-Algo external subnets for eBGP next hops IP1 and IP2 | - Per Flex-Algo external subnets for eBGP next hops IP1 and IP2 | |||
| are distributed in IS-IS within AS2. | are distributed in IS-IS within AS2. | |||
| * BGP CAR prefix route B:C11::/32 with LCM C1 is originated by AS1 | * BGP CAR prefix route B:C11::/32 with LCM C1 is originated by AS1 | |||
| BRs 231 and 232 on eBGP sessions to AS2 BRs 121 and 122. | BRs 231 and 232 on eBGP sessions to AS2 BRs 121 and 122. | |||
| * ASBR 121 and 122 propagate the route in AS2 to all the P, ABRs, | * ASBR 121 and 122 propagate the route in AS2 to all the P, ABRs, | |||
| and PEs through transport RR. | and PEs through TRR. | |||
| * Every router in AS2 resolves BGP CAR prefix B:C11::/32 next hops | * Every router in AS2 resolves BGP CAR prefix B:C11::/32 next hops | |||
| IP1 and IP2 in IS-ISv6 Flex-Algo 128 and programs B:C11::/32 | IP1 and IP2 in IS-ISv6 Flex-Algo 128 and programs B:C11::/32 | |||
| prefix in global IPv6 forwarding table. | prefix in global IPv6 forwarding table. | |||
| * AIGP attribute influences BGP CAR route best path decision. | * AIGP attribute influences BGP CAR route best path decision. | |||
| * Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID | * Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID | |||
| B:C11:2:DT4::. Service SID is allocated by E2 from its locator of | B:C11:2:DT4::. Service SID is allocated by E2 from its locator of | |||
| color C1 intent. | color C1 intent. | |||
| skipping to change at line 3746 ¶ | skipping to change at line 3744 ¶ | |||
| domain for the given intent. Node locators in the egress | domain for the given intent. Node locators in the egress | |||
| domain are sub-allocated from the block. | domain are sub-allocated from the block. | |||
| - Prefix B:C12::/32 summarizes Flex-Algo 128 block in transit | - Prefix B:C12::/32 summarizes Flex-Algo 128 block in transit | |||
| domain. | domain. | |||
| - Prefix B:C13::/32 summarizes Flex-Algo 128 block in ingress | - Prefix B:C13::/32 summarizes Flex-Algo 128 block in ingress | |||
| domain. | domain. | |||
| * BGP CAR route B:C11::/32 is originated by ABRs 231 and 232 with | * BGP CAR route B:C11::/32 is originated by ABRs 231 and 232 with | |||
| LCM C1. Along the propagation path, border routers set next-hop- | LCM C1. Along the propagation path, BRs set next-hop-self and | |||
| self and appropriately update the intra-domain encapsulation | appropriately update the intra-domain encapsulation information | |||
| information for the C1 intent. For example, 231 and 121 signal | for the C1 intent. For example, 231 and 121 signal SRv6 SID of | |||
| SRv6 SID of End behavior [RFC8986] allocated from their respective | End behavior [RFC8986] allocated from their respective locators | |||
| locators for the C1 intent. (Note: IGP Fleixible Algorithm is | for the C1 intent. (Note: IGP Fleixible Algorithm is shown for | |||
| shown for intra-domain path, but SR Policy may also provide the | intra-domain path, but SR Policy may also provide the path as | |||
| path as shown in Appendix C.3.) | shown in Appendix C.3.) | |||
| * AIGP attribute influences BGP CAR route best path decision. | * AIGP attribute influences BGP CAR route best path decision. | |||
| * Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID | * Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID | |||
| B:C11:2:DT4::. Service SID is allocated by E2 from its locator of | B:C11:2:DT4::. Service SID is allocated by E2 from its locator of | |||
| color C1 intent. | color C1 intent. | |||
| * Ingress PE E1 learns CAR route B:C11::/32 and VPN route RD:V/v | * Ingress PE E1 learns CAR route B:C11::/32 and VPN route RD:V/v | |||
| with SRv6 SID B:C11:2:DT4::. | with SRv6 SID B:C11:2:DT4::. | |||
| skipping to change at line 3774 ¶ | skipping to change at line 3772 ¶ | |||
| steered along IPv6 routed path provided by BGP CAR IP Prefix route | steered along IPv6 routed path provided by BGP CAR IP Prefix route | |||
| to locator B:C11::/32. | to locator B:C11::/32. | |||
| Important: | Important: | |||
| * Uses longest prefix match of SRv6 Service SID to BGP CAR prefix. | * Uses longest prefix match of SRv6 Service SID to BGP CAR prefix. | |||
| There is no mapping labels/SIDs; there is simple IP-based | There is no mapping labels/SIDs; there is simple IP-based | |||
| forwarding instead. | forwarding instead. | |||
| * Originating domain PE locators of the given intent can be | * Originating domain PE locators of the given intent can be | |||
| summarized on transit BGP hops eliminating per PE state on border | summarized on transit BGP hops eliminating per PE state on BRs. | |||
| routers. | ||||
| Packet forwarding: | Packet forwarding: | |||
| @E1: IPv4 VRF V/v => H.Encaps.red <B:C13:121:END::, B:C11:2:DT4::> | @E1: IPv4 VRF V/v => H.Encaps.red <B:C13:121:END::, B:C11:2:DT4::> | |||
| @121: My SID table: B:C13:121:END:: => Update DA with B:C11:2:DT4:: | @121: My SID table: B:C13:121:END:: => Update DA with B:C11:2:DT4:: | |||
| @121: IPv6 Table: B:C11::/32 => H.Encaps.red <B:C12:231:END::> | @121: IPv6 Table: B:C11::/32 => H.Encaps.red <B:C12:231:END::> | |||
| @231: My SID table: B:C12:231:END:: => Remove IPv6 header; | @231: My SID table: B:C12:231:END:: => Remove IPv6 header; | |||
| Inner DA B:C11:2:DT4:: | Inner DA B:C11:2:DT4:: | |||
| @231: IPv6 Table B:C11:2::/48 => Forward via IS-ISv6 Flex-Algo | @231: IPv6 Table B:C11:2::/48 => Forward via IS-ISv6 Flex-Algo | |||
| path to E2 | path to E2 | |||
| skipping to change at line 3834 ¶ | skipping to change at line 3831 ¶ | |||
| route-based design (Section 7.1.2). The example is iBGP, but the | route-based design (Section 7.1.2). The example is iBGP, but the | |||
| design also applies to eBGP (multi-AS). | design also applies to eBGP (multi-AS). | |||
| * SR Policy (E2, C2) provides given intent in egress domain. | * SR Policy (E2, C2) provides given intent in egress domain. | |||
| - SR Policy (E2, C2) with segments <B:01:z:END::, B:01:2:END::>, | - SR Policy (E2, C2) with segments <B:01:z:END::, B:01:2:END::>, | |||
| where z is the node id in egress domain. | where z is the node id in egress domain. | |||
| * Egress ABRs 231 and 232 redistribute SR Policy into BGP CAR Type-1 | * Egress ABRs 231 and 232 redistribute SR Policy into BGP CAR Type-1 | |||
| NLRI (E2, C2) to other domains, with SRv6 SID of End.B6 behavior. | NLRI (E2, C2) to other domains, with SRv6 SID of End.B6 behavior. | |||
| This route is propagated to ingress PEs through Transport RR (TRR) | This route is propagated to ingress PEs through TRR or inline with | |||
| or inline with next-hop-unchanged. | next-hop-unchanged. | |||
| * The ABRs also advertise BGP CAR prefix route (B:C21::/32) | * The ABRs also advertise BGP CAR prefix route (B:C21::/32) | |||
| summarizing locator part of SRv6 SIDs for SR policies of given | summarizing locator part of SRv6 SIDs for SR policies of given | |||
| intent to different PEs in egress domain. BGP CAR prefix route | intent to different PEs in egress domain. BGP CAR prefix route | |||
| propagates through border routers. At each BGP hop, BGP CAR | propagates through BRs. At each BGP hop, BGP CAR prefix next-hop | |||
| prefix next-hop resolution triggers intra-domain transit SR Policy | resolution triggers intra-domain transit SR Policy (C2, CAR next | |||
| (C2, CAR next hop). For example: | hop). For example: | |||
| - SR Policy (231, C2) with segments <B:02:y:END::, | - SR Policy (231, C2) with segments <B:02:y:END::, | |||
| B:02:231:END::>, and | B:02:231:END::>, and | |||
| - SR Policy (121, C2) with segments <B:03:x:END::, | - SR Policy (121, C2) with segments <B:03:x:END::, | |||
| B:03:121:END::>, | B:03:121:END::>, | |||
| - where x and y are node ids within the respective domains. | - where x and y are node ids within the respective domains. | |||
| * Egress PE E2 advertises a VPN route RD:V/v with Color-EC C2. | * Egress PE E2 advertises a VPN route RD:V/v with Color-EC C2. | |||
| skipping to change at line 3985 ¶ | skipping to change at line 3982 ¶ | |||
| endpoints). | endpoints). | |||
| CASE A: BGP data exchanged for MPLS (non-SR): | CASE A: BGP data exchanged for MPLS (non-SR): | |||
| Consider 200 bytes of shared attributes | Consider 200 bytes of shared attributes | |||
| CAR SAFI signals label in non-key TLV part of NLRI | CAR SAFI signals label in non-key TLV part of NLRI | |||
| Each NLRI size for AFI 1 = 12(key) + 5(label) = 17 bytes | Each NLRI size for AFI 1 = 12(key) + 5(label) = 17 bytes | |||
| Ideal packing: | Ideal packing: | |||
| Number of NLRIs in 4k update size = 223 (4k-200/17) | Number of NLRIs in 4k update size = 223 (4k-200/17) | |||
| Number of update messages of 4k size = 1.5 million/223 = 6726 | Number of update messages of 4k size = 1.5 million/223 = 6726 | |||
| Total BGP data on wire = 6726 * 4k = ~27.5MB | Total BGP data on wire = 6726 * 4k = ~27.5 MB | |||
| Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
| Size of update message = (17 * 5) + 200 = 285 | Size of update message = (17 * 5) + 200 = 285 | |||
| Total BGP data on wire = 285 * 300k = ~86MB | Total BGP data on wire = 285 * 300k = ~86 MB | |||
| No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
| Size of update message = 17 + 200 = 217 | Size of update message = 17 + 200 = 217 | |||
| Total BGP data on wire = 217 * 1.5 million = ~325MB | Total BGP data on wire = 217 * 1.5 million = ~325 MB | |||
| SAFI 128 using encoding specified in RFC 8277 with label in NLRI | SAFI 128 using encoding specified in RFC 8277 with label in NLRI | |||
| Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | |||
| Ideal packing: | Ideal packing: | |||
| Number of NLRIs in 4k update size = 237 (4k-200/16) | Number of NLRIs in 4k update size = 237 (4k-200/16) | |||
| Number of update messages of 4k size = 1.5 million/237 = ~6330 | Number of update messages of 4k size = 1.5 million/237 = ~6330 | |||
| Total BGP data on wire = 6330 * 4k = ~25.9MB | Total BGP data on wire = 6330 * 4k = ~25.9 MB | |||
| Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
| Size of update message = (16 * 5) + 200 = 280 | Size of update message = (16 * 5) + 200 = 280 | |||
| Total BGP data on wire = 280 * 300k = ~84MB | Total BGP data on wire = 280 * 300k = ~84 MB | |||
| No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
| Size of update message = 16 + 200 = 216 | Size of update message = 16 + 200 = 216 | |||
| Total BGP data on wire = 216 * 1.5 million = ~324MB | Total BGP data on wire = 216 * 1.5 million = ~324 MB | |||
| CASE B: BGP data exchanged for SR-MPLS label index: | CASE B: BGP data exchanged for SR-MPLS label index: | |||
| Consider 200 bytes of shared attributes | Consider 200 bytes of shared attributes | |||
| CAR SAFI signals label index in non-key TLV part of NLRI | CAR SAFI signals label index in non-key TLV part of NLRI | |||
| Each NLRI size for AFI 1 | Each NLRI size for AFI 1 | |||
| = 12(key) + 5(label) + 9(Index) = 26 bytes | = 12(key) + 5(label) + 9(Index) = 26 bytes | |||
| Ideal packing: | Ideal packing: | |||
| Number of NLRIs in 4k update size = 146 (4k-200/26) | Number of NLRIs in 4k update size = 146 (4k-200/26) | |||
| Number of update messages of 4k size = 1.5 million/146 = 6726 | Number of update messages of 4k size = 1.5 million/146 = 6726 | |||
| Total BGP data on wire = 10274 * 4k = ~42MB | Total BGP data on wire = 10274 * 4k = ~42 MB | |||
| Practical packing (5 routes in update message) | Practical packing (5 routes in update message) | |||
| Size of update message = (26 * 5) + 200 = 330 | Size of update message = (26 * 5) + 200 = 330 | |||
| Total BGP data on wire = 330 * 300k = ~99MB | Total BGP data on wire = 330 * 300k = ~99 MB | |||
| No-packing case (1 route per update message) | No-packing case (1 route per update message) | |||
| Size of update message = 26 + 200 = 226 | Size of update message = 26 + 200 = 226 | |||
| Total BGP data on wire = 226 * 1.5 million = ~339MB | Total BGP data on wire = 226 * 1.5 million = ~339 MB | |||
| SAFI 128 using encoding specified in RFC 8277 with label in NLRI | SAFI 128 using encoding specified in RFC 8277 with label in NLRI | |||
| Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | |||
| Ideal packing: | Ideal packing: | |||
| Not supported as label index is encoded in Prefix-SID | Not supported as label index is encoded in Prefix-SID | |||
| attribute | attribute | |||
| Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
| Not supported as label index is encoded in Prefix-SID | Not supported as label index is encoded in Prefix-SID | |||
| attribute | attribute | |||
| No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
| Size of update message = 16 + 210 = 226 | Size of update message = 16 + 210 = 226 | |||
| Total BGP data on wire = 216 * 1.5 million = ~339MB | Total BGP data on wire = 216 * 1.5 million = ~339 MB | |||
| CASE C: BGP data exchanged with 128 bit single SRv6 SID: | CASE C: BGP data exchanged with 128 bit single SRv6 SID: | |||
| Consider 200 bytes of shared attributes | Consider 200 bytes of shared attributes | |||
| CAR SAFI signals SRv6 SID in non-key TLV part of NLRI | CAR SAFI signals SRv6 SID in non-key TLV part of NLRI | |||
| Each NLRI size for AFI 1 = 12(key) + 18(SRv6 SID) = 30 bytes | Each NLRI size for AFI 1 = 12(key) + 18(SRv6 SID) = 30 bytes | |||
| Ideal packing: | Ideal packing: | |||
| Number of NLRIs in 4k update size = 126 (4k-200/30) | Number of NLRIs in 4k update size = 126 (4k-200/30) | |||
| Number of update messages of 4k size = 1.5 million/126 = ~12k | Number of update messages of 4k size = 1.5 million/126 = ~12k | |||
| Total BGP data on wire = 12k * 4k = ~49MB | Total BGP data on wire = 12k * 4k = ~49 MB | |||
| Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
| Size of update message | Size of update message | |||
| = (30 * 5) + 236 (including Prefix-SID) = 386 | = (30 * 5) + 236 (including Prefix-SID) = 386 | |||
| Total BGP data on wire = 386 * 300k = ~115MB | Total BGP data on wire = 386 * 300k = ~115 MB | |||
| No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
| Size of update message = 12 + 236 (SID in Prefix-SID) = 252 | Size of update message = 12 + 236 (SID in Prefix-SID) = 252 | |||
| Total BGP data on wire = 252 * 1.5 million = ~378MB | Total BGP data on wire = 252 * 1.5 million = ~378 MB | |||
| SAFI 128 using encoding specified in RFC 8277 with label in NLRI | SAFI 128 using encoding specified in RFC 8277 with label in NLRI | |||
| (No transposition) | (No transposition) | |||
| Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | |||
| Ideal packing: | Ideal packing: | |||
| Not supported as SRv6 SID is encoded in Prefix-SID | Not supported as SRv6 SID is encoded in Prefix-SID | |||
| attribute | attribute | |||
| Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
| Not supported as SRv6 SID is encoded in Prefix-SID | Not supported as SRv6 SID is encoded in Prefix-SID | |||
| attribute | attribute | |||
| No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
| Size of update message = 16 + 236 = 252 | Size of update message = 16 + 236 = 252 | |||
| Total BGP data on wire = 252 * 1.5 million = ~378MB | Total BGP data on wire = 252 * 1.5 million = ~378 MB | |||
| BGP data exchanged with transposition of 4 bytes from SRv6 SID into | BGP data exchanged with transposition of 4 bytes from SRv6 SID into | |||
| SRv6 SID TLV: | SRv6 SID TLV: | |||
| Consider 200 bytes of shared attributes | Consider 200 bytes of shared attributes | |||
| CAR SAFI signals SRv6 SID in non-key TLV part of NLRI | CAR SAFI signals SRv6 SID in non-key TLV part of NLRI | |||
| Each NLRI size for AFI 1 = 12(key) + 6(SRv6 SID) = 18 bytes | Each NLRI size for AFI 1 = 12(key) + 6(SRv6 SID) = 18 bytes | |||
| Ideal packing: | Ideal packing: | |||
| Number of NLRIs in 4k update size = 211 (4k-200/18) | Number of NLRIs in 4k update size = 211 (4k-200/18) | |||
| Number of update messages of 4k size = 1.5 million/211 = ~7110 | Number of update messages of 4k size = 1.5 million/211 = ~7110 | |||
| Total BGP data on wire = 7110 * 4k = ~29MB | Total BGP data on wire = 7110 * 4k = ~29 MB | |||
| Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
| Size of update message | Size of update message | |||
| = (18 * 5) + 236 (including Prefix-SID) = 326 | = (18 * 5) + 236 (including Prefix-SID) = 326 | |||
| Total BGP data on wire = 326 * 300k = ~98MB | Total BGP data on wire = 326 * 300k = ~98 MB | |||
| No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
| Size of update message | Size of update message | |||
| = 12 + 236 (SID in Prefix-SID attribute) = 252 | = 12 + 236 (SID in Prefix-SID attribute) = 252 | |||
| Total BGP data on wire = 252 * 1.5 million = ~378MB | Total BGP data on wire = 252 * 1.5 million = ~378 MB | |||
| Acknowledgements | Acknowledgements | |||
| The authors would like to acknowledge the invaluable contributions of | The authors would like to acknowledge the invaluable contributions of | |||
| many collaborators towards the BGP CAR solution and this document in | many collaborators towards the BGP CAR solution and this document in | |||
| providing input about use cases, participating in brainstorming and | providing input about use cases, participating in brainstorming and | |||
| mailing list discussions and in reviews of the solution and draft | mailing list discussions and in reviews of the solution and draft | |||
| revisions. In addition to the contributors listed in the | revisions. In addition to the contributors listed in the | |||
| Contributors section, the authors would like to thank Robert Raszuk, | Contributors section, the authors would like to thank Robert Raszuk, | |||
| Bin Wen, Chaitanya Yadlapalli, Satoru Matsushima, Moses Nagarajah, | Bin Wen, Chaitanya Yadlapalli, Satoru Matsushima, Moses Nagarajah, | |||
| End of changes. 57 change blocks. | ||||
| 92 lines changed or deleted | 89 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||