{ "blogs-2017-07-31-the-future-of-highly-available-networks": { "title": "The Future of Highly Available Networks", "content": " On This Page Introduction Defining Availability Approaches to Availability Network Level Device Level The Complexity Problem New Heuristics Solve for Network Availability First Minimize Impact with Scale-Out Architectures Manage Scale With Automation Understand Your Failures Automate Upgrades Simplify Upgrades Design Simple Failures Drain Instead of Switchover Return to A Known State Protect Single Points of Failure Conclusion References And Further Reading IntroductionNobody will dispute the importance of availability in today’s service provider networks. What is less obvious is how you achieve it. A network is a complex, dynamic system that must continually adapt to changing conditions. Some changes are normal, necessary and planned (e.g. software and hardware upgrades, configuration changes), while others are unplanned and unpredictable (e.g. software or hardware faults, human error). This whitepaper discusses different approaches to availability in both cases and lays out current best practices which can be distilled into a few simple themes# Solve for the Network First Reduce Complexity Automate Operations Defining AvailabilityBefore diving into the mechanics of availability, it’s worth considering what a highly available network means to you. Tolerance for failure is driven by your SLAs. Not every service requires the same kind of availability as high frequency trading or systems supporting hospitals. So take some time to understand the availability requirements for your services across your network. Many people assume that higher availability is always better. This may result in over-engineering the network and introducing unneeded complexity and cost.Approaches to AvailabilityWhen planning for availability, network architects often consider two levels of availability strategies# device level and network level. Getting the right balance between these levels is key to service availability in your network.Network LevelThe idea of building a reliable network from unreliable components is as old as the Internet itself. Originally designed for military survivability, the Internet assumes that nodes and links will fail. Under such conditions, you can still deliver network availability through resilient protocols and well-designed architectures.On the protocol side, availability is improved by reducing both the Mean Time To Detection (MTTD) and the Mean Time To Repair (MTTR) of protocol failures. For reducing MTTD, Bidirectional Forwarding Detection (BFD) and/or Ethernet OAM (802.3ah) are your best friends. BFD operates at Layer 3 and EOAM at Layer 2, but both provide fast-failure detection that allows network protocols to begin convergence.Reducing MTTR involves multiple approaches, starting with optimizing protocol convergence after the failure has been detected. Incremental SPF (iSPF) has long been used in IGPs to reduce the time it takes to recompute the best path. Convergence can also be improved by installing a precomputed backup path in the routing table using BGP Protocol Independent Convergence (PIC) and Loop Free Alternative Fast ReRoute (LFA FRR).MPLS Traffic Engineering (TE) can also help reduce MTTR by giving you the ability to use links and nodes that are not necessarily in the shortest path. By increasing the pool of available resources, TE helps ensure that the loss of one link or node results in the loss of only a small amount of total capacity. When links or nodes fail, MPLS TE Fast Reroute (MPLS TE FRR) can locally repair LSPs while the headend re-establishes the end-to-end LSP with an impressive 50 millisecond failover time. Still, it’s also worth remembering that FRR might be overkill for some services# not all SLAs actually require 50 millisecond failover.For these convergence optimizations and fast reroute technologies to work, the underlying architecture must support them. Multiple paths are essential to delivering fault tolerance. Well-designed architectures use redundant uplinks and multi-homing to avoid single points of failure. In such architectures, fast failure detection and fast convergence optimizations together provide a good balance of minimizing packet loss while re-converging the control plane at a pace that doesn’t destabilize the network.One of the main drawbacks with network-level availability mechanisms is that you either have to overprovision your network or accept that availability may be degraded during a failure. After all, if a link or node fails and your backup paths don’t have enough capacity, you will drop traffic. Again, it’s worth considering your SLAs. Some temporary degradation may be acceptable and you could use QoS to ensure that lower priority traffic is dropped first, allowing you to provision just enough redundant capacity for high priority traffic in failure conditions. In any event, be sure to weigh the higher capex costs of an overbuilt network against the lower operating costs of simpler hardware, software and network designs. Many operators have found that the opex savings ultimately far outweighs the capex cost of network-level availability.Device LevelNetwork-level availability mechanisms are great for coping with unreliable components but, over the years, we’ve also put a lot of work into making those individual networking devices more reliable, too. In the industry parlance, device-level availability often focuses on increasing the Mean Time Between Failure (MTBF) of the router and its components. Following the paradigm of traditional hardware fault tolerance, device-level HA duplicates major components in the system (RPs, power supplies, fan trays) for 1+1 redundancy. The duplicate components may load share in an Active-Active configuration (e.g. redundant power shelves in the CRS) or run in an Active-Standby configuration (e.g. Active-Standby RPs).As reassuring as backup power supplies and fans may be, straight-forward hardware redundancy doesn’t cut it for complex components with significant software elements. Take the route processor (RP), a complex bundle of hardware and software that is responsible for running the control plane and programming the data plane. In the event of a failover, a standby RP cannot assume the duties of the active RP without either 1) re-building the state (re-establishing neighbor relationships, re-building routing tables, etc) or 2) having an exact, real-time copy of the active state. The first takes time and the second has proven difficult to achieve in practice.One way to buy time is to separate the control plane and data plane by allowing the data plane to continue to forward traffic using the existing FIB even when the control plane on the RP is unavailable. At Cisco, we call this Non-Stop Forwarding (NSF). Modifications to higher-level protocols (BGP, ISIS, OSPF, LDP) allow a router to alert neighbors that a restart is in progress (“Graceful Restart”). The NSF-aware neighbors will continue to maintain neighbor relationships and forwarding entries while the RP reboots and/or the standby RP transitions to active. Stale FIB entries may cause sub-optimal routing or even black holes, but the effect is temporary and usually tolerable. NSF may also impose additional CPU and memory requirements which increase the complexity and cost of the device. Nevertheless, over years of industry hardening, NSF has matured into an effective technique for minimizing network downtime.Instead of buying time with NSF and Graceful Restart, IOS XR also supports Non-Stop-Routing (NSR). With NSR, all the protocol state required to maintain peering state is precisely synchronized across the active and standby RPs. When the active fails over, the standby can immediately take over the peering sessions. Because the failure is handled internally, it is hidden from the outside world. In practice, NSR is a very complex, resource-intensive operation that doesn’t always result in the perfect state synchronization that is required. And precisely because NSR “hides” the failure of the RP from neighbors, troubleshooting can be very difficult if something goes wrong.Building on NSF and NSR, Cisco tackled the specific problem of planned outages by developing an upgrade process called In Service Software Upgrades (ISSU). ISSU is a complex process that coordinates the standard RP failover with various other mechanisms to ensure that the line card stops forwarding for only as much time as it takes to re-program the hardware. Under ideal conditions, the outage is less than 10 seconds. However, in real networks, conditions are almost never ideal. Like NSR, ISSU has proven difficult to achieve in practice for the entire industry. Even when an in-service upgrade is possible, the operational overhead of understanding and troubleshooting ISSU’s many stages and caveats often outweighs the value of keeping the node in service during the upgrade.The Complexity ProblemIn Normal Accidents, Charles Perrow introduced the now widely-accept idea that systems with interactive complexity and tight coupling are at higher risk for accidents and outages. It simply isn’t possible for engineers to imagine, anticipate, plan for and prevent every possible interaction in the system. Moreover, system designers have learned the hard way that, in practice, using redundancy to compensate for local failures often has the effect of increasing complexity which, in turn, causes the outage you were trying to avoid! Given that a router with two-way active-standby redundancy is a complex, tightly coupled system, it is perhaps inevitable that successfully and reliably executing NSR and ISSU at service provider speeds and scale has proven challenging industry-wide.New HeuristicsAs daunting as availability can be, the solutions are relatively straightforward. You don’t need a lot of new features and functionality, but you may need to rethink your operations and architecture.Solve for Network Availability FirstGiven the unavoidable cost and complexity of device-level availability, it makes sense to focus your availability strategy on network-level availability mechanisms. Fast detection, convergence and re-route in a redundant, multi-path topology will get you the most bang for the buck.Minimize Impact with Scale-Out ArchitecturesIf you’re looking at your next-generation network architecture, it’s worth considering emerging design patterns that can significantly improve availability. Traditional, hierarchical network designs typically include an aggregation layer, where a small number of large devices with a large number of ports aggregate traffic from southbound layers in preparation for transit northbound. These aggregation devices are commonly deployed in a 1+1 redundancy topology.From an availability perspective, the weak point is those aggregation boxes. If one of those boxes go down, you’ll lose half your network capacity. That’s a large blast radius. Network-level availability mechanisms like fast-reroute and QoS may mitigate the impact to high priority services, but unless you have vast amount of excess capacity, your network will run in a degraded state. Hence, those devices are prime candidates for dual RPs with NSF and NSR. But we’ve already seen that those strategies can introduce complexity and, consequently, reduce availability.Faced with new traffic patterns and scale requirements, pioneers in massively-scaled data center design developed a new design pattern that continues to find new applications outside the data center [1]. The spine-and-leaf topology replaces the large boxes in the aggregation layer with many smaller leafs and spines that can be scaled horizontally. Because the traffic is spread across more, smaller boxes in a spine-leaf topology, the loss of any single device has a much smaller blast radius. Cisco’s NCS 5500 product line is well aligned with this type of Clos-based fabric design as it moves to the core and beyond.Manage Scale With AutomationThe sheer numbers in fabric-based network architectures can be intimidating. In the past, we’ve often assumed that complexity (and therefore, failure) increases with the number of devices. But as the design of large-scale data centers have shown, you can easily manage vast numbers of devices if you have sufficiently hardened automation. On the IOS-XR side, we are committed to Model-Driven Programmability to enable complete automation to make the network full programmable through tools like Ansible and Cisco’s Network Service Orchestrator (NSO).Understand Your FailuresKnowing what’s failed in the past is essential to avoiding that failure in the future. The world’s largest web service providers routinely perform forensic analyses of past failures in order to improve design and operations [2]. It is also possible to automate the remediation of well-understood failures [3].Automate UpgradesMore than one forensic analysis has shown that 60 – 90% of failures in the network are caused by having a human being in the loop# fat-fingering the configuration, killing the wrong process, applying the wrong software patch. Maintenance operations are responsible for twice the number of failures as bugs. Upgrades in particular are a magnet for these kinds of failure, as they are often complex, manual and multi-stage.When manual intervention is the cause of the problem, automation provides a way forward. By automating and vigorously validating upgrade procedures, you can significantly improve device availability while reducing operational overhead. This is the motivation for tools like the Cisco Software Manager which has been proven to reduce errors and improve availability.Simplify UpgradesWe can’t just stop at automating the upgrade process. Automating complex processes removes the human element, but as long as the complexity remains, so does the risk of “normal accidents.” After all, in the worst case, automation just provides you a way to do stupid things faster! To get the most out of automation, the upgrade process itself needs to be simplified. The current state of the art for software installs and upgrades is the standard Linux model of package management. Starting with IOS XR 6.0, all IOS XR packages use the RPM Package Manager (RPM) format, the first step in our upgrade simplification journey.Design Simple FailuresThe truth is that device failure is normal at scale. Even if an individual device has the vaunted 5 9s availability, if you have 10,000 of those devices, you will have downtime every day of the year. Instead of trying to avoid failures at all costs, embrace it! Just make sure you embrace the right kind of failure. Experience has taught us that simple, obvious failures with predictable consequences are easier to manage than complex, subtle failures with poorly understood consequences even when the simple failure has a larger “blast radius.”Take the case of a stateful switchover from the active to the standby RP. There aren’t many network operators who haven’t been scarred by a planned or unplanned switchover that did not go as expected. For whatever reason, the standby RP ends up in state that does not match the active RP’s state when it went down. Because device-level availability techniques like NSR try to “hide” the failure from the outside world, it can be very difficult to diagnose and troubleshoot the underlying issue. In many cases, the partial failure of the switchover is often worse than just having the router go away and come back. A rebooting router is a well-understood event that can be handled by the control plane protocols. On the other hand, a misbehaving RP in an unknown state is not well understood and may lead to much worse behavior than a temporary degradation.The RP example illustrates an emerging design principle# instead of trying to handle a large set of potential partial failures (e.g. different kinds of failed switchovers), group them into a common failure (router reset) that can be handled in a predictable way by the network. The end result is a network that is simpler, more predictable and more reliable.Following this principle, many service providers have settled on simpler switchover techniques like NSF over the more complex stateful switchover of NSR. Going a step further, some providers have deployed single-RP systems in multi-homed roles, forgoing the upside of switchover altogether in favor of a single, well-understood failure. Of course, single-RP systems cost less, but this is absolutely not about capex. Some customers will actually buy the redundant RP and leave it, powered off, in the chassis so they don’t have to roll a truck in the event of a truly catastrophic RP failure. The cost of the extra RP is trivial compared to the operational cost of detecting and fixing bad failovers.Drain Instead of SwitchoverIf you have a single RP system, you won’t be doing a switchover for maintenance operations. Operators who have pioneered this kind of deployment have developed a “drain-and-maintain” strategy. In this workflow, devices targeted for traffic-impacting maintenance will be drained of traffic in a controlled manner by shutting down links, lowering the preference of the link, or assigning an infinite cost to the link so routing neighbors will not select it. Once the traffic has been redirected away from the router, the maintenance operation (such as software upgrade) can proceed. When the maintenance is complete, links can be brought back into service. This works well for redundant, transport-only nodes like LSRs and P routers.For drain-and-maintain to be successful, you have to first validate that there is sufficient excess capacity to carry the drained traffic while the device is offline. Because of all the moving parts, automation is a key element of this strategy. You will want an orchestration system to validate the excess capacity, choreograph the drain, maintenance, and undrain, and validate the return to the desired steady-state.Return to A Known StateIf a router reboot is to be a “normal failure” in the network, the system needs to ensure that the router returns to a known state as quickly as possible. For this to work, the router’s “source of truth” (i.e. its configuration) needs to be stored off the box. If the source of truth is on the router, the truth will be irretrievably lost if the router is wiped out. With an off-box source of truth, you can iPXE-boot your router back to the known state with confidence.Protect Single Points of FailureRedundant, multi-homed topologies are the hallmarks of good network design, but it is not always possible to design out all the single points point of failure. Some common examples include# Edge devices such as an LER or PE router may have single-attached customers. End-users are typically single-homed to an edge device that may contain significant amounts of non-duplicated user state (e.g. BNG or CMTS). Long-haul transport can be prohibitively expensive, increasing the likelihood that of an architecture that leverages single devices with a limited number of links.The first question to ask yourself is if there is any way to design redundancy into the network using new technologies. For example, IOS XR supports Network Function Virtualization (NFV), allowing it to be deployed as a virtual PE (vPE). Deployed in a redundant pair, vPEs can provide edge customers with better network availability over the same physical link.If you can’t eliminate it altogether, a single point of failure is also a good candidate for the spine-and-leaf fabric described above. If a 2 RU node in the fabric fails, the blast radius is much smaller than if a 20-slot chassis fails. Barring that, in-chassis hardware redundancy and switchover techniques like NSF may be considered.ConclusionDecades of experience have proven that network availability mechanisms provide simpler, more efficient, lower cost alternatives when the architecture supports them. Beyond that, the frontier for availability is all about operations. By automating workflows, particularly upgrades, you can eliminate the most common causes of failure in the first place. Investing in operations may seem like an odd strategy for availability but the truth is that simple, automated networks are the most available networks of all.References And Further Reading[1] Introducing Data Center Fabric[2] Evolve or Die - High-Availability Design Principles Drawn from Google’s Network Infrastructure[3] Making Facebook Self-HealingNormal Accidents# Living with High Risk Technologies (Updated). Princeton University Press, 1999, Charles Perrow.Site Reliability Engineering, O’Reilly Media, April 2016, Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy.", "url": "/blogs/2017-07-31-the-future-of-highly-available-networks/", "author": "Shelly Cadora", "tags": "iosxr, HA, ISSU" } , "blogs-2017-08-01-internet-edge-peering-current-practice": { "title": "Internet Edge Peering - Current Practice", "content": " On This Page Introduction Peering Background Peering History IXP Fabric vs. Private Network Interconnect Localized Peering B2B IP Peering Peering Hardware Peering Network Design Traditional SP IXP Network Design Distributed SP Peering Fabric Regional Core Bypass / Express Peering Peering Network Resiliency Management, Control-Plane, and Security Peering Telemetry Flow Information Peer Statistics and Operational Data Model-Driven Telemetry BGP Monitoring Protocol TE Traffic Matrix Information BGP Policy Accounting Peering Engineering Ingress and Egress Peer Engineering Peer Selection via Analytics Segment Routing Egress Peer Engineering Flexible Routing Policies via RPL Peering Security BCP Implementation BGP Attribute and CoS Scrubbing Per-Peer Control Plane Policers BGP Prefix Security RPKI Origin Validation BGPSEC BGP Flowspec Summary IntroductionThe Internet was created to provide transparent data services across interconnected packet switched networks. The interconnection and exchange of Internet routing data between two networks is known as Peering. Peering is the glue holding together the Internet, without it the flow of data across the Internet would not be possible. Peering represents an important administrative, operational, and security boundary between networks. Peering is a subject of great interest in many areas, from analytics to politics. This paper will focus on the technical aspects of peering covering peering history and current peering architecture in the areas of network, security, and telemetry.Peering BackgroundPeering HistoryThe Internet became ubiquitous across the globe in the early 1990s largely due to the construction of Internet connection points known as IXPs or Internet Exchange Points. The IXP provided a place for networks to connect and exchange traffic. The initial IXPs constructed were slightly different than the exchange points of today, but by the middle of the 1990s they began to look markedly similar to today’s IXPs. The initial four major North American IXPs, known as NAPs or Network Access Points were operated by traditional telephone companies. As the Internet grew the steep rise in traffic necessitated the creation of more IXPs and private enterprises created a number of carrier-neutral IXPs across the globe. These facilities also provide physical cross-connect services to allow networks to peer directly with each other, bypassing the IXP fabric. Hundreds of IXPs exist today across the world, along with thousands of private interconnections between organizations.IXP Fabric vs. Private Network InterconnectThe easiest way to interconnect the large number of networks joining the Internet was for the IXP itself to create a peering “fabric” facilitating connectivity from one network to many other networks via a single physical connection to each member network. Multiplexing technology such as ATM VCs were used initially, replaced with Ethernet VLANs or flat Ethernet broadcast domains as Ethernet became the dominant medium.In the earlier days of the Internet, producers and consumers were more widely distributed as regional residential and business providers provided services to both along with major telephony carriers. As broadband connectivity has become ubiquitous across the world, service and content providers have evolved to meet user’s needs. In North America for example, Internet connectivity to residential consumers has been consolidated to a small number of providers. There are also relatively few mobile providers, an area seeing tremendous traffic growth. In addition, the content being consumed is produced by a much smaller number of content providers, most of whom have created their own peering presence. This has led the “eyeball” consumer networks to peer directly with content provider networks over PNI (Private Network Interconnection) instead of utilizing IXP fabrics. The reduction in pricing of Internet transit has also lead to more use of transit for low bandwidth peers, also negating the need for IXP fabric connections.Outside North America broadband consolidation has not occurred at the same rate and the decoupling of physical infrastructure from network service has led to many more network providers who interconnect with each other and content providers. IXP fabrics are still widely used across the world today in regional IXPs and large IXPs outside North America. In North America peering fabrics have largely been replaced by the use of Private Network Interconnection.Localized PeeringWhile the bulk of Internet peering still occurs in a relatively small number of IXPs in the US, there has been an effort in the last several years to interconnect closer to consumers. The main driver is the cost to carry traffic across long-haul networks from the traditional peering locations to the consumers they serve in larger markets across a large geographic distance. Building and maintaining equipment in a more distributed peering fabric has become more feasible vs increasing capacity on long-haul fiber networks. In addition, it improves the quality of experience for end users since content is served closer to the consumer.B2B IP PeeringAn often-overlooked form of peering is B2B peering between different networks. There are many examples of B2B peering, the most visible today are those connecting datacenter colocation providers to cloud service providers. Additionally, B2B peering is used for linear video content providers to send video to end providers, carry voice services over IP instead of PSTN, and interconnect various service owners to consumers.Peering HardwareThe main requirements when looking at peering network hardware are Physical, power, and cooling footprint 10G/100G interface density Software support for necessary peering features The Cisco Visual Networking Index has shown a 1270% rise in Internet traffic over the last 12 years and projects a threefold increase over the next five years. Peering routers require flexible high-density hardware supporting the required software feature sets needed for Internet peering. The Cisco NCS 5500 platform powered by IOS-XR satisfies today’s peering needs along with capacity for future traffic growth. Most peering locations are third party facilities where space and power can add considerable cost. Across the entire NCS 5500 series density is greater than 24x100GE per RU, with the NCS 5504, 5508, and 5516 chassis systems providing 36x100GE per RU, all at power consumption of approximately .3/Gbps. In addition to high density 100GE, every NCS 5500 QSFP+ port supports 4x10GE breakout options to connect to 10GE peers as well as downstream caching devices in the case of content provider edge peering.More information on the Network Convergence System 5500 series can be found at http#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.htmlPeering Network DesignTraditional SP IXP Network DesignTraditional SP peering is distributed to one or more geographic locations driven by the convenience of having peers in the same IXP location. There is a need to maintain those locations today as not everyone has the capability of interconnecting everywhere. Peering center network design has typically been done with two or more larger edge peering routers with either direct connections to an optical backbone or connections to another stage of backbone routers as seen in left hand side of Figure 1. In recent years, there has been a trend to more modular network design within the peering center using smaller fixed or chassis systems, allowing the use of best of breed hardware and flexibility to add capacity more granularly. The topology mimics the folded CLOS networks, scaling horizontally across peer connections and minimizing impact during failures. Also, datacenter space in traditional IXP facilities has become more limited in recent years, leading the carrier neutral facility providers to expand to multiple facilities within a metropolitan area. Placing smaller high-density systems to multiplex peer connections onto higher speed links is more cost effective than paying the MRC of inter-facility cross-connects per peer connection.Figure 1 - IXP Network Design EvolutionDistributed SP Peering FabricWhile traditional IXP facilities are still important, reducing the distance and network hops between where packets enter your network and exit to the consumer is a high priority for service providers. Each pass through an optical transponder or router interface adds additional cost to the transit path. The rise in video traffic over the next several years demands Internet peering at the edges of the network to serve wireline broadband subscribers along with high-bandwidth 5G mobile users. Content providers have invested heavily in their own networks as well as distributed caches serving content from any network location with Internet access. Third party colocation providers have begun building more regional locations supporting PNI between content distributors and the end subscribers on the SP network. This leads to a localized peering option for SPs and content providers, greatly reducing the distance and hops across the network. The right-hand diagram of Figure 1 is an example of how more distributed peering is changing the landscape of peering.An initial step may be to create a single localized peering facility within a region but as traffic demands increase or more resiliency is needed it can drive the addition of multiple facilities within a region.Reduced footprint high capacity routers are ideal for serving the needs of a distributed peering network. The NCS 5501, 5502, and 5504 are ideal, providing high density peer termination in a compact, efficient package while not sacrificing features or protocol scale.Figure 2 - Localized PeeringRegional Core Bypass / Express PeeringRegional SP networks serving residential subscribers are typically deployed in an aggregation/access hierarchy using logical Ethernet connections over a metro optical transport network. The aggregation nodes serve as an aggregation point for connections to regional sites along with acting as the ingress point for traffic coming from the SP backbone. If a distributed peering fabric is implemented, SPs can drive even greater efficiency by selecting specific high bandwidth regional sites for core bypass. This is simply connecting the regional hub routers directly to a localized peering facility, bypassing the regional core aggregation nodes which are simply acting as a pass through for the traffic. Due to the growth in Internet video traffic, this secondary express peering network in time will likely be higher capacity than the original SP converged network. The same express peering network can also be used to serve content originated by the SP, leaving the converged regional network to serve other higher priority traffic needs.Figure 3 - Express Peering FabricPeering Network ResiliencyPeering resiliency refers to the ability for the network to cope with the loss of a peer, peering router, or peering facility. Apart from B2B peering instances, almost all traffic coming over a peering link is considered best-effort low priority traffic. Even though most traffic is BE, it is recommended for SPs to dual-home to high bandwidth sources of inbound traffic within the same facility or region. Since most larger SPs peer with the same providers in diverse geographic locations, it is more cost-effective to maintain multiple connections in a facility or region than to have traffic fail-over to a path originating in another geographic region. The traffic may result in congestion on backbone links or building excess capacity on expensive long-haul optical networks. The same holds true for localized peering facilities, dual-homing a peer to multiple routers is more cost effective than relying on backup paths across longer distances.Management, Control-Plane, and SecurityRobust and feature-rich software is required in all three of these areas to build a successful peering network. IOS-XR has been a critical part of many peering networks across the world, and continued innovation in the areas of telemetry, programmability, and security is enhancing service provider peering edge networks.Peering TelemetryPeering exists to drive greater efficiency in the network by reducing network transit path length. The insight to determine when to peer with another network and if the peering is resulting in savings is paramount. The streaming data capabilities of IOS-XR using Netflow/IPFIX, Model-Driven Telemetry, and BGP Monitoring Protocol give unprecedented visibility into the peering edge of the network.Flow InformationNetflow was invented by Cisco due to requirements for traffic visibility and accounting. Netflow in its simplest form exports 5-tuple data for each flow traversing a Netflow-enabled interface. Netflow data was further enhanced with the inclusion of BGP information in the exported Netflow data, namely AS_PATH and destination prefix. It was now possible to see where traffic originated by ASN and derive the destination for the traffic per BGP prefix, aiding in peering analysis and planning. The latest iteration of Cisco Netflow is Netflow v9, with the next-generation IETF standardized version called IPFIX (IP Flow Information Export). Netflow continues to be an important source of information for discovering traffic patterns, detecting traffic anomalies, and for detecting security issues such as DDoS attacks. IPFIX has expanded on Netflow’s capabilities by introducing hundreds of entities Netflow is a mandatory component for SPs at the peering edge and Cisco continues to lead in support and development of Netflow and IPFIX on all platforms.Peer Statistics and Operational DataThe most basic information needed on a per-peering connection basis is interface traffic statistics. Collected via SNMP or newer methods like Model-Driven Streaming Telemetry, having insight into both real-time traffic statistics and historical trends is a necessary component for operating and planning peering networks. In addition to statistics data, it’s important to monitor the state of the peer such as current operational state, max prefix limits, andModel-Driven TelemetryModel-driven streaming telemetry is a foundational element to a modern peering network architecture. Apart from insights gained from the higher frequency of data like interface statistics, MDT also supports operational data such as BGP session state and BGP prefix counts. In addition to periodic streaming data, event driven streaming telemetry can also enhance peering monitoring and automation capabilities. In the case of a RSVP-TE or SR-TE enabled peering router, traffic matrix data can also be streamed and when combined with Netflow data provides powerful insight into the traffic entering or leaving your network. Cisco IOS-XR coupled with the capabilities of the NCS and ASR hardware platforms deliver a wide range of streaming telemetry data. http#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/telemetry/b-telemetry-cg-ncs5500-62x.html is a resource for configuring both policy and model driven telemetry on IOS-XR based platforms.BGP Monitoring ProtocolBGP Monitoring Protocol, standardized in RFC 7854, streaming data regarding received BGP prefixes from peers both pre and post routing policy. Since all prefix changes from peers are streamed you gain instant visibility into why outbound traffic shifts are occurring on the network. SNAS, Streaming Network Analytics System, implements a BMP collector and a number of BGP monitoring applications focused on deriving greater meaning from BGP updates. Protocol analytics and security are two main focus areas. The SNAS project can be found at http#//snas.io. IOS-XR was the first vendor to support BMP and continues to be enhanced to support additional NLRI.TE Traffic Matrix InformationTE enabled networks with full or partial mesh configurations can easily determine traffic between peering facilities and origin/destination facilities by looking at TE tunnel statistics. Netflow data can show similar statistics, but requires collection and aggregation, which is not a realtime operation. IOS-XR supports near realtime monitoring of TE tunnel statistics via streaming telemetry, enhancing the capability to catch anomalies and dynamically react to traffic changes. In addition to traditional RSVP-TE tunnel statistics, IOS-XR has been enhanced to gather persistent traffic matrix statistics for Segment Routing enabled networks. The Demand Matrix feature measures traffic from external interfaces destined for Prefix SIDs, making it ideal for peering edge applications.BGP Policy AccountingUnique to IOS-XR is the BGP Policy Accounting feature. BGP PA allows providers to use criteria defined in RPL such as matching a single or set of origin ASNs via regex to create traffic counters when the route is installed in the FIB. This is done through the application of RIB to FIB table policies via the table-policy command. BGP PA can add to the data acquired via other means or be used to quickly isolate operational issues. [pointer to BGP PA?]Peering EngineeringIngress and Egress Peer EngineeringIngress peer engineering is influencing the point traffic is coming into your network. Since you are peering with a network you do not own, the options for ingress peer engineering are limited. The use of selective advertisement, deaggregation of shorter prefixes into longer ones, and AS_PATH prepending are the standard options available that guarantee IPE. Other mechanisms like MED may be available if agreements with the peer are negotiated for them to adhere to those attributes.Egress peer engineering is much easier since you are controlling the path on your own network. There are standard mechanisms to influence best path selection such as LOCALPREF and MED and also more advanced options for per-peer selection such as SR EPE.Peer Selection via AnalyticsUsing the data provided by the peering edge devices and telemetry collectors, traffic analysis can reveal who to peer with and where to peer with them. Analyzing transit connections is key to determining good candidates for peering. Netflow exported data will contain the source ASN of traffic flows, and when aggregated over a time period by source ASN gives a clear picture of how much traffic you can expect via direct peering with the ASN. The aggregate traffic to a destination prefix can help determine where to peer, helping minimize the transit hops from source to destination.Segment Routing Egress Peer EngineeringEgress Peering Engineering or EPE using SR is a newer method of directing traffic to a specific peer based on a unique label, in the case of SR the peer is addressed via a specific prefix SID. It allows a TE path decision from deeper in the network to not only specify an egress node but a specific egress peer or set of peers. Through the use of Anycast SIDs, traffic can be load balanced between several peer nodes as well, simplifying the process of balancing egress traffic.Flexible Routing Policies via RPLBGP routing policies are used to filter inbound or outbound advertised prefixes and apply modifications to BGP attributes. These mechanisms are used to steer traffic towards egress points, or steer traffic into networks via the application of MED and AS_PATH or prefix suppression.IOS-XR from its inception has supported flexible routing policy definitions via its Routing Policy Language. RPL supports advanced functionality such as hierarchical policies, global parameters, and passing parameters to policies. Replacing common policy components with variables passed as parameters when the policy is applied allows abstraction and eliminates duplication. More information about RPL including many examples of its functionality can be found at http#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/routing/62x/b-routing-cg-ncs5500-62x/b-routing-cg-ncs5500-62x_chapter_0101.htmlPeering SecurityPeering by definition is at the edge of the network, where security is mandatory. While not exclusive to peering, there are a number of best practices and software features when implemented will protect your own network as well as others from malicious sources within your network.BCP ImplementationBest Current Practices are informational documents published by the IETF to give guidelines on operational practices. This document will not outline the contents of the recommended BCPs, but two in particular are of interest to Internet peering. BCP38 explains the need to filter unused address space at the edges of the network, minimizing the chances of spoofed traffic from DDoS sources reaching their intended target. BCP38 is applicable for ingress traffic and especially egress traffic, as it stops spoofed traffic before it reaches outside your network. BCP84 proposes automated ways to verify ingress traffic is valid via the use of Unicast Reverse Path Check or uRPF, a mechanism dropping traffic if the source address does not match a valid route. BCP194, BGP Operations and Security, covers a number of BGP operational practices, many of which are used in Internet peering. IOS-XR supports all of the mechanisms recommended in BCP38, BCP84, and BCP194, including software features such as GTTL, BGP dampening, and prefix limits.BGP Attribute and CoS ScrubbingScrubbing of data on ingress and egress of your network is an important security measure. Scrubbing falls into two categories, control-plane and dataplane. The control-plane for Internet peering is BGP and there are a few BGP transitive attributes one should take care to normalize. Your internal BGP communities should be deleted from outbound BGP NLRI via egress policy. Most often you are setting communities on inbound prefixes, make sure you are replacing existing communities from the peer and not adding communities. Unless you have an agreement with the peer, normalize the MED attribute to zero or another standard value on all inbound prefixes.In the dataplane, it’s important to treat the peering edge as untrusted and clear any CoS markings on inbound packets, assuming a prior agreement hasn’t been reached with the peer to carry them across the network boundary. It’s an overlooked aspect which could lead to peer traffic being prioritized on your network, leading to unexpected network behavior.Per-Peer Control Plane PolicersBGP protocol packets are handled at the RP level, meaning each packet is handled by the router CPU with limited bandwidth and processing resources. In the case of a malicious or misconfigured peer this could exhaust the processing power of the CPU impacting other important tasks. Most vendors implement policers to prohibit impact to other processes, but at a per-protocol level, meaning even with a policer in place sessions to other BGP peers could be disrupted. IOS-XR’s powerful control plane policing feature implements separate dynamic policers for each peer, meaning no impact outside of that peer.BGP Prefix SecurityRPKI Origin ValidationPrefix hijacking has been prevalent throughout the last decade as the Internet became more integrated into our lives. This led to the creation of RPKI origin validation, a mechanism to validate a prefix was being originated by its rightful owner by checking the originating ASN vs. a secure database. IOS-XR fully supports RPKI for origin validation. Complete details of IOS-XR’s Flowspec implementation plus configuration examples can be found athttp#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r6-2/routing/configuration/guide/b-routing-cg-asr9000-62x/b-routing-cg-asr9000-62x_chapter_011.htmlBGPSECRPKI origin validation works to validate the source of a prefix, but does not validate the entire path of the prefix. Origin validation also does not use cryptographic signatures to ensure the originator is who they say they are, so spoofing the ASN as well does not stop someone form hijacking a prefix. BGPSEC is an evolution where a BGP prefix is cryptographically signed with the key of its valid originator, and each BGP router receiving the path checks to ensure the prefix originated from the valid owner. BGPSEC standards are being worked on in the SIDR working group.BGP FlowspecBGP Flowspec was standardized in RFC 5575 and defines additional BGP NLRI to inject traffic manipulation policy information to be dynamically implemented by a receiving router. BGP acts as the control-plane for disseminating the policy information while it is up to the BGP Flowspec receiver to implement the dataplane rules specified in the NLRI. At the Internet peering edge, DDoS protection has become extremely important, and automating the remediation of an incoming DDoS attack has become very important. IOS-XR on the NCS 5500, ASR9000, and ASR9900 implements most functionality defined in RFC 5575 and the RFCs and drafts extending BGP Flowspec’s capabilities. Full details of IOS-XR’s Flowspec implementation and configuration examples can be found at# https#//community.cisco.com/t5/service-providers-blogs/bgp-flowspec-implementation-on-ncs5500-platforms/ba-p/3387443SummaryCisco routers have powered the Internet for decades now, including playing a critical role in peering between Internet networks. Through continued innovation Cisco platforms such as the NCS 5500 powered by IOS-XR enable providers to unlock efficiency today in their peering edge driving Capex and Opex savings.Stay tuned for the next paper in the series covering Next-Generation peering, where we introduce Intelligent Peering through rich analytics, router programmability, and more flexible path selection.", "url": "/blogs/2017-08-01-internet-edge-peering-current-practice/", "author": "Phil Bedard", "tags": "iosxr, Peering, Design" } , "#": {} , "blogs-2017-09-21-peering-telemetry": { "title": "Peering Telemetry", "content": " On This Page Introduction Telemetry Data Types Metric Data Event Data Peering Metric Data Peering Metric Data Protocols Model-Driven Telemetry Sampled Netflow / IPFIX NETCONF SNMP Peering Metric Data Examples Peer Physical and Logical Interface Statistics BGP Operational State Peering Event Data Peering Event Data Protocols BGP Monitoring Protocol Syslog SNMP Traps Peering Related Events Peer Interface State Peer Session State Max-Prefix Threshold Events Global and Per-Peer RIB Changes Enabling Telemetry Model Driven Streaming Telemetry Netflow / IPFIX BMP SNMP and SNMP Traps Syslog Applications for Peering Telemetry Enhancing Peering Operations Monitor Peering Router Stability and Resources Monitor Peer Stability Using Flow Data for Enhanced Troubleshooting Applicable Metric Telemetry Applicable Event Telemetry Network Visibility Peer Traffic Anomaly Detection Applicable Metric Telemetry Applicable Event Telemetry Capacity Planning Peer Targeting Existing Peer Capacity Planning Security Attack Detection BGP Prefix Anomalies Applicable Metric Telemetry Applicable Event Telemetry Peering Traffic Engineering Ingress Peer Engineering Appendix A A.1 Interface Metric SNMP OIDs and YANG Paths Relevant YANG Models A.2 BGP Operational State YANG Paths Relevant YANG Models Global BGP Protocol State BGP Neighbor State Example Usage BGP RIB Data A.3 Device Resource YANG Paths IntroductionTelemetry is the process of measuring the state of the components in a system and transmitting it to a remote location for further processing and analysis. Telemetry is important in any system whether it be a car, water treatment facility, or IP network. The loosely coupled distributed nature of IP networks where the state of a single node can affect an entire set of interconnected elements makes telemetry especially important. Peering adds an additional dimension of a node not under your administrative control, requiring additional telemetry data and analysis to detect network anomalies and provide meaningful insight to network behavior. Analysis of peering telemetry data also enhances operations and network efficiency. In this paper, we’ll explore IP network telemetry data types, how they are used, and their configuration in IOS-XR.Telemetry Data TypesIt’s important to make a distinction between the two major telemetry types we’ll be collecting and analyzing for peering networks. Each data types is important for operations and planning and often combined to fulfill complex use cases.Metric DataMetric data refers to the measurement of quantitative data which normally changes over time. Metric telemetry data is usually found in the form of a counter, Boolean value, gauge, rate, or histogram. A counter is a monotonically increasing measurement, interface received bits is an example of a counter. A Boolean value is used to periodically transmit a state value. These are typically used in the absence of an event data type covering the state change. Gauges are used to record instantaneous values in time, such as the number of prefixes received from a BGP peer. Rate data is the rate of change in a counter or gauge over a period of time. Rate data being received from a remote device requires some processing of data on the device itself since it must record historical values and average them over a time period. Interface bits per second is an example of a typical rate data type. Histograms are an additional more complex data type requiring the most processing on a device. Histograms store the frequency of occurrence over a period of time and typically use ranges to group similar values. Histograms are not widely used in IP networks, but may have more applicability in the future.Event DataEvent data is the recording of data at specific times triggered by a specific monitored state change. The state change could be a boolean value such as an interface being up or down, or an exception triggered by exceeding a limit on a metric like a BGP peer exceeding its received prefix limit. Event data always has a timestamp and then a series of other schema-less data points carrying additional information. Event data often contains metric data to supply context around the event. A prefix limit violation event may include a peer IP, peer ASN, and currently configured limit.Peering Metric DataPeering Metric Data ProtocolsModel-Driven TelemetryIOS-XR on the NCS5500 series supports MDT, or Model-Driven Telemetry over TCP or gRPC as an efficient method to stream statistics to a collector. The MDT fields are accessed by a specific telemetry sensor path defined in native IOS-XR, OpenConfig, or standard IETF YANG models. Multiple MDT groups can be configured to report data at different intervals, for instance to stream interface statistics at higher frequency than BGP protocol statistics. Much more information on Model-Driven Telemetry in IOS-XR can be found at http#//xrdocs.io/telemetry/.Sampled Netflow / IPFIXNetflow was invented by Cisco due to requirements for traffic visibility and accounting. Netflow in its simplest form exports 5-tuple data for each flow traversing a Netflow-enabled interface. Netflow data is further enhanced with the inclusion of BGP information in the exported Netflow data, namely AS_PATH and destination prefix. This inclusion makes it possible to see where traffic originated by ASN and derive the destination for the traffic per BGP prefix. The latest iteration of Cisco Netflow is Netflow v9, with the next-generation IETF standardized version called IPFIX (IP Flow Information Export). IPFIX has expanded on Netflow’s capabilities by introducing hundreds of entities.Netflow is traditionally partially processed telemetry data. The device itself keeps a running cache table of flow entries and counters associated with packets, bytes, and flow duration. At certain time intervals or event triggered, the flow entries are exported to a collector for further processing. The type 315 extension to IPFIX, supported on the NCS5500, does not process flow data on the device, but sends the raw sampled packet header to an external collector for all processing. Due to the high bandwidth, PPS rate, and large number of simultaneous flows on Internet routers, Netflow samples packets at a pre-configured rate for processing. Typical sampling values on peering routers are 1#4000 or 1#8000 packets.NETCONFDefined in RFC 6241, NETCONF can also be used to retrieve metric data from a device over SSH using the operational state paths associated with IETF, Openconfig, and IOS-XR native YANG models. Almost all state data in IOS-XR adheres to a YANG model, allowing one to retrieve state data for protocols, resource utilization, route tables, etc. via simply constructed NETCONF RPC operations. See Appendix A for several examples of using NETCONF to gather useful peering metric data.SNMPSNMP, the traditional well-supported protocol for pulling data from a device is also supported via native IOS-XR or standard IETF MIBs.Peering Metric Data ExamplesPeer Physical and Logical Interface StatisticsThe most basic information needed on peering connections is interface statistics. Collected via SNMP or newer methods like Model-Driven Telemetry, having insight into both real-time traffic statistics and historical trends is a necessary component for operating and planning peering networks. A list of recommended interface counters, their related MDT sensor paths, and SNMP OIDs can be found in Appendix A.1.BGP Operational StateThere is a variety of BGP operational state data to be mined for information. Doing so can lead to enhanced peering operations. A wealth of information on global and per-peer BGP state is available via OpenConfig and native IOS-XR YANG models. This includes data such as per-AFI and per-neighbor prefix counts, update message counts, and associated configuration data. Using the OpenConfig BGP RIB modesl, you can retrieve the global best-path Loc-RIB and per-neighbor adj-RIB-in and adj-RIB-out pre and post policy along with the reason why a given route was not selected as best-path. This gives additional operational insight through automation which would normally require one to login to various routers, issue show commands, and parse the verbose output. A list of recommended BGP OpState YANG Paths can be found in Appendix A.2Peering Event DataPeering event telemetry data is ideally sent when an event occurs on the device, as opposed to polling the state. The timestamped event is sent to a collection system which may simply log the event or the event may trigger the collection of additional data or remediation action.Peering Event Data ProtocolsBGP Monitoring ProtocolBMP, defined in RFC7854, is a protocol to monitor BGP events as well as BGP related data and statistics. BMP has two primary modes, Route Monitoring mode and Route Mirroring mode. The monitoring mode will initially transmit the adj-rib-in contents per-peer to a monitoring station, and continue to send updates as they occur on the monitored device. Setting the L bits on the RM header to 1 will convey this is a post-policy route, 0 will indicate pre-policy. The mirroring mode simply reflects all received BGP messages to the monitoring host. IOS-XR supports sending pre and post policy routes to a station via the Route Monitoring mode. BMP can additionally send information on peer state change events, including why a peer went down in the case of a BGP event.There are drafts in the IETF process to extend BMP to report additional routing data, such as the loc-RIB and per-peer adj-RIB-out. Local-RIB is the full device RIB including received BGP routes, routes from other protocols, and locally originated routes. Adj-RIB-out will add the ability to monitor routes advertised to peers pre and post policy.SyslogSyslog has long been used as a method for reporting event data from both host servers and network devices, and allows a severity to be transmitted along with a verbose log message. Syslog requires more complex post-processing on the receiver end since all the data is encoded within the text message itself, but in the absence of a standardized schema for certain events can be a useful option. Syslog is not typically encrypted which can be a security concern.SNMP TrapsSNMP traps are event-driven SNMP messages sent by a device following a well-defined SNMP OID schema. SNMP trap receivers can easily decode the message type by the OID and apply the appropriate policy. SNMP trap policies must be defined on the device itself to filter out unwanted messages.Peering Related EventsMonitoring all event data on a router can often overwhelm collectors, so prescriptive monitoring is needed to only ingest applicable events. Applicable SNMP trap OIDs can be found in Appendix A.3.Peer Interface StateThe most basic form of peer monitoring is physical and logical interface state. Whenever an interface goes down, it’s a traffic impacting event needing further investigation. Interface state can be polled at periodic intervals.Peer Session StateMonitoring peer session state can be critical for detecting transient outages, traffic shifts, and performing root cause analysis on historical traffic impacting events. BGP peer sessions can transition from down to up in a short time period and are not always triggered by interface state changes.Max-Prefix Threshold EventsSetting a realistic max-prefix limit on peers is an important security mechanism. Most router operating systems support the ability to trigger an event based on a percentage threshold of this max prefix limit. This is an important event to monitor since reaching the limit generally results in a traffic-affecting session teardown event.Global and Per-Peer RIB ChangesBMP allows one to monitor incoming advertisements on a per-peer basis and record them for historical purposes. Having a record of all changes allows one to playback updates to determine the past impact of peer advertisement changes. BMP is the preferred mechanism to stream BGP updates as they happen, but NETCONF can also be used to retrieve BGP RIB data globally and per-peer on IOS-XR.Enabling TelemetryModel Driven Streaming TelemetryMDT is enabled on the node itself in three steps. 1) Grouping source data YANG paths, called sensors, into a sensor group. 2) Creating a destination group with the destination and data encoding method. 3) Creating a subscription grouping a sensor-group to a destination-group. This method of configuration is known as “dial-out” since the node itself initiates the streaming. Another method, called “dial-in” uses specific models to configure all of the above information from an external management application. The dial-in configuration is ephemeral, meaning it is not stored in the startup configuration. Configuration of MDT for IOS-XR on the NCS5500 can be found here# https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/telemetry/b-telemetry-cg-ncs5500-62x/b-telemetry-cg-ncs5500-62x_chapter_011.htmlAlso, http#//xrdocs.io has a number of telemetry related blogs which go in depth on configuration and use cases for MDT.On the collection side, Pipeline is a Cisco open-source project which can be used to collect streaming data and output it to several popular time-series databases. Pipeline can be located at https#//github.com/cisco/bigmuddy-network-telemetry-pipeline and tutorial on using Pipeline at https#//xrdocs.github.io/telemetry/tutorials/2017-05-08-pipeline-with-grpcNetflow / IPFIXThe Netflow and IPFIX configuration guide for the NCS5500 can be found here#https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/netflow/62x/b-ncs5500-netflow-configuration-guide-62x/b-ncs5500-netflow-configuration-guide-62x_chapter_010.htmlTuning Netflow parameters is critical to extracting the most useful data from Netflow. It is recommended to contact your Cisco SE to work through optimizing Netflow for your platform. As a general guideline, for traffic and security analysis, a sampling interval of 8000#1 or 4000#1 is sufficient.There are a wide range of Netflow collection engines on the market today as well as cloud-based solutions. PMACCT found at http#//www.pmacct.net is a popular open-source Netflow and IPFIX collector.BMPBMP is easily configured in the following steps in IOS-XR. Configure a BMP destination host using the global “bmp server <1-8> command with its associated parameters. The minimum configuration is bmp server <1-8> host <fqdn|ip> port . BMP uses TCP as its transport protocol, and has no standard port so a port must be specified. Additionally, in order to send periodic BGP statistics, a statistics interval must be configured via the bmp server <1-8> stats-reporting-period <1-3600> command. Once a destination BMP host is configured, BMP is activated on a per-peer basis (or all peers via a shared peer-group configuration) using the “bmp-activate server <1-8>” under the neighbor configuration with the BGP routing configuration. The following is an example configuration.!bmp server 1 host 192.168.2.51 port 8500 update-source GigabitEthernet0/0/0/0 stats-reporting-period 60!router bgp 65001 neighbor 192.168.1.1 remote-as 65002 bmp-activate server 1 !! Collecting BMP data is best done using the open source SNAS collector, formally known as OpenBMP. SNAS can be found at http#//snas.io.SNMP and SNMP TrapsThe NCS5500 IOS-XR SNMP configuration guide can be found here# https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/sysman/62x/b-system-management-cg-ncs5500-62x/b-system-management-cg-ncs5500-62x_chapter_0110.html. It is recommended to use SNMPv3 for higher security. SNMP is supported by most traditional EMS/NMS systems.SyslogThe IOS-XR Syslog configuration guide can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/system-monitoring/62x/b-system-monitoring-cg-ncs5500-62x/b-system-monitoring-cg-ncs5500-62x_chapter_010.htmlApplications for Peering TelemetryEnhancing Peering OperationsSuccessfully operating a peering edge router requires knowledge of a variety of telemetry data. As the connecting device is not under your administrative control, having information on the state of the connection at all times is critical. Mitigating issues like an ingress congestion event on a peering interface can be more difficult to troubleshoot and be in a degraded state longer due to multiple party involvement.Monitor Peering Router Stability and ResourcesGlobal BGP statistics can be used to determine peering router health. Global BGP state values such as overall per-AFI RIB size, established peer session count, overall BGP update counts, and state transitions are used to determine instability either causing network issues or leading to them. When coupled with known resource limits, these values can also be used to monitor devices for exhaustion in resources like FIB or RIB size, and peer session limits.Monitor Peer StabilityJust as we monitor the global BGP state, we can monitor the same statistics on a per-peer basis. Detecting instability on peering edge routers is beneficial since off-net instability can be replicated across your entire network. Tracking per-peer BGP state values such as overall BGP message count, update and withdrawal count, update queue depth, per-AFI adj-RIB-in pre/post policy, and active prefix count can help discover peer stability issues very quickly.Using Flow Data for Enhanced TroubleshootingIt is often difficult to troubleshoot exact network behavior across administrative boundaries. Flow data can assist by giving a view of traffic behavior at an application stream level. Tools like ping and traceroute do not give the same insight into service traffic behavior.Applicable Metric Telemetry Peer logical and physical interface statistics Peering router general health data (CPU, Memory, FIB resources, etc.) BGP protocol statistics Global BGP session count Global BGP RIB size Global BGP table version (update count) Per-peer session state Per-peer prefix counts Per-peer message update counts Per-peer message queue depth Current dampened paths Applicable Event Telemetry Peer logical and physical interface state Per-peer BGP session state Network VisibilityPeer Traffic Anomaly DetectionSimple interface stats are again the starting point for peering network visibility. Having accurate and timely logical and physical interface stats can quickly alert you to service-affecting anomalies for both ingress and egress traffic. Using the faster sampling frequency of MDT on IOS-XR decreases the time to catch events from what was typically 5+ minutes using SNMP to 30 seconds or less.Historical flow data from peers can be used to store a network baseline used with real-time information to determine anomalous behavior. Examples include DNS-driven content peers shifting traffic sources causing sub-optimal traffic on your network, or peer BGP routes being withdrawn from optimal peer sessions. Some shifts are not detected by interface statistics alone and require the flow-level traffic view to detect. Grouping by constraints such as SRC/DST ASN or BGP prefix can help quickly determine large changes in traffic across an entire network.Analyzing BGP updates on a global and per-peer basis via BMP data can also help detect traffic-affecting routing anomalies and be an important resource for root cause analysis of previous events.Applicable Metric Telemetry Peer logical and physical interface statistics BGP protocol statistics Global BGP RIB size Global BGP table version (update count) Per-peer session state Per-peer prefix counts Per-peer message update counts Per-peer message queue depth Netflow / IPFIX Applicable Event Telemetry Peer logical and physical interface state Per-peer BGP session state Capacity PlanningPeer TargetingThe most common use of Netflow data for peering is to determine who you need your network to peer with or where to add additional peer connections. The traffic exchange rate between sources or destinations on your network and remote networks can easily be derived with flow information and the associated BGP ASN data. Analyzing flow data from transit connections or larger service provider peer connections can help determine new organizations to peer with. Analyzing flow data from existing peers helps determine if traffic to/from your network is taking the optimal path. Sub-optimal paths can be remedied by adding additional peer connections.Telemetry data can not only tell you where to augment existing peering, or connect to a new provider in an existing location, but help determine where to add additional peering facilities. Aggregated flow data along with topology data help determine the cost of carrying traffic across an internal network as well as paid peering connections. Eliminating network hops and augmenting paid peering or transit with settlement free peering connections can offset the location and network build costs in a short amount of time.Existing Peer Capacity PlanningInterface bandwidth statistics are necessary for accurate capacity planning. The most basic metric to trigger capacity upgrades is exceeding a traffic utilization threshold. Capacity planning can also be aided by flow data. Knowing the growth rates of specific types of peer traffic will aid in future overall network traffic projections. Flow data can also be used to derive growth rates between specific source/destination pairs, helping better predict growth across a network and not just at the peer interface boundary.SecurityPeering by definition is at the edge of the network, where security is mandatory. Telemetry data is critical to security applications and aids in quickly identifying potential threats and triggering mitigation activity.Attack DetectionDDoS is a threat to all Internet-connected entities. Often times simple interface packet rates can be used to quickly identify that a DDoS attack is in progress. Flow data is then critical in determining the source, destination, volume, packet rate, and type of data associated to a DDoS attack. Coupled with a dynamic remediation system, traffic can be quickly blocked or diverted and alleviate congestion on downstream nodes.Not all attacks have a signature of high packet rates. The attacks can also originate from within your own network, such as open DNS resolvers participating in an Internet-wide attack on a remote destination. Flow data becomes important in these instances, monitoring per-protocol or anomalous behavior on both ingress and egress peering traffic can quickly help identify and mitigate attacks.BGP Prefix AnomaliesThe pre-policy RIB information from BMP can be used for traffic simulation as well as detecting security issues such as prefix hijacking without the prefixes being active in the provider table. In certain cases, analyzing incoming NLRI for malformed attributes or excessive AS_PATH lengths can help mitigate router security vulnerabilities as well.Having historical information can also help troubleshoot traffic issues where a provider may be changing advertisement locations due to instability on their network, causing reachability issues from sources on your network.Applicable Metric Telemetry Netflow / IPFIX Peer logical and physical interface statistics Applicable Event Telemetry Peer logical and physical interface state BGP adj-RIB-in information and updates Peering Traffic EngineeringPeer traffic engineering in this context refers to shifting either ingress or egress traffic between peer connections. Peer TE may be performed on a variety of reasons such as capacity optimization, performance, or maintenance activity. Without more granular flow data to determine traffic per prefix, where a prefix may be recursively split for precision, accurate traffic placement cannot be achieved. Peer TE may be a manual operation, done by an offline planning tool, or a real-time network component.Ingress Peer EngineeringIngress peer engineering generally involves the manipulation of outbound BGP NLRI, either through prefix withdrawal, prefix disaggregation, or augmenting a transitive attribute such as MED or AS_PATH. Targeting specific prefixes to manipulate requires telemetry data. IPE may be employed due to capacity, performance, or operations reasons. Accurate capacity augmentation requires interface statistics, BGP prefix information, and Netflow data to plan for traffic shifts and verify the network behaves as predicted after implementation.Appendix AA.1 Interface Metric SNMP OIDs and YANG PathsRelevant YANG Modelsietf-interfacesopenconfig-interfacesopenconfig-if-ethernetoc-platformCisco-IOS-XR-infra-statsd-operCisco-IOS-XR-drivers-media-eth-oper     Logical Interface Admin State Enum SNMP OID IF-MIB#ifAdminStatus OC YANG oc-if#interfaces/interface/state/admin-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state MDT Native     Logical Interface Operational State Enum SNMP OID IF-MIB#ifOperStatus OC YANG oc-if#interfaces/interface/state/oper-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state MDT Native     Logical Last State Change (seconds) Counter SNMP OID IF-MIB#ifLastChange OC YANG oc-if#interfaces/interface/state/last-change Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/last-state-transition-time MDT Native     Logical Interface SNMP ifIndex Integer SNMP OID IF-MIB#ifIndex OC YANG oc-if#interfaces/interface/state/if-index Native YANG Cisco-IOS-XR-snmp-agent-oper#snmp/interface-indexes/if-index MDT Native     Logical Interface RX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCInOctets OC YANG oc-if#/interfaces/interface/state/counters/in-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-received MDT Native     Logical Interface TX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCOutOctets OC YANG oc-if#/interfaces/interface/state/counters/out-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-sent MDT Native     Logical Interface RX Errors Counter SNMP OID IF-MIB#ifInErrors OC YANG oc-if#/interfaces/interface/state/counters/in-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-errors MDT Native     Logical Interface TX Errors Counter SNMP OID IF-MIB#ifOutErrors OC YANG oc-if#/interfaces/interface/state/counters/out-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-errors MDT Native     Logical Interface Unicast Packets RX Counter SNMP OID IF-MIB#ifHCInUcastPkts OC YANG oc-if#/interfaces/interface/state/counters/in-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total MDT Native     Logical Interface Unicast Packets TX Counter SNMP OID IF-MIB#ifHCOutUcastPkts OC YANG oc-if#/interfaces/interface/state/counters/out-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total MDT Native     Logical Interface Input Drops Counter SNMP OID IF-MIB#ifIntDiscards OC YANG oc-if#/interfaces/interface/state/counters/in-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-drops MDT Native     Logical Interface Output Drops Counter SNMP OID IF-MIB#ifOutDiscards OC YANG oc-if#/interfaces/interface/state/counters/out-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-drops MDT Native     Ethernet Layer Stats – All Interfaces Counters SNMP OID NA OC YANG oc-if#interfaces/interface/oc-eth#ethernet/oc-eth#state Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics MDT Native     Ethernet PHY State – All Interfaces Counters SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info MDT Native     Ethernet Input CRC Errors Counter SNMP OID NA OC YANG oc-if#interfaces/interface/oc-eth#ethernet/oc-eth#state/oc-eth#counters/oc-eth#in-crc-errors Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics/statistic/dropped-packets-with-crc-align-errors MDT Native The following transceiver paths retrieve the total power for the transceiver, there are specific per-lane power levels which can be retrieved from both native and OC models, please refer to the model YANG file for additional information     Ethernet Transceiver RX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-rx-power MDT Native     Ethernet Transceiver TX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-tx-power MDT Native A.2 BGP Operational State YANG PathsRelevant YANG Modelsopenconfig-bgp.YANGopenconfig-bgp-rib.YANGCisco-IOS-XR-ipv4-bgp-operCisco-IOS-XR-ipv6-bgp-operCisco-IOS-XR-ip-rib-ipv4-operCisco-IOS-XR-ip-rib-ipv6-operGlobal BGP Protocol StateIOS-XR native models do not store route information in the BGP Oper model, they are stored in the IPv4/IPv6 RIB models. These models contain RIB information based on protocol, with a numeric identifier for each protocol with the BGP ProtoID being 5. The protoid must be specified or the YANG path will return data for all configured routing protocols.     BGP Total Paths (all AFI/SAFI) Counter SNMP OID NA OC YANG oc-bgp#bgp/global/state/total-paths Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/num-active-paths MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG oc-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native BGP Neighbor StateExample UsageThe following OC NETCONF RPC returns the BGP session state for all configured peers. The neighbor-address key must be included as a container in all OC BGP state RPCs<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp xmlns=~http#//openconfig.net/YANG/bgp~> <neighbors> <neighbor> <neighbor-address/> <state> <session-state/> </state> </neighbor> </neighbors> </bgp> </filter> </get></rpc>\t<nc#rpc-reply message-id=~urn#uuid#24db986f-de34-4c97-9b2f-ac99ab2501e3~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp xmlns=~http#//openconfig.net/YANG/bgp~> <neighbors> <neighbor> <neighbor-address>172.16.0.2</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> </neighbors> </bgp> </nc#data></nc#rpc-reply>     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG oc-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors MDT Native     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG oc-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors MDT Native     Session State for all BGP neighbors Enum SNMP OID NA OC YANG oc-bgp#bgp/neighbors/neighbor/state/session-state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/connection-state MDT Native     Message counters for all BGP neighbors Counter SNMP OID NA OC YANG /oc-bgp#bgp/neighbors/neighbor/state/messages Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/message-statistics MDT Native     Current queue depth for all BGP neighbors Counter SNMP OID NA OC YANG /oc-bgp#bgp/neighbors/neighbor/state/queues Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-out Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-in MDT Native BGP RIB DataRIB data is retrieved per AFI/SAFI. To retrieve IPv6 unicast routes using OC models, replace “ipv4-unicast” with “ipv6-unicast”IOS-XR native models do not have a BGP specific RIB, but a protocol of ‘bgp’ can be specified in the field to filter out prefixes to those learned via BGP. The following OC YANG NETCONF RPC retrieves a list of best-path IPv4 prefixes without attributes from the loc-RIB<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/YANG/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <loc-rib> <routes> <route> <prefix/> <best-path>true</best-path> </route> </routes> </loc-rib> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc> The following native XR NETCONF RPC retrieves a list of BGP prefixes in its global (vrf default) IPv4 RIB, with only the prefix,prefix-length, source (route-path), and active attributes. Replacing <active/> with <active>true</active> would only return active prefixes, removing the prefix,prefix-length-xr, and active leaf attributes under route will return all attributes for each route<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <rib xmlns=~http#//cisco.com/ns/YANG/Cisco-IOS-XR-ip-rib-ipv4-oper~> <vrfs> <vrf> <afs> <af> <safs> <saf> <ip-rib-route-table-names> <ip-rib-route-table-name> <routes> <route> <route-path> <ipv4-rib-edm-path> <address/> </ipv4-rib-edm-path> </route-path> <prefix/> <prefix-length-xr/> <protocol-name>bgp</protocol-name> <active/> </route> </routes> </ip-rib-route-table-name> </ip-rib-route-table-names> </saf> </safs> </af> </afs> <vrf-name>default</vrf-name> </vrf> </vrfs> </rib> </filter> </get></rpc>     IPv4 RIB – Prefix Count Counter SNMP OID NA OC YANG oc-bgprib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/num-routes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/routes/route MDT Native     IPv4 RIB – IPv4 Prefixes w/o Attributes List SNMP OID NA OC YANG oc-bgprib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes/route/prefix Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/routes/route (see example RPC) MDT Native     IPv4 Local RIB – IPv4 Prefixes w/Attributes List SNMP OID NA OC YANG oc-bgprib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/routes/route (see example RPC) MDT Native The following per-neighbor RIB paths can be qualified with a specific neighbor address to retrieve RIB data for a specific peer. Below is an example of a NETCONF RPC to retrieve the number of post-policy routes from the 192.168.2.51 peer and the returned output. Native IOS-XR models do not support per-neighbor RIBs, but using the above example with the route-path address leaf set to the neighbor address will filter prefixes to a specific neighbor<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/YANG/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes/> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc><nc#rpc-reply message-id=~urn#uuid#7d9a0468-4d8d-4008-972b-8e703241a8e9~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp-rib xmlns=~http#//openconfig.net/YANG/rib/bgp~> <afi-safis> <afi-safi> <afi-safi-name xmlns#idx=~http#//openconfig.net/YANG/rib/bgp-types~>idx#IPV4_UNICAST</afi-safi-name> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes>3</num-routes> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </nc#data></nc#rpc-reply>     IPv4 Neighbor adj-rib-in pre-policy List SNMP OID NA OC YANG oc-bgprib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-re MDT NA     IPv4 Neighbor adj-rib-in post-policy List SNMP OID NA OC YANG oc-bgprib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-post MDT NA     IPv4 Neighbor adj-rib-out pre-policy List SNMP OID NA OC YANG oc-bgprib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre MDT NA     IPv4 Neighbor adj-rib-out post-policy List SNMP OID NA OC YANG oc-bgprib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre MDT NA A.3 Device Resource YANG PathsCisco-IOS-XR-fretta-bcm-dpa-hw-resources-oper.YANGopenconfig-platform     Device Inventory List SNMP OID ENTITY-MIB#entityMIBObjects.entityPhysical.entPhysicalTable OC YANG oc-platform#components MDT NA     NCS5500 Dataplane Resources List SNMP OID NA OC YANG NA Native YANG Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data MDT Native ", "url": "/blogs/2017-09-21-peering-telemetry/", "author": "Phil Bedard", "tags": "iosxr, Peering, Design" } , "blogs-2018-02-25-internet-traffic-trends": { "title": "Internet Traffic Trends", "content": " On This Page Introduction Unicast Video Growth Broadcast Video History Video on Demand Video over IP Over the Top IP Video Producers and Consumers Content Providers Caching and Content Delivery Networks Eyeball Networks Efficient Unicast Video Delivery What is Network Efficiency? Role of Internet Peering Localized Peering Service Provider Unicast Delivery Headend Express Peering Fabrics IntroductionOver the last five years Internet traffic has been driven by one dominant source, unicast Internet video. The rise in Internet video has its roots in how user viewing habits have changed as unicast video delivery became more viable as an alternative to broadcast or multicast delivery. The capacity requirements of modern video delivery require rethinking networks to make the most efficient use of resources at all layers. This two-part blog series will first cover how video delivery has evolved to bring us where we are today. The second part will cover network architecture and technology to help improve network efficiency to deal with the rising tide of unicast video.Unicast Video GrowthBroadcast Video HistoryTelevision video delivery for many years followed the same path blazed by radio before it, broadcasting a single program over the air at a specific time to anyone within range of the signal. Cable networks were built in the 70s and 80s, with the promise of delivering a wider variety of content to subscribers not subject to the same impairments as over the air (OTA) broadcasts, while eliminating the use of antennas. A physical medium like coaxial cable exhibits similar properties as transmission through the air as electrical signals are replicated across branches in the medium. The original primitive cable networks were still analog end to end and built for broadcast delivery of all video to every user. Satellite video delivery worked in much the same way, simply broadcasting all signals and requiring the end device tune to the channel at a specific analog frequency and the user tune in to watch at a specific time of day. In the 1980s and 1990s, TV viewers could always cite the exact day and time of their favorite programs. While broadcast video has limitations on flexibility for users, it has the ultimate efficiency when it comes to network resources as the signal is broadcast once to all users once at the origin.Broadcast Video DeliveryVideo on DemandVideo on Demand, or VoD, originated in the late 1980s and rose to prominence in the 1990s as a way to deliver video content to users on their time, not tied to a broadcast schedule. Viewers could now select what they wanted to view and have it be delivered to them immediately. It took the reduction in cost of the base infrastructure components, mainly storage and compute, to make a service like VoD a reality. VoD was also seen as a way to further monetize the cable network and challenge the huge video rental business that existed during those times with Pay Per View (PPV) content.VoD delivery used two different methods for delivery back in its original form, push or pull. In the push method, content was delivered via broadcast to a single or all subscribers and stored locally on a device such as a set-top box, for viewing later. The pull method streamed the content to the subscriber device from a remote server on user demand. In the end, the pull method easily won out due to the much larger variety of content available and less costly end user device. In order to support a single user viewing content destined for only their device, it required dedicating analog spectrum for the channel. It was still broadcast to a number of users, but encrypted such that only the paying subscriber could view it. Even though it was still analog broadcast at the lowest level, the content was unicast to a specific subscriber by consuming resources for a single video stream destined to a single viewer. VoD fundamentally changed users viewing habits and also introduced the first unicast video delivery.Video over IPService providers who built out wireline networks using DSL and Ethernet technology, network infrastructure types not having a native analog video delivery method, looked at IP as the higher layer protocol to deliver video content to users. These networks were deployed to take advantage of multicast, a subset of the broadcast capability inherent in Ethernet, and standardized for IP in RFC 1112. IP multicast improves network efficiency by implementing frame replication in the network devices, combined with a set of control-plane protocols to create optimized distribution trees. In its simplest form IP multicast replicates a broadcast network, sending all channels to all users (dense mode), and some providers used this method. However, to improve network efficiency it is now most common for end devices to use protocols like IGMP (v4) and MLD (v6) so optimized multicast trees are built. This type of multicast IP delivery is known as IPTV and is implemented in North America by networks such as AT&T UVerse and Google Fiber.}Multicast Video Delivery Supporting VoD on these networks requires delivering video over IP. Similar mechanisms can be used as analog networks, using a specific multicast address for the subscriber stream. However, instead of simulating a unicast stream using a more complex multicast process, streaming the content as a to a unicast IP address assigned to a device is much simpler and supported a wider range of devices, even across networks that do not support native multicast delivery. Today more and more content on wireline networks is delivered using unicast IP, even on traditional cable networks, due to its flexibility and the ability to serve content to a variety of end user devices from a single content source. The flexibility and ease of delivery using unicast IP has superceded the inefficiencies of delivering duplicate content over the same network resources.}Unicast Video DeliveryOver the Top IP VideoThe unicast video content described above has typically been contained within a service provider network. As the Internet has grown and bandwidth to end users increased, video content from alternative sources emerged. Broadband Internet became more widely available in the mid 2000s and with it came user-generated video providers like YouTube along with traditional media rental companies like Netflix embracing streaming video for rental delivery. These Internet content providers deliver video “over the top” (OTT) of service provider networks since the origin and destination are applications controlled by the content provider. The growth of OTT Internet video has continued to climb rapidly over the last decade along with IP video in general. IP video accounted for 73% of all Internet traffic in 2016, and by 2021 will account for 82% of all Internet traffic. The most rapid increase is in over the top Internet video, shown in the graph below from the Cisco VNI.It is not only on-demand content driving OTT growth, streaming of traditional broadcast video like sports to mobile devices, tablets, smart TVs, and additional endpoints is increasing in popularity. The last few years have seen a number of new services delivering traditional linear (live) TV using OTT IP delivery. Over the top video is by nature unicast, as each stream is simply sent on demand when a user clicks “play.” Since there is little efficiency in sending a single stream to each user, it causes tremendous strain on network resources. It is however a trend that is unlikely to change, so new methods need to be employed to improve network efficiency and build networks to handle increasing video traffic demands.Producers and ConsumersContent ProvidersContent providers are those who originate video streams. A content source could also be a service provider providing video content to its own subscriber base, or an OTT Internet video source. A content provider may not be the original origin of the content, but is simply a means to deliver the content to the end user. dCaching and Content Delivery NetworksCaching is the process of storing content locally to serve to users instead of utilizing network resources to retrieve the content each time the content is accessed. Caching of Internet content became popular with the rise of the Internet in the late 1990s with open-source software such as Squid and commercial products like Cacheflow and Cisco WAAS. Called “transparent” caches, they intercept content without the source or destination having knowledge of the caching. The content in those days was mainly primitive static content, but with the high cost of bandwidth, it was still sometimes beneficial to cache content. Transparent caches have evolved into systems today targeted at OTT providers, and use more sophisticated techniques to cache video content from any source. However, with the rise of end to end encryption use, transparent caches are no longer a realistic option for serving content closer to users.Content Delivery Networks (CDN) have been around for many years now, with the first major CDN Akamai going live in 1999. The aim of a CDN is to place content closer to end users by distributing caching servers closer to end users. CDNs such as Akamai, Limelight, and Fastly host and deliver a variety of content from their customers. In addition to more generic CDNs, content owners have built their own CDNs to deliver their own content. Examples of dedicated CDNs include the Netflix OpenConnect network and Google Global Cache network. Most new streaming video providers utilize a distributed CDN to deliver content. Each CDN uses proprietary mechanisms for request routing and content delivery, so providers must use analysis to determine which ones are the most optimal for their network.Eyeball NetworksWireless and wireline service providers providing the last mile Internet connection to end users are commonly known as “Eyeball” networks, because the all content must pass through those networks for end users to view it. Around the world, and especially in North America, consolidation of service providers have left relatively few Eyeball networks serving a large number of subscribers.Efficient Unicast Video DeliveryWhat is Network Efficiency?Network efficiency in this context refers to minimizing the cost and consumption of network resources such as physical fiber, wavelengths,and IP interfaces to deliver unicast video content to end users. The equation to delivering video traffic efficiently is to create a network model reducing the distance, network hops, and network layer transitions between the content provider and content consumer while maintaining statistical multiplexing through aggregation where beneficial.Role of Internet PeeringInternet Peering is the exchange of traffic between two providers. Peering originated at third party carrier-neutral facilities known as Internet Exchange Points (IXP), with the exchange providing a public fabric to interconnect service providers. Due to consolidation and the dominant traffic type being video, the Internet has evolved from most content flowing through a Tier-1 Internet provider via transit connections to one of direct traffic exchange between content providers and eyeball networks. The majority of Internet video traffic today is exchanged via private network interconnection (PNI) between content providers and eyeball networks. The concept has been coined the “Flattening of the Internet” as the traditional hierarchical traffic flow between ISPs is eliminated. Traditional large IXPs still act as meet-me points for many providers, facilitating both public and private interconnection, but improving network efficiency demands traffic take shorter paths.Localized PeeringReducing the distance and network hops between where unicast video packets enter your network and exit to the consumer is a key priority for service providers in reducing network cost. Each pass through an optical transponder or router interface adds additional cost to the transit path, especially on long-haul paths from traditional large IXPs to subscriber regions. The aforementioned rise in video traffic demands peering move closer to the edges of the network to serve wireline broadband subscribers along with high-bandwidth 5G mobile users. Content providers have invested heavily in their own networks as well as distributed caches serving content from any network location with Internet access. Third party co-location providers have begun building more regional locations supporting PNI between content distributors and the end subscribers on the SP network. This leads to a localized peering option for SPs and content providers, greatly reducing the distance and hops across the network. As more traffic shifts to becoming locally delivered building additional regional or metro peering locations becomes important to ensure less reliance on longer paths during failures.Localized Peering Service Provider Unicast Delivery HeadendAs mentioned, service providers are seeing growth not only in OTT unicast video delivery, but also delivery for their own video services. Most service providers have deployed their own internal CDNs to provide unicast video content to their subscribers and migrate VoD off legacy analog systems onto an all-IP infrastructure. The same efficiency tools for dealing with off-net content from peers applies to on-net video services. There may be efficiencies gained in placing SP content servers in the same facilities as other content peers, aggregating all content traffic in a single location for efficient delivery to end users.Express Peering FabricsSee how Express Peering Fabrics can help drive efficiency into service providers networks in the next blog in this series.", "url": "/blogs/2018-02-25-internet-traffic-trends/", "author": "Phil Bedard", "tags": "iosxr, Peering, Design" } , "blogs-traffic-trends-convert": { "title": "Internet Traffic Trends", "content": " On This Page Introduction Unicast Video Growth Broadcast Video History Video on Demand Video over IP Over the Top IP Video Producers and Consumers Content Providers Caching and Content Delivery Networks Eyeball Networks Efficient Unicast Video Delivery What is Network Efficiency? Role of Internet Peering Localized Peering Service Provider Unicast Delivery Headend Express Peering Fabrics IntroductionOver the last five years Internet traffic has been driven by one dominant source, unicast Internet video. The rise in Internet video has its roots in how user viewing habits have changed as unicast video delivery became more viable as an alternative to broadcast or multicast delivery. The capacity requirements of modern video delivery require rethinking networks to make the most efficient use of resources at all layers. This two-part blog series will first cover how video delivery has evolved to bring us where we are today. The second part will cover network architecture and technology to help improve network efficiency to deal with the rising tide of unicast video.Unicast Video GrowthBroadcast Video HistoryTelevision video delivery for many years followed the same path blazed by radio before it, broadcasting a single program over the air at a specific time to anyone within range of the signal. Cable networks were built in the 70s and 80s, with the promise of delivering a wider variety of content to subscribers not subject to the same impairments as over the air (OTA) broadcasts, while eliminating the use of antennas. A physical medium like coaxial cable exhibits similar properties as transmission through the air as electrical signals are replicated across branches in the medium. The original primitive cable networks were still analog end to end and built for broadcast delivery of all video to every user. Satellite video delivery worked in much the same way, simply broadcasting all signals and requiring the end device tune to the channel at a specific analog frequency and the user tune in to watch at a specific time of day. In the 1980s and 1990s, TV viewers could always cite the exact day and time of their favorite programs. While broadcast video has limitations on flexibility for users, it has the ultimate efficiency when it comes to network resources as the signal is broadcast once to all users once at the origin.Broadcast Video DeliveryVideo on DemandVideo on Demand, or VoD, originated in the late 1980s and rose to prominence in the 1990s as a way to deliver video content to users on their time, not tied to a broadcast schedule. Viewers could now select what they wanted to view and have it be delivered to them immediately. It took the reduction in cost of the base infrastructure components, mainly storage and compute, to make a service like VoD a reality. VoD was also seen as a way to further monetize the cable network and challenge the huge video rental business that existed during those times with Pay Per View (PPV) content.VoD delivery used two different methods for delivery back in its original form, push or pull. In the push method, content was delivered via broadcast to a single or all subscribers and stored locally on a device such as a set-top box, for viewing later. The pull method streamed the content to the subscriber device from a remote server on user demand. In the end, the pull method easily won out due to the much larger variety of content available and less costly end user device. In order to support a single user viewing content destined for only their device, it required dedicating analog spectrum for the channel. It was still broadcast to a number of users, but encrypted such that only the paying subscriber could view it. Even though it was still analog broadcast at the lowest level, the content was unicast to a specific subscriber by consuming resources for a single video stream destined to a single viewer. VoD fundamentally changed users viewing habits and also introduced the first unicast video delivery.Video over IPService providers who built out wireline networks using DSL and Ethernet technology, network infrastructure types not having a native analog video delivery method, looked at IP as the higher layer protocol to deliver video content to users. These networks were deployed to take advantage of multicast, a subset of the broadcast capability inherent in Ethernet, and standardized for IP in RFC 1112. IP multicast improves network efficiency by implementing frame replication in the network devices, combined with a set of control-plane protocols to create optimized distribution trees. In its simplest form IP multicast replicates a broadcast network, sending all channels to all users (dense mode), and some providers used this method. However, to improve network efficiency it is now most common for end devices to use protocols like IGMP (v4) and MLD (v6) so optimized multicast trees are built. This type of multicast IP delivery is known as IPTV and is implemented in North America by networks such as AT&T UVerse and Google Fiber.}Multicast Video Delivery Supporting VoD on these networks requires delivering video over IP. Similar mechanisms can be used as analog networks, using a specific multicast address for the subscriber stream. However, instead of simulating a unicast stream using a more complex multicast process, streaming the content as a to a unicast IP address assigned to a device is much simpler and supported a wider range of devices, even across networks that do not support native multicast delivery. Today more and more content on wireline networks is delivered using unicast IP, even on traditional cable networks, due to its flexibility and the ability to serve content to a variety of end user devices from a single content source. The flexibility and ease of delivery using unicast IP has superceded the inefficiencies of delivering duplicate content over the same network resources.}Unicast Video DeliveryOver the Top IP VideoThe unicast video content described above has typically been contained within a service provider network. As the Internet has grown and bandwidth to end users increased, video content from alternative sources emerged. Broadband Internet became more widely available in the mid 2000s and with it came user-generated video providers like YouTube along with traditional media rental companies like Netflix embracing streaming video for rental delivery. These Internet content providers deliver video “over the top” (OTT) of service provider networks since the origin and destination are applications controlled by the content provider. The growth of OTT Internet video has continued to climb rapidly over the last decade along with IP video in general. IP video accounted for 73% of all Internet traffic in 2016, and by 2021 will account for 82% of all Internet traffic. The most rapid increase is in over the top Internet video, shown in the graph below from the Cisco VNI.It is not only on-demand content driving OTT growth, streaming of traditional broadcast video like sports to mobile devices, tablets, smart TVs, and additional endpoints is increasing in popularity. The last few years have seen a number of new services delivering traditional linear (live) TV using OTT IP delivery. Over the top video is by nature unicast, as each stream is simply sent on demand when a user clicks “play.” Since there is little efficiency in sending a single stream to each user, it causes tremendous strain on network resources. It is however a trend that is unlikely to change, so new methods need to be employed to improve network efficiency and build networks to handle increasing video traffic demands.Producers and ConsumersContent ProvidersContent providers are those who originate video streams. A content source could also be a service provider providing video content to its own subscriber base, or an OTT Internet video source. A content provider may not be the original origin of the content, but is simply a means to deliver the content to the end user. dCaching and Content Delivery NetworksCaching is the process of storing content locally to serve to users instead of utilizing network resources to retrieve the content each time the content is accessed. Caching of Internet content became popular with the rise of the Internet in the late 1990s with open-source software such as Squid and commercial products like Cacheflow and Cisco WAAS. Called “transparent” caches, they intercept content without the source or destination having knowledge of the caching. The content in those days was mainly primitive static content, but with the high cost of bandwidth, it was still sometimes beneficial to cache content. Transparent caches have evolved into systems today targeted at OTT providers, and use more sophisticated techniques to cache video content from any source. However, with the rise of end to end encryption use, transparent caches are no longer a realistic option for serving content closer to users.Content Delivery Networks (CDN) have been around for many years now, with the first major CDN Akamai going live in 1999. The aim of a CDN is to place content closer to end users by distributing caching servers closer to end users. CDNs such as Akamai, Limelight, and Fastly host and deliver a variety of content from their customers. In addition to more generic CDNs, content owners have built their own CDNs to deliver their own content. Examples of dedicated CDNs include the Netflix OpenConnect network and Google Global Cache network. Most new streaming video providers utilize a distributed CDN to deliver content. Each CDN uses proprietary mechanisms for request routing and content delivery, so providers must use analysis to determine which ones are the most optimal for their network.Eyeball NetworksWireless and wireline service providers providing the last mile Internet connection to end users are commonly known as “Eyeball” networks, because the all content must pass through those networks for end users to view it. Around the world, and especially in North America, consolidation of service providers have left relatively few Eyeball networks serving a large number of subscribers.Efficient Unicast Video DeliveryWhat is Network Efficiency?Network efficiency in this context refers to minimizing the cost and consumption of network resources such as physical fiber, wavelengths,and IP interfaces to deliver unicast video content to end users. The equation to delivering video traffic efficiently is to create a network model reducing the distance, network hops, and network layer transitions between the content provider and content consumer while maintaining statistical multiplexing through aggregation where beneficial.Role of Internet PeeringInternet Peering is the exchange of traffic between two providers. Peering originated at third party carrier-neutral facilities known as Internet Exchange Points (IXP), with the exchange providing a public fabric to interconnect service providers. Due to consolidation and the dominant traffic type being video, the Internet has evolved from most content flowing through a Tier-1 Internet provider via transit connections to one of direct traffic exchange between content providers and eyeball networks. The majority of Internet video traffic today is exchanged via private network interconnection (PNI) between content providers and eyeball networks. The concept has been coined the “Flattening of the Internet” as the traditional hierarchical traffic flow between ISPs is eliminated. Traditional large IXPs still act as meet-me points for many providers, facilitating both public and private interconnection, but improving network efficiency demands traffic take shorter paths.Localized PeeringReducing the distance and network hops between where unicast video packets enter your network and exit to the consumer is a key priority for service providers in reducing network cost. Each pass through an optical transponder or router interface adds additional cost to the transit path, especially on long-haul paths from traditional large IXPs to subscriber regions. The aforementioned rise in video traffic demands peering move closer to the edges of the network to serve wireline broadband subscribers along with high-bandwidth 5G mobile users. Content providers have invested heavily in their own networks as well as distributed caches serving content from any network location with Internet access. Third party co-location providers have begun building more regional locations supporting PNI between content distributors and the end subscribers on the SP network. This leads to a localized peering option for SPs and content providers, greatly reducing the distance and hops across the network. As more traffic shifts to becoming locally delivered building additional regional or metro peering locations becomes important to ensure less reliance on longer paths during failures.Localized Peering Service Provider Unicast Delivery HeadendAs mentioned, service providers are seeing growth not only in OTT unicast video delivery, but also delivery for their own video services. Most service providers have deployed their own internal CDNs to provide unicast video content to their subscribers and migrate VoD off legacy analog systems onto an all-IP infrastructure. The same efficiency tools for dealing with off-net content from peers applies to on-net video services. There may be efficiencies gained in placing SP content servers in the same facilities as other content peers, aggregating all content traffic in a single location for efficient delivery to end users.Express Peering FabricsSee how Express Peering Fabrics can help drive efficiency into service providers networks in the next blog in this series.", "url": "/blogs/traffic-trends-convert/", "author": "Phil Bedard", "tags": "iosxr, Peering, Design" } , "blogs-2018-04-30-metro-fabric-hld": { "title": "Metro Fabric High Level Design", "content": " On This Page Value Proposition Summary Technical Overview Transport – Design Use Cases Intra-Domain Intra-Domain Routing and Forwarding Intra-Domain Forwarding - Fast Re-Route Inter-Domain Inter-Domain Forwarding Area Border Routers – Prefix-SID vs Anycast-SID Inter-Domain Forwarding - Label Stack Optimization Inter-Domain Forwarding - High Availability and Fast Re-Route Transport Programmability Transport Controller Path Computation Engine (PCE) Segment Routing Path Computation Element (SR-PCE) WAN Automation Engine (WAE) PCE Controller Summary – SR-PCE & WAE Path Computation Engine – Workflow Delegated Computation to SR-PCE WAE Instantiated LSP Delegated Computation to WAE Transport – Segment Routing IPv6 Data Plane (SRv6) Best-Effort Path Low-Latency Path SRv6 – Inter-Domain Forwarding SRv6 Conclusion Services – Design Overview Ethernet VPN (EVPN) Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or with Data Center End-To-End (Flat) – Services Hierarchical – Services Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHE Services – Router-Reflector (S-RR) Network Services Orchestrator (NSO) Transport and Services Integration The Compass Metro Fabric Design – Phase 1 Transport - Phase 1 Transport Programmability – Phase 1 Services – Phase 1 Transport and Services Integration – Phase 1 The Compass Metro Fabric Design - Summary Value PropositionService Providers are facing the challenge to provide next generationservices that can quickly adapt to market needs. New paradigms such as5G introduction, video traffic continuous growth, IoT proliferation andcloud services model require unprecedented flexibility, elasticity andscale from the network. Increasing bandwidth demands and decreasing ARPUput pressure on reducing network cost. At the same time, services needto be deployed faster and more cost effectively to stay competitive.Metro Access and Aggregation solutions have evolved from nativeEthernet/Layer 2 based, to Unified MPLS to address the above challenges.The Unified MPLS architecture provides a single converged networkinfrastructure with a common operational model. It has great advantagesin terms of network convergence, high scalability, high availability,and optimized forwarding. However, that architectural model is stillquite challenging to manage, especially on large-scale networks, becauseof the large number of distributed network protocols involved whichincreases operational complexity.Compass Metro Fabric (CMF) design introduces an SDN-ready architecturewhich evolves traditional Metro network design towards an SDN enabled,programmable network capable of delivering all services (Residential,Business, 5G Mobile Backhauling, Video, IoT) on the premise ofsimplicity, full programmability, and cloud integration, with guaranteedservice level agreements (SLAs).The Compass Metro Fabric design brings tremendous value to the ServiceProviders# Fast service deployment and rapid time to market throughfully automated service provisioning and end-to-end networkprogrammability Operational simplicity with less protocols to operate and manage Smooth migration towards an SDN-ready architecture thanks tobackward-compatibility with existing network protocols and services Next generation services creation leveraging guaranteed SLAs Enhanced and optimized operations using telemetry/analytics inconjunction with automation tools The Compass Metro Fabric design is targeted at Service Providercustomers who# Want to evolve their existing Unified MPLS Network Are looking for an SDN ready solution Need a simple, scalable design that can support future growth Want an industry–leading or future proof technology and architecture SummaryThe Compass Metro Fabric design meets the criteria identified forcompass designs# Simple# based on Segment Routing as unified forwarding plane andEVPN and L3VPN as a common BGP based control plane Programmable# it uses SR-PCE to program end-to-end paths across thenetwork with guaranteed SLAs Automatable# service provisioning is fully automated using NSOand Yang models; analytics with model driven telemetry inconjunction with automation tools will be used in the future toenhance operations and network and services optimization Repeatable# it’s an evolution of the Unified MPLS architectureand based on standard protocols Technical OverviewThe Compass Metro Fabric design evolves from the successful CiscoEvolved Programmable Network (EPN) 5.0 architecture framework, to bringgreater programmability and automation.In the Compass Metro Fabric design, the transport and service are builton-demand when the customer service is requested. The end-to-endinter-domain network path is programmed through controllers and selectedbased on the customer SLA, such as the need for a low latency path.The Compass Metro Fabric is made of the following main buildingblocks# IOS-XR as a common Operating System proved in Service ProviderNetworks Transport Layer based on Segment Routing as UnifiedForwarding Plane SDN - Segment Routing Path Computation Element (SR-PCE) as Cisco Path ComputationEngine (PCE) coupled with Segment Routing to provide simple andscalable inter-domain transport connectivity and TrafficEngineering and Path control Service Layer for Layer 2 (EVPN) and Layer 3 VPN services basedon BGP as Unified Control Plane Automation and Analytics NSO for service provisioning Netconf/YANG data models Telemetry to enhance and simplify operations Zero Touch Provisioning and Deployment (ZTP/ZTD) By leveraging analytics collected through model driven telemetry on IOS-XR platforms, in conjunction with automation tools, Compass Metro Fabric provides Service Providers with enhancements in network and services operations experience.Transport – DesignUse CasesService Provider networks must adopt a very flexible design that satisfyany to any connectivity requirements, without compromising in stabilityand availability. Moreover, transport programmability is essential tobring SLA awareness into the network,The goals of the Compass Metro Fabric is to provide a flexible networkblueprint that can be easily customized to meet customer specificrequirements.To provide unlimited network scale, the Compass Metro Fabric isstructured into multiple IGP Domains# Access, Aggregation, and Core.Refer to the network topology in Figure 1.Figure 1# Distributed Central OfficeThe network diagram in Figure 2 shows how a Service Provider network canbe simplified by decreasing the number of IGP domains. In this scenariothe Core domain is extended over the Aggregation domain, thus increasingthe number of nodes in theCore.Figure 2# Distributed Central Office with Core domain extensionA similar approach is shown in Figure 3. In this scenario the Coredomain remains unaltered and the Access domain is extended over theAggregation domain, thus increasing the number of nodes in the Accessdomain.Figure 3# Distributed Central Office with Access domain extensionThe Compass Metro Fabric transport design supports all three networkoptions, while remaining easily customizable.The first phase of the Compass Metro Fabric, discussed later in thisdocument, will cover in depth the scenario described in Figure 3.Intra-DomainIntra-Domain Routing and ForwardingThe Compass Metro Fabric is based on a fully programmable transport thatsatisfies the requirements described earlier. The foundation technologyused in the transport design is Segment Routing (SR) with a MPLS basedData Plane in Phase 1 and a IPv6 based Data Plane (SRv6) in future.Segment Routing dramatically reduces the amount of protocols needed in aService Provider Network. Simple extensions to traditional IGP protocolslike ISIS or OSPF provide full Intra-Domain Routing and ForwardingInformation over a label switched infrastructure, along with HighAvailability (HA) and Fast Re-Route (FRR) capabilities.Segment Routing defines the following routing related concepts# Prefix-SID – A node identifier that must be unique for each node ina IGP Domain. Prefix-SID is statically allocated by th3 networkoperator. Adjacency-SID – A node’s link identifier that must be unique foreach link belonging to the same node. Adjacency-SID is typicallydynamically allocated by the node, but can also be staticallyallocated. In the case of Segment Routing with a MPLS Data Plane, both Prefix-SIDand Adjacency-SID are represented by the MPLS label and both areadvertised by the IGP protocol. This IGP extension eliminates the needto use LDP or RSVP protocol to exchange MPLS labels.The Compass Metro Fabric design uses ISIS as the IGP protocol.Intra-Domain Forwarding - Fast Re-RouteSegment-Routing embeds a simple Fast Re-Route (FRR) mechanism known asTopology Independent Loop Free Alternate (TI-LFA).TI-LFA provides sub 50ms convergence for link and node protection.TI-LFA is completely Stateless and does not require any additionalsignaling mechanism as each node in the IGP Domain calculates a primaryand a backup path automatically and independently based on the IGPtopology. After the TI-LFA feature is enabled, no further care isexpected from the network operator to ensure fast network recovery fromfailures. This is in stark contrast with traditional MPLS-FRR, whichrequires RSVP and RSVP-TE and therefore adds complexity in the transportdesign.Please refer also to the Area Border Router Fast Re-Route covered inSection# “Inter-Domain Forwarding - High Availability and Fast Re-Route” for additional details.Inter-DomainInter-Domain ForwardingThe Compass Metro Fabric achieves network scale by IGP domainseparation. Each IGP domain is represented by separate IGP process onthe Area Border Routers (ABRs).Section# “Intra-Domain Routing and Forwarding” described basic Segment Routing concepts# Prefix-SID andAdjacency-SID. This section introduces the concept of Anycast SID.Segment Routing allows multiple nodes to share the same Prefix-SID,which is then called a “Anycast” Prefix-SID or Anycast-SID. Additionalsignaling protocols are not required, as the network operator simplyallocates the same Prefix SID (thus a Anycast-SID) to a pair of nodestypically acting as ABRs.Figure 4 shows two sets of ABRs# Aggregation ABRs – AG Provider Edge ABRs – PE Figure 4# IGP Domains - ABRs Anycast-SIDFigure 5 shows the End-To-End Stack of SIDs for packets traveling fromleft to right through thenetwork.Figure 5# Inter-Domain LSP – SRTE PolicyThe End-To-End Inter-Domain Label Switched Path (LSP) was computed viaSegment Routing Traffic Engineering (SRTE) Policies.On the Access router “A” the SRTE Policy imposes# Local Aggregation Area Border Routers Anycast-SID# Local-AGAnycast-SID Local Provider Edge Area Border Routers Anycast-SID# Local-PEAnycast SID Remote Provider Edge Area Border Routers Anycast-SID# Remote-PEAnycast-SID Remote Aggregation Area Border Routers Anycast-SID# Remote-AGAnycast-SID Remote/Destination Access Router# Destination-A Prefix-SID#Destination-A Prefix-SID The SRTE Policy is programmed on the Access device on-demand by anexternal Controller and does not require any state to be signaledthroughout the rest of the network. The SRTE Policy provides, by simpleSID stacking (SID-List), an elegant and robust way to programInter-Domain LSPs without requiring additional protocols such as BGP-LU(RFC3107).Please refer to Section# “Transport Programmability” for additional details.Area Border Routers – Prefix-SID vs Anycast-SIDSection# “Inter-Domain Forwarding” showed the use of Anycast-SID at the ABRs for theprovisioning of an Access to Access End-To-End LSP. When the LSP is setup between the Access Router and the AG/PE ABRs, there are two options# ABRs are represented by Anycast-SID; or Each ABR is represented by a unique Prefix-SID. Choosing between Anycast-SID or Prefix-SID depends on the requestedservice. Please refer to Section# “Services - Design”.Note that both options can be combined on the same network.Inter-Domain Forwarding - Label Stack OptimizationSection# “Inter-Domain Forwarding” described how SRTE Policy uses SID stacking (SID-List) to define the Inter-Domain End-To-End LSP. The SID-List has to be optimized to be able to support different HW capabilities on different service termination platforms, while retaining all the benefits of a clear, simple and robust design.Figure 6 shows the optimization indetail.Figure 6# Label Stack OptimizationThe Anycast-SIDs and the Anycast Loopback IP address of all PE ABRs inthe network are redistributed into the Aggregation IGP Domain by the local PE ABRs. By doing this, all nodes in a Aggregation IGP Domainknow, via IGP, the Anycast-SID of all PE ABRs in the network. Local AGABRs then redistribute the Anycast-SIDs and Anycast Loopback IP addressof all PE ABRs into the Access IGP Domain. By doing this, all nodes in aAccess IGP Domain also know, via IGP, the Anycast-SID of all PE ABRs inthe network.It is very important to note that this redistribution is asymmetric,thus it won’t cause any L3 routing loop in the network.Another important fact to consider is that there is only a limitedamount of PEs in a Service Provider Network, therefore theredistribution does not affect scalability in the Access IGP Domain.After Label Stack Optimization, the SRTE Policy on the Access routerimposes# Remote Provider Edge Area Border Routers Anycast-SID# Remote-PEAnycast-SID Remote Aggregation Are Border Routers Anycast-SID# Remote-AGAnycast-SID Remote/Destination Access Router# Destination-A Prefix-SID#Destination-A Prefix-SID Because of the Label Stack Optimization, the total amount of SIDsrequired for the Inter-Domain LSP is reduced to 3 instead of theoriginal 5.The Label Stack Optimization mechanism is very similar when an ABR isrepresented by a Prefix-SID instead of an Anycast-SID. The Prefix-SIDand the unicast Loopback IP address are redistributed into theAggregation IGP Domain by Local PE ABRs. By doing this, all nodes in theAggregation IGP Domain know, via IGP, the Prefix-SID of all PE ABRs inthe network. Local AG ABRs then redistribute the learned Prefix-SIDs andunicast Loopback IP address of all PE ABRs to the Access IGP Domain. Bydoing this, all nodes in a Access IGP Domain know, via IGP, thePrefix-SID of all PE ABRs in the network.Both Anycast-SID and Prefix-SID can be combined in the same network withor without Label Stack Optimization.Inter-Domain Forwarding - High Availability and Fast Re-RouteAG/PE ABRs redundancy enables high availability for Inter-DomainForwarding.Figure 7# IGP Domains - ABRs Anycast-SIDWhen Anycast-SID is used to represent AG or PE ABRs, no other mechanismis needed for Fast Re-Route (FRR). Each IGP Domain provides FRRindependently by TI-LFA as described in Section# “Intra-Domain Forwarding - Fast Re-Route”.Figure 8 shows how FRR is achieved for a Inter-DomainLSP.Figure 8# Inter-Domain - FRRThe access router on the left imposes the Anycast-SID of the ABRs andthe Prefix-SID of the destination access router. For FRR, any router inIGP1, including the Access router, looks at the top label# “ABRAnycast-SID”. For this label, each device maintains a primary and backuppath preprogrammed in the HW. In IGP2, the top label is “Destination-A”.For this label, each node in IGP2 has primary and backup pathspreprogrammed in the HW. The backup paths are computed by TI-LFA.As Inter-Domain forwarding is achieved via SRTE Policies, FRR iscompletely self-contained and does not require any additional protocol.Note that when traditional BGP-LU is used for Inter-Domain forwarding,BGP-PIC is also required for FRR.Inter-Domain LSPs provisioned by SRTE Policy are protected by FRR alsoin case of ABR failure (because of Anycast-SID). This is not possiblewith BGP-LU/BGP-PIC, since BGP-LU/BGP-PIC have to wait for the IGP toconverge first.Transport ProgrammabilityFigure 9 and Figure 10 show the design of Router-Reflectors (RR), Segment Routing Path Computation Element (SR-PCE) and WAN Automation Engines (WAE).High-Availability is achieved by device redundancy in the Aggregationand Core networks.Figure 9# Transport Programmability – PCEPRRs collect network topology from ABRs through BGP Link State (BGP-LS).Each ABR has a BGP-LS session with the two Domain RRs.Aggregation Domain RRs collect network topology information from theAccess and the Aggregation IGP Domain (Aggregation ABRs are part of theAccess and the Aggregation IGP Domain). Core Domain RRs collect networktopology information from the Core IGP Domain.Aggregation Domain RRs have BGP-LS sessions with Core RRs.Through the Core RRs, the Aggregation Domains RRs advertise localAggregation and Access IGP topologies and receive the network topologiesof the remote Access and Aggregation IGP Domains as well as the networktopology of the Core IGP Domain. Hence, each RR maintains the overallnetwork topology in BGP-LS.Redundant Domain SR-PCEs have BGP-LS sessions with the local Domain RRsthrough which they receive the overall network topology. Refer toSection# “Segment Routing Path Computation Element (SR-PCE)” for more details about SR-PCE.SR-PCE is then capable of computing the Inter-Domain LSP path on-demand andto instantiate it. The computed path (SID-List) is then advertised viathe Path Computation Element Protocol (PCEP), as shown in Figure 9, orBGP-SRTE, as shown in Figure 10, to the Service End Points. In the caseof PCEP, SR-PCEs and Service End Points communicate directly, while forBGP-SRTE, they communicate via RRs. Phase 1 uses PCEP only.The Service End Points program the SID-List via SRTE Policy.Service End Points can be co-located with the Access Routers for FlatServices or at the ABRs for Hierarchical Services. The SRTE Policy DataPlane in the case of Service End Point co-located with the Access routerwas described in Figure 5.The WAN Automation Engine (WAE) provides bandwidthoptimization.Figure 10# Transport Programmability – BGP-SRTEThe proposed design is very scalable and can be easily extended tosupport even higher numbers of BGP-SRTE/PCEP sessions by addingadditional RRs and SR-PCEs into the Access Domain.Figure 11 shows the Compass Metro Fabric physical topology with examplesof productplacement.Figure 11# Compass Metro Fabric – Physical Topology with transportprogrammabilityNote that the design of the Central Office is not covered by thisdocument.###Traffic Engineering (Tactical Steering) – SRTE PolicyOperators want to fully monetize their network infrastructure byoffering differentiated services. Traffic engineering is used to providedifferent paths (optimized based on diverse constraints, such aslow-latency or disjoined paths) for different applications. Thetraditional RSVP-TE mechanism requires signaling along the path fortunnel setup or tear down, and all nodes in the path need to maintainstates. This approach doesn’t work well for cloud applications, whichhave hyper scale and elasticity requirements.Segment Routing provides a simple and scalable way of defining anend-to-end application-aware traffic engineering path computed onceagain through SRTE Policy.In the Compass Metro Fabric design, the Service End Point uses PCEP orBGP-SRTE (Phase 1 uses PCEP only) along with Segment Routing On-DemandNext-hop (SR-ODN) capability, to request from the controller a path thatsatisfies specific constraints (such as low latency). This is done byassociating an SLA tag/attribute to the path request. Upon receiving therequest, the SR-PCE controller calculates the path based on the requestedSLA, and uses PCEP or BGP-SRTE to dynamically program the ingress nodewith a specific SRTE Policy.The Compass Metro Fabric design also uses MPLS Performance Management tomonitor link delay/jitter/drop (RFC6374).Transport Controller Path Computation Engine (PCE)Segment Routing Path Computation Element (SR-PCE)Segment Routing Path Computation Element, or SR-PCE, is a Cisco Path Computation Engine(PCE) and it is implemented as a feature included as part of CiscoIOS-XR operating system. The function is typically deployed on a CiscoIOS-XR cloud appliance XRv9000, as it involves control plane operationsonly. The SR-PCE gains network topology awareness from BGP-LSadvertisements received from the underlying network. Such knowledge isleveraged by the embedded multi-domain computation engine to provideoptimal path to Path Computation Element Clients (PCCs) using the PathComputation Element Protocol (PCEP) or BGP-SRTE.The PCC is the device where the service originates and therefore itrequires end-to-end connectivity over the segment routing enabledmulti-domain network.The SR-PCE provides a path based on constraints such as# Shortest path (IGP metrics). Traffic-Engineering metrics. Disjoint path.
 Figure 12# XR Transport Controller – ComponentsWAN Automation Engine (WAE)WAE Automation combines the smart data collection, modeling, andpredictive analytics of Cisco WAE Planning with an extensible,API-driven configuration platform. The use of open APIs and standardizedprotocols provides a means for intelligent interaction betweenapplications and the network. Applications have visibility into theglobal network and can make requests for specific service levels.Section# “PCE Controller Summary - SR-PCE & WAE” compares SR-PCE and WAE.PCE Controller Summary – SR-PCE & WAESegment Routing Path Computation Element (SR-PCE)# Runs as a features in a IOS-XR node Collects topology from BGP, ISIS, OSPF and BGP Link State Deploys tunnel# PCEP SR/RSVP, BGP SR-TE Computes Shortest, Disjoint, Low Latency, and Avoidance paths North Bound interface with applications via REST API WAN Automation Engine (WAE)# Runs as a SR-PCE application Collects topology# via SR-PCE Collects BW utilization# Flexible NetFlow (FNF), StreamingTelemetry, SNMP Deploys tunnel via SR-PCE (preferred# stateful) or NSO (optional#stateless) Computes# Bandwidth Optimization, On demand BW. Path Computation Engine – WorkflowThere are three models available to program transport LSPs# Delegated Computation to SR-PCE WAE Instantiated LSP Delegated Computation to WAE All models assume SR-PCE has acquired full network topology throughBGP-LS.Figure 13# PCE Path ComputationDelegated Computation to SR-PCE NSO provisions the service. Alternatively, the service can beprovisioned via CLI Access Router requests a path SR-PCE computes the path SR-PCE provides the path to Access Router Access Router acknowledges (Optional) When WAE is deployed for LSP visibility, SR-PCE updates WAEwith the newer LSP WAE Instantiated LSP WAE computes the path WAE sends computed path to SR-PCE SR-PCE provides the path to Access Router Access Router confirms SR-PCE updates WAE with newer LSP Delegated Computation to WAE NSO provisions the service – Service can also be provisioned via CLI Access Router requests a path SR-PCE delegates computation to WAE WAE computes the path WAE sends computed path to SR-PCE SR-PCE provides the path to Access Router Access Router confirms SR-PCE updates WAE with newer LSP Transport – Segment Routing IPv6 Data Plane (SRv6)The Compass Metro Fabric design will use Segment Routing IPv6 Data Plane(SRv6) in later phases.SRv6 brings another level of simplification with IPv6 data plane andwith network programming concept.Network programming concept is the capability to encode a networkprogram in the header of the packet. The program is expressed as a listof segments included in the SRv6 extension header, also called SRH. Eachsegment is a 128-bit entity where the first bits identify a router inthe network, we call them the locator part of the segment, and theremaining bits identify a function to be executed at that router.The SRv6 network programming IETF draft(draft-filsfils-spring-srv6-network-programming) provides thepseudo-code for various functions that allow the implementation ofnetwork services such as TILFA FRR, TE for latency or disjointness, VPN,NFV and many other applications.The following diagram# “SRv6 Network Topology” shows customer’s IPv6traffic with destination IPv6 address 6001##1 (same can be done for IPv4traffic) encapsulated in SRv6 on router 1, then forwarded via SRv6 coreto router 4 via router 2 or via router 3. Router 4 decapsulates SRv6 andperforms IPv6 destination address lookup in customer VRF. In our usecase, VRF 100.All the links in the topology except the link between routers 3 and 4are equal. the Link between 3 and 4 is high cost, but has lowlatencyFigure 14# SRv6 Network TopologyBest-Effort PathThe following diagram# “SRv6 Best-Effort Path” shows best-effort path.Router 1 encapsulates the received IPv6 packets from the customer siteof Entreprise100 in an outer IPv6 header. Router 1 sets the destinationaddress of the outer header to the segment of Router 4 which implementsthe egress PE function related to the VPN of Entreprise100 at Router 4.In this example, this segment is 2001##4#E100. The bits “2001##4” locatethe router where the function will be executed.In this example, we assume that Router 4 would advertise 2001##4/112 inthe IGP (no new IETF extension is required for this). The bits “E100”identify a specific function to be executed at the router identified bythe locator bits, in our example, Router 4.We assume here that Router 4 has instantiated the local function “E100”for the behavior “decap the packet, lookup the inner destination addressin the VRF of Entreprise 100 and forward accordingly”.We also assume that Router 4 has signaled the availability of thatfunction to Router 1 either via BGP or via an SDNcontroller.Figure 15# SRv6 Best-Effort PathRouter 1 receives packets from Entreprise100, then encapsulates thesepackets in an outer header with DA leading to Router 4 where thefunction “decap and lookup in VRF 100” is executed.By the process above, we have eliminated any encapsulation protocol suchas LISP, VXLAN, L2TPv3, GRE etc. This is an important simplification.We have not used an SRH in this case because the network program can beencoded with one single segment and hence we just need the outer IPv6header.The example above shows IPv6 VPN over IPv6. The same is applicable forIPv4 overlay (IPv4 VPN) and L2VPN construct. More details can be foundin draft-filsfils-spring-srv6-network-programming.Low-Latency PathThe following diagram# “SRv6 Low-Latency Path” shows low-latency path.Router 1 encapsulates the customer traffic in an outer IPv6 header withan SRH. In this example, the network program needs two SID’s. The firstSID is placed in the DA and implements the low-latency underlay service.The second SID is programmed in the SRH and implements the overlayservice.The first SID is 2001##3#C34.2001##3 is the locator part and leads the packet to Router 3. OnceRouter 3 gets this packet, it executes the function C34. The functionC34 means “update the DA with the next-segment and cross-connect theresulting packet to the neighboring Router 4.”We assume here that Router 3 has instantiated the local function “C34”for the behavior “update DA, cross-connect to neighbor 4”.We also assume that Router 4 has signaled the availability of thatfunction to Router 1 via the IGP.With one single SID, the ingress PE encodes in the packet header theunderlay SLA service for low-latency. The packets will use the northpath despite that it is not the preferred one from an IGP viewpoint.This is achieved by using the explicit routing capability of SR.State is created in the fabric to create this underlay SLA.The second SID implements the overlay service# this is 2001##4#E100 likewe sawpreviously.Figure 16# SRv6 Low-Latency PathThe following diagram# ”SRv6 Cross-Connect Function” shows thecross-connect function 2001##3#C34 instantiated by Router 3 as thisexplicit routing function provides a nice and easy way to enforcerequested SRTE Policy without any stateful information as the networkprogram is in the packetheader.Figure 17# SRv6 Cross-Connect Function“Cross-connect” means the function 2001##3#C34 is bound specifically tothe red link between Routers 3 and 4, but before Router 3 cancross-connect the packets, the IPv6 header needs to be updated.SRH can carry multiple segments stored in the segment list, so there isalso index represented by segment left value in SRH, which points to theright segment in the segment list. In our example, we have just onesegment in SRH which is the segment of Router 4.First, Router 3 decreases segment left value by one.In the next step, Router 3 will use segment left value as index in thesegment list to read the next segment. Router 3 sets the new destinationaddress of outer header to this segment which is the segment of Router4.This is completely the same segment of Router 4 like in the first usecase which implements the egress PE function related to VPN ofEnterprise 100 at Router 4.In our example, SRH doesn’t carry other segments, therefor Router 3 canpop this SRH header. We call this operation Penultimate Segment Pop.Finally, Router 3 cross-connects the packet to Router 4 via the redlink.Router 4 decapsulates the packet, lookups the inner destination addressin the VRF Enterprise 100 and forwards accordingly.The following diagram# ”SRv6 Low-Latency Detail” shows end-to-endlow-latency traffic data-path described in the previousparagraph.Figure 18# SRv6 Low-Latency DetailSRv6 – Inter-Domain ForwardingNext network diagram# “SRv6 Inter-Domain Forwarding” shows how SRv6benefits can be used in Service Provider network for Inter-Domainforwarding. End-To-End forwarding reachability can be easily provided bysummary or default route in the Access IGP Domain pointing to AG ABRs.Similarly, the Aggregation IGP Domain has summary or default routepointing to PE ABRs. The Core IGP domain also has summary routes of theAggregation/Access IGP Domains pointing to particular PEABRs.Figure 19# SRv6 Inter-Domain ForwardingNext network diagram# “SRv6 Inter-Domain Forwarding with SRH” shows howSRH can also be used to get Inter-Domain with Traffic Engineering(Low-LatencyPath).Figure 20# SRv6 Inter-Domain Forwarding with SRHNote that there is SDN controller to fulfill SLAs required byprovisioned services and SRTE Policy is programmed to network based onthe SLAs.Segment Routing basic elements are the same for MPLS as well as forIPv6(SRv6) data plane.SRv6 ConclusionSRv6 brings the following benefits to an SP network Simplicity Removal of any encapsulation protocol such as GRE, L2TPv3, LISP. Stateless and Scalable The network program is in the packet header not as a state in thefabric Services – DesignOverviewThe Compass Metro Fabric Design aims to enable simplification across alllayers of a Service Provider network. Thus, the Compass Metro Fabricservices layer focuses on a converged Control Plane based on BGP.BGP based Services include EVPNs and Traditional L3VPNs (VPNv4/VPNv6).EVPN is a technology initially designed for Ethernet multipoint servicesto provide advanced multi-homing capabilities. By using BGP fordistributing MAC address reachability information over the MPLS network,EVPN brought the same operational and scale characteristics of IP basedVPNs to L2VPNs. Today, beyond DCI and E-LAN applications, the EVPNsolution family provides a common foundation for all Ethernet servicetypes; including E-LINE, E-TREE, as well as data center routing andbridging scenarios. EVPN also provides options to combine L2 and L3services into the same instance.To simplify service deployment, provisioning of all services is fullyautomated using Cisco Network Services Orchestrator (NSO) using (YANG)models and NETCONF. Refer to Section# “Network Services Orchestrator (NSO)”.There are two types of services# End-To-End and Hierarchical. The nexttwo sections describe these two types of services in more detail.Ethernet VPN (EVPN)EVPNs solve two long standing limitations for Ethernet Services inService Provider Networks# Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or withData Center Multi-Homed & All-Active Ethernet AccessFigure 21 demonstrates the greatest limitation of traditional L2Multipoint solutions likeVPLS.Figure 21# EVPN All-Active AccessWhen VPLS runs in the core, loop avoidance requires that PE1/PE2 andPE3/PE4 only provide Single-Active redundancy toward their respectiveCEs. Traditionally, techniques such mLACP or Legacy L2 protocols likeMST, REP, G.8032, etc. were used to provide Single-Active accessredundancy.The same situation occurs with Hierarchical-VPLS (H-VPLS), where theaccess node is responsible for providing Single-Active H-VPLS access byactive and backup spoke pseudowire (PW).All-Active access redundancy models are not deployable as VPLStechnology lacks the capability of preventing L2 loops that derive fromthe forwarding mechanisms employed in the Core for certain categories oftraffic. Broadcast, Unknown-Unicast and Multicast (BUM) traffic sourcedfrom the CE is flooded throughout the VPLS Core and is received by allPEs, which in turn flood it to all attached CEs. In our example PE1would flood BUM traffic from CE1 to the Core, and PE2 would sends itback toward CE1 upon receiving it.EVPN uses BGP-based Control Plane techniques to address this issue andenables Active-Active access redundancy models for either Ethernet orH-EVPN access.Figure 22 shows another issue related to BUM traffic addressed byEVPN.Figure 22# EVPN BUM DuplicationIn the previous example, we described how BUM is flooded by PEs over theVPLS Core causing local L2 loops for traffic returning from the core.Another issue is related to BUM flooding over VPLS Core on remote PEs.In our example either PE3 or PE4 receive and send the BUM traffic totheir attached CEs, causing CE2 to receive duplicated BUM traffic.EVPN also addresses this second issue, since the BGP Control Planeallows just one PE to send BUM traffic to an All-Active EVPN access.Figure 23 describes the last important EVPNenhancement.Figure 23# EVPN MAC Flip-FloppingIn the case of All-Active access, traffic is load-balanced (per-flow)over the access PEs (CE uses LACP to bundle multiple physical ethernetports and uses hash algorithm to achieve per flow load-balancing).Remote PEs, PE3 and PE4, receive the same flow from different neighbors.With a VPLS core, PE3 and PE4 would rewrite the MAC address tablecontinuously, each time the same mac address is seen from a differentneighbor.EVPN solves this by mean of “Aliasing”, which is also signaled via theBGP Control Plane.Service Provider Network - Integration with Central Office or with Data CenterAnother very important EVPN benefit is the simple integration withCentral Office (CO) or with Data Center (DC). Note that Metro CentralOffice design is not covered by this document.The adoption of EVPNs provides huge benefits on how L2 Multipointtechnologies can be deployed in CO/DC. One such benefit is the convergedControl Plane (BGP) and converged data plane (SR MPLS/SRv6) over SP WANand CO/DC network.Moreover, EVPNs can replace existing proprietary EthernetMulti-Homed/All-Active solutions with a standard BGP-based ControlPlane.End-To-End (Flat) – ServicesThe End-To-End Services use cases are summarized in the table in Figure24 and shown in the network diagram in Figure 25.Figure 24# End-To-End – Services tableFigure 25# End-To-End – ServicesAll services use cases are based on BGP Control Plane.Refer also to Section# “Transport and Services Integration”.Hierarchical – ServicesHierarchical Services Use Cases are summarized in the table of Figure 26and shown in the network diagram of Figure 27.Figure 26# Hierarchical – Services tableFigure 27# Hierarchical - ServicesHierarchical services designs are critical for Service Providers lookingfor limiting requirements on the access platforms and deploying morecentralized provisioning models that leverage very rich features sets ona limited number of touch points.Hierarchical Services can also be required by Service Providers who wantto integrate their SP-WAN with the Central Office/Data Center networkusing well-established designs based on Data Central Interconnect (DCI).Figure 27 shows hierarchical services deployed on PE routers, but thesame design applies when services are deployed on AG or DCI routers.The Compass Metro Design offers scalable hierarchical services withsimplified provisioning. The three most important use cases aredescribed in the following sections# Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service(H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) andPWHE Hierarchical L2 Multipoint Multi-Homed/All-ActiveFigure 28 shows a very elegant way to take advantage of the benefits ofSegment-Routing Anycast-SID and EVPN. This use case providesHierarchical L2 Multipoint Multi-Homed/All-Active (Single-Homed Ethernetaccess) service with traditional access routerintegration.Figure 28# Hierarchical – Services (Anycast-PW)Access Router A1 establishes a Single-Active static pseudowire(Anycast-Static-PW) to the Anycast IP address of PE1/PE2. PEs anycast IPaddress is represented by Anycast-SID.Access Router A1 doesn’t need to establish active/backup PWs as in atraditional H-VPLS design and doesn’t need any enhancement on top of theestablished spoke pseudowire design.PE1 and PE2 use BGP EVPN Control Plane to provide Multi-Homed/All-Activeaccess, protecting from L2 loop, and providing efficient per-flowload-balancing (with aliasing) toward the remote PEs (PE3/PE4).A3, PE3 and PE4 do the same, respectively.Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRBFigure 29 shows how EVPNs can completely replace the traditional H-VPLSsolution. This use case provides the greatest flexibility asHierarchical L2 Multi/Single-Home, All/Single-Active modes are availableat each layer of the servicehierarchy.Figure 29# Hierarchical – Services (H-EVPN)Optionally, Anycast-IRB can be used to enable Hierarchical L2/L3Multi/Single-Home, All/Single-Active service and to provide optimal L3routing.Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHEFigure 30 shows how the previous H-EVPN can be extended by takingadvantage of Pseudowire Headend (PWHE). PWHE with the combination ofMulti-Homed, Single-Active EVPN provides an Hierarchical L2/L3Multi-Homed/Single-Active (H-EVPN) solution that supports QoS.It completely replaces traditional H-VPLS based solutions. This use caseprovides Hierarchical L2 Multi/Single-Home, All/Single-Activeservice.Figure 30# Hierarchical – Services (H-EVPN and PWHE)Refer also to the section# “Transport and Services Integration”.Services – Router-Reflector (S-RR)Figure 31 shows the design of Services Router-Reflectors(S-RRs).Figure 31# Services – Router-ReflectorsThe Compass Metro Fabric Design focuses mainly on BGP-based services,therefore it is important to provide a robust and scalable ServicesRoute-Reflector (S-RR) design.For Redundancy reasons, there are at least 2 S-RRs in any given IGPDomain, although Access and Aggregation are supported by the same pairof S-RRs.Each node participating in BGP-based service termination has two BGPsessions with Domain Specific S-RRs and supports multipleaddress-Families# VPNv4, VPNv6, EVPN.Core Domain S-RRs cover the core Domain. Aggregation Domain S-RRs coverAccess and Aggregation Domains. Aggregation Domain S-RRs and Core S-RRshave BGP sessions among each other.The described solution is very scalable and can be easily extended toscale to higher numbers of BGP sessions by adding another pair of S-RRsin the Access Domain.Network Services Orchestrator (NSO)The NSO is a management and orchestration (MANO) solution for networkservices and Network Functions Virtualization (NFV). The NSO includescapabilities for describing, deploying, configuring, and managingnetwork services and VNFs, as well as configuring the multi-vendorphysical underlay network elements with the help of standard open APIssuch as NETCONF/YANG or a vendor-specific CLI using Network ElementDrivers (NED).In the Compass Metro Fabric design, the NSO is used for ServicesManagement, Service Provisioning, and Service Orchestration.The NSO provides several options for service designing as shown inFigure 32 Service model with service template Service model with mapping logic Service model with mapping logic and servicetemplates Figure 32# NSO – ComponentsA service model is a way of defining a service in a template format.Once the service is defined, the service model accepts user inputs forthe actual provisioning of the service. For example, a E-Line servicerequires two endpoints and a unique virtual circuit ID to enable theservice. The end devices, attachment circuit UNI interfaces, and acircuit ID are required parameters that should be provided by the userto bring up the E-Line service. The service model uses the YANG modelinglanguage (RFC 6020) inside NSO to define a service.Once the service characteristics are defined based on the requirements,the next step is to build the mapping logic in NSO to extract the userinputs. The mapping logic can be implemented using Python or Java. Thepurpose of the mapping logic is to transform the service models todevice models. It includes mechanisms of how service related operationsare reflected on the actual devices. This involves mapping a serviceoperation to available operations on the devices.Finally, service templates need to be created in XML for each devicetype. In NSO, the service templates are required to translate theservice logic into final device configuration through CLI NED. The NSOcan also directly use the device YANG models using NETCONF for deviceconfiguration. These service templates enable NSO to operate in amulti-vendor environment.Transport and Services IntegrationSection# “Transport - Design” described how Segment Routing provides flexible End-To-End andAny-To-Any Highly-Available transport together with Fast Re-Route. Aconverged BGP Control Plane provides a scalable and flexible solutionalso at the services layer.Figure 33 shows a consolidated view of the Compass Metro Fabric networkfrom a Control-Plane standpoint. Note that while network operators coulduse both PCEP and BGR-SRTE at the same time, it is nottypical.Figure 33# Compass Metro Fabric – Control-PlaneAs mentioned, service provisioning is independent of the transportlayer. However, transport is responsible for providing the path based onservice requirements (SLA). The component that enables such integrationis On-Demand Next Hop (ODN). ODN is the capability of requesting to acontroller a path that satisfies specific constraints (such as lowlatency). This is achieved by associating an SLA tag/attribute to thepath request. Upon receiving the request, the SR-PCE controller calculatesthe path based on the requested SLA and use PCEP or BGP-SRTE todynamically program the Service End Point with a specific SRTE Policy.The Compass Metro Fabric design also use MPLS Performance Management tomonitor link delay/jitter/drop (RFC6374) to be able to create a LowLatency topology dynamically.Figure 34 shows a consolidated view of Compass Metro Fabric network froma Data Planestandpoint.Figure 34# Compass Metro Fabric – Data-PlaneThe Compass Metro Fabric Design – Phase 1Transport - Phase 1This section describes in detail Phase 1 of the Compass Metro Fabricdesign. This Phase focuses on transport programmability and BGP-basedservices adoption.Figure 35 and Figure 36 show the network topology and transport DataPlane details for Phase 1. Refer also to the Access domain extension usecase in Section# “Use Cases”.The network is split into Access and Core IGP domains. Each IGP domainis represented by separate IGP processes. The Compass Metro Fabricdesign uses ISIS IGP protocol for validation.Validation will be done on two types of access platforms, IOS-XR andIOS-XE, to proveinteroperability.Figure 35# Access Domain Extension – End-To-End TransportFor the End-To-End LSP shown in Figure 35, the Access Router imposes 3transport labels (SID-list) An additional label, the TI-LFA label, canbe also added for FRR (node and link protection). In the Core and in theremote Access IGP Domain, 2 additional TI-LFA labels can be used for FRR(node and link protection). In Phase 1 PE ABRs are represented byPrefix-SID. Refer also to Section# “Transport Programmability - Phase 1”.Figure 36# Access Domain Extension – Hierarchical TransportFigure 36 shows how the Access Router imposes a single transport labelto reach local PE ABRs, where the hierarchical service is terminated.Similarly, in the Core and in the remote Access IGP domain, thetransport LSP is contained within the same IGP domain (Intra-DomainLSP). Routers in each IGP domain can also impose two additional TI-LFAlabels for FRR (to provide node and link protection).In the Hierarchical transport use case, PE ABRs are represented byAnycast-SID or Prefix-SID. Depending on the type of service, Anycast-SIDor Prefix-SID is used for the transport LSP.Transport Programmability – Phase 1The Compass Metro Fabric employs a distributed and highly available SR-PCEdesign as described in Section# “Transport Programmability”. Transport programmability is basedon PCEP. Figure 37 shows the design when SR-PCE uses PCEP.Figure 37# SR-PCE – PCEPSR-PCE in the Access domain is responsible for Inter-Domain LSPs andprovides the SID-list. PE ABRs are represented by Prefix-SID.SR-PCE in the Core domain is responsible for On-Demand Nexthop (ODN) forhierarchical services. Refer to the table in Figure 41 to see whatservices use ODN. Refer to Section# “Transport Controller - Path Computation Engine (PCE)” to see more details about XRTransport Controller (SR-PCE). Note that Phase 1 uses the “DelegatedComputation to SR-PCE” mode described in Section# “Path Computation Engine - Workflow” without WAE as shownin Figure38.Figure 38# PCE Path Computation – Phase 1Delegated Computation to SR-PCE NSO provisions the service – Service can also be provisioned via CLI Access Router requests a path SR-PCE computes the path SR-PCE provides the path to Access Router Access Router confirms Services – Phase 1This section describes the Services used in the Compass Metro FabricPhase 1.The table in Figure 39 describes the End-To-End services, while thenetwork diagram in Figure 40 shows how services are deployed in thenetwork. Refer also to Section# “Services - Design” of this document.Figure 39# End-To-End Services tableFigure 40# End-To-End ServicesThe table in Figure 41 describes the hierarchical services, while thenetwork diagram in Figure 42 shows how services are deployed in thenetwork. Refer also to Section# “Services - Design” of this document.In addition, the table in Figure 41 shows where PE ABRs Anycast-SID isrequired and where ODN in the Core IGP domain is used.Figure 41# Hierarchical Services tableFigure 42# Hierarchical ServicesThe Compass Metro Fabric uses the hierarchical Services Route-Reflectors(S-RRs) design described in Section# “Services - Router-Reflector (S-RR)”. Figure 43 shows in detail the S-RRs design used for Phase 1.Figure 43# Services Route-Reflectors (S-RRs)Network Services Orchestrator (NSO) is used for service provisioning.Refer to Section# “Network Services Orchestrator (NSO)”.Transport and Services Integration – Phase 1Transport and Services integration is described in Section# “Transport and Services Integration” of this document. Figure 44 shows an example of End-To-End LSP and servicesintegration in Phase 1.Figure 44# Transport and Services Data-PlaneFigure 45 shows a consolidated view of the Transport and ServicesControl-Plane.Figure 45# Transport and Services Control-PlaneFigure 46 shows the physical topology of the testbed used for Phase 1validation.Figure 46# Testbed – Phase 1The Compass Metro Fabric Design - SummaryThe Compass Metro Fabric brings huge simplification at the Transport aswell as at the Services layers of a Service Provider network.Simplification is a key factor for real Software Defined Networking(SDN). Cisco continuously improves Service Provider network designs tosatisfy market needs for scalability and flexibility.From a very well established and robust Unified MPLS design, Cisco hasembarked on a journey toward transport simplification andprogrammability, which started with the Transport Control Planeunification in Evolved Programmable Network 5.0 (EPN5.0). The CiscoMetro Fabric provides another huge leap forward in simplification andprogrammability adding Services Control Plane unification andcentralized path computation.Figure 47# Compass Metro Fabric – EvolutionThe transport layer requires only IGP protocols with Segment Routingextensions for Intra and Inter Domain forwarding. Fast recovery for nodeand link failures leverages Fast Re-Route (FRR) by Topology IndependentLoop Free Alternate (TI-LFA), which is a built-in function of SegmentRouting. End to End LSPs are built using Traffic Engineering by SegmentRouting, which does not require additional signaling protocols. Insteadit solely relies on SDN controllers, thus increasing overall networkscalability. The controller layer is based on standard industryprotocols like BGP-LS, PCEP, BGP-SRTE, etc., for path computation andNETCONF/YANG for service provisioning, thus providing a on openstandards based solution.For all those reasons, the Cisco Metro Fabric design really brings anexciting evolution in Service Provider Networking.", "url": "/blogs/2018-04-30-metro-fabric-hld/", "author": "Jiri Chaloupka", "tags": "iosxr, Metro, Design" } , "blogs-2018-05-08-peering-fabric-hld": { "title": "Peering Fabric Design", "content": " On This Page Revision History Key Drivers Traffic Growth Network Simplification Network Efficiency High-Level Design Peering Strategy Topology and Peer Distribution Platforms Control-Plane Telemetry Automation Validated Design Peering Fabric Use Cases Traditional IXP Peering Migration to Peering Fabric Peering Fabric Extension Localized Metro Peering and Content Delivery Express Peering Fabric Datacenter Edge Peering Peer Traffic Engineering with Segment Routing Low-Level Design Integrated Peering Fabric Reference Diagram Distributed Peering Fabric Reference Diagram Peering Fabric Hardware Detail NCS-5501-SE NCS-55A1-36H-SE NCS-55A1-24H NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card Peer Termination Strategy Distributed Fabric Device Roles PFL – Peering Fabric Leaf PFS – Peering Fabric Spine Device Interconnection Capacity Scaling Peering Fabric Control Plane PFL to Peer PFL to PFS PFS to Core SR Peer Traffic Engineering Summary Nodal EPE Peer Interface EPE Abstract Peering Peering Fabric Telemetry Telemetry Diagram Model-Driven Telemetry BGP Monitoring Protocol Netflow / IPFIX Automation and Programmability Netconf YANG Model Support Cisco NSO Modules 3rd Party Hosted Applications XR Service Layer API Recommended Device and Protocol Configuration Overview Common Node Configuration Enable LLDP Globally PFS Nodes IGP Configuration Segment Routing Traffic Engineering BGP Global Configuration Model-Driven Telemetry Configuration PFL Nodes Peer QoS Policy Peer Infrastructure ACL Peer Interface Configuration IS-IS IGP Configuration BGP Add-Path Route Policy BGP Global Configuration EBGP Peer Configuration PFL to PFS IBGP Configuration Netflow/IPFIX Configuration Model-Driven Telemetry Configuration Abstract Peering Configuration PFS Configuration Security Infrastructure ACLs BCP Implementation BGP Attribute and CoS Scrubbing Per-Peer Control Plane Policers BGP Prefix Security RPKI Origin Validation BGPSEC BGP Flowspec Appendix Applicable YANG Models NETCONF YANG Paths BGP Operational State Global BGP Protocol State BGP Neighbor State Example Usage BGP RIB Data Example Usage Device Resource YANG Paths Validated Model-Driven Telemetry Sensor Paths Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform LLDP Monitoring Interface statistics and state The following sub-paths can be used but it is recommended to use the base openconfig-interfaces model Aggregate bundle information (use interface models for interface counters) BGP Peering information IS-IS IGP information It is not recommended to monitor complete RIB tables using MDT but can be used for troubleshooting QoS and ACL monitoring BGP RIB information It is not recommended to monitor these paths using MDT with large tables Routing policy Information Revision History Version Date Comments 1.0 05/08/2018 Initial Peering Fabric Publication Key DriversTraffic GrowthInternet traffic has seen a compounded annual growth rate of 30% orhigher over the last five years, as more devices are connected and morecontent is consumed, fueled by the demand for video. Traffic willcontinue to grow as more content sources are added and Internetconnections speeds increase. Service and content providers must designtheir peering networks to scale for a future of more connected deviceswith traffic sources and destinations spanning the globe. Efficientpeering is required to deliver traffic to consumers.Network SimplificationSimple networks are easier to build and easier to operate. As networksscale to handle traffic growth, the level of network complexity mustremain flat. A prescriptive design using standard discrete componentsmakes it easier for providers to scale from networks handling a smallamount of traffic to 10s of Tbps without complete network forklifts.Fabrics with reduced control-plane elements and feature sets enhancestability and availability. Dedicating nodes to specific functions ofthe network also helps isolate the rest of the network from maliciousbehavior, defects, or instability.Network EfficiencyNetwork efficiency refers not only to maximizing network resources butalso optimizing the environmental impact of the deployed network. Muchof Internet peering today is done in 3rd party facilitieswhere space, power, and cooling are at a premium. High-density, lowerenvironmental footprint devices are critical to handling more trafficwithout exceeding the capabilities of a facility. In cases wheremultiple facilities must be connected, a simple and efficient way toextend networks must exist.High-Level DesignThe Peering design incorporates high-density environmentallyefficient edge routers, a prescriptive topology and peer terminationstrategy, and features delivered through IOS-XR to solve the needs ofservice and content providers. Also included as part of the Peeringdesign are ways to monitor the health and operational status of thepeering edge and through Cisco NSO integration assist providers inautomating peer configuration and validation. All designs areboth feature tested and validated as a complete design to ensurestability once implemented.Peering Strategyproposes a localized peering strategy to reduce network cost for“eyeball” service providers by placing peering or content provider cachenodes closer to traffic consumers. This reduces not only reducescapacity on long-haul backbone networks carrying traffic from IXPs toend users but also improves the quality of experience for users byreducing latency to content sources. The same design can also be usedfor content provider networks wishing to deploy a smaller footprintsolution in a SP location or 3rd party peering facility.Topology and Peer DistributionThe Cisco Peering Fabric introduces two options for fabric topology andpeer termination. The first, similar to more traditional peeringdeployments, collapses the Peer Termination and Core Connectivitynetwork functions into a single physical device using the device’sinternal fabric to connect each function. The second option utilizes afabric separating the network functions into separate physical layers,connected via an external fabric running over standard Ethernet.In many typical SP peering deployments, a traditional two-node setup isused where providers vertically upgrade nodes to support the highercapacity needs of the network. Some may employ technologies such as backto back or multi-chassis clusters in order to support more connectionswhile keeping what seems like the operational footprint low. However,failures and operational issues occurring in these types of systems aretypically difficult to troubleshoot and repair. They also requirelengthy planning and timeframes for performing system upgrades. Weintroduce a horizontally scalable distributed peering fabric, the endresult being more deterministic interface or node failures.Minimizing the loss of peering capacity is very important for bothingress-heavy SPs and egress-heavy content providers. The loss of localpeering capacity means traffic must ingress or egress a sub-optimalnetwork port. Making a conscious design decision to spread peerconnections, even to the same peer, across multiple edge nodes helpsincrease resiliency and limit traffic-affecting network events.PlatformsThe Cisco NCS5500 platform is ideal for edge peer termination, given itshigh-density, large RIB and FIB scale, buffering capability, and IOS-XRsoftware feature set. The NCS5500 is also space and power efficient with36x100GE supporting up to 7.5M IPv4 routes in a 1RU fixed form factor orsingle modular line card. A minimal The Peering fabric can provide36x100GE, 144x10GE, or a mix of non-blocking peering connections withfull resiliency in 4RU. The fabric can also scale to support 10s ofterabits of capacity in a single rack for large peering deployments.Fixed chassis are ideal for incrementally building a peering edgefabric, the NCS NC55-36X100GE-A-SE and NC55A1-24H are efficient highdensity building blocks which can be rapidly deployed as needed withoutinstalling a large footprint of devices day one. Deployments needingmore capacity or interface flexibility such as IPoDWDM to extend peeringcan utilize the NCS5504 4-slot or NCS5508 8-slot modular chassis. If thepeering location has a need for services termination the ASR9000 familyor XRv-9000 virtual edge node can be incorporated into the fabric.All NCS5500 routers also contain powerful Route Processors to unlockpowerful telemetry and programmability. The Peering Fabric fixedchassis contain 1.6Ghz 8-core processors and 32GB of RAM. The latestNC55-RP-E for the modular NCS5500 chassis has a 1.9Ghz 6-core processorand 32G of RAM.Control-PlaneThe peering fabric design introduces a simplified control-plane builtupon IPv4/IPv6 with Segment Routing. In the collapsed design, eachpeering node is connected to EBGP peers and upstream to the core viastandard IS-IS, OSPF, and TE protocols, acting as a PE or LER in aprovider network.In the distributed design, network functions are separated. PeerTermination happens on Peering Fabric Leaf nodes. Peering Fabric Spineaggregation nodes are responsible for Core Connectivity and perform moreadvanced LER functions. The PFS routers use ECMP to balance trafficbetween PFL routers and are responsible for forwarding within the fabricand to the rest of the provider network. Each PFS acts as an LER,incorporated into the control-plane of the core network. The PFS, oralternatively vRRs, reflect learned peer routes from the PFL to the restof the network. The SR control-plane supports several trafficengineering capabilities. EPE to a specific peer interface, PFL node, orPFS is supported. We also introduce the abstract peering concept wherePFS nodes utilize a next-hop address bound to an anycast SR SID to allowtraffic engineering on a per-peering center basis.TelemetryThe Peering fabric design uses the rich telemetry available in IOS-XRand the NCS5500 platform to enable an unprecedented level of insightinto network and device behavior. The Peering Fabric leverages Model-DrivenTelemetry and NETCONF along with both standard and native YANG modelsfor metric statistics collection. Telemetry configuration and applicablesensor paths have been identified to assist providers in knowing what tomonitor and how to monitor it.AutomationNETCONF and YANG using OpenConfig and native IOS-XR models are used tohelp automate peer configuration and validation. Cisco has developed specific Peering Fabric NSO service models to help automate common tasks suchas peer interface configuration, peer BGP configuration, and addingphysical interfaces to an existing peer bundle.Validated DesignThe Design control, management, and forwarding planes haveundergone validation testing to ensure individual design features workas intended and the peering fabric as a whole performs without fault.Validation is done exceeding real-world scaling requirements to ensurethe design fulfills its rule in existing networks with room for futuregrowth.Peering Fabric Use CasesTraditional IXP Peering Migration to Peering FabricA traditional SP IXP design traditionally uses one or two large modularsystems terminating all peering connections. In many cases, sinceproviders are constrained on space and power they use a collapsed designwhere the minimal set of peering nodes not only terminates peerconnections but also provides services and core connectivity to thelocation. The Peering Fabric uses best of breed high density,low footprint hardware requiring much less space than older generationmodular systems. Many older systems provide densities at approximately4x100GE per rack unit, while Peering Fabric PFL nodes start at 24x100GEor 36x100GE per 1RU with high FIB capability. Due to the superior spaceefficiency, there is no longer a limitation of using just a pair ofnodes for these functions. In either a collapsed function or distributedfunction design, peers can be distributed across a number of devices toincrease resiliency and lessen collateral impact when failures occur.The diagram below shows a fully distributed fabric, where peers are nowdistributed across three PFL nodes, each with full connectivity toupstream PFS nodes.Peering Fabric ExtensionIn some cases, there may be peering facilities within close geographicproximity which need to integrate into a single fabric. This may happenif there are multiple 3rd party facilities in a closegeographic area, each with unique peers you want to connect to. Theremay also be multiple independent peering facilities within a smallgeographic area you do not wish to install a complete peering fabricinto. In those cases, connecting remote PFL nodes to a larger peeringfabric can be done using optical transport or longer range gray optics.Localized Metro Peering and Content DeliveryIn order to drive greater network efficiency, content sources should beplaces as close to the end destination as possible. Traditional wirelineand wireless service providers have heavy inbound traffic from contentproviders delivering OTT video. Providers may also be providing theirown IP video services to on-net and off-net destinations via a SP CDN.Peering and internal CDN equipment can be placed within a localized peeror content delivery center, connected via a common peering fabric. Inthese cases the PFS nodes connect directly to the metro core to enabledelivery across the region or metro.Express Peering FabricAn evolution to localized metro peering is to interconnect the PFSpeering nodes directly or a metro-wide peering core. The main driver fordirect interconnection is minimizing the number of router and transportnetwork interfaces traffic must pass through. High density opticalmuxponders such as the NCS1002 along with flexible photonic ROADMarchitectures enabled by the NCS2000 can help make the most efficientuse of metro fiber assets.Datacenter Edge PeeringIn order to serve traffic as close to consumer endpoints as possible aprovider may construct a peering edge attached to an edge or centraldatacenter. As gateway functions in the network become virtualized forapplications such as vPE, vCPE, and mobile 5G, the need to attachInternet peering to the SP DC becomes more important. The Peering Fabric supports interconnected to the DC via the SP core or withthe PFS nodes as leafs to the DC spine. These would act as traditionalborder routers in the DC design.Peer Traffic Engineering with Segment RoutingSegment Routing performs efficient source routing of traffic across aprovider network. Traffic engineering is particular applicable topeering as content providers look for ways to optimize egress networkports and eyeball providers work to reduce network hops between ingressand subscriber. There are also a number of advanced use cases based onusing constraints to place traffic on optimal paths, such as latency. AnSRTE Policy represents a forwarding entity within the SR domain mappingtraffic to a specific network path, defined statically on the node orcomputed by an external PCE. An additional benefit of SR is the abilityto source route traffic based on a node SID or an anycast SIDrepresenting a set of nodes. ECMP behavior is preserved at each point inthe network, redundancy is simplified, and traffic protection issupplied using TI-LFA.In the Low-Level Design we explore common peer engineering use cases.Much more information on Segment Routing technology and its futureevolution can be found at http#//segment-routing.netLow-Level DesignIntegrated Peering Fabric Reference DiagramDistributed Peering Fabric Reference DiagramPeering Fabric Hardware DetailThe NCS5500 family of routers provide high density, high routing scale,idea buffer sizes, and environmental efficiency to help providerssatisfy any peering fabric use case. Due to high FIB scale, largebuffers, and broad XR feature set, all prescribed hardware can serve ineither a collapsed or distributed fabric. Further detailed informationon each platform can be found athttps#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.html.NCS-5501-SEThe NCS 5501 is a 1RU fixed router with 40X10GE SFP+ and 4X100GE QSFP28interfaces. The 5501 has IPv4 FIB scale of at least 2M routes. The5501-SE is ideal as a peering leaf node when providers need 10GEinterface flexibility such as ER, ZR, or DWDM.NCS-55A1-36H-SEThe 55A1-36H-SE is a second generation 1RU NCS5500 fixed platform with36 100GE QSFP28 ports operating at line rate. The –SE model contains anexternal TCAM increasing route scale to a minimum of 3M IPv4/512K IPv6routes in its FIB. It also contains a powerful multi-core routeprocessor with 64GB of RAM and an on-board 64GB SSD. Its high density,efficiency, and buffering capability make it ideal in 10GE or 100GEdeployments. Peering fabrics can scale to much higher capacity 1RU at atime by simply adding additional 55A1-36H-SE spine nodes.NCS-55A1-24HThe NCS-55A1-24H is a second generation 1RU NCS5500 fixed platform with24 100GE QSFP28 ports. The device uses two 900GB NPUs, with 12X100GEports connected to each NPU. The 55A1-24H uses a high scale NPU with aminimum of 1.3M IPv4/256K IPv6 routes. At just 675W it is ideal for 10GEpeering fabric deployments with a migration path to 100GE connectivity.The 55A1-24H also has a powerful multi-core processor and 32GB of RAM.NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card Very large peering fabric deployments or those needing interfaceflexibility such as IPoDWDM connectivity can use the modular NCS5500series chassis. Large deployments can utilize the second-generation36X100G-A-SE line card with external TCAM, supporting a minimum of 3MIPv4 routes.Peer Termination StrategyOften overlooked when connecting to Internet peers is determining astrategy to maximize efficiency and resiliency within a local peeringinstance. Often times a peer is connected to a single peering node evenwhen two nodes exist for ease of configuration and coordination with thepeering or transit partner. However, with minimal additionalconfiguration and administration assisted by automation, even singlepeers can be spread across multiple edge peering nodes. Ideally, withina peering fabric, a peer is connected to each leaf in the fabric. Incases where this cannot be done, the provider should use capacityplanning processes to balance peers and transit connections acrossmultiple leafs in the fabric. The added resiliency leads to greaterefficiency when failures do happen, with less reliance on peeringcapacity further away from the traffic destination.Distributed Fabric Device RolesPFL – Peering Fabric LeafThe Peering Fabric Leaf is the node physically connected to externalpeers. Peers could be aggregation routers or 3rd party CDNnodes. In a deconstructed design the PFL is analogous to a line card ina modular chassis solution. PFL nodes can be added as capacity needsgrow.PFS – Peering Fabric SpineThe Peering Fabric Spine acts as an aggregation node for the PFLs and isalso physical connected to the rest of the provider network. Theprovider network could refer to a metro core in the case of localizedpeering, a backbone core in relation to IXP peering, a DC spine layer inthe case of DC peering.Device InterconnectionIn order to maximize resiliency in the fabric, each PFL node isconnected to each PFS. While the design shown includes three PFLs andtwo PFS nodes, there could be any number of PFL and PFS nodes, scalinghorizontally to keep up with traffic and interface growth. PFL nodes arenot connected to each other, the PFS nodes provide the capacity for anytraffic between those nodes. The PFS nodes are also not interconnectedto each other, as no end device should terminate on the PFL, only otherrouters.Capacity ScalingCapacity of the peering fabric is scaled horizontally. The uplinkcapacity from PFL to PFS will be determine by an appropriateoversubscription factor determined by the service provider’s capacityplanning exercises. The leaf/spine architecture of the fabric connectseach PFL to each PFS with equal capacity. In steady-state operationtraffic is balanced between the PFS and PFL in both directions,maximizing the total capacity. The entropy in peering traffic generallyensures equal distribution between either ECMP paths or bundle interfacemember links in the egress direction. More information can be found inthe forwarding plane section of the document. An example deployment mayhave two NC55-36X100G-A-SE spine nodes and two NC55A1-24H leaf nodes. Ina 100GE peer deployment scenario each leaf would support 14x100GE clientconnections and 5x100GE to each spine node. A 10GE deployment wouldsupport 72x10GE client ports and 3x100GE to each spine, at a 1.2#1oversubscription ratio.Peering Fabric Control PlanePFL to PeerThe Peering Fabric Leaf is connected directly to peers via traditionalEBGP. BFD may additionally be used for fault detection if agreed to bythe peer. Each EBGP peer will utilize SR EPE to enable TE to the peerfrom elsewhere on the provider network.PFL to PFSPFL to Peering Fabric Spine uses widely deployed standard routingprotocols. IS-IS is the prescribed IGP protocol within the peeringfabric. Each PFS is configured with the same IS-IS L1 area. In the casewhere OSPF is being used as an IGP, the PFL nodes will reside in an OSPFNSSA area. The peering fabric IGP is SR-enabled with the loopback ofeach PFL assigned a globally unique SR Node SID. Each PFL also has anIBGP session to each PFR to distribute its learned EBGP routes upstreamand learn routes from elsewhere on the provider network. If a provideris distributing routes from PFL to PFL or from another peering locationto local PFLs it is important to enable the BGP “best-path-external”feature to ensure the PFS has the routing information to acceleratere-convergence if it loses the more preferred path.Egress peer engineering will be enabled for EBGP peering connections, sothat each peer or peer interface connected to a PFL is directlyaddressable by its AdJ-Peer-SID from anywhere on the SP network.Adj-Peer-SID information is currently not carried in the IGP of thenetwork. If utilized it is recommended to distribute this informationusing BGP-LS to all controllers creating paths to the PFL EPEdestinations.Each PFS node will be configured with IBGP multipath so traffic is loadbalanced to PFL nodes and increase resiliency in the case of peerfailure. On reception of a BGP withdraw update for a multipath route,traffic loss is minimized as the existing valid route is stillprogrammed into the FIB.PFS to CoreThe PFS nodes will participate in the global Core control plane and actas the gateway between the peering fabric and the rest of the SPnetwork. In order to create a more scalable and programmatic fabric, itis prescribed to use Segment Routing across the core infrastructure.IS-IS is the preferred protocol for transmitting SR SID information fromthe peering fabric to the rest of the core network and beyond. Indeployments where it may be difficult to transition quickly to an all-SRinfrastructure, the PFS nodes will also support OSPF and RSVP-TE forinterconnection to the core. The PFS acts as an ABR or ASBR between thepeering fabric and the larger metro or backbone core network.SR Peer Traffic EngineeringSummarySR allows a provider to create engineered paths to egress peeringdestinations or egress traffic destinations within the SP network. Astack of globally addressable labels is created at the traffic entrypoint, requiring no additional protocol state at midpoints in thenetwork and preserving qualities of normal IGP routing such as ECMP ateach hop. The Peering Fabric proposes end-to-end visibility fromthe PFL nodes to the destinations and vice-versa. This will allow arange of TE capabilities targeting a peering location, peering exitnode, or as granular as a specific peering interface on a particularnode. The use of anycast SIDs within a group of PFS nodes increasesresiliency and load balancing capability.Nodal EPENode EPE directs traffic to a specific peering node within the fabric.The node is targeted using first the PFS cluster anycast IP along withthe specific PFL node SID.Peer Interface EPEThis example uses an Egress Peer Engineering peer-adj-SID value assignedto a single peer interface. The result is traffic sent along this SRpath will use only the prescribed interface for egress traffic.Abstract PeeringAbstract peering allows a provider to simply address a Peering Fabric bythe anycast SIDs of its cluster of PFS nodes. In this case PHP is usedfor the anycast SIDs and traffic is simply forwarded as IP to the finaldestination across the fabric.Peering Fabric TelemetryOnce a peering fabric is deployed, it is extremely important to monitorthe health of the fabric as well as harness the wealth of data providedby the enhanced telemetry on the NCS5500 platform and IOS-XR. Throughstreaming data mechanisms such as Model-Driven Telemetry, BMP, andNetflow, providers can extract data useful for operations, capacityplanning, security, and many other use cases. In the diagram below, thetelemetry collection hosts could be a single system or distributedsystems used for collection. The distributed design of the peeringfabric enhances the ability to collect telemetry data from the fabric bydistributing resources across the fabric. Each PFL or PFS contains amodern multi-core CPU and at least 32GB of RAM (64GB in NC55A1-36H-SE)to support not only built in telemetry operation but also 3rdparty applications a service or content provider may want to deploy tothe node for additional telemetry. Examples of 3rd partytelemetry applications include those storing temporary data forroot-cause analysis if a node is isolated from the rest of the networkor performance measurement applications.The peering fabric also fully supports traditional collections methodssuch as SNMP, and NETCONF using YANG models to integrate with legacysystems.Telemetry DiagramModel-Driven TelemetryMDT uses standards-based or native IOS-XR YANG data models to streamoperational state data from deployed devices. The ability to pushstatistics and state data from the device adds capabilities andefficiency not found using traditional SNMP. Sensors and collectionhosts can be configured statically on the host (dial-out) or the set ofsensors, collection hosts, and their attributes can be managed off-boxusing OpenConfig or native IOS-XR YANG models. Pipeline is Cisco’s opensource collector, which can take MDT data as an input and output it viaa plugin architecture supporting scalable messages buses such as Kafka,or directly to a TSDB such as InfluxDB or Prometheus. The appendixcontains information about MDT YANG paths relevant to the peering fabricand their applicability to PFS and PFL nodes.BGP Monitoring ProtocolBMP, defined in RFC7854, is a protocol to monitor BGP RIB information,updates, and protocol statistics. BMP was created to alleviate theburden of collecting BGP routing information using inefficientmechanisms like screen scraping. BMP has two primary modes, RouteMonitoring mode and Route Mirroring mode. The monitoring mode willinitially transmit the adj-rib-in contents per-peer to a monitoringstation, and continue to send updates as they occur on the monitoreddevice. Setting the L bits on the RM header to 1 will convey this is apost-policy route, 0 will indicate pre-policy. The mirroring mode simplyreflects all received BGP messages to the monitoring host. IOS-XRsupports sending pre and post policy routing information and updates toa station via the Route Monitoring mode. BMP can additionally sendinformation on peer state change events, including why a peer went downin the case of a BGP event.There are drafts in the IETF process led by Cisco to extend BMP toreport additional routing data, such as the loc-RIB and per-peeradj-RIB-out. Local-RIB is the full device RIB include ng received BGProutes, routes from other protocols, and locally originated routes.Adj-RIB-out will add the ability to monitor routes advertised to peerspre and post routing policy.Netflow / IPFIXNetflow was invented by Cisco due to requirements for traffic visibilityand accounting. Netflow in its simplest form exports 5-tuple data foreach flow traversing a Netflow-enabled interface. Netflow data isfurther enhanced with the inclusion of BGP information in the exportedNetflow data, namely AS_PATH and destination prefix. This inclusionmakes it possible to see where traffic originated by ASN and derive thedestination for the traffic per BGP prefix. The latest iteration ofCisco Netflow is Netflow v9, with the next-generation IETF standardizedversion called IPFIX (IP Flow Information Export). IPFIX has expanded onNetflow’s capabilities by introducing hundreds of entities.Netflow is traditionally partially processed telemetry data. The deviceitself keeps a running cache table of flow entries and countersassociated with packets, bytes, and flow duration. At certain timeintervals or event triggered, the flow entries are exported to acollector for further processing. The type 315 extension to IPFIX,supported on the NCS5500, does not process flow data on the device, butsends the raw sampled packet header to an external collector for allprocessing. Due to the high bandwidth, PPS rate, and large number ofsimultaneous flows on Internet routers, Netflow samples packets at apre-configured rate for processing. Typical sampling values on peeringrouters are 1 in 8192 packets, however customers implementing Netflow orIPFIX should work with Cisco to fine tune parameters for optimal datafidelity and performance.Automation and ProgrammabilityNetconfNetconf is an industry standard method for configuration networkdevices. Standardized in RFC 6241, Netconf has standard Remote ProcedureCalls (RPCs) to manipulate configuration data and retrieving state data.Netconf on IOS-XR supports the candidate datastore, meaningconfiguration must be explicitly committed for application to therunning configuration.YANG Model SupportWhile Netconf created standard RPCs for managing configuration on adevice, it did not define a language for expressing configuration. Theconfiguration syntax communicated by Netconf followed the typical CLIconfiguration, proprietary for each network vendor XML formatted withoutfollowing any common semantics. YANG or Yet Another Network Grammar, isa modeling language to express configuration using standard elementssuch as containers, groups, lists, and endpoint data called leafs. YANG1.0 was defined in RFC 6020 and updated to version 1.1 in RFC 7950.Vendors cover the majority of device configuration and state usingNative YANG models unique to each vendor, but the industry is headedtowards standardized models where applicable. Groups such as OpenConfigand the IETF are developing standardized YANG models allowing operatorsto write a configuration once across all vendors. Cisco has implementeda number of standard OpenConfig network models relevant to peeringincluding the BGP protocol, BGP RIB, and Interfaces model.The appendix contains information about YANG paths relevant toconfiguring the peering fabric and their applicability to PFS and PFLnodes.Cisco NSO ModulesCisco Network Services Orchestrator is a widely deployed networkautomation and orchestration platform, performing intent-drivenconfiguration and validation of networks from a single source of truthconfiguration database. The Peering design includes a Cisco NSOmodules to perform specific peering tasks such as peer turn-up, peermodification, deploying routing policy and ACLs to multiple nodes,providing a jumpstart to peering automation.3rd Party Hosted ApplicationsIOS-XR starting in 6.0 runs on an x86 64-bit Linux foundation. The moveto an open and well supported operating system, with XR componentsrunning on top of it, allows network providers to run 3rdparty applications directly on the router. There are a wide variety ofapplications which can run on the XR host, with fast path interfaces inand out of the application. Example applications are telemetrycollection, custom network probes, or tools to manage other portions ofthe network within a location.XR Service Layer APIThe XR service layer API is a gRPC based API to extract data from adevice as well as provide a very fast programmatic path into therouter’s runtime state. One use case of SL API in the peering fabricis to directly program FIB entries on a device, overriding the defaultpath selection. Using telemetry extracted from a peering fabric, anexternal controller can use the data and additional external constraintsto programmatically direct traffic across the fabric. SL API alsosupports transmission of event data via subscriptions.Recommended Device and Protocol ConfigurationOverviewThe following configuration guidelines will step through the majorcomponents of the device and protocol configuration specific to thepeering fabric and highlight non-default configuration recommended foreach device role and the reasons behind those choices. Complete exampleconfigurations for each role can be found in the Appendix of thisdocument. Configuration specific to telemetry is covered in section 4.Common Node ConfigurationThe following configuration is common to both PFL and PFS NCS5500 seriesnodes.Enable LLDP GloballylldpPFS NodesAs the PFS nodes will integrate into the core control-plane, onlyrecommended configuration for connectivity to the PFL nodes is given.IGP Configurationrouter isis pf-internal-core set-overload-bit on-startup wait-for-bgp is-type level-1-2 net <L2 NET> net <L1 PF NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10Segment Routing Traffic EngineeringIn IOS-XR there are two mechanisms for configuring SR-TE. Prior to IOS-XR 6.3.2 SR-TE was configured using the MPLS traffic engineering tunnel interface configuration. Starting in 6.3.2 SR-TE can now be configured using the more flexible SR-TE Policy model. The following examples show how to define a static SR-TE path from PFS node to exit PE node using both the legacy tunnel configuration model as well as the new SR Policy model.Paths to PE exit node being load balanced across two static P routers using legacy tunnel configexplicit-path name PFS1-P1-PE1-1 index 1 next-address 192.168.12.1 index 2 next-address 192.168.11.1!explicit-path name PFS1-P2-PE1-1 index 1 next-label 16221 index 2 next-label 16511!interface tunnel-te1 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.1 path-option 1 explicit name PFS1-P1-PE1-1 segment-routing!interface tunnel-te2 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.2 path-option 1 explicit name PFS1-P2-PE1-1 segment-routingIOS-XR 6.3.2+ SR Policy Configurationsegment-routingtraffic-eng segment-list PFS1-P1-PE1-SR-1 index 1 mpls label 16211 index 2 mpls label 16511 ! segment-list PFS1-P2-PE1-SR-1 index 1 mpls label 16221 index 2 mpls label 16511 ! policy pfs1_pe1_via_p1 binding-sid mpls 900001 color 1 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! ! policy pfs1_pe1_via_p2 binding-sid mpls 900002 color 2 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! !BGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATH is longer address-family ipv4 unicast additional-paths receive maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths receive bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF Model-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1PFL NodesPeer QoS PolicyPolicy applied to edge of the network to rewrite any incoming DSCP valueto 0.policy-map peer-qos-in class class-default set dscp default ! end-policy-map!Peer Infrastructure ACLSee the Security section of the document for recommended best practicesfor ingress and egress infrastructure ACLs.access-group v4-infra-acl-in access-group v6-infra-acl-in access-group v4-infra-acl-out access-group v6-infra-acl-out Peer Interface Configurationinterface TenGigE0/0/0/0 description “external peer” service-policy input peer-qos-in ;Explicit policy to rewrite DSCP to 0 lldp transmit disable #Do not run LLDP on peer connected interfaces lldp receive disable #Do not run LLDP on peer connected interfaces ipv4 access-group v4-infra-acl-in #IPv4 Ingress infrastructure ACL ipv4 access-group v4-infra-acl-out #IPv4 Egress infrastructure ACL, BCP38 filtering ipv6 access-group v6-infra-acl-in #IPv6 Ingress infrastructure ACL ipv6 access-group v6-infra-acl-out #IPv6 Egress infrastructure ACL, BCP38 filtering IS-IS IGP Configurationrouter isis pf-internal set-overload-bit on-startup wait-for-bgp is-type level-1 net <L1 Area NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10 ! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10BGP Add-Path Route Policyroute-policy advertise-all ;Create policy for add-path advertisements set path-selection all advertiseend-policyBGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATh is longer address-family ipv4 unicast bgp attribute-download ;Enable BGP information for Netflow/IPFIX export additional-paths send additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv4 NLRI to PFS maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths send additional-paths receive additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv6 NLRI to PFS bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF EBGP Peer Configurationsession-group peer-session ignore-connected-check #Allow loopback peering over ECMP w/o EBGP Multihop egress-engineering #Allocate adj-peer-SID ttl-security #Enable gTTL security if neighbor supports it bmp-activate server 1 #Optional send BMP data to receiver 1af-group v4-af-peer address-family ipv4 unicast soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 1000 80;Set maximum inbound prefixes, warning at 80% thresholdaf-group v6-af-peer soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 100 80 #Set maximum inbound prefixes, warning at 80% thresholdneighbor-group v4-peer use session-group peer-session dmz-link-bandwidth ;Propagate external link BW address-family ipv4 unicast af-group v4-af-peerneighbor-group v6-peer use session-group peer-session dmz-link-bandwidth address-family ipv6 unicast af-group v6-af-peer neighbor 1.1.1.1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v4-peer address-family ipv4 unicast route-policy v4-peer-in(12345) in route-policy v4-peer-out(12345) out neighbor 2001#dead#b33f#0#1#1#1#1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v6-peer address-family ipv6 unicast route-policy v6-peer-in(12345) in route-policy v6-peer-out(12345) out PFL to PFS IBGP Configurationsession-group pfs-session ttl-security #Enable gTTL security if neighbor supports it bmp-activate server 1 #Optional send BMP data to receiver 1 update-source Loopback0 #Set BGP session source address to Loopback0 address af-group v4-af-pfs address-family ipv4 unicast next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v4-pfs-in in route-policy v4-pfs-out out af-group v6-af-pfs next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v6-pfs-in in route-policy v6-pfs-out out neighbor-group v4-pfs ! use session-group pfs-session address-family ipv4 unicast af-group v4-af-pfsneighbor-group v6-pfs ! use session-group pfs-session address-family ipv6 unicast af-group v6-af-pfs neighbor <PFS IP> description ~PFS #1~ remote-as <local ASN> use neighbor-group v4-pfsNetflow/IPFIX Configurationflow exporter-map nf-export version v9 options interface-table timeout 60 options sampler-table timeout 60 template timeout 30 ! transport udp <port> source Loopback0 destination <dest>flow monitor-map flow-monitor-ipv4 record ipv4 option bgpattr exporter nf-export cache entries 50000 cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-ipv6 record ipv6 option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-mpls record mpls ipv4-ipv6-fields option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10 sampler-map nf-sample-8192 random 1 out-of 8192Peer Interfaceinterface Bundle-Ether100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressPFS Upstream Interfaceinterface HundredGigE0/0/0/100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressModel-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1Abstract Peering ConfigurationAbstract peering uses qualities of Segment Routing anycast addresses toallow a provider to steer traffic to a specific peering fabric by simplyaddressing a node SID assigned to all PFS members of the peeringcluster. All of the qualities of SR such as midpoint ECMP and TI-LFAfast protection are preserved for the end to end BGP path, improvingconvergence across the network to the peering fabric. Additionally,through the use of SR-TE Policy, source routed engineered paths can beconfigured to the peering fabric based on business logic and additionalpath constraints.PFS ConfigurationOnly the PFS nodes require specific configuration to perform abstractpeering. Configuration shown is for example only with IS-IS configuredas the IGP carrying SR information. The routing policy setting thenext-hop to the AP anycast SID should be incorporated into standard IBGPoutbound routing policy.interface Loopback1 ipv4 address x.x.x.x/32 ipv6 address x#x#x#x##x/128 router isis <ID> passive address-family ipv4 unicast prefix-sid absolute <Global IPv4 AP Node SID> address-family ipv6 unicast prefix-sid absolute <Global IPv6 AP Node SID> route-policy v4-abstract-ibgp-out set next-hop <Loopback1 IPv4 address> route-policy v6-abstract-ibgp-out set next-hop <Loopback1 IPv6 address> router bgp <ASN> ibgp policy out enforce-modifications ;Enables a PFS node to set a next-hop address on routes reflected to IBGP peersrouter bgp <ASN> neighbor x.x.x.x address-family ipv4 unicast route-policy v4-abstract-ibgp-out neighbor x#x#x#x##x address-family ipv6 unicast route-policy v6-abstract-ibgp-out SecurityPeering by definition is at the edge of the network, where security ismandatory. While not exclusive to peering, there are a number of bestpractices and software features when implemented will protect your ownnetwork as well as others from malicious sources within your network.Infrastructure ACLsInfrastructure ACLs and their associated ACEs (Access Control Entries)are the perimeter protection for a network. The recommended PFL deviceconfiguration uses IPv4 and IPv6 infrastructure ACLs on all edgeinterfaces. These ACLs are specific to each provider’s security needs,but should include the following sections. Filter IPv4 and IPv6 BOGON space ingress and egress Drop ingress packets with a source address matching your own aggregate IPv4/IPv6 prefixes. Rate-limit ingress traffic to Unix services typically used in DDoSattacks, such as chargen (TCP/19). On ingress and egress, allow specific ICMP types and rate-limit toappropriate values, filter out ones not needed on your network. ICMPttl-exceeded, host unreachable, port unreachable, echo-reply,echo-request, and fragmentation needed should always be allowed in somecapacity.BCP ImplementationBest Current Practices are informational documents published by the IETFto give guidelines on operational practices. This document will notoutline the contents of the recommended BCPs, but two in particular areof interest to Internet peering. BCP38 explains the need to filterunused address space at the edges of the network, minimizing the chancesof spoofed traffic from DDoS sources reaching their intended target.BCP38 is applicable for ingress traffic and especially egress traffic,as it stops spoofed traffic before it reaches outside your network.BCP194, BGP Operations and Security, covers a number of BGP operationalpractices, many of which are used in Internet peering. IOS-XR supportsall of the mechanisms recommended in BCP38, BCP84, and BCP194, includingsoftware features such as GTTL, BGP dampening, and prefix limits.BGP Attribute and CoS ScrubbingScrubbing of data on ingress and egress of your network is an importantsecurity measure. Scrubbing falls into two categories, control-plane anddataplane. The control-plane for Internet peering is BGP and there are afew BGP transitive attributes one should take care to normalize. Yourinternal BGP communities should be deleted from outbound BGP NLRI viaegress policy. Most often you are setting communities on inboundprefixes, make sure you are replacing existing communities from the peerand not adding communities. Unless you have an agreement with the peer,normalize the MED attribute to zero or another standard value on allinbound prefixes.In the dataplane, it’s important to treat the peering edge as untrustedand clear any CoS markings on inbound packets, assuming a prioragreement hasn’t been reached with the peer to carry them across thenetwork boundary. It’s an overlooked aspect which could lead to peertraffic being prioritized on your network, leading to unexpected networkbehavior. An example PFL infrastructure ACL is given resetting incomingIPv4/IPv6 DSCP values to 0.Per-Peer Control Plane PolicersBGP protocol packets are handled at the RP level, meaning each packet ishandled by the router CPU with limited bandwidth and processingresources. In the case of a malicious or misconfigured peer this couldexhaust the processing power of the CPU impacting other important tasks.IOS-XR enforces protocol policers and BGP peer policers by default.BGP Prefix SecurityRPKI Origin ValidationPrefix hijacking has been prevalent throughout the last decade as theInternet became more integrated into our lives. This led to the creationof RPKI origin validation, a mechanism to validate a prefix was beingoriginated by its rightful owner by checking the originating ASN vs. asecure database. IOS-XR fully supports RPKI for origin validation.BGPSECRPKI origin validation works to validate the source of a prefix, butdoes not validate the entire path of the prefix. Origin validation alsodoes not use cryptographic signatures to ensure the originator is whothey say they are, so spoofing the ASN as well does not stop someoneform hijacking a prefix. BGPSEC is an evolution where a BGP prefix iscryptographically signed with the key of its valid originator, and eachBGP router receiving the path checks to ensure the prefix originatedfrom the valid owner. BGPSEC standards are being worked on in the SIDRworking group.BGP FlowspecBGP Flowspec was standardized in RFC 5575 and defines additional BGPNLRI to inject traffic manipulation policy information to be dynamicallyimplemented by a receiving router. BGP acts as the control-plane fordisseminating the policy information while it is up to the BGP Flowspecreceiver to implement the dataplane rules specified in the NLRI. At theInternet peering edge, DDoS protection has become extremely important,and automating the remediation of an incoming DDoS attack has becomevery important. Automated DDoS protection is only one BGP Flowspec usecase, any application needing a programmatic way to create interfacepacket filters can make se use of its capabilities.AppendixApplicable YANG ModelsModelDataopenconfig-interfacesCisco-IOS-XR-infra-statsd-operCisco-IOS-XR-pfi-im-cmd-operInterface config and state Common counters found in SNMP IF-MIB openconfig-if-ethernet Cisco-IOS-XR-drivers-media-eth-operEthernet layer config and stateXR native transceiver monitoringopenconfig-platformInventory, transceiver monitoring openconfig-bgpCisco-IOS-XR-ipv4-bgp-oper Cisco-IOS-XR-ipv6-bgp-operBGP config and state Includes neighbor session state, message counts, etc.openconfig-bgp-rib Cisco-IOS-XR-ip-rib-ipv4-oper Cisco-IOS-XR-ip-rib-ipv6-operBGP RIB information. Note# Cisco native includes all protocols openconfig-routing-policyConfigure routing policy elements and combined policyopenconfig-telemetryConfigure telemetry sensors and destinations Cisco-IOS-XR-ip-bfd-cfg Cisco-IOS-XR-ip-bfd-operBFD config and state Cisco-IOS-XR-ethernet-lldp-cfg Cisco-IOS-XR-ethernet-lldp-operLLDP config and state openconfig-mplsMPLS config and state, including Segment RoutingCisco-IOS-XR-clns-isis-cfgCisco-IOS-XR-clns-isis-operIS-IS config and state Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-operNCS 5500 HW resources NETCONF YANG PathsNote that while paths are given to retrieve data from a specific leafnode, it is sometimes more efficient to retrieve all the data under aspecific heading and let a management station filter unwanted data thanperform operations on the router. Additionally, Model Driven Telemetrymay not work at a leaf level, requiring retrieval of an entire subset ofdata.The data is also available via NETCONF, which does allow subtree filtersand retrieval of specific data. However, this is a more resourceintensive operation on the router.MetricData     Logical Interface Admin State Enum SNMP OID IF-MIB#ifAdminStatus OC YANG openconfig-interfaces#interfaces/interface/state/admin-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Interface Operational State Enum SNMP OID IF-MIB#ifOperStatus OC YANG openconfig-interfaces#interfaces/interface/state/oper-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Last State Change (seconds) Counter SNMP OID IF-MIB#ifLastChange OC YANG openconfig-interfaces#interfaces/interface/state/last-change Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/last-state-transition-time     Logical Interface SNMP ifIndex Integer SNMP OID IF-MIB#ifIndex OC YANG openconfig-interfaces#interfaces/interface/state/if-index Native YANG Cisco-IOS-XR-snmp-agent-oper#snmp/interface-indexes/if-index     Logical Interface RX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCInOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-received     Logical Interface TX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCOutOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-sent     Logical Interface RX Errors Counter SNMP OID IF-MIB#ifInErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-errors MDT Native     Logical Interface TX Errors Counter SNMP OID IF-MIB#ifOutErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-errors     Logical Interface Unicast Packets RX Counter SNMP OID IF-MIB#ifHCInUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Unicast Packets TX Counter SNMP OID IF-MIB#ifHCOutUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Input Drops Counter SNMP OID IF-MIB#ifIntDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-drops     Logical Interface Output Drops Counter SNMP OID IF-MIB#ifOutDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-drops     Ethernet Layer Stats – All Interfaces Counters SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics     Ethernet PHY State – All Interfaces Counters SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info     Ethernet Input CRC Errors Counter SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state/oc-eth#counters/oc-eth#in-crc-errors Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics/statistic/dropped-packets-with-crc-align-errors The following transceiver paths retrieve the total power for thetransceiver, there are specific per-lane power levels which can beretrieved from both native and OC models, please refer to the model YANGfile for additionalinformation.     Ethernet Transceiver RX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-rx-power     Ethernet Transceiver TX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-tx-power BGP Operational StateGlobal BGP Protocol StateIOS-XR native models do not store route information in the BGP Opermodel, they are stored in the IPv4/IPv6 RIB models. These models containRIB information based on protocol, with a numeric identifier for eachprotocol with the BGP ProtoID being 5. The protoid must be specified orthe YANG path will return data for all configured routingprotocols.     BGP Total Paths (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-paths Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/num-active-paths MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native BGP Neighbor StateExample UsageDue the construction of the YANG model, the neighbor-address key must beincluded as a container in all OC BGP state RPCs. The following RPC getsthe session state for all configured peers#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address/> <state> <session-state/> </state> </neighbor> </neighbors> </bgp> </filter> </get></rpc>\t<nc#rpc-reply message-id=~urn#uuid#24db986f-de34-4c97-9b2f-ac99ab2501e3~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address>172.16.0.2</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> </neighbors> </bgp> </nc#data></nc#rpc-reply>     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Session State for all BGP neighbors Enum SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/session-state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/connection-state     Message counters for all BGP neighbors Counter SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/messages Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/message-statistics Current queue depth for all BGP neighborsCounterSNMP OIDNAOC YANG/openconfig-bgp#bgp/neighbors/neighbor/state/queuesNative YANGCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-outCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-inBGP RIB DataRIB data is retrieved per AFI/SAFI. To retrieve IPv6 unicast routesusing OC models, replace “ipv4-unicast” with “ipv6-unicast”IOS-XR native models do not have a BGP specific RIB, only RIB dataper-AFI/SAFI for all protocols. Retrieving RIB information from thesepaths will include this data.While this data is available via both NETCONF and MDT, it is recommendedto use BMP as the mechanism to retrieve RIB table data.Example UsageThe following retrieves a list of best-path IPv4 prefixes withoutattributes from the loc-RIB#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <loc-rib> <routes> <route> <prefix/> <best-path>true</best-path> </route> </routes> </loc-rib> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc>     IPv4 Local RIB – Prefix Count Counter OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/num-routes Native YANG       IPv4 Local RIB – IPv4 Prefixes w/o Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes/route/prefix     IPv4 Local RIB – IPv4 Prefixes w/Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes Native YANG   The following per-neighbor RIB paths can be qualified with a specificneighbor address to retrieve RIB data for a specific peer. Below is anexample of a NETCONF RPC to retrieve the number of post-policy routesfrom the 192.168.2.51 peer and the returned output.<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes/> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc><nc#rpc-reply message-id=~urn#uuid#7d9a0468-4d8d-4008-972b-8e703241a8e9~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <afi-safi-name xmlns#idx=~http#//openconfig.net/yang/rib/bgp-types~>idx#IPV4_UNICAST</afi-safi-name> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes>3</num-routes> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </nc#data></nc#rpc-reply>     IPv4 Neighbor adj-rib-in pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-re     IPv4 Neighbor adj-rib-in post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-post     IPv4 Neighbor adj-rib-out pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre     IPv4 Neighbor adj-rib-out post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre Device Resource YANG Paths     Device Inventory List OC YANG oc-platform#components     NCS5500 Dataplane Resources List OC YANG NA Native YANG Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data Validated Model-Driven Telemetry Sensor PathsThe following represents a list of validated sensor paths useful formonitoring the Peering Fabric and the data which can be gathered byconfiguring these sensorpaths.Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform openconfig-platform#components cisco-ios-xr-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data cisco-ios-xr-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info cisco-ios-xr-shellutil-oper#system-time/uptime cisco-ios-xr-wdsysmon-fd-oper#system-monitoring/cpu-utilizationLLDP MonitoringCisco-IOS-XR-ethernet-lldp-oper#lldpCisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/neighborsInterface statistics and stateopenconfig-interfaces#interfacesCisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-countersCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interfaceCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statisticsCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statistics/basic-interface-statsThe following sub-paths can be used but it is recommended to use the base openconfig-interfaces modelopenconfig-interfaces#interfaces/interfaceopenconfig-interfaces#interfaces/interface/stateopenconfig-interfaces#interfaces/interface/state/countersopenconfig-interfaces#interfaces/interface/subinterfaces/subinterface/state/countersAggregate bundle information (use interface models for interface counters)sensor-group openconfig-if-aggregate#aggregatesensor-group openconfig-if-aggregate#aggregate/statesensor-group openconfig-lacp#lacpsensor-group Cisco-IOS-XR-bundlemgr-oper#bundlessensor-group Cisco-IOS-XR-bundlemgr-oper#bundle-information/bfd-countersBGP Peering informationsensor-path openconfig-bgp#bgpsensor-path openconfig-bgp#bgp/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticsIS-IS IGP informationsensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighborssensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/interfacessensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/adjacenciesIt is not recommended to monitor complete RIB tables using MDT but can be used for troubleshootingCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countQoS and ACL monitoringopenconfig-acl#aclCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/general-statsCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/queue-stats-arrayBGP RIB informationIt is not recommended to monitor these paths using MDT with large tablesopenconfig-rib-bgp#bgp-ribCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intRouting policy InformationCisco-IOS-XR-policy-repository-oper#routing-policy/policies", "url": "/blogs/2018-05-08-peering-fabric-hld/", "author": "Phil Bedard", "tags": "iosxr, Peering, Design" } , "blogs-2018-05-09-metro-design-implementation-guide": { "title": "Metro Design Implementation Guide", "content": " On This Page Targets Testbed Overview Devices Role-Based Configuration Transport IOS-XR – All IOS-XR nodes IGP Protocol (ISIS) and Segment Routing MPLS configuration MPLS Segment Routing Traffic Engineering (SRTE) configuration Transport IOS-XE – All IOS-XE nodes Segment Routing MPLS configuration IGP-ISIS configuration MPLS Segment Routing Traffic Engineering (SRTE) Area Border Routers (ABRs) IGP-ISIS Redistribution configuration BGP – Access or Provider Edge Routers IOS-XR configuration IOS-XE configuration Area Border Routers (ABRs) IGP Topology Distribution Transport Route Reflector (tRR) Services Route Reflector (sRR) Segment Routing Path Computation Element (SR-PCE) Segment Routing Traffic Engineering (SRTE) and Services Integration On Demand Next-Hop (ODN) configuration – IOS-XR On Demand Next-Hop (ODN) configuration – IOS-XE Preferred Path configuration – IOS-XR Preferred Path configuration – IOS-XE Services End-To-End Services L3VPN MP-BGP VPNv4 On-Demand Next-Hop Access Router Service Provisioning (IOS-XE)# L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Access Router Service Provisioning (IOS-XR)# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# End-To-End Services Data Plane Hierarchical Services L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Router Service Provisioning (IOS-XR)# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# Targets Hardware# ASR9000 as Provider Edge (PE) node NCS5500 as Aggregation and P-Aggregation Node ASR920 and NCS5500 (standing for the NCS540) as Access Router Software# IOS-XR 6.3.2 on ASR9000 and NCS5500 IOS-XE 16.8.1 on ASR920 Key technologies Transport# End-To-End Segment-Routing Network Programmability# SRTE Inter-Domain LSPs with On-DemandNext Hop Network Availability# TI-LFA/Anycast-SID Services# BGP-based L2 and L3 Virtual Private Network services(EVPN and L3VPN) Testbed OverviewFigure 1# Compass Metro Fabric High Level TopologyFigure 2# Testbed Physical TopologyFigure 3# Testbed Route-Reflector and SR-PCE physical connectivityFigure 4# Testbed IGP DomainsDevicesAccess Routers Cisco NCS5501-SE (IOS-XR) – A-PE1, A-PE2, A-PE3, A-PE7 Cisco ASR920 (IOS-XE) – A-PE4, A-PE5, A-PE6, A-PE9 Area Border Routers (ABRs) and Provider Edge Routers# Cisco ASR9000 (IOS-XR) – PE1, PE2, PE3, PE4Route Reflectors (RRs)# Cisco IOS XRv 9000 – tRR1-A, tRR1-B, sRR1-A, sRR1-B, sRR2-A, sRR2-B,sRR3-A, sRR3-BSegment Routing Path Computation Element (SR-PCE)# Cisco IOS XRv 9000 – SR-PCE1-A, SR-PCE1-B, SR-PCE2-A, SR-PCE2-B, SR-PCE3-A, SR-PCE3-BRole-Based ConfigurationTransport IOS-XR – All IOS-XR nodesIGP Protocol (ISIS) and Segment Routing MPLS configurationRouter isis configurationkey chain ISIS-KEY key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password 00071A150754 send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm HMAC-MD5All Routers, except Provider Edge (PE) Routers, are part of one IGPdomain (ISIS ACCESS or ISIS-CORE). PEs act as Area Border Routers (ABRs)and run two IGP processes (ISIS-ACCESS and ISIS-CORE). Please note thatLoopback 0 is part of both IGP processes.router isis ISIS-ACCESS set-overload-bit on-startup 360 is-type level-2-only net 49.0001.0101.0000.0110.00 nsr nsf cisco log adjacency changes lsp-gen-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 lsp-refresh-interval 65000 max-lsp-lifetime 65535 lsp-password keychain ISIS-KEY lsp-password keychain ISIS-KEY level 1 address-family ipv4 unicast metric-style wide spf-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 segment-routing mpls spf prefix-priority critical tag 5000 spf prefix-priority high tag 1000 !PEs Loopback 0 is part of both IGP processes together with same“prefix-sid index” value. interface Loopback0 address-family ipv4 unicast prefix-sid index 150 ! !TI-LFA FRR configuration interface TenGigE0/0/0/10 point-to-point hello-password keychain ISIS-KEY address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 ! ! !interface Loopback0 ipv4 address 100.0.1.50 255.255.255.255!MPLS Interface configurationinterface TenGigE0/0/0/10 bfd mode ietf bfd address-family ipv4 timers start 180 bfd address-family ipv4 multiplier 3 bfd address-family ipv4 destination 10.1.2.1 bfd address-family ipv4 fast-detect bfd address-family ipv4 minimum-interval 50 mtu 9216 ipv4 address 10.15.150.1 255.255.255.254 ipv4 unreachables disable bundle minimum-active links 1 load-interval 30 dampening!MPLS Segment Routing Traffic Engineering (SRTE) configurationipv4 unnumbered mpls traffic-eng Loopback0router isis ACCESS address-family ipv4 unicast mpls traffic-eng level-2-only mpls traffic-eng router-id Loopback0Transport IOS-XE – All IOS-XE nodesSegment Routing MPLS configurationmpls label range 6001 32767 static 16 6000segment-routing mpls ! set-attributes address-family ipv4 sr-label-preferred exit-address-family ! global-block 16000 32000 !Prefix-SID assignment to loopback 0 configuration connected-prefix-sid-map address-family ipv4 100.0.1.51/32 index 151 range 1 exit-address-family !IGP-ISIS configurationkey chain ISIS-KEY key 1 key-string cisco accept-lifetime 00#00#00 Jan 1 2018 infinite send-lifetime 00#00#00 Jan 1 2018 infinite!router isis ACCESS net 49.0001.0102.0000.0254.00 is-type level-2-only authentication mode md5 authentication key-chain ISIS-KEY metric-style wide fast-flood 10 set-overload-bit on-startup 120 max-lsp-lifetime 65535 lsp-refresh-interval 65000 spf-interval 5 50 200 prc-interval 5 50 200 lsp-gen-interval 5 5 200 log-adjacency-changes segment-routing mpls segment-routing prefix-sid-map advertise-localTI-LFA FRR configuration fast-reroute per-prefix level-2 all fast-reroute ti-lfa level-2 microloop avoidance protected redistribute connected!interface Loopback0 ip address 100.0.1.51 255.255.255.255 ip router isis ACCESS isis circuit-type level-2-onlyendMPLS Interface configurationinterface TenGigabitEthernet0/0/12 mtu 9216 ip address 10.117.151.1 255.255.255.254 ip router isis ACCESS mpls ip isis circuit-type level-2-only isis network point-to-point isis metric 100endMPLS Segment Routing Traffic Engineering (SRTE)router isis ACCESS mpls traffic-eng router-id Loopback0 mpls traffic-eng level-2interface TenGigabitEthernet0/0/12 mpls traffic-eng tunnelsArea Border Routers (ABRs) IGP-ISIS Redistribution configurationPEs have to provide IP reachability for RRs, SR-PCEs and NSO between bothISIS-ACCESS and ISIS-CORE IGP domains. This is done by specific IPprefixes redistribution.router staticaddress-family ipv4 unicast 100.0.0.0/24 Null0 100.0.1.0/24 Null0 100.1.0.0/24 Null0 100.1.1.0/24 Null0prefix-set ACCESS-XTC_SvRR-LOOPBACKS 100.0.1.0/24, 100.1.1.0/24end-setprefix-set RR-LOOPBACKS 100.0.0.0/24, 100.1.0.0/24end-setredistribute Core SvRR and TvRR loopback into Access domainroute-policy CORE-TO-ACCESS1 if destination in RR-LOOPBACKS then pass else drop endifend-policyrouter isis ACCESS address-family ipv4 unicast redistribute static route-policy CORE-TO-ACCESS1 redistribute Access SR-PCE and SvRR loopbacks into Core domainroute-policy ACCESS1-TO-CORE if destination in ACCESS-XTC_SvRR-LOOPBACKS then pass else drop endif end-policy router isis CORE address-family ipv4 unicast redistribute static route-policy CORE-TO-ACCESS1 BGP – Access or Provider Edge RoutersIOS-XR configurationrouter bgp 100 nsr bgp router-id 100.0.1.50 bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family l2vpn evpn ! neighbor-group SvRR remote-as 100 update-source Loopback0 address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family l2vpn evpn ! ! neighbor 100.0.1.201 use neighbor-group SvRR !IOS-XE configurationrouter bgp 100 bgp router-id 100.0.1.51 bgp log-neighbor-changes no bgp default ipv4-unicast neighbor SvRR peer-group neighbor SvRR remote-as 100 neighbor SvRR update-source Loopback0 neighbor 100.0.1.201 peer-group SvRR ! address-family ipv4 exit-address-family ! address-family vpnv4 neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family ! address-family l2vpn evpn neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family !Area Border Routers (ABRs) IGP Topology DistributionNext network diagram# “BGP-LS Topology Distribution” shows how AreaBorder Routers (ABRs) distribute IGP network topology from ISIS ACCESSand ISIS CORE to Transport Route-Reflectors (tRRs). tRRs then reflecttopology to Segment Routing Path Computation Element (SR-PCEs)Figure 5# BGP-LS Topology Distributionrouter isis ACCESS distribute link-state instance-id 101 net 49.0001.0101.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0router isis CORE distribute link-state instance-id 100 net 49.0001.0100.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0router bgp 100 address-family link-state link-state ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR !Transport Route Reflector (tRR)router static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.10 bgp graceful-restart ibgp policy out enforce-modifications address-family link-state link-state additional-paths receive additional-paths send ! neighbor-group RRC remote-as 100 update-source Loopback0 address-family link-state link-state route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group RRC ! neighbor 100.0.0.2 use neighbor-group RRC ! neighbor 100.0.0.3 use neighbor-group RRC ! neighbor 100.0.0.4 use neighbor-group RRC ! neighbor 100.0.0.100 use neighbor-group RRC ! neighbor 100.0.1.101 use neighbor-group RRC ! neighbor 100.0.2.102 use neighbor-group RRC ! neighbor 100.1.1.101 use neighbor-group RRC !!Services Route Reflector (sRR)router static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.200 bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast additional-paths receive additional-paths send ! address-family vpnv6 unicast additional-paths receive additional-paths send retain route-target all ! address-family l2vpn evpn additional-paths receive additional-paths send ! neighbor-group SvRR-Client remote-as 100 update-source Loopback0 address-family l2vpn evpn route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group SvRR-Client ! neighbor 100.0.0.2 use neighbor-group SvRR-Client ! neighbor 100.0.0.3 use neighbor-group SvRR-Client ! neighbor 100.0.0.4 use neighbor-group SvRR-Client ! neighbor 100.2.0.5 use neighbor-group SvRR-Client description Ixia-P1 ! neighbor 100.2.0.6 use neighbor-group SvRR-Client description Ixia-P2 ! neighbor 100.0.1.201 use neighbor-group SvRR-Client ! neighbor 100.0.2.202 use neighbor-group SvRR-Client !!Segment Routing Path Computation Element (SR-PCE)router static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.100 bgp graceful-restart ibgp policy out enforce-modifications address-family link-state link-state ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR !!pce address ipv4 100.0.0.100!Segment Routing Traffic Engineering (SRTE) and Services IntegrationThis section shows how to integrate Traffic Engineering (SRTE) withServices. Particular usecase refers to next sub-section.On Demand Next-Hop (ODN) configuration – IOS-XRsegment-routing traffic-eng logging policy status ! on-demand color 100 dynamic pce ! metric type igp ! ! ! pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 ! pce address ipv4 100.1.1.101 ! !extcommunity-set opaque BLUE 100end-setroute-policy ODN_EVPN set extcommunity color BLUEend-policyrouter bgp 100 address-family l2vpn evpn route-policy ODN_EVPN out !!On Demand Next-Hop (ODN) configuration – IOS-XEmpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-allmpls traffic-eng auto-tunnel p2p config unnumbered-interface Loopback0mpls traffic-eng auto-tunnel p2p tunnel-num min 1000 max 5000!mpls traffic-eng lsp attributes L3VPN-SRTE path-selection metric igp pce!ip community-list 1 permit 9999route-map L3VPN-ODN-TE-INIT permit 10 match community 1 set attribute-set L3VPN-SRTE!route-map L3VPN-SR-ODN-Mark-Comm permit 10 match ip address L3VPN-ODN-Prefixes set community 9999!!endrouter bgp 100 address-family vpnv4 neighbor SvRR send-community both neighbor SvRR route-map L3VPN-ODN-TE-INIT in neighbor SvRR route-map L3VPN-SR-ODN-Mark-Comm outPreferred Path configuration – IOS-XRsegment-routing traffic-eng pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 ! pce address ipv4 100.1.1.101 ! !Preferred Path configuration – IOS-XEmpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-allServicesEnd-To-End ServicesFigure 6# End-To-End Services TableL3VPN MP-BGP VPNv4 On-Demand Next-HopFigure 7# L3VPN MP-BGP VPNv4 On-Demand Next-Hop Control PlaneAccess Routers# Cisco ASR920 IOS-XE Operator# New VPNv4 instance via CLI or NSO Access Router# Advertises/receives VPNv4 routes to/from ServicesRoute-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Please refer to “On Demand Next-Hop (ODN) – IOS-XE” section forinitial ODN configuration.Access Router Service Provisioning (IOS-XE)#VRF definition configurationvrf definition L3VPN-SRODN-1 rd 100#100 route-target export 100#100 route-target import 100#100 address-family ipv4 exit-address-familyVRF Interface configurationinterface GigabitEthernet0/0/2 mtu 9216 vrf forwarding L3VPN-SRODN-1 ip address 10.5.1.1 255.255.255.0 negotiation autoendBGP VRF configuration Static & BGP neighbor Static routing configurationrouter bgp 100 address-family ipv4 vrf L3VPN-SRODN-1 redistribute connected exit-address-familyBGP neighbor configurationrouter bgp 100 neighbor Customer-1 peer-group neighbor Customer-1 remote-as 200 neighbor 10.10.10.1 peer-group Customer-1 address-family ipv4 vrf L3VPN-SRODN-2 neighbor 10.10.10.1 activate exit-address-familyL2VPN Single-Homed EVPN-VPWS On-Demand Next-HopFigure 8# L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Advertises/receives EVPN-VPWS instance to/fromServices Route-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Please refer to “On Demand Next-Hop (ODN) – IOS-XR” section forinitial ODN configuration.Access Router Service Provisioning (IOS-XR)#PORT Based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 1000 target 1 source 1 interface TenGigE0/0/0/5 l2transportVLAN Based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 interface TenGigE0/0/0/5.1 neighbor evpn evi 1000 target 1 source 1 interface TenGigE0/0/0/5.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!L2VPN Static Pseudowire (PW) – Preferred Path (PCEP)Figure 9# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) ControlPlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Access Router Service Provisioning (IOS-XR)#segment-routing traffic-eng policy GREEN-PE7 color 200 end-point ipv4 100.0.2.52 candidate-paths preference 1 dynamic pce ! metric type igpPort Based Service configurationinterface TenGigE0/0/0/15 l2transportl2vpn pw-class static-pw-class-PE7 encapsulation mpls control-word preferred-path sr-te policy GREEN-PE7 p2p Static-PW-to-PE7-1 interface TenGigE0/0/0/15 neighbor ipv4 100.0.2.52 pw-id 1000 mpls static label local 1000 remote 1000 pw-class static-pw-class-PE7 VLAN Based Service configurationinterface TenGigE0/0/0/5.1001 l2transport encapsulation dot1q 1001 rewrite ingress tag pop 1 symmetricl2vpn pw-class static-pw-class-PE7 encapsulation mpls control-word preferred-path sr-te policy GREEN-PE7 p2p Static-PW-to-PE7-2 interface TenGigE0/0/0/5.1001 neighbor ipv4 100.0.2.52 pw-id 1001 mpls static label local 1001 remote 1001 pw-class static-pw-class-PE7 Access Router Service Provisioning (IOS-XE)#Port Based service with Static OAM configurationinterface GigabitEthernet0/0/1 mtu 9216 no ip address negotiation auto no keepalive service instance 10 ethernet encapsulation default xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word ! pseudowire-static-oam class static-oam timeout refresh send 10 ttl 255 pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 status protocol notification static static-oam ! VLAN Based Service configurationinterface GigabitEthernet0/0/1 no ip address negotiation auto service instance 1 ethernet Static-VPWS-EVC encapsulation dot1q 10 rewrite ingress tag pop 1 symmetric xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word !pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 End-To-End Services Data PlaneFigure 10# End-To-End Services Data PlaneHierarchical ServicesFigure 11# Hierarchical Services TableL3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE)Figure 12# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Provider Edge Router# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together withPseudowire-Headend (PWHE) via CLI or NSO Provider Edge Router# Path to remote PE is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p L3VPN-VRF1 interface TenGigE0/0/0/5.501 neighbor evpn evi 13 target 501 source 501 ! ! !interface TenGigE0/0/0/5.501 l2transport encapsulation dot1q 501 rewrite ingress tag pop 1 symmetricPort based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 13 target 502 source 502 interface TenGigE0/0/0/5 l2transportAccess Router Service Provisioning (IOS-XE)#VLAN based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric !Port based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation defaultProvider Edge Router Service Provisioning (IOS-XR)#VRF configurationvrf L3VPN-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#501 ! export route-target 100#501 ! ! address-family ipv6 unicast import route-target 100#501 ! export route-target 100#501 ! !BGP configurationrouter bgp 100 vrf L3VPN-ODNTE-VRF1 rd 100#501 address-family ipv4 unicast redistribute connected ! address-family ipv6 unicast redistribute connected ! !PWHE configurationinterface PW-Ether1 vrf L3VPN-ODNTE-VRF1 ipv4 address 10.13.1.1 255.255.255.0 ipv6 address 1000#10#13##1/126 attach generic-interface-list PWHE!EVPN VPWS configuration towards Access PEl2vpn xconnect group evpn-vpws-l3vpn-A-PE3 p2p L3VPN-ODNTE-VRF1 interface PW-Ether1 neighbor evpn evi 13 target 501 source 501 !Figure 13# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 withPseudowire-Headend (PWHE) Data PlaneL3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRBFigure 14# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 withAnycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together with Anycast IRBvia CLI or NSO Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word !Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2 l2transport!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word !Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word !Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word !Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override ribAnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012L2VPN configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! !EVPN configurationevpn evi 12001 ! advertise-mac ! ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30!VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !!BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! !Figure 15# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB Datal PlaneL2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRBFigure 16# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L2VPN Multipoint EVPN instance together withAnycast IRB via CLI or NSO (Anycast IRB is optional when L2 and L3is required in same service instance) Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Please note that provisioning on Access and Provider Edge routers issame as in “L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB”. In this use case there is BGP EVPN instead of MP-BGPVPNv4/6 in the core.Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word !Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! ! interface TenGigE0/0/0/2 l2transport!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word !Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word !Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word !Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override ribAnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012L2VPN Configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! !EVPN configurationevpn evi 12001 ! advertise-mac ! ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30!VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !!BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 17# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Data Plane", "url": "/blogs/2018-05-09-metro-design-implementation-guide/", "author": "Jiri Chaloupka", "tags": "iosxr, cisco, Metro, Design" } , "#": {} , "blogs-2018-05-01-express-peering-fabric": { "title": "Express Peering Fabrics", "content": " On This Page Express Peering Fabric Overview Figure 1# Traditional Peering and Content Delivery Figure 2# Optimized Regional Express Peering Fabric Regional Transport Design Flexible Photonic Network On-net Content Source Facilities Coherent Optics Network Modeling Control Plane Design Should I build an Express Fabric? Cisco Express Peering Fabric Components Additional Efficiency Options Local Caching Nodes ICN Express Peering FabricOverviewRegional SP networks serving residential subscribers are typically deployed in an aggregation/access hierarchy using logical Ethernet connections over a regional optical transport network, as shown in Figure 1 below. The aggregation nodes serve as an aggregation point for connections to regional sites along with acting as the ingress point for traffic coming from the SP backbone. SPs can drive even greater efficiency by selecting specific high bandwidth regional sites for core bypass. This is simply connecting the regional hub routers directly to a localized peering facility or facilities, bypassing the regional core aggregation nodes which are simply acting as a pass through for the traffic. This is called an Express Peering Fabric. Due to the growth in Internet video traffic, this secondary express peering network in time will likely be higher capacity than the original SP converged network. The same express peering network can also be used to serve content originated by the SP, leaving the converged regional network to serve other higher priority traffic needs.Not only is the express delivery network design a more efficient logical design, it can also use a simplified control-plane as the network does not need to support more complex network services or multicast video delivery. The RIB and FIB resources to carry video delivery routes are also reduced, requiring less power and memory resources than a device capable of carrying a full Internet routing table. Service providers are advised to look for hardware supporting flexible FIB options delivering the greatest environmental efficiency.Figure 1# Traditional Peering and Content DeliveryFigure 2# Optimized Regional Express Peering FabricRegional Transport DesignFlexible Photonic NetworkOne of the key building blocks to express delivery networks is flexible placement of DWDM circuits between ingress peering endpoints and end user locations. The regional transport network must allow DWDM wavelengths direct reach between peering and content locations to subscriber locations without additional router hops. The lowest layer block is a flexible photonic layer providing any-to-any wavelength connectivity through multi-degree colorless and contentionless ROADMs and add-drop complexes. TOn-net Content Source FacilitiesIn some instances, providers have built linear extensions from regional peering locations to core aggregation sites since all connectivity went between the peering routers to the metro core aggregation routers. In order to eliminate redundant hops, the peering locations must be connected to upstream ROADMs to directly reach subscriber locations. There is generally very little cost incurred with adding additional multi-degree ROADMs today, and their use greatly increases network flexibility. While it’s most beneficial to have the peering location connected to diverse sites via a fiber ring, even a linear route connected via ROADM will pay dividends in network agility and efficient connectivity.Coherent OpticsAnother key to the transport design is the use of coherent transponders or coherent integrated IPoDWDM ports. Coherent optics give transport networks longer reach without regeneration and flexibility through tuning across 80+ channels. This tuning flexibility allows the ability to connect router interfaces to any wavelength across an optical transport network. High-density 100G is typically done through 100G muxponders and transponders, while integrated IPoDWDM coherent router interfaces support 100 or 200G per port for sites which may not support a transport shelf deployment.Network ModelingNetwork modeling must be performed to determine which sites are candidates for direct connectivity to content locations. The modeling is based on factors such as statmux gain, component cost, and resource cost such as DWDM wavelengths. A simplified traffic demand matrix needs to be computed from the ingress traffic location to the egress customer sites. Netflow can be used as a tool to determine how much traffic is being sent to customer prefixes at each site. Alternative to Netflow, networks using MPLS can derive the stats to each egress router using either MPLS FEC or TE Tunnel statistics. Once the traffic matrix has been computed, a network model can be created with and without bypass links to calculate the total number of router interfaces and transport links needed. There will be an optimal traffic percentage where connecting a bypass link aids efficiency. In some cases however, traffic growth may be projected to be high enough over time to connect all sites day one.Control Plane DesignIn most cases the peering or content location routers will be connected to both an end location as well as the metro core aggregation network. Care must be taken to make sure the end site locations do not act as transit paths between content location and the core. In order to create an isolated domain, use carefully selected metrics to ensure traffic does not flow through the wrong links. Another option is to use a separate IGP process entirely for the express network, ensuring the end site nodes cannot become transit nodes from the content location to the core aggregation nodes. Using multiple loopback addresses is recommended in that instance to create additional separation between networks. More advanced techniques may also be used such as using Segment Routing TE Policies to define an express routing plane across the regional network. Advancements in SR technology such as the Flexible Algorithm selection outlined at http#//www.segment-routing.net/tutorials/2018-03-06-segment-routing-igp-flex-algo/ can be used to build a virtual topology specific to Express Peering without considerable control-plane complexity.Should I build an Express Fabric?There are several factors that go into whether or not building an Express Peering fabric is the right approach for your network. Most important is to analyze the traffic coming into your network from external peers and determine the true network cost of the traffic path from ingress to egress. Building a detailed network cost model incorporating physical fiber, optical transport, and IP networks will allow you to gain insight into how much each hop of the network path costs at each layer and combined. An advanced network modeling tool such as Cisco WAE Planning can help build a network model and simulate the current network as well as potential Express Fabric designs to determine if building an Express Network is an efficient solution. However, if you have taken the steps to build a local peering location, an Express Fabric is the next logical step in reducing cost from ingress peer to customer endpoints.Cisco Express Peering Fabric ComponentsCisco’s family of Network Convergence System components bring both the scale and flexibility to maximize network efficiency for SP content traffic.One of the keys to building an optimized peering fabric is an agile photonic network. The Cisco NCS2000 with its intelligent high-density multi-degree ROADMs and GMPLS control-plane give providers the flexible photonic layer needed to construct a more efficient express traffic delivery network. The NCS2000 and its family of integrated muxponders can support 96 channels at 200G per wavelength. The NCS1010 flexible ROADM. The 2RU NCS1002 muxponder provides a flexible 2Tbps of client and trunk capacity. The new NCS1004 increases scale to 4.8Tbps of client and trunk capacity with wavelength capacities up to 600G in 2RU. The NCS 1002 and 1004 are powered by IOS-XR, supporting rich telemetry and automation capabilities. The NCS5500 routing platform has flexible fixed and modular chassis options. The 1RU NCS-55A1-36H-SE has 36 100G interfaces with a 4M IPv4 route FIB capacity. The modular NCS-5504 and NCS-5508 support the same scale in each line card slot. A 6x200G IPoDWDM line card can be used to extend connections over passive optical muxes or dark fiber at up to 200G per interface.Learn more about Cisco NCS optical networking at https#//www.cisco.com/c/en/us/products/optical-networking/index.htmlLearn more about the Cisco NCS 5500 series of IP routers at https#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.htmlAdditional Efficiency OptionsLocal Caching NodesPlacing CDN cache nodes directly into service provider aggregation or end subscriber locations can also reduce cost and netowrk complexity. The CDN nodes can be 3rd party cache nodes supplied by a content provider, such as the Netflix OpenConnect appliance, or internal CDN nodes delivering service provider video content. The main benefits to using local cache nodes are reduction in network resources and improved QoE for subscribers. The cache hit rate or efficiency of the nodes varies but in general they are very good and for very high bandwidth flash events, like the release of a new season of a TV series, the majority of content can be delivered locally. Using distributed caches which also serve as origins for downstream caches can help emulate a multicast delivery network without the operational headaches of multicast.Aggregating caching nodes requires high speed routers with adequate buffering capacity due to the bursty traffic profile of video traffic. The NCS-5500 has deep buffers along with the 10G, 25G, and 100G density required to satisfy cache node aggregation needs. Scale-out network design allows providers to build delivery fabrics in the hundreds of Tbps.Work has been done to standardize caching infrastructure through the Streaming Video Alliance, found at https#//www.streamingvideoalliance.org. The Streaming Video Alliance is a consortium of service providers, network hardware and software vendors, and content networks. The Open Cache initiative is meant to create a caching server capable of caching any content, owned and operated by the service provider. Work has been done by the IETF CDNI working group to define a framework of how caching nodes interconnect and route requests between providers, and the Open Cache WG in the SVA has adopted most of that architecture. There are however many challenges to open caching such as content encryption, quality of experience metrics, and efficient request routing.ICNInformation Centric Networking has gained much research exposure over the last several years, with two primary archtectures being Concent Centric Networking and Named Data Networking. The premise behind ICN is the Internet is almost completely content-driven today, so request routing and delivery should be based off content names and not IP addresses. It tackles the concept of location vs. content identifier. Caching is ubiquitous in the ICN architecture to aid in efficient content delivery. Typically every ICN router has one or more cache nodes to serve local content from when additional requests are made. ICN currently is mostly a research effort, with work being led by the IETF ICNNG working group. CCN and NDN networks can be created as overlays over IP using Linux software as a way to explore the architecture and routing constructs of ICN.", "url": "/blogs/2018-05-01-express-peering-fabric/", "author": "Phil Bedard", "tags": "iosxr, Peering, Design" } , "#": {} , "blogs-latest-core-fabric-hld": { "title": "Core Fabric Design", "content": " On This Page Key Drivers Scale Network Availability Cost Automation Network Simplification High-Level Design Topology# Scale Out and Scale Up Platforms Control-Plane Telemetry Automation Validated Design Use Cases LDP to SR Core Migration Starting Point Step 1# Enable SR Step 2# Enable TI-LFA Step 3# Enable Mapping Servers Step 4# Protocol Simplification LDP over RSVP-TE to SR Migration LDP to SR End-to-End Migration Step 1# Enable SR on the PEs Step 2# Enable SR on the PEs Low Level Design Validation Topology Hardware Detail Control Plane Configuration Enable Segment Routing in ISIS Enable TI-LFA Enable Mapping Server Disable LDP Automation Validation SR Validation TI-LFA Validation Mapping Server Validation Model-Driven Telemetry Appendix Applicable YANG Models XML Configuration Examples Enable Segment Routing (XML) Enable TI-LFA (XML) Enable Mapping Server (XML) Disable LDP (XML) NSO SR Service Creation via Northbound RESTCONF API Examples Create SRGB (RESTCONF) Create SR-Infrastructure (RESTCONF) Create SR Service (RESTCONF) YANG Models for SR Operational Data Example Usage (IGP Verification) YANG Models for TILFA Operational Data Example Usage (IGP Backup Routes) YANG Models for SR Mapping Server Operational Data Example Usage (Mapping Server Verification) For More Information NSO IOS-XR Key DriversThe Core Fabric Design represents an evolution in backbone core networks. There are several factors driving this evolution.ScaleDriven by broadband speed improvements, the rise of video and the advent of 5G, global IP traffic is expected to triple in the next five years. However, not all of that traffic is destined to cross the traditional backbone core. Already, the majority of Internet traffic is delivered by Content Delivery Networks (CDNs) (e.g. Google, Apple, Amazon, Microsoft, Facebook, Netflix) on private CDN networks. CDNs are bringing content closer to the end user, delivering content into regional networks and, increasing, directly to the metro. With metro-based peering and caching, some Service Providers are already serving 70% of externally-sourced traffic from peering points or CDNs from within their major metros.The change in traffic patterns are significant in core design for several reasons. First, the traditional Service Provider backbone network will grow more slowly, moderating the rapidly increasing scale requirements that have characterized core design for the last 15 years. Second, the connectivity requirements of different core functions (e.g. connecting to other core routers over long-haul links, connecting to metro PE and/or aggregation routers, connecting to peering) may scale differently. Third, private backbones will continue to grow as CDN operators build out capacity to deliver their content to regional and metro networks. Lastly, the metro core will evolve to meet the scale demands of the metro-delivered content.Network AvailabilityNetwork availability has always been a key design goal of core networks. The industry’s attempt to provide network resilience through device-level redundancy mechanisms (NSF, NSR, ISSU, etc) is reaching the limits of effectiveness. While the typical 1+1 redundancy model provides “good enough” availability for many operators, others are looking for new ways to increase availability, whether through new topologies or simplified protocols. Instead of building ever-larger and more complex devices to connect expensive WAN links, core networks can leverage horizontal, DC-inspired fabrics to scale out while increasing availability. Because traffic is spread across more, smaller boxes in a fabric, the loss of any single device has a much smaller blast radius. Instead of using a complex and stateful protocol like RSVP-TE for Fast Re-Route (FRR), customers can use the simpler, built-in mechanisms of Segment Routing Transport Independent Loop Free Alternative (TI-LFA).CostFor many operators, bandwidth scales faster than revenue, so the core must be cost effective to build and operate. The lower cost of 100GE routers and line cards, fueled by the speed and power optimization of new silicon offerings, have accelerated the adoption of 100GE in high end routing. Today, simple, high-density routers can provide up to 57.6 Tbps in a single chassis at a fraction of the cost per 100 GigabitEthernet interface as on previous generations of silicon/network-processors. The increase in capacity enables providers to consider reducing or eliminating complex features (like QoS) that were developed to manage limited capacity. It is also possible to build fabrics out of smaller, simpler systems to achieve capex flexibility# you add more capacity when you need it. To scale the cost of operations, automation is the only answer.AutomationManual operations are expensive and don’t scale. The next-gen core design must be fully automatable, starting with data models and APIs in the network routers. Additional tools can provide layers of abstraction that make it easier for operator’s OSS/BSS to quickly translate business intent to network operation.Network SimplificationSimplicity scales; complexity does not. By reducing the amount of state and number of protocols in the core, operators can deploy networks that are easier to scale and simpler to operate. Simplification also enables automation# the simpler and more uniform the design, the easier it is to automate. Optimized control-plane elements and feature sets enhance stability and availability. Dedicating nodes to specific functions of the network also helps isolate the rest of the network from malicious behavior, defects, or instability.High-Level DesignThe Core design incorporates high-density routers and a small, recommended feature set delivered through IOS-XR to solve the needs of service providers. The Core design also includes ways to monitor the health and operational status of the core and assist providers in migration. All Cisco SP Validated designs are both feature tested and validated as a complete design.Topology# Scale Out and Scale UpThe gold standard of SP core deployments in the classic two-node setup, where two chassis connect to each other and the Metro or Peering elements in a redundant fashion. In very small deployments, the P and PE functions may be collapsed into a single device.In such a setup, scale is achieved by “scaling up” (aka vertical scaling). In a scale-up model, you add more capacity by replacing small chassis with large chassis or adding higher-density line cards to modular chassis.The ultimate “scale up” topology is a multi-chassis cluster that can add entire line card chassis to support more connections.While scale-up systems are suitable for many applications, failures and operational issues can be difficult to troubleshoot and repair. Moreover, when there is only one or two large boxes in a role, the “blast radius” of an outage can be substantial. The large blast radius means that scale-out systems require lengthy planning and timeframes for system upgrades. To address these concerns, the core design offers a horizontally scalable option with the goal of improving availability while simplifying failure types to more deterministic interface or node failures. Horizontal scaling is sometimes referred to as “scaling out.” It also enables throughput scaling beyond what is possible with today’s largest chassis.Scale out can be as simple as adding more standalone boxes in parallel (e.g. 4 or 8 smaller routers instead of the traditional 2).Having more, smaller routers increases the amount of connectivity for the metro and peering while reducing the blast radius of a single failure. A single router can fail (or be taken out of service for upgrade) with a much smaller impact on the network. While scale out results in more boxes to manage, automation can be used to reduce complexity and ensure consistency across the network.The reference topology for the Core design supports the standard 1+1 deployment, the collapsed P/PE deployment for small deployments, and a simple scale-out.PlatformsThe Cisco NCS5500 platform is ideal for backbone core routing, given its high-density, large RIB and FIB scale, buffer capacity, and IOS-XR software feature set. The NCS5500 is also space and power efficient with 36x100GE in a 1RU fixed form factor or single modular line card.A minimal core can provide 36x100GE, 144x10GE, or a mix of non-blocking connections with full resiliency in 4RU. The fabric can also scale to support 10s of terabits of capacity in a single rack for large core deployments. Fixed chassis are ideal for incrementally building a fabric# the NCS NC55-36X100GE-A-S and NC55A1-24H are efficient high density building blocks which can be rapidly deployed as needed without installing a large footprint of devices day one. Deployments needing more capacity or interface flexibility can utilize the NCS5504 4-slot, NCS5508 8-slot or NCS5516 16-slot modular chassis. If the network has a need for multicast or other advanced features, the ASR9000 family or other node can be incorporated into the design.All NCS5500 routers also contain powerful Route Processors to unlock telemetry and programmability. The fixed chassis contain 1.6Ghz 8-core processors and 32GB of RAM. The latest NC55-RP-E for the modular NCS5500 chassis has a 1.9Ghz 6-core processor and 32G of RAM.Control-PlaneThe core design introduces a simplified control-plane built on Segment Routing that supports scale and efficiency in transport.Segment Routing extensions advertises MPLS Labels for devices (Prefix-SIDs) and links (Adjacency-SIDs) as part of the IGP which enables a simplified forwarding plane, eliminates the need for LDP and RSVP, and increases scale and stability by eliminating the heavy amount of control plane state the router must manage while maintaining an RSVP-TE mesh. With Segment Routing, traffic can be source-routed to a node SID or an anycast SID representing a set of nodes. ECMP behavior is preserved at each point in the network and redundancy is simplified.Because each node in the domain calculates a primary and backup path, Segment Routing automatically provides a Fast ReRoute (FRR) mechanism using Topology Independent Loop Free Alternative (TI-LFA). TI-LFA requires no additional signaling protocol and typically provides convergence times below 50 ms with 100% link and node protection. More information on Segment Routing technology and its future evolution can be found at http#//segment-routing.netTelemetryThe core design uses the rich telemetry available in IOS-XR and the NCS5500 platform to enable an unprecedented level of insight into network and device behavior. The Cisco SP Validated Core leverages Model-Driven Telemetry and NETCONF along with both standard and native YANG models for metric statistics collection. Telemetry configuration and applicable sensor paths have been identified to assist providers in knowing what to monitor and how to monitor it. Through streaming data mechanisms such as Model-Driven Telemetry, providers can extract data useful for operations, capacity planning, security, and many other use cases.The core also fully supports traditional collections methods such as SNMP, and NETCONF using YANG models to integrate with legacy systems.Other telemetry mechanisms such as Netflow and BMP have limited applicability in the Core and are not in the scope of this design.AutomationNETCONF and YANG using OpenConfig and native IOS-XR data models are used to help automate configuration and validation. Cisco has developed Network Service Orchestrator (NSO) services to help automate common Segment Routing migration tasks using NETCONF NEDs.Validated DesignThe control, management, and forwarding planes in this design have undergone validation testing to ensure individual design features work as intended and the peering fabric as a whole performs without fault. Validation is done exceeding real-world scaling requirements to ensure the design fulfills its rule in existing networks with room for future growth.Use CasesLDP to SR Core MigrationMany customers run LDP in the core today. To simplify and scale, one of the most important steps is to migrate from LDP to Segment Routing (SR). The benefits of SR are clear# protocol simplification, simplified resiliency, and multi-domain programmability. What is less clear is how to accomplish the migration. Greenfield networks are rare and pockets of LDP will continue to exist in most networks for a long time to come. Cisco SP Validated Core approaches SR migration in a series of steps that are intended to maintain existing functionality while gradually enabling SR and transitioning traffic to SR LSPs. Each migration step has an NSO service profile associated with it to ensure a consistent, best-practice implementation.While end-to-end SR is the goal, it may not always be possible at a given point in time. Customers may be looking to refresh the P routers and not the PE routers. Legacy PE routers may not support SR even if new PEs do. Therefore, it is important to find value at each step of the migration and maintain support for non-SR PEs through the process.Starting PointThis migration use case assumes that the Core is already configured as a functional MPLS network with the following characteristics# Single instance of ISIS as the IGP LDP for label distribution BGP-free core Working L2/L3 VPN services from PE to PE using BGPThe goal of the migration is to preserve existing services while migrating the core in an incremental, validated, step-by-step fashion.Step 1# Enable SRIn the first step of migration, SR is enabled on the P routers using CLI or (preferably) NETCONF/YANG. The latter can be orchestrated using the NSO sr service, ensuring that# Every router uses the same ISIS instance, loopback interface and global block of labels. Every router is assigned a unique prefix-SID from the global block of labels. The SR service can be rolled out across multiple devices in a transactional fashion and seamlessly rolled back if needed.Deploying SR by itself in this way will not impact the forwarding plane in any way# the P routers will continue to use LDP labels to forward traffic until 1) the PE routers use SR or 2) LDP is disabled in the core. Nevertheless, it is possible to deeply validate the operation of SR even while still using LDP forwarding. The following validations can be performed via CLI or (preferably) a model-based query method such as NETCONF/YANG, YDK, or NSO’s live-status capability# Each router successfully assigns the desired block from the label database. The IGP is advertising SR labels for every SR-enabled router’s loopback. Each router is programming the SR labels in the RIB and FIB. Traffic is forwarded using LDP labels.YANG-modeled operational data can also be streamed using model-driven telemetry. In this example, model driven telemetry is streaming Cisco-IOS-XR-mpls-lsd-oper#mpls-lsd/label-summary data, making it easy to see that the number of labels assigned to ISIS jumps to 2000 when the SR global block of labels is configured.Step 2# Enable TI-LFAIn this step, TI-LFA is configured for link protection. The NSO ti-lfa service leverages the same resource pools as the “sr” service and greatly simplifies the configuration process by enabling TI-LFA under every non-loopback interface in the given ISIS instance.As soon as it is enabled, TI-LFA protects IP, LDP and SR traffic. This means that all traffic in the Core now has the benefit of sub-50 millisecond convergence times without complicated RSVP-TE tunnels. Network availability is improved even before the primary forwarding plane is switched to SR.The operation of TI-LFA should be validated to ensure that all paths are protected. YANG models can be used to retrieve or stream the relevant operational data.In the example below, model driven telemetry is streaming Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/topologies/topology/frr-summary, making it easy to see that the number of protected paths increases when TI-LFA is configured.Step 3# Enable Mapping ServersIn this step, mapping servers are configured to provide SR labels for LDP-only endpoints, specifically the loopback addresses of non-SR PEs. Mapping servers can be configured anywhere in the network. At least two mapping servers should be configured for redundancy.This step can be achieved through CLI or NETCONF/YANG. To automate the process, use the NSO sr-ms service which leverages the same infrastructure as the “sr” services to simplify and ensure consistency in the configuration process.At the end of this step, the P routers will have SR labels for all P and PE routers. However, the VPN services will still use the LDP LSPs from end-to-end since the non-SR PE routers still initiate the service with an LDP label.To validate the Mapping Server configuration, check that the non-SR endpoint addresses are represented by labels in the IGP, RIB and FIB as in Step 1. In addition, specific prefixes can be queried on the mapping server.Step 4# Protocol SimplificationOnce the P routers are fully configured for SR, LDP can optionally be disabled on a link-by-link basis for every link pair that has SR enabled on each end. When this step is accomplished, the P routers will use the SR label for the path across the core. The benefit of this step is fewer protocols to maintain and troubleshoot in the core. There should be no impact to the VPN services when the transition is made.The disable-ldp service can be used in NSO to orchestrate this step on a link-by-link basis. Telemetry can be used to track the impact of disabling LDP on core-facing interfaces using the Cisco-IOS-XR-mpls-ldp-oper#mpls-ldp/global/active/default-vrf/summary path as shown below.Some customers may choose not to disable LDP in the core until all PE routers have been migrated to SR as well, creating an end-to-end S deployment. In that case, LDP provides the primary forwarding path while SR provides TI-LFA until the rest of the network is ready to switch to SR.LDP over RSVP-TE to SR MigrationCustomers with LDP cores who need fast-recovery in the event of a failure often deploy RSVP-TE for fast reroute (FRR). A common design pattern is to create a mesh of TE tunnels among P routers and tunnel LDP over it. A P router mesh does not provide an end-to-end solution for fast-recovery but it is more scalable than a PE router mesh. But even a P router mesh can require a substantial amount of configuration as the number of tunnels scales as the square of the number of P routers. Optimizations such as auto-tunnel mesh groups can be used to simplify the configuration. Even so, the amount of RSVP state that the network is required to maintain even for a P router mesh is substantial.Migrating from LDP over RSVP to Segment Routing in the Core provides an exceptionally good value proposition. Because TI-LFA provides fast-route for all traffic, customers can remove their RSVP-TE FRR configuration and all the associated complexity and state while maintaining sub 50-milisecond convergence. Moreover, SR with TI-LFA enables 100% coverage and micro-loop avoidance.To achieve this use case, follow the same steps as above, disabling RSVP-TE in the core when all the core routers have been enabled for SR.LDP to SR End-to-End MigrationIn the case where the PE devices are SR-capable, the previous use cases can be extended to run SR end-to-end. This can be done incrementally on a PE by PE basis until all PEs are migrated. The benefits of this additional step include end-to-end TI-LFA and further reduction in LDP maintenance. In addition, once end-to-end SR transport has been implemented, the Core is ready to integrate with other SR designs in the Metro and Peering.The goal of this use case is the same as before# enable full or partial migration of the PE devices without service disruption.Step 1# Enable SR on the PEsEnabling SR on the PEs will happen in two stages. First, SR will be enabled but not preferred. This means that the PEs will learn SR labels for all the endpoints in the network but still initiate the VPN services using LDP labels. This step can be accomplished using the same NSO “sr” service as the P routers in the previous use case and validation follows the same steps for the PE routers.Step 2# Enable SR on the PEsFinally, the PEs will be configured for “sr-prefer” one by one, slowly transitioning the traffic to end-to-end SR. The NSO sr service includes an option for sr-prefer in the “sr” service. LDP can then be disabled on the PEs as well.Low Level DesignValidation TopologyThe Core validation topology included three types of core sites# a standard 2 P x 2 PE design, a collapsed 2xP/PE and a scale out design with 4xP routers.Hardware DetailThe NCS5500 family of routers provide high density, ideal buffer sizes, and environmental efficiency for core routing use cases. All of the following platforms can be used in the Core designs above. Further detailed information on each platform can be found at https#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.html.NCS-55A1-36HThe 55A1-36H is a second generation 1RU NCS5500 fixed platform with 36 100GE QSFP28 ports operating at line rate. All the ports can support 100GE and 40GE optics as well as 25G to 10GE breakout. It also contains a powerful multi-core route processor with 64GB of RAM and an on-board 64GB SSD. Its high density, efficiency, and buffering capability make it ideal in 10GE or 100GE deployments.NCS-55A1-24HThe NCS-55A1-24H is a second generation 1RU NCS5500 fixed platform with 24 100GE QSFP28 ports. It uses two 900GB NPUs, with 12X100GE ports connected to each NPU.NCS 5504 and 5508 Modular Chassis and NC55-36X100G line cardLarge deployments or those needing interface flexibility such as IPoDWDM connectivity can use the modular NCS5500 series chassis.Control PlaneThe Core uses a single instance of ISIS that encompasses all PE and P devices with BFD for fault protection. BGP VPNv4/v6 runs between the PEs to provide services. For label distribution, LDP and/or RSVP-TE runs between the P devices with LDP between P and PE devices with the ultimate goal of transitioning all label distribution to Segment Routing.ConfigurationThe following configuration guidelines will step through the major components of the device and protocol configuration specific to SR migration in the Core. Only the net-new configuration for SR is included. It is assumed that an ISIS instance is fully configured and operational across all nodes, as well as LDP.CLI examples are given here for readability. The equivalent NETCONF/YANG examples (preferred for automation) are in the appendix. Ideally, these configurations would be deployed via NETCONF/YANG using NSO service packs as described in the next section.Full configurations used in the validation testing are available in github.Enable Segment Routing in ISISrouter isis ISIS-CORE net 49.1921.6800.0005.00 segment-routing global-block 17000 19000 address-family ipv4 unicast segment-routing mplsinterface Loopback0 address-family ipv4 unicast prefix-sid absolute 17000Enable TI-LFArouter isis ISIS-CORE interface Bundle101* address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa*repeat for each interface in the IGPEnable Mapping Serverrouter isis ISIS-CORE address-family ipv4 unicast segment-routing prefix-sid-map advertise-localsegment-routing mapping-server prefix-sid-map address-family ipv4 192.168.0.100/32 18500 range 500Disable LDPmpls ldp no interface g0/0/0/2AutomationThe configuration tasks required for the migration use cases are encapsulated in NSO resource-pools and service packages as summarized below. To download services templates, visit the Devnet NSO Developer Forum. For examples of how to configure these services using the NSO Northbound RESTCONF API, see the Appendix. Name Purpose Example (ncs_cli) id-pool Resource-pool for common global block of SR labels. resource-pools id-pool SRGB-POOL1 range start 17000 end 19000 sr-infrastructure Associates an IGP Instance, a Loopback and a global block of labels sr-infrastructure instance-name ISIS-COREloopback 0sr-global-block-pools SRGB-POOL1 sr Defines an sr service. services sr DENVERrouter P3 instance-preference use-sr-infrastructureprefix-preference auto-assign-prefix-sid ti-lfa Defines a TI-LFA services. services ti-lfa DENVER-LFA address-family ipv4router P3 instance-name-preference use-sr-infrastructure interface-preference all-interfaces sr-ms Defines an service for creating SR Mapping Servers services sr-ms MAP-SERV-1router P3 instance-name-preference use-sr-infrastructure address-family ipv4 ipv4-address 192.168.0.1 prefix-length 32 first-sid-value 25 number-of-allocated-sids 100 disable-ldp Defines a service for disabling LDP on a link-by-link basis. services disable-ldp 102router P3 interface-type HundredGigE interface-id 0/0/0/4 ValidationSR ValidationThe following table shows a series of validation steps. Operational commands are provided in CLI for readability. Operational YANG models are provided in the Appendix. Component Validation Common CLI Label Database SRGB Label Range Has Been Allocated to ISIS show mpls label table summary show mpls label table label 17000 detail IGP IGP Advertises Labels for Every SR Router’s Loopback show isis segment-routing label table RIB SR Labels are Programmed in RIB show route <address/prefix> detail FIB SR Labels are Programmed in FIB show mols forward labels <label> Forwarding Traffic is Forwarded Using LDP Labels traceroute <address>traceroute sr-mapls <address/prefix> TI-LFA ValidationCLI is given below for readability. Operational YANG models are provided in the Appendix. Component Validation Common CLI IGP Every Prefix Has a Backup Path show isis fast-reroute IGP Number of Paths Protected show isis fast-reroute summary Mapping Server ValidationCLI is given below for readability. Operational YANG models are provided in the Appendix. Component Validation Common CLI SR Non-SR Endpoints Have Mapping Entries show segment-routing mapping-server prefix-sid-map ipv4 <address/prefix> Model-Driven TelemetryThe configuration below creates two sensor groups and two subscriptions. Migration-Summary tracks important summary statistics when migrating to Segment Routing. The Interface sensor group contains commonly used interface counters. Putting the two sensor-groups in different subscriptions ensures that each one is assigned its own thread for more efficient collection. A single destination is used, however multiple destinations could be configured. The sensors and timers provided are for illustration only.telemetry model-driven destination-group DEST1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! !sensor-group Migration-Summary sensor-path Cisco-IOS-XR-mpls-lsd-oper#mpls-lsd/label-summary sensor-path Cisco-IOS-XR-mpls-ldp-oper#mpls-ldp/global/active/default-vrf/summary sensor-path Cisco-IOS-XR-fib-common-oper#mpls-forwarding/nodes/node/forwarding-summary sensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/topologies/topology/frr-summary !sensor-group Interface-Counters sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters subscription Migration sensor-group-id Migration-Summary sample-interval 30000 destination-id DEST1subscription Interface sensor-group-id Interface-Counters sample-interval 30000 destination-id DEST1AppendixApplicable YANG ModelsModelDataopenconfig-interfacesCisco-IOS-XR-infra-statsd-operCisco-IOS-XR-pfi-im-cmd-operInterface config and state Common counters found in SNMP IF-MIB openconfig-if-ethernet Cisco-IOS-XR-drivers-media-eth-operEthernet layer config and stateXR native transceiver monitoringopenconfig-platformInventory, transceiver monitoring openconfig-telemetryConfigure telemetry sensors and destinations Cisco-IOS-XR-ip-bfd-cfg Cisco-IOS-XR-ip-bfd-operBFD config and state Cisco-IOS-XR-ethernet-lldp-cfg Cisco-IOS-XR-ethernet-lldp-operLLDP config and state openconfig-mplsMPLS config and state, including Segment RoutingCisco-IOS-XR-clns-isis-cfgCisco-IOS-XR-clns-isis-operIS-IS config and state Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-operNCS 5500 HW resources Cisco-IOS-XR-mpls-lsd-operMPLS Label Switch Database state data Cisco-IOS-XR-mpls-ldp-cfg Cisco-IOS-XR-mpls-ldp-operLDP config and state Cisco-IOS-XR-fib-common-operPlatform Independent FIB State Cisco-IOS-XR-ip-rib-ipv4-operPlatform Independent FIB State XML Configuration ExamplesEnable Segment Routing (XML)<isis xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-clns-isis-cfg~> <instances> <instance> <instance-name>ISIS-CORE</instance-name> <srgb> <lower-bound>17000</lower-bound> <upper-bound>19000</upper-bound> </srgb> <afs> <af> <af-name>ipv4</af-name> <saf-name>unicast</saf-name> <af-data> <segment-routing> <mpls>ldp</mpls> </segment-routing> </af-data> </af> </afs> <interfaces> <interface> <interface-name>Loopback0</interface-name> <interface-afs> <interface-af> <af-name>ipv4</af-name> <saf-name>unicast</saf-name> <interface-af-data> <prefix-sid> <type>absolute</type> <value>17000</value> </prefix-sid> </interface-af-data> </interface-af> </interface-afs> <running/> </interface> </interfaces> <running/> </instance> </instances></isis>Enable TI-LFA (XML)<isis xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-clns-isis-cfg~> <instances> <instance> <instance-name>ISIS-CORE</instance-name> <interfaces> <interface> <interface-name>HundredGigE0/0/0/0</interface-name> <interface-afs> <interface-af> <af-name>ipv4</af-name> <saf-name>unicast</saf-name> <interface-af-data> <interface-frr-table> <frrtilfa-types> <frrtilfa-type> <level>not-set</level> </frrtilfa-type> </frrtilfa-types> <frr-types> <frr-type> <level>not-set</level> <type>per-prefix</type> </frr-type> </frr-types> </interface-frr-table> </interface-af-data> </interface-af> </interface-afs> <running/> </interface> </interfaces> <running/> </instance> </instances></isis>Enable Mapping Server (XML)<isis xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-clns-isis-cfg~> <instances> <instance> <instance-name>ISIS-CORE</instance-name> <afs> <af> <af-name>ipv4</af-name> <saf-name>unicast</saf-name> <af-data> <segment-routing> <prefix-sid-map> <advertise-local/> </prefix-sid-map> <mpls>ldp</mpls> </segment-routing> </af-data> </af> </afs> <running/> </instance> </instances></isis> <sr xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-segment-routing-ms-cfg~> <mappings> <mapping> <af>ipv4</af> <ip>ip2.168.0.100</ip> <mask>32</mask> <sid-start>18500</sid-start> <sid-range>500</sid-range> </mapping> </mappings> <enable/></sr>Disable LDP (XML)<mpls-ldp xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-mpls-ldp-cfg~> <default-vrf> <interfaces> <interface tags=~delete~> <interface-name>HundredGigE0/0/0/0</interface-name> </interface> </interfaces> </default-vrf> </mpls-ldp>NSO SR Service Creation via Northbound RESTCONF API ExamplesThe following examples show how to configure the SR services and resources using the northbound RESTCONF API on NSO.Create SRGB (RESTCONF)curl -X POST \\ http#//X.X.X.X#8080/restconf/data/resource-pools \\ -H 'Authorization# Basic **************' \\ -H 'Cache-Control# no-cache' \\ -H 'Content-Type# application/yang-data+xml' \\ -d '<id-pool xmlns=~http#//tail-f.com/pkg/id-allocator~> <name>SRGB-POOL1</name> <range> <start>17000</start> <end>19000</end> </range> </id-pool>'Create SR-Infrastructure (RESTCONF)curl -X POST \\ http#//X.X.X.X#8080/restconf/data \\ -H 'Authorization# Basic **************' \\ -H 'Cache-Control# no-cache' \\ -H 'Content-Type# application/yang-data+xml' \\ -d '<sr-infrastructure xmlns=~http#//cisco.com/ns/tailf/cf-infra~ xmlns#y=~http#//tail-f.com/ns/rest~ xmlns#cfinfra=~http#//cisco.com/ns/tailf/cf-infra~> <sr-global-block-pools> <name>SRGB-POOL1</name> </sr-global-block-pools> <instance-name>ISIS-CORE</instance-name> <loopback>0</loopback></sr-infrastructure>'Create SR Service (RESTCONF)curl -X POST \\ http#//X.X.X.X#8080/restconf/data/services \\ -H 'Authorization# Basic **************' \\ -H 'Cache-Control# no-cache' \\ -H 'Content-Type# application/yang-data+xml' \\ -d '<sr xmlns=~http#//cisco.com/tailf/sr~> <name>Denver</name> <router xmlns=~http#//cisco.com/tailf/sr~> <device-name>P3</device-name> <prefix-preference> <auto-assign-prefix-sid/> </prefix-preference> <instance-preference> <use-sr-infrastructure/> </instance-preference> </router> <router xmlns=~http#//cisco.com/tailf/sr~> <device-name>P4</device-name> <prefix-preference> <auto-assign-prefix-sid/> </prefix-preference> <instance-preference> <use-sr-infrastructure/> </instance-preference> </router> <router xmlns=~http#//cisco.com/tailf/sr~> <device-name>P31</device-name> <prefix-preference> <auto-assign-prefix-sid/> </prefix-preference> <instance-preference> <use-sr-infrastructure/> </instance-preference> </router> </sr>'YANG Models for SR Operational DataThe following models show the relevant YANG data models for retrieving operational data about the SR deployment.ComponentValidationModel SubstringLabel DatabaseSRGB Label Range Has Been Allocated to ISISCisco-IOS-XR-mpls-lsd-oper.yang mpls-lsd/label-summaryIGPIGP Is Advertising SR LabelsCisco-IOS-XR-clns-isis-oper.yang isis/instances/instance/topologies/topology/ipv4-routes/ipv4-route/native-status/native-details/primary/source/nodal-sidRIBSR Labels Are Programmed in RIBCisco-IOS-XR-ip-rib-ipv4-oper.yang rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/routes/routeFIBSR Labels Are Programmed in FIBCisco-IOS-XR-fib-common-oper.yang mpls-forwarding/nodes/node/label-fib/forwarding-details/forwarding-detail ForwardingTraffic is Forwarded using LDP LabelsNot available. Use “traceroute [mpls | sr-mpls]” CLI to validate forwarding.Example Usage (IGP Verification)The following RPC for NETCONF retrieves a list of prefixes in the ISIS database that have nodal-SIDs, thereby validating that the ISIS is correctly advertising SR labels.<get xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <filter> <isis xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-clns-isis-oper~> <instances> <instance> <topologies> <topology> <ipv4-routes> <ipv4-route> <native-status> <native-details> <primary> <source> <nodal-sid> <sid-value/> </nodal-sid> </source> </primary> </native-details> </native-status> </ipv4-route> </ipv4-routes> </topology> </topologies> <instance-name/> </instance> </instances> </isis> </filter></get>A partial sample of the returned data is shown below. Note that the SID value (“5”) returned in this model is an index, not an absolute value.<data xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <isis xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-clns-isis-oper~> <instances> <instance> <instance-name>ISIS-CORE</instance-name> <topologies> <topology> <af-name>ipv4</af-name> <saf-name>unicast</saf-name> <ipv4-routes> <ipv4-route> <prefix>192.168.0.1</prefix> <prefix-length>32</prefix-length> <native-status> <native-details> <primary> <source> <nodal-sid> <sid-value>5</sid-value> </nodal-sid> </source> </primary> </native-details> </native-status> </ipv4-route> </ipv4-routes> </topology> </topologies> </instance> </instances> </isis> </data>This same data can be retrieved using NSO’s live-status feature from CLI#admin@ncs# show devices device P31 live-status clns-isis-oper#isis instances instance topologies topology ipv4-routes ipv4-route native-status native-details primary sourceAnd via a RESTCONF call to NSO’s northbound API#curl -X GET \\ 'http#//X.X.X.X#8080/restconf/data/devices/device=P3/live-status/Cisco-IOS-XR-clns-isis-oper#isis/instances/instance=ISIS-CORE/topologies/topology?fields=ipv4-routes/ipv4-route/native-status/native-details/primary/source/nodal-sid/sid-value&depth=5' \\ -H 'Accept# application/vnd.yang.collection+json' \\ -H 'Cache-Control# no-cache' \\ -H 'Content-Type# application/yang-data+xml' \\YANG Models for TILFA Operational DataComponentValidationModel SubstringIGPGet A List of Every Prefix with a Backup PathCisco-IOS-XR-clns-isis-oper.yang isis/instances/instance/topologies/topology/ipv4frr-backups/ipv4frr-backup/prefixIGPNumber of Paths ProtectedCisco-IOS-XR-clns-isis-oper.yang isis/instances/instance/topologies/topology/frr-summaryExample Usage (IGP Backup Routes)The following query retrieves the prefixes that have a backup route. The <prefix/> filter can be removed for more detail.<get xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <filter> <isis xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-clns-isis-oper~> <instances> <instance> <topologies> <topology> <ipv4frr-backups> <ipv4frr-backup> <prefix/> </ipv4frr-backup> </ipv4frr-backups> </topology> </topologies> <instance-name/> </instance> </instances> </isis> </filter></get>An example of the returned data is shown below.<data xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <isis xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-clns-isis-oper~> <instances> <instance> <instance-name>ISIS-CORE</instance-name> <topologies> <topology> <af-name>ipv4</af-name> <saf-name>unicast</saf-name> <ipv4frr-backup> <prefix>192.168.0.1</prefix> <prefix-length>32</prefix-length> </ipv4frr-backup> <ipv4frr-backup> <prefix>192.168.0.2</prefix> <prefix-length>32</prefix-length> </ipv4frr-backup> <ipv4frr-backup> <prefix>192.168.0.3</prefix> <prefix-length>32</prefix-length> </ipv4frr-backup> <ipv4frr-backup> <prefix>192.168.0.4</prefix> <prefix-length>32</prefix-length> </ipv4frr-backup> </ipv4frr-backups> </topology> </topologies> </instance> </instances> </isis> </data>YANG Models for SR Mapping Server Operational DataComponentValidationModel SubstringMapping ServerNon-SR prefixes have assigned SIDsCisco-IOS-XR-segment-routing-ms-oper.yang srms/mapping/mapping-ipv4IGPIGP Is Advertising SR LabelsCisco-IOS-XR-clns-isis-oper.yang isis/instances/instance/topologies/topology/ipv4-routes/ipv4-route/native-status/native-details/primary/source/nodal-sidRIBSR Labels Are Programmed in RIBCisco-IOS-XR-ip-rib-ipv4-oper.yang rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/routes/routeFIBSR Labels Are Programmed in FIBCisco-IOS-XR-fib-common-oper.yang mpls-forwarding/nodes/node/label-fib/forwarding-details/forwarding-detailExample Usage (Mapping Server Verification)The following query can be qualified with a specific IP address and prefix (e.g. the loopback addresses of non-SR nodes in the network) to retrieve the mapped SID index for that address. Note that this query must be done against the node configured as a mapping server.<get xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <filter> <srms xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-segment-routing-ms-oper~> <mapping> <mapping-ipv4> <mapping-mi> <prefix>32</prefix> <ip>192.168.0.7</ip> <sid-start/> </mapping-mi> </mapping-ipv4> </mapping> </srms>An example of the returned data is shown below. Note that the SID value (“sid-start”) returned in this model is an index, not an absolute value.<data xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <srms xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-segment-routing-ms-oper~> <mapping> <mapping-ipv4> <mapping-mi> <ip>192.168.0.7</ip> <prefix>32</prefix> <sid-start>31</sid-start> </mapping-mi> </mapping-ipv4> </mapping> </srms>For More InformationNSONSO on DevNetNSO Example Services for Segment RoutingSimpler Segment Routing with NSO – The Inside StoryIOS-XRValidation Test Device ConfigurationsData-Model OverviewModel-Driven ProgrammabilityYANG Models by Release", "url": "/blogs/latest-core-fabric-hld", "author": "Shelly Cadora", "tags": "iosxr, Design, Core" } , "blogs-2018-10-01-metro-fabric-hld": { "title": "Metro Fabric High Level Design", "content": " On This Page Revision History Value Proposition Summary Technical Overview Transport – Design Use Cases Intra-Domain Intra-Domain Routing and Forwarding Intra-Domain Forwarding - Fast Re-Route Inter-Domain Inter-Domain Forwarding Area Border Routers – Prefix-SID vs Anycast-SID Inter-Domain Forwarding - Label Stack Optimization Inter-Domain Forwarding - High Availability and Fast Re-Route Transport Programmability Traffic Engineering (Tactical Steering) – SRTE Policy Transport Controller Path Computation Engine (PCE) Segment Routing Path Computation Element (SR-PCE) WAN Automation Engine (WAE) PCE Controller Summary – SR-PCE & WAE Path Computation Engine – Workflow Delegated Computation to SR-PCE WAE Instantiated LSP Delegated Computation to WAE Device Automation Zero Touch Provisioning Services – Design Overview Ethernet VPN (EVPN) Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or with Data Center End-To-End (Flat) – Services Hierarchical – Services Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHE Services – Route-Reflector (S-RR) Network Services Orchestrator (NSO) Metro Fabric Supported Service Models Transport and Services Integration The Compass Metro Fabric Design – Phase 1 Transport - Phase 1 Transport Programmability – Phase 1 Services – Phase 1 Transport and Services Integration – Phase 1 The Compass Metro Fabric Design - Summary Revision History Version Date Comments 1.0 05/08/2018 Initial Metro Fabric publication 1.5 09/24/2018 NCS540 Access, ZTP, NSO Services       Value PropositionService Providers are facing the challenge to provide next generationservices that can quickly adapt to market needs. New paradigms such as5G introduction, video traffic continuous growth, IoT proliferation andcloud services model require unprecedented flexibility, elasticity andscale from the network. Increasing bandwidth demands and decreasing ARPUput pressure on reducing network cost. At the same time, services needto be deployed faster and more cost effectively to stay competitive.Metro Access and Aggregation solutions have evolved from nativeEthernet/Layer 2 based, to Unified MPLS to address the above challenges.The Unified MPLS architecture provides a single converged networkinfrastructure with a common operational model. It has great advantagesin terms of network convergence, high scalability, high availability,and optimized forwarding. However, that architectural model is stillquite challenging to manage, especially on large-scale networks, becauseof the large number of distributed network protocols involved whichincreases operational complexity.Compass Metro Fabric (CMF) design introduces an SDN-ready architecturewhich evolves traditional Metro network design towards an SDN enabled,programmable network capable of delivering all services (Residential,Business, 5G Mobile Backhauling, Video, IoT) on the premise ofsimplicity, full programmability, and cloud integration, with guaranteedservice level agreements (SLAs).The Compass Metro Fabric design brings tremendous value to the ServiceProviders# Fast service deployment and rapid time to market throughfully automated service provisioning and end-to-end networkprogrammability Operational simplicity with less protocols to operate and manage Smooth migration towards an SDN-ready architecture thanks tobackward-compatibility with existing network protocols and services Next generation services creation leveraging guaranteed SLAs Enhanced and optimized operations using telemetry/analytics inconjunction with automation tools The Compass Metro Fabric design is targeted at Service Providercustomers who# Want to evolve their existing Unified MPLS Network Are looking for an SDN ready solution Need a simple, scalable design that can support future growth Want a future proof architecture built ousing industry-leading technology SummaryThe Compass Metro Fabric design meets the criteria identified forcompass designs# Simple# based on Segment Routing as unified forwarding plane andEVPN and L3VPN as a common BGP based control plane Programmable# it uses SR-PCE to program end-to-end paths across thenetwork with guaranteed SLAs Automatable# service provisioning is fully automated using NSOand Yang models; analytics with model driven telemetry inconjunction with automation tools will be used in the future toenhance operations and network and services optimization Repeatable# it’s an evolution of the Unified MPLS architectureand based on standard protocols Technical OverviewThe Compass Metro Fabric design evolves from the successful CiscoEvolved Programmable Network (EPN) 5.0 architecture framework, to bringgreater programmability and automation.In the Compass Metro Fabric design, the transport and service are builton-demand when the customer service is requested. The end-to-endinter-domain network path is programmed through controllers and selectedbased on the customer SLA, such as the need for a low latency path.The Compass Metro Fabric is made of the following main buildingblocks# IOS-XR as a common Operating System proved in Service ProviderNetworks Transport Layer based on Segment Routing as UnifiedForwarding Plane SDN - Segment Routing Path Computation Element (SR-PCE) as Cisco Path ComputationEngine (PCE) coupled with Segment Routing to provide simple andscalable inter-domain transport connectivity and TrafficEngineering and Path control Service Layer for Layer 2 (EVPN) and Layer 3 VPN services basedon BGP as Unified Control Plane Automation and Analytics NSO for service provisioning Netconf/YANG data models Telemetry to enhance and simplify operations Zero Touch Provisioning and Deployment (ZTP/ZTD) By leveraging analytics collected through model driven telemetry on IOS-XR platforms, in conjunction with automation tools, Compass Metro Fabric provides Service Providers with enhancements in network and services operations experience.Transport – DesignUse CasesService Provider networks must adopt a very flexible design that satisfyany to any connectivity requirements, without compromising in stabilityand availability. Moreover, transport programmability is essential tobring SLA awareness into the network,The goals of the Compass Metro Fabric is to provide a flexible networkblueprint that can be easily customized to meet customer specificrequirements.To provide unlimited network scale, the Compass Metro Fabric isstructured into multiple IGP Domains# Access, Aggregation, and Core.Refer to the network topology in Figure 1.Figure 1# Distributed Central OfficeThe network diagram in Figure 2 shows how a Service Provider network canbe simplified by decreasing the number of IGP domains. In this scenariothe Core domain is extended over the Aggregation domain, thus increasingthe number of nodes in theCore.Figure 2# Distributed Central Office with Core domain extensionA similar approach is shown in Figure 3. In this scenario the Coredomain remains unaltered and the Access domain is extended over theAggregation domain, thus increasing the number of nodes in the Accessdomain.Figure 3# Distributed Central Office with Access domain extensionThe Compass Metro Fabric transport design supports all three networkoptions, while remaining easily customizable.The first phase of the Compass Metro Fabric, discussed later in thisdocument, will cover in depth the scenario described in Figure 3.Intra-DomainIntra-Domain Routing and ForwardingThe Compass Metro Fabric is based on a fully programmable transport thatsatisfies the requirements described earlier. The foundation technologyused in the transport design is Segment Routing (SR) with a MPLS basedData Plane in Phase 1 and a IPv6 based Data Plane (SRv6) in future.Segment Routing dramatically reduces the amount of protocols needed in aService Provider Network. Simple extensions to traditional IGP protocolslike ISIS or OSPF provide full Intra-Domain Routing and ForwardingInformation over a label switched infrastructure, along with HighAvailability (HA) and Fast Re-Route (FRR) capabilities.Segment Routing defines the following routing related concepts# Prefix-SID – A node identifier that must be unique for each node ina IGP Domain. Prefix-SID is statically allocated by th3 networkoperator. Adjacency-SID – A node’s link identifier that must be unique foreach link belonging to the same node. Adjacency-SID is typicallydynamically allocated by the node, but can also be staticallyallocated. In the case of Segment Routing with a MPLS Data Plane, both Prefix-SIDand Adjacency-SID are represented by the MPLS label and both areadvertised by the IGP protocol. This IGP extension eliminates the needto use LDP or RSVP protocol to exchange MPLS labels.The Compass Metro Fabric design uses ISIS as the IGP protocol.Intra-Domain Forwarding - Fast Re-RouteSegment-Routing embeds a simple Fast Re-Route (FRR) mechanism known asTopology Independent Loop Free Alternate (TI-LFA).TI-LFA provides sub 50ms convergence for link and node protection.TI-LFA is completely Stateless and does not require any additionalsignaling mechanism as each node in the IGP Domain calculates a primaryand a backup path automatically and independently based on the IGPtopology. After the TI-LFA feature is enabled, no further care isexpected from the network operator to ensure fast network recovery fromfailures. This is in stark contrast with traditional MPLS-FRR, whichrequires RSVP and RSVP-TE and therefore adds complexity in the transportdesign.Please refer also to the Area Border Router Fast Re-Route covered inSection# “Inter-Domain Forwarding - High Availability and Fast Re-Route” for additional details.Inter-DomainInter-Domain ForwardingThe Compass Metro Fabric achieves network scale by IGP domainseparation. Each IGP domain is represented by separate IGP process onthe Area Border Routers (ABRs).Section# “Intra-Domain Routing and Forwarding” described basic Segment Routing concepts# Prefix-SID andAdjacency-SID. This section introduces the concept of Anycast SID.Segment Routing allows multiple nodes to share the same Prefix-SID,which is then called a “Anycast” Prefix-SID or Anycast-SID. Additionalsignaling protocols are not required, as the network operator simplyallocates the same Prefix SID (thus a Anycast-SID) to a pair of nodestypically acting as ABRs.Figure 4 shows two sets of ABRs# Aggregation ABRs – AG Provider Edge ABRs – PE Figure 4# IGP Domains - ABRs Anycast-SIDFigure 5 shows the End-To-End Stack of SIDs for packets traveling fromleft to right through thenetwork.Figure 5# Inter-Domain LSP – SRTE PolicyThe End-To-End Inter-Domain Label Switched Path (LSP) was computed viaSegment Routing Traffic Engineering (SRTE) Policies.On the Access router “A” the SRTE Policy imposes# Local Aggregation Area Border Routers Anycast-SID# Local-AGAnycast-SID Local Provider Edge Area Border Routers Anycast-SID# Local-PEAnycast SID Remote Provider Edge Area Border Routers Anycast-SID# Remote-PEAnycast-SID Remote Aggregation Area Border Routers Anycast-SID# Remote-AGAnycast-SID Remote/Destination Access Router# Destination-A Prefix-SID#Destination-A Prefix-SID The SRTE Policy is programmed on the Access device on-demand by anexternal Controller and does not require any state to be signaledthroughout the rest of the network. The SRTE Policy provides, by simpleSID stacking (SID-List), an elegant and robust way to programInter-Domain LSPs without requiring additional protocols such as BGP-LU(RFC3107).Please refer to Section# “Transport Programmability” for additional details.Area Border Routers – Prefix-SID vs Anycast-SIDSection# “Inter-Domain Forwarding” showed the use of Anycast-SID at the ABRs for theprovisioning of an Access to Access End-To-End LSP. When the LSP is setup between the Access Router and the AG/PE ABRs, there are two options# ABRs are represented by Anycast-SID; or Each ABR is represented by a unique Prefix-SID. Choosing between Anycast-SID or Prefix-SID depends on the requestedservice. Please refer to Section# “Services - Design”.Note that both options can be combined on the same network.Inter-Domain Forwarding - Label Stack OptimizationSection# “Inter-Domain Forwarding” described how SRTE Policy uses SID stacking (SID-List) to define the Inter-Domain End-To-End LSP. The SID-List has to be optimized to be able to support different HW capabilities on different service termination platforms, while retaining all the benefits of a clear, simple and robust design.Figure 6 shows the optimization indetail.Figure 6# Label Stack OptimizationThe Anycast-SIDs and the Anycast Loopback IP address of all PE ABRs inthe network are redistributed into the Aggregation IGP Domain by the local PE ABRs. By doing this, all nodes in a Aggregation IGP Domainknow, via IGP, the Anycast-SID of all PE ABRs in the network. Local AGABRs then redistribute the Anycast-SIDs and Anycast Loopback IP addressof all PE ABRs into the Access IGP Domain. By doing this, all nodes in aAccess IGP Domain also know, via IGP, the Anycast-SID of all PE ABRs inthe network.It is very important to note that this redistribution is asymmetric,thus it won’t cause any L3 routing loop in the network.Another important fact to consider is that there is only a limitedamount of PEs in a Service Provider Network, therefore theredistribution does not affect scalability in the Access IGP Domain.After Label Stack Optimization, the SRTE Policy on the Access routerimposes# Remote Provider Edge Area Border Routers Anycast-SID# Remote-PEAnycast-SID Remote Aggregation Are Border Routers Anycast-SID# Remote-AGAnycast-SID Remote/Destination Access Router# Destination-A Prefix-SID#Destination-A Prefix-SID Because of the Label Stack Optimization, the total amount of SIDsrequired for the Inter-Domain LSP is reduced to 3 instead of theoriginal 5.The Label Stack Optimization mechanism is very similar when an ABR isrepresented by a Prefix-SID instead of an Anycast-SID. The Prefix-SIDand the unicast Loopback IP address are redistributed into theAggregation IGP Domain by Local PE ABRs. By doing this, all nodes in theAggregation IGP Domain know, via IGP, the Prefix-SID of all PE ABRs inthe network. Local AG ABRs then redistribute the learned Prefix-SIDs andunicast Loopback IP address of all PE ABRs to the Access IGP Domain. Bydoing this, all nodes in a Access IGP Domain know, via IGP, thePrefix-SID of all PE ABRs in the network.Both Anycast-SID and Prefix-SID can be combined in the same network withor without Label Stack Optimization.Inter-Domain Forwarding - High Availability and Fast Re-RouteAG/PE ABRs redundancy enables high availability for Inter-DomainForwarding.Figure 7# IGP Domains - ABRs Anycast-SIDWhen Anycast-SID is used to represent AG or PE ABRs, no other mechanismis needed for Fast Re-Route (FRR). Each IGP Domain provides FRRindependently by TI-LFA as described in Section# “Intra-Domain Forwarding - Fast Re-Route”.Figure 8 shows how FRR is achieved for a Inter-DomainLSP.Figure 8# Inter-Domain - FRRThe access router on the left imposes the Anycast-SID of the ABRs andthe Prefix-SID of the destination access router. For FRR, any router inIGP1, including the Access router, looks at the top label# “ABRAnycast-SID”. For this label, each device maintains a primary and backuppath preprogrammed in the HW. In IGP2, the top label is “Destination-A”.For this label, each node in IGP2 has primary and backup pathspreprogrammed in the HW. The backup paths are computed by TI-LFA.As Inter-Domain forwarding is achieved via SRTE Policies, FRR iscompletely self-contained and does not require any additional protocol.Note that when traditional BGP-LU is used for Inter-Domain forwarding,BGP-PIC is also required for FRR.Inter-Domain LSPs provisioned by SRTE Policy are protected by FRR alsoin case of ABR failure (because of Anycast-SID). This is not possiblewith BGP-LU/BGP-PIC, since BGP-LU/BGP-PIC have to wait for the IGP toconverge first.Transport ProgrammabilityFigure 9 and Figure 10 show the design of Router-Reflectors (RR), Segment Routing Path Computation Element (SR-PCE) and WAN Automation Engines (WAE).High-Availability is achieved by device redundancy in the Aggregationand Core networks.Figure 9# Transport Programmability – PCEPRRs collect network topology from ABRs through BGP Link State (BGP-LS).Each ABR has a BGP-LS session with the two Domain RRs.Aggregation Domain RRs collect network topology information from theAccess and the Aggregation IGP Domain (Aggregation ABRs are part of theAccess and the Aggregation IGP Domain). Core Domain RRs collect networktopology information from the Core IGP Domain.Aggregation Domain RRs have BGP-LS sessions with Core RRs.Through the Core RRs, the Aggregation Domains RRs advertise localAggregation and Access IGP topologies and receive the network topologiesof the remote Access and Aggregation IGP Domains as well as the networktopology of the Core IGP Domain. Hence, each RR maintains the overallnetwork topology in BGP-LS.Redundant Domain SR-PCEs have BGP-LS sessions with the local Domain RRsthrough which they receive the overall network topology. Refer toSection# “Segment Routing Path Computation Element (SR-PCE)” for more details about SR-PCE.SR-PCE is then capable of computing the Inter-Domain LSP path on-demand andto instantiate it. The computed path (SID-List) is then advertised viathe Path Computation Element Protocol (PCEP), as shown in Figure 9, orBGP-SRTE, as shown in Figure 10, to the Service End Points. In the caseof PCEP, SR-PCEs and Service End Points communicate directly, while forBGP-SRTE, they communicate via RRs. Phase 1 uses PCEP only.The Service End Points program the SID-List via SRTE Policy.Service End Points can be co-located with the Access Routers for FlatServices or at the ABRs for Hierarchical Services. The SRTE Policy DataPlane in the case of Service End Point co-located with the Access routerwas described in Figure 5.The WAN Automation Engine (WAE) provides bandwidthoptimization.Figure 10# Transport Programmability – BGP-SRTEThe proposed design is very scalable and can be easily extended tosupport even higher numbers of BGP-SRTE/PCEP sessions by addingadditional RRs and SR-PCEs into the Access Domain.Figure 11 shows the Compass Metro Fabric physical topology with examplesof productplacement.Figure 11# Compass Metro Fabric – Physical Topology with transportprogrammabilityNote that the design of the Central Office is not covered by thisdocument.Traffic Engineering (Tactical Steering) – SRTE PolicyOperators want to fully monetize their network infrastructure byoffering differentiated services. Traffic engineering is used to providedifferent paths (optimized based on diverse constraints, such aslow-latency or disjoined paths) for different applications. Thetraditional RSVP-TE mechanism requires signaling along the path fortunnel setup or tear down, and all nodes in the path need to maintainstates. This approach doesn’t work well for cloud applications, whichhave hyper scale and elasticity requirements.Segment Routing provides a simple and scalable way of defining anend-to-end application-aware traffic engineering path computed onceagain through SRTE Policy.In the Compass Metro Fabric design, the Service End Point uses PCEP orBGP-SRTE (Phase 1 uses PCEP only) along with Segment Routing On-DemandNext-hop (SR-ODN) capability, to request from the controller a path thatsatisfies specific constraints (such as low latency). This is done byassociating an SLA tag/attribute to the path request. Upon receiving therequest, the SR-PCE controller calculates the path based on the requestedSLA, and uses PCEP or BGP-SRTE to dynamically program the ingress nodewith a specific SRTE Policy.The Compass Metro Fabric design also uses MPLS Performance Management tomonitor link delay/jitter/drop (RFC6374).Transport Controller Path Computation Engine (PCE)Segment Routing Path Computation Element (SR-PCE)Segment Routing Path Computation Element, or SR-PCE, is a Cisco Path Computation Engine(PCE) and it is implemented as a feature included as part of CiscoIOS-XR operating system. The function is typically deployed on a CiscoIOS-XR cloud appliance XRv9000, as it involves control plane operationsonly. The SR-PCE gains network topology awareness from BGP-LSadvertisements received from the underlying network. Such knowledge isleveraged by the embedded multi-domain computation engine to provideoptimal path to Path Computation Element Clients (PCCs) using the PathComputation Element Protocol (PCEP) or BGP-SRTE.The PCC is the device where the service originates and therefore itrequires end-to-end connectivity over the segment routing enabledmulti-domain network.The SR-PCE provides a path based on constraints such as# Shortest path (IGP metrics). Traffic-Engineering metrics. Disjoint path.
 Figure 12# XR Transport Controller – ComponentsWAN Automation Engine (WAE)WAE Automation combines the smart data collection, modeling, andpredictive analytics of Cisco WAE Planning with an extensible,API-driven configuration platform. The use of open APIs and standardizedprotocols provides a means for intelligent interaction betweenapplications and the network. Applications have visibility into theglobal network and can make requests for specific service levels.Section# “PCE Controller Summary - SR-PCE & WAE” compares SR-PCE and WAE.PCE Controller Summary – SR-PCE & WAESegment Routing Path Computation Element (SR-PCE)# Runs as a features in a IOS-XR node Collects topology from BGP, ISIS, OSPF and BGP Link State Deploys tunnel# PCEP SR/RSVP, BGP SR-TE Computes Shortest, Disjoint, Low Latency, and Avoidance paths North Bound interface with applications via REST API WAN Automation Engine (WAE)# Runs as a SR-PCE application Collects topology# via SR-PCE Collects BW utilization# Flexible NetFlow (FNF), StreamingTelemetry, SNMP Deploys tunnel via SR-PCE (preferred# stateful) or NSO (optional#stateless) Computes# Bandwidth Optimization, On demand BW. Path Computation Engine – WorkflowThere are three models available to program transport LSPs# Delegated Computation to SR-PCE WAE Instantiated LSP Delegated Computation to WAE All models assume SR-PCE has acquired full network topology throughBGP-LS.Figure 13# PCE Path ComputationDelegated Computation to SR-PCE NSO provisions the service. Alternatively, the service can beprovisioned via CLI Access Router requests a path SR-PCE computes the path SR-PCE provides the path to Access Router Access Router acknowledges (Optional) When WAE is deployed for LSP visibility, SR-PCE updates WAEwith the newer LSP WAE Instantiated LSP WAE computes the path WAE sends computed path to SR-PCE SR-PCE provides the path to Access Router Access Router confirms SR-PCE updates WAE with newer LSP Delegated Computation to WAE NSO provisions the service – Service can also be provisioned via CLI Access Router requests a path SR-PCE delegates computation to WAE WAE computes the path WAE sends computed path to SR-PCE SR-PCE provides the path to Access Router Access Router confirms SR-PCE updates WAE with newer LSP Device AutomationZero Touch ProvisioningIn addition to model-driven configuration and operation, Metro Fabric 1.5 supports ZTP operation for automated device provisioning. ZTP is useful both in production as well as staging environments to automate initial device software installation, deploy an initial bootstrap configuration, as well as advanced functionality triggered by ZTP scripts. ZTP is supported on both out of band management interfaces as well as in-band data interfaces. When a device first boots, the IOS-XR ZTP processbeging on the management interface of the device and if no response is received, or the the interface is not active, the ZTP process will begin the process on data ports. IOS-XRcan be part of an ecosystem of automated device and service provisioning via Cisco NSO.Services – DesignOverviewThe Compass Metro Fabric Design aims to enable simplification across alllayers of a Service Provider network. Thus, the Compass Metro Fabricservices layer focuses on a converged Control Plane based on BGP.BGP based Services include EVPNs and Traditional L3VPNs (VPNv4/VPNv6).EVPN is a technology initially designed for Ethernet multipoint servicesto provide advanced multi-homing capabilities. By using BGP fordistributing MAC address reachability information over the MPLS network,EVPN brought the same operational and scale characteristics of IP basedVPNs to L2VPNs. Today, beyond DCI and E-LAN applications, the EVPNsolution family provides a common foundation for all Ethernet servicetypes; including E-LINE, E-TREE, as well as data center routing andbridging scenarios. EVPN also provides options to combine L2 and L3services into the same instance.To simplify service deployment, provisioning of all services is fullyautomated using Cisco Network Services Orchestrator (NSO) using (YANG)models and NETCONF. Refer to Section# “Network Services Orchestrator (NSO)”.There are two types of services# End-To-End and Hierarchical. The nexttwo sections describe these two types of services in more detail.Ethernet VPN (EVPN)EVPNs solve two long standing limitations for Ethernet Services inService Provider Networks# Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or withData Center Multi-Homed & All-Active Ethernet AccessFigure 21 demonstrates the greatest limitation of traditional L2Multipoint solutions likeVPLS.Figure 21# EVPN All-Active AccessWhen VPLS runs in the core, loop avoidance requires that PE1/PE2 andPE3/PE4 only provide Single-Active redundancy toward their respectiveCEs. Traditionally, techniques such mLACP or Legacy L2 protocols likeMST, REP, G.8032, etc. were used to provide Single-Active accessredundancy.The same situation occurs with Hierarchical-VPLS (H-VPLS), where theaccess node is responsible for providing Single-Active H-VPLS access byactive and backup spoke pseudowire (PW).All-Active access redundancy models are not deployable as VPLStechnology lacks the capability of preventing L2 loops that derive fromthe forwarding mechanisms employed in the Core for certain categories oftraffic. Broadcast, Unknown-Unicast and Multicast (BUM) traffic sourcedfrom the CE is flooded throughout the VPLS Core and is received by allPEs, which in turn flood it to all attached CEs. In our example PE1would flood BUM traffic from CE1 to the Core, and PE2 would sends itback toward CE1 upon receiving it.EVPN uses BGP-based Control Plane techniques to address this issue andenables Active-Active access redundancy models for either Ethernet orH-EVPN access.Figure 22 shows another issue related to BUM traffic addressed byEVPN.Figure 22# EVPN BUM DuplicationIn the previous example, we described how BUM is flooded by PEs over theVPLS Core causing local L2 loops for traffic returning from the core.Another issue is related to BUM flooding over VPLS Core on remote PEs.In our example either PE3 or PE4 receive and send the BUM traffic totheir attached CEs, causing CE2 to receive duplicated BUM traffic.EVPN also addresses this second issue, since the BGP Control Planeallows just one PE to send BUM traffic to an All-Active EVPN access.Figure 23 describes the last important EVPNenhancement.Figure 23# EVPN MAC Flip-FloppingIn the case of All-Active access, traffic is load-balanced (per-flow)over the access PEs (CE uses LACP to bundle multiple physical ethernetports and uses hash algorithm to achieve per flow load-balancing).Remote PEs, PE3 and PE4, receive the same flow from different neighbors.With a VPLS core, PE3 and PE4 would rewrite the MAC address tablecontinuously, each time the same mac address is seen from a differentneighbor.EVPN solves this by mean of “Aliasing”, which is also signaled via theBGP Control Plane.Service Provider Network - Integration with Central Office or with Data CenterAnother very important EVPN benefit is the simple integration withCentral Office (CO) or with Data Center (DC). Note that Metro CentralOffice design is not covered by this document.The adoption of EVPNs provides huge benefits on how L2 Multipointtechnologies can be deployed in CO/DC. One such benefit is the convergedControl Plane (BGP) and converged data plane (SR MPLS/SRv6) over SP WANand CO/DC network.Moreover, EVPNs can replace existing proprietary EthernetMulti-Homed/All-Active solutions with a standard BGP-based ControlPlane.End-To-End (Flat) – ServicesThe End-To-End Services use cases are summarized in the table in Figure24 and shown in the network diagram in Figure 25.Figure 24# End-To-End – Services tableFigure 25# End-To-End – ServicesAll services use cases are based on BGP Control Plane.Refer also to Section# “Transport and Services Integration”.Hierarchical – ServicesHierarchical Services Use Cases are summarized in the table of Figure 26and shown in the network diagram of Figure 27.Figure 26# Hierarchical – Services tableFigure 27# Hierarchical - ServicesHierarchical services designs are critical for Service Providers lookingfor limiting requirements on the access platforms and deploying morecentralized provisioning models that leverage very rich features sets ona limited number of touch points.Hierarchical Services can also be required by Service Providers who wantto integrate their SP-WAN with the Central Office/Data Center networkusing well-established designs based on Data Central Interconnect (DCI).Figure 27 shows hierarchical services deployed on PE routers, but thesame design applies when services are deployed on AG or DCI routers.The Compass Metro Design offers scalable hierarchical services withsimplified provisioning. The three most important use cases aredescribed in the following sections# Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service(H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) andPWHE Hierarchical L2 Multipoint Multi-Homed/All-ActiveFigure 28 shows a very elegant way to take advantage of the benefits ofSegment-Routing Anycast-SID and EVPN. This use case providesHierarchical L2 Multipoint Multi-Homed/All-Active (Single-Homed Ethernetaccess) service with traditional access routerintegration.Figure 28# Hierarchical – Services (Anycast-PW)Access Router A1 establishes a Single-Active static pseudowire(Anycast-Static-PW) to the Anycast IP address of PE1/PE2. PEs anycast IPaddress is represented by Anycast-SID.Access Router A1 doesn’t need to establish active/backup PWs as in atraditional H-VPLS design and doesn’t need any enhancement on top of theestablished spoke pseudowire design.PE1 and PE2 use BGP EVPN Control Plane to provide Multi-Homed/All-Activeaccess, protecting from L2 loop, and providing efficient per-flowload-balancing (with aliasing) toward the remote PEs (PE3/PE4).A3, PE3 and PE4 do the same, respectively.Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRBFigure 29 shows how EVPNs can completely replace the traditional H-VPLSsolution. This use case provides the greatest flexibility asHierarchical L2 Multi/Single-Home, All/Single-Active modes are availableat each layer of the servicehierarchy.Figure 29# Hierarchical – Services (H-EVPN)Optionally, Anycast-IRB can be used to enable Hierarchical L2/L3Multi/Single-Home, All/Single-Active service and to provide optimal L3routing.Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHEFigure 30 shows how the previous H-EVPN can be extended by takingadvantage of Pseudowire Headend (PWHE). PWHE with the combination ofMulti-Homed, Single-Active EVPN provides an Hierarchical L2/L3Multi-Homed/Single-Active (H-EVPN) solution that supports QoS.It completely replaces traditional H-VPLS based solutions. This use caseprovides Hierarchical L2 Multi/Single-Home, All/Single-Activeservice.Figure 30# Hierarchical – Services (H-EVPN and PWHE)Refer also to the section# “Transport and Services Integration”.Services – Route-Reflector (S-RR)Figure 31 shows the design of Services Router-Reflectors(S-RRs).Figure 31# Services – Route-ReflectorsThe Compass Metro Fabric Design focuses mainly on BGP-based services,therefore it is important to provide a robust and scalable ServicesRoute-Reflector (S-RR) design.For Redundancy reasons, there are at least 2 S-RRs in any given IGPDomain, although Access and Aggregation are supported by the same pairof S-RRs.Each node participating in BGP-based service termination has two BGPsessions with Domain Specific S-RRs and supports multipleaddress-Families# VPNv4, VPNv6, EVPN.Core Domain S-RRs cover the core Domain. Aggregation Domain S-RRs coverAccess and Aggregation Domains. Aggregation Domain S-RRs and Core S-RRshave BGP sessions among each other.The described solution is very scalable and can be easily extended toscale to higher numbers of BGP sessions by adding another pair of S-RRsin the Access Domain.Network Services Orchestrator (NSO)The NSO is a management and orchestration (MANO) solution for networkservices and Network Functions Virtualization (NFV). The NSO includescapabilities for describing, deploying, configuring, and managingnetwork services and VNFs, as well as configuring the multi-vendorphysical underlay network elements with the help of standard open APIssuch as NETCONF/YANG or a vendor-specific CLI using Network ElementDrivers (NED).In the Compass Metro Fabric design, the NSO is used for ServicesManagement, Service Provisioning, and Service Orchestration.The NSO provides several options for service designing as shown inFigure 32 Service model with service template Service model with mapping logic Service model with mapping logic and servicetemplates Figure 32# NSO – ComponentsA service model is a way of defining a service in a template format.Once the service is defined, the service model accepts user inputs forthe actual provisioning of the service. For example, a E-Line servicerequires two endpoints and a unique virtual circuit ID to enable theservice. The end devices, attachment circuit UNI interfaces, and acircuit ID are required parameters that should be provided by the userto bring up the E-Line service. The service model uses the YANG modelinglanguage (RFC 6020) inside NSO to define a service.Once the service characteristics are defined based on the requirements,the next step is to build the mapping logic in NSO to extract the userinputs. The mapping logic can be implemented using Python or Java. Thepurpose of the mapping logic is to transform the service models todevice models. It includes mechanisms of how service related operationsare reflected on the actual devices. This involves mapping a serviceoperation to available operations on the devices.Finally, service templates need to be created in XML for each devicetype. In NSO, the service templates are required to translate theservice logic into final device configuration through CLI NED. The NSOcan also directly use the device YANG models using NETCONF for deviceconfiguration. These service templates enable NSO to operate in amulti-vendor environment.Metro Fabric Supported Service ModelsMetro Fabric 1.5 supports the following NSO service models for provisioning both hierarchical and flat services across the fabric. All NSO service modules in 1.5 utilize the IOS-XR and IOS-XE CLI NEDs for configuration.Figure 33# Automation – Flat Service ModelsFigure 34# Automation – Hierarchical Service ModelsTransport and Services IntegrationSection# “Transport - Design” described how Segment Routing provides flexible End-To-End andAny-To-Any Highly-Available transport together with Fast Re-Route. Aconverged BGP Control Plane provides a scalable and flexible solutionalso at the services layer.Figure 35 shows a consolidated view of the Compass Metro Fabric networkfrom a Control-Plane standpoint. Note that while network operators coulduse both PCEP and BGR-SRTE at the same time, it is nottypical.Figure 35# Compass Metro Fabric – Control-PlaneAs mentioned, service provisioning is independent of the transportlayer. However, transport is responsible for providing the path based onservice requirements (SLA). The component that enables such integrationis On-Demand Next Hop (ODN). ODN is the capability of requesting to acontroller a path that satisfies specific constraints (such as lowlatency). This is achieved by associating an SLA tag/attribute to thepath request. Upon receiving the request, the SR-PCE controller calculatesthe path based on the requested SLA and use PCEP or BGP-SRTE todynamically program the Service End Point with a specific SRTE Policy.The Compass Metro Fabric design also use MPLS Performance Management tomonitor link delay/jitter/drop (RFC6374) to be able to create a LowLatency topology dynamically.Figure 36 shows a consolidated view of Compass Metro Fabric network froma Data Planestandpoint.Figure 36# Compass Metro Fabric – Data-PlaneThe Compass Metro Fabric Design – Phase 1Transport - Phase 1This section describes in detail Phase 1 of the Compass Metro Fabricdesign. This Phase focuses on transport programmability and BGP-basedservices adoption.Figure 35 and Figure 36 show the network topology and transport DataPlane details for Phase 1. Refer also to the Access domain extension usecase in Section# “Use Cases”.The network is split into Access and Core IGP domains. Each IGP domainis represented by separate IGP processes. The Compass Metro Fabricdesign uses ISIS IGP protocol for validation.Validation will be done on two types of access platforms, IOS-XR andIOS-XE, to proveinteroperability.Figure 37# Access Domain Extension – End-To-End TransportFor the End-To-End LSP shown in Figure 35, the Access Router imposes 3transport labels (SID-list) An additional label, the TI-LFA label, canbe also added for FRR (node and link protection). In the Core and in theremote Access IGP Domain, 2 additional TI-LFA labels can be used for FRR(node and link protection). In Phase 1 PE ABRs are represented byPrefix-SID. Refer also to Section# “Transport Programmability - Phase 1”.Figure 38# Access Domain Extension – Hierarchical TransportFigure 38 shows how the Access Router imposes a single transport labelto reach local PE ABRs, where the hierarchical service is terminated.Similarly, in the Core and in the remote Access IGP domain, thetransport LSP is contained within the same IGP domain (Intra-DomainLSP). Routers in each IGP domain can also impose two additional TI-LFAlabels for FRR (to provide node and link protection).In the Hierarchical transport use case, PE ABRs are represented byAnycast-SID or Prefix-SID. Depending on the type of service, Anycast-SIDor Prefix-SID is used for the transport LSP.Transport Programmability – Phase 1The Compass Metro Fabric employs a distributed and highly available SR-PCEdesign as described in Section# “Transport Programmability”. Transport programmability is basedon PCEP. Figure 39 shows the design when SR-PCE uses PCEP.Figure 39# SR-PCE – PCEPSR-PCE in the Access domain is responsible for Inter-Domain LSPs andprovides the SID-list. PE ABRs are represented by Prefix-SID.SR-PCE in the Core domain is responsible for On-Demand Nexthop (ODN) forhierarchical services. Refer to the table in Figure 41 to see whatservices use ODN. Refer to Section# “Transport Controller - Path Computation Engine (PCE)” to see more details about XRTransport Controller (SR-PCE). Note that Phase 1 uses the “DelegatedComputation to SR-PCE” mode described in Section# “Path Computation Engine - Workflow” without WAE as shownin Figure38.Figure 40# PCE Path Computation – Phase 1Delegated Computation to SR-PCE NSO provisions the service – Service can also be provisioned via CLI Access Router requests a path SR-PCE computes the path SR-PCE provides the path to Access Router Access Router confirms Services – Phase 1This section describes the Services used in the Compass Metro FabricPhase 1.The table in Figure 41 describes the End-To-End services, while thenetwork diagram in Figure 40 shows how services are deployed in thenetwork. Refer also to Section# “Services - Design” of this document.Figure 41# End-To-End Services tableFigure 42# End-To-End ServicesThe table in Figure 42 describes the hierarchical services, while thenetwork diagram in Figure 43 shows how services are deployed in thenetwork. Refer also to Section# “Services - Design” of this document.In addition, the table in Figure 44 shows where PE ABRs Anycast-SID isrequired and where ODN in the Core IGP domain is used.Figure 41# Hierarchical Services tableFigure 42# Hierarchical ServicesThe Compass Metro Fabric uses the hierarchical Services Route-Reflectors(S-RRs) design described in Section# “Services - Route-Reflector (S-RR)”. Figure 43 shows in detail the S-RRs design used for Phase 1.Figure 43# Services Route-Reflectors (S-RRs)Network Services Orchestrator (NSO) is used for service provisioning.Refer to Section# “Network Services Orchestrator (NSO)”.Transport and Services Integration – Phase 1Transport and Services integration is described in Section# “Transport and Services Integration” of this document. Figure 44 shows an example of End-To-End LSP and servicesintegration in Phase 1.Figure 44# Transport and Services Data-PlaneFigure 45 shows a consolidated view of the Transport and ServicesControl-Plane.Figure 45# Transport and Services Control-PlaneFigure 46 shows the physical topology of the testbed used for Phase 1validation.Figure 46# Testbed – Phase 1The Compass Metro Fabric Design - SummaryThe Compass Metro Fabric brings huge simplification at the Transport aswell as at the Services layers of a Service Provider network.Simplification is a key factor for real Software Defined Networking(SDN). Cisco continuously improves Service Provider network designs tosatisfy market needs for scalability and flexibility.From a very well established and robust Unified MPLS design, Cisco hasembarked on a journey toward transport simplification andprogrammability, which started with the Transport Control Planeunification in Evolved Programmable Network 5.0 (EPN5.0). The CiscoMetro Fabric provides another huge leap forward in simplification andprogrammability adding Services Control Plane unification andcentralized path computation.Figure 47# Compass Metro Fabric – EvolutionThe transport layer requires only IGP protocols with Segment Routingextensions for Intra and Inter Domain forwarding. Fast recovery for nodeand link failures leverages Fast Re-Route (FRR) by Topology IndependentLoop Free Alternate (TI-LFA), which is a built-in function of SegmentRouting. End to End LSPs are built using Traffic Engineering by SegmentRouting, which does not require additional signaling protocols. Insteadit solely relies on SDN controllers, thus increasing overall networkscalability. The controller layer is based on standard industryprotocols like BGP-LS, PCEP, BGP-SRTE, etc., for path computation andNETCONF/YANG for service provisioning, thus providing a on openstandards based solution.For all those reasons, the Cisco Metro Fabric design really brings anexciting evolution in Service Provider Networking.", "url": "/blogs/2018-10-01-metro-fabric-hld/", "author": "Phil Bedard", "tags": "iosxr, Metro, Design" } , "blogs-2018-10-01-peering-fabric-hld": { "title": "Peering Fabric High-Level Design", "content": " On This Page Revision History Key Drivers Traffic Growth Network Simplification Network Efficiency High-Level Design Peering Strategy Topology and Peer Distribution Platforms Control-Plane Telemetry Automation Zero Touch Provisioning Advanced Security using BGP Flowspec and QPPB (1.5) Internet and Peering in a VRF Validated Design Peering Fabric Use Cases Traditional IXP Peering Migration to Peering Fabric Peering Fabric Extension Localized Metro Peering and Content Delivery Express Peering Fabric Datacenter Edge Peering Peer Traffic Engineering with Segment Routing Low-Level Design Integrated Peering Fabric Reference Diagram Distributed Peering Fabric Reference Diagram Peering Fabric Hardware Detail NCS-5501-SE NCS-55A1-36H-SE NCS-55A1-24H NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card Peer Termination Strategy Distributed Fabric Device Roles PFL – Peering Fabric Leaf PFS – Peering Fabric Spine Device Interconnection Capacity Scaling Peering Fabric Control Plane PFL to Peer PFL to PFS PFS to Core SR Peer Traffic Engineering Summary Nodal EPE Peer Interface EPE Abstract Peering Peering Fabric Telemetry Telemetry Diagram Model-Driven Telemetry BGP Monitoring Protocol Netflow / IPFIX Automation and Programmability Cisco NSO Modules Netconf YANG Model Support 3rd Party Hosted Applications XR Service Layer API Recommended Device and Protocol Configuration Overview Common Node Configuration Enable LLDP Globally PFS Nodes IGP Configuration Segment Routing Traffic Engineering BGP Global Configuration Model-Driven Telemetry Configuration PFL Nodes Peer QoS Policy Peer Infrastructure ACL Peer Interface Configuration IS-IS IGP Configuration BGP Add-Path Route Policy BGP Global Configuration EBGP Peer Configuration PFL to PFS IBGP Configuration Netflow/IPFIX Configuration Model-Driven Telemetry Configuration Abstract Peering Configuration PFS Configuration BGP Flowspec Configuration and Operation Enabling BGP Flowspec Address Families on PFS and PFL Nodes BGP Flowspec Server Policy Definition BGP Flowspec Server Enablement BGP Flowspec Client Configuration QPPB Configuration and Operation Routing Policy Configuration Global BGP Configuration QoS Policy Definition Interface-Level Configuration Security Peering and Internet in a VRF VRF per Peer, default VRF for Internet Internet in a VRF Only VRF per Peer, Internet in a VRF Infrastructure ACLs BCP Implementation BGP Attribute and CoS Scrubbing Per-Peer Control Plane Policers BGP Prefix Security RPKI Origin Validation BGPSEC (Reference Only) Appendix Applicable YANG Models NETCONF YANG Paths BGP Operational State Global BGP Protocol State BGP Neighbor State Example Usage BGP RIB Data Example Usage BGP Flowspec Device Resource YANG Paths Validated Model-Driven Telemetry Sensor Paths Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform LLDP Monitoring Interface statistics and state The following sub-paths can be used but it is recommended to use the base openconfig-interfaces model Aggregate bundle information (use interface models for interface counters) BGP Peering information IS-IS IGP information It is not recommended to monitor complete RIB tables using MDT but can be used for troubleshooting QoS and ACL monitoring BGP RIB information It is not recommended to monitor these paths using MDT with large tables Routing policy Information Revision History Version Date Comments 1.0 05/08/2018 Initial Peering Fabric publication 1.5 07/31/2018 BGP-FS, QPPB, ZTP, Internet/Peering in a VRF, NSO Services       Key DriversTraffic GrowthInternet traffic has seen a compounded annual growth rate of 30% orhigher over the last five years, as more devices are connected and morecontent is consumed, fueled by the demand for video. Traffic willcontinue to grow as more content sources are added and Internetconnections speeds increase. Service and content providers must designtheir peering networks to scale for a future of more connected deviceswith traffic sources and destinations spanning the globe. Efficientpeering is required to deliver traffic to consumers.Network SimplificationSimple networks are easier to build and easier to operate. As networksscale to handle traffic growth, the level of network complexity mustremain flat. A prescriptive design using standard discrete componentsmakes it easier for providers to scale from networks handling a smallamount of traffic to 10s of Tbps without complete network forklifts.Fabrics with reduced control-plane elements and feature sets enhancestability and availability. Dedicating nodes to specific functions ofthe network also helps isolate the rest of the network from maliciousbehavior, defects, or instability.Network EfficiencyNetwork efficiency refers not only to maximizing network resources butalso optimizing the environmental impact of the deployed network. Muchof Internet peering today is done in 3rd party facilitieswhere space, power, and cooling are at a premium. High-density, lowerenvironmental footprint devices are critical to handling more trafficwithout exceeding the capabilities of a facility. In cases wheremultiple facilities must be connected, a simple and efficient way toextend networks must exist.High-Level DesignThe Peering design incorporates high-density environmentallyefficient edge routers, a prescriptive topology and peer terminationstrategy, and features delivered through IOS-XR to solve the needs ofservice and content providers. Also included as part of the Peeringdesign are ways to monitor the health and operational status of thepeering edge and through Cisco NSO integration assist providers inautomating peer configuration and validation. All designs areboth feature tested and validated as a complete design to ensurestability once implemented.Peering Strategyproposes a localized peering strategy to reduce network cost for“eyeball” service providers by placing peering or content provider cachenodes closer to traffic consumers. This reduces not only reducescapacity on long-haul backbone networks carrying traffic from IXPs toend users but also improves the quality of experience for users byreducing latency to content sources. The same design can also be usedfor content provider networks wishing to deploy a smaller footprintsolution in a SP location or 3rd party peering facility.Topology and Peer DistributionThe Cisco Peering Fabric introduces two options for fabric topology andpeer termination. The first, similar to more traditional peeringdeployments, collapses the Peer Termination and Core Connectivitynetwork functions into a single physical device using the device’sinternal fabric to connect each function. The second option utilizes afabric separating the network functions into separate physical layers,connected via an external fabric running over standard Ethernet.In many typical SP peering deployments, a traditional two-node setup isused where providers vertically upgrade nodes to support the highercapacity needs of the network. Some may employ technologies such as backto back or multi-chassis clusters in order to support more connectionswhile keeping what seems like the operational footprint low. However,failures and operational issues occurring in these types of systems aretypically difficult to troubleshoot and repair. They also requirelengthy planning and timeframes for performing system upgrades. Weintroduce a horizontally scalable distributed peering fabric, the endresult being more deterministic interface or node failures.Minimizing the loss of peering capacity is very important for bothingress-heavy SPs and egress-heavy content providers. The loss of localpeering capacity means traffic must ingress or egress a sub-optimalnetwork port. Making a conscious design decision to spread peerconnections, even to the same peer, across multiple edge nodes helpsincrease resiliency and limit traffic-affecting network events.PlatformsThe Cisco NCS5500 platform is ideal for edge peer termination, given itshigh-density, large RIB and FIB scale, buffering capability, and IOS-XRsoftware feature set. The NCS5500 is also space and power efficient with36x100GE supporting up to 7.5M IPv4 routes in a 1RU fixed form factor orsingle modular line card. A minimal The Peering fabric can provide36x100GE, 144x10GE, or a mix of non-blocking peering connections withfull resiliency in 4RU. The fabric can also scale to support 10s ofterabits of capacity in a single rack for large peering deployments.Fixed chassis are ideal for incrementally building a peering edgefabric, the NCS NC55-36X100GE-A-SE and NC55A1-24H are efficient highdensity building blocks which can be rapidly deployed as needed withoutinstalling a large footprint of devices day one. Deployments needingmore capacity or interface flexibility such as IPoDWDM to extend peeringcan utilize the NCS5504 4-slot or NCS5508 8-slot modular chassis. If thepeering location has a need for services termination the ASR9000 familyor XRv-9000 virtual edge node can be incorporated into the fabric.All NCS5500 routers also contain powerful Route Processors to unlockpowerful telemetry and programmability. The Peering Fabric fixedchassis contain 1.6Ghz 8-core processors and 32GB of RAM. The latestNC55-RP-E for the modular NCS5500 chassis has a 1.9Ghz 6-core processorand 32G of RAM.Control-PlaneThe peering fabric design introduces a simplified control-plane builtupon IPv4/IPv6 with Segment Routing. In the collapsed design, eachpeering node is connected to EBGP peers and upstream to the core viastandard IS-IS, OSPF, and TE protocols, acting as a PE or LER in aprovider network.In the distributed design, network functions are separated. PeerTermination happens on Peering Fabric Leaf nodes. Peering Fabric Spineaggregation nodes are responsible for Core Connectivity and perform moreadvanced LER functions. The PFS routers use ECMP to balance trafficbetween PFL routers and are responsible for forwarding within the fabricand to the rest of the provider network. Each PFS acts as an LER,incorporated into the control-plane of the core network. The PFS, oralternatively vRRs, reflect learned peer routes from the PFL to the restof the network. The SR control-plane supports several trafficengineering capabilities. EPE to a specific peer interface, PFL node, orPFS is supported. We also introduce the abstract peering concept wherePFS nodes utilize a next-hop address bound to an anycast SR SID to allowtraffic engineering on a per-peering center basis.TelemetryThe Peering fabric design uses the rich telemetry available in IOS-XRand the NCS5500 platform to enable an unprecedented level of insightinto network and device behavior. The Peering Fabric leverages Model-DrivenTelemetry and NETCONF along with both standard and native YANG modelsfor metric statistics collection. Telemetry configuration and applicablesensor paths have been identified to assist providers in knowing what tomonitor and how to monitor it.AutomationNETCONF and YANG using OpenConfig and native IOS-XR models are used tohelp automate peer configuration and validation. Cisco has developed specific Peering Fabric NSO service models to help automate common tasks suchas peer interface configuration, peer BGP configuration, and addingphysical interfaces to an existing peer bundle.Zero Touch ProvisioningIn addition to model-driven configuration and operation, Peering Fabric 1.5 alsosupports ZTP operation for automated device provisioning. ZTP is useful both in production as well as staging environments to automate initial device software installation, deploy an initial bootstrap configuration, as well as advanced functionality triggered by ZTP scripts. ZTP is supported on both out of band management interfaces as well as in-band data interfaces.Advanced Security using BGP Flowspec and QPPB (1.5)Release 1.5 of the Cisco Peering Fabric enhances the design by adding advancedsecurity capabilities using BGP Flowspec and QoS Policy Propagation using BGPor QPPB. BGP Flowspec was standardized in RFC 5575 and defines additional BGPNLRI to inject packet filter information to receiving routers. BGP is the control-plane fordisseminating the policy information while it is up to the BGP Flowspecreceiver to implement the dataplane rules specified in the NLRI. At theInternet peering edge, DDoS protection has become extremely important,and automating the remediation of an incoming DDoS attack has becomevery important. Automated DDoS protection is only one BGP Flowspec usecase, any application needing a programmatic way to create interfacepacket filters can make se use of its capabilities.QPPB allows using BGP attributes as a match criteria in dataplane packet filters. Matching packets based on attributes like BGP community and AS Path allows serviceproviders to create simplified edge QoS policies by not having to manage more cumbersome prefix lists or keep up to date when new prefixes are added. QPPB is supported in the peering fabric for destination prefix BGP attribute matching and has a number of use cases when delivering traffic from external providers to specific internal destinations.Internet and Peering in a VRFWhile Internet peering and carrying the Internet table in a provider network is typically done using the Global Routing Table (default VRF in IOS-XR) many modern networks are being built to isolate the GRT from the underlying infrastructure. In this case, the Internet global table is carried as a service just like any other VPN service, leaving the infrastructure layer protected from both the global Internet. Another application using VRFs is to simply isolate peers to specific VRFs in order to isolate the forwarding plane of each peer from each other and be able to control which routes a peer sees by the use of VPN route target communities as opposed to outbound routing policy. In this simplified use the case the global table is still carried in the default VRF, using IOS-XR capabilities to import and export routes to and from specific peer VRFs. Separating Internet and Peering routes into specific VRFs also gives flexibility in creating custom routing tables for specific customers, giving a service provider the flexibility to offer separate regional or global reach on the same network.Internet in a VRF and Peering in a VRF for IPv4 and IPv6 are compatible with most Peering Fabric features. Specific caveats are document in the Appendixof the document.Validated DesignThe Peering Fabric Design control, management, and forwarding planes haveundergone validation testing to ensure individual design features workas intended and the peering fabric as a whole performs without fault.Validation is done exceeding real-world scaling requirements to ensurethe design fulfills its rule in existing networks with room for futuregrowth.Peering Fabric Use CasesTraditional IXP Peering Migration to Peering FabricA traditional SP IXP design traditionally uses one or two large modularsystems terminating all peering connections. In many cases, sinceproviders are constrained on space and power they use a collapsed designwhere the minimal set of peering nodes not only terminates peerconnections but also provides services and core connectivity to thelocation. The Peering Fabric uses best of breed high density,low footprint hardware requiring much less space than older generationmodular systems. Many older systems provide densities at approximately4x100GE per rack unit, while Peering Fabric PFL nodes start at 24x100GEor 36x100GE per 1RU with high FIB capability. Due to the superior spaceefficiency, there is no longer a limitation of using just a pair ofnodes for these functions. In either a collapsed function or distributedfunction design, peers can be distributed across a number of devices toincrease resiliency and lessen collateral impact when failures occur.The diagram below shows a fully distributed fabric, where peers are nowdistributed across three PFL nodes, each with full connectivity toupstream PFS nodes.Peering Fabric ExtensionIn some cases, there may be peering facilities within close geographicproximity which need to integrate into a single fabric. This may happenif there are multiple 3rd party facilities in a closegeographic area, each with unique peers you want to connect to. Theremay also be multiple independent peering facilities within a smallgeographic area you do not wish to install a complete peering fabricinto. In those cases, connecting remote PFL nodes to a larger peeringfabric can be done using optical transport or longer range gray optics.Localized Metro Peering and Content DeliveryIn order to drive greater network efficiency, content sources should beplaces as close to the end destination as possible. Traditional wirelineand wireless service providers have heavy inbound traffic from contentproviders delivering OTT video. Providers may also be providing theirown IP video services to on-net and off-net destinations via a SP CDN.Peering and internal CDN equipment can be placed within a localized peeror content delivery center, connected via a common peering fabric. Inthese cases the PFS nodes connect directly to the metro core to enabledelivery across the region or metro.Express Peering FabricAn evolution to localized metro peering is to interconnect the PFSpeering nodes directly or a metro-wide peering core. The main driver fordirect interconnection is minimizing the number of router and transportnetwork interfaces traffic must pass through. High density opticalmuxponders such as the NCS1002 along with flexible photonic ROADMarchitectures enabled by the NCS2000 can help make the most efficientuse of metro fiber assets.Datacenter Edge PeeringIn order to serve traffic as close to consumer endpoints as possible aprovider may construct a peering edge attached to an edge or centraldatacenter. As gateway functions in the network become virtualized forapplications such as vPE, vCPE, and mobile 5G, the need to attachInternet peering to the SP DC becomes more important. The Peering Fabric supports interconnected to the DC via the SP core or withthe PFS nodes as leafs to the DC spine. These would act as traditionalborder routers in the DC design.Peer Traffic Engineering with Segment RoutingSegment Routing performs efficient source routing of traffic across aprovider network. Traffic engineering is particular applicable topeering as content providers look for ways to optimize egress networkports and eyeball providers work to reduce network hops between ingressand subscriber. There are also a number of advanced use cases based onusing constraints to place traffic on optimal paths, such as latency. AnSRTE Policy represents a forwarding entity within the SR domain mappingtraffic to a specific network path, defined statically on the node orcomputed by an external PCE. An additional benefit of SR is the abilityto source route traffic based on a node SID or an anycast SIDrepresenting a set of nodes. ECMP behavior is preserved at each point inthe network, redundancy is simplified, and traffic protection issupplied using TI-LFA.In the Low-Level Design we explore common peer engineering use cases.Much more information on Segment Routing technology and its futureevolution can be found at http#//segment-routing.netLow-Level DesignIntegrated Peering Fabric Reference DiagramDistributed Peering Fabric Reference DiagramPeering Fabric Hardware DetailThe NCS5500 family of routers provide high density, high routing scale,idea buffer sizes, and environmental efficiency to help providerssatisfy any peering fabric use case. Due to high FIB scale, largebuffers, and broad XR feature set, all prescribed hardware can serve ineither a collapsed or distributed fabric. Further detailed informationon each platform can be found athttps#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.html.NCS-5501-SEThe NCS 5501 is a 1RU fixed router with 40X10GE SFP+ and 4X100GE QSFP28interfaces. The 5501 has IPv4 FIB scale of at least 2M routes. The5501-SE is ideal as a peering leaf node when providers need 10GEinterface flexibility such as ER, ZR, or DWDM.NCS-55A1-36H-SEThe 55A1-36H-SE is a second generation 1RU NCS5500 fixed platform with36 100GE QSFP28 ports operating at line rate. The –SE model contains anexternal TCAM increasing route scale to a minimum of 3M IPv4/512K IPv6routes in its FIB. It also contains a powerful multi-core routeprocessor with 64GB of RAM and an on-board 64GB SSD. Its high density,efficiency, and buffering capability make it ideal in 10GE or 100GEdeployments. Peering fabrics can scale to much higher capacity 1RU at atime by simply adding additional 55A1-36H-SE spine nodes.NCS-55A1-24HThe NCS-55A1-24H is a second generation 1RU NCS5500 fixed platform with24 100GE QSFP28 ports. The device uses two 900GB NPUs, with 12X100GEports connected to each NPU. The 55A1-24H uses a high scale NPU with aminimum of 1.3M IPv4/256K IPv6 routes. At just 675W it is ideal for 10GEpeering fabric deployments with a migration path to 100GE connectivity.The 55A1-24H also has a powerful multi-core processor and 32GB of RAM.NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card Very large peering fabric deployments or those needing interfaceflexibility such as IPoDWDM connectivity can use the modular NCS5500series chassis. Large deployments can utilize the second-generation36X100G-A-SE line card with external TCAM, supporting a minimum of 3MIPv4 routes.Peer Termination StrategyOften overlooked when connecting to Internet peers is determining astrategy to maximize efficiency and resiliency within a local peeringinstance. Often times a peer is connected to a single peering node evenwhen two nodes exist for ease of configuration and coordination with thepeering or transit partner. However, with minimal additionalconfiguration and administration assisted by automation, even singlepeers can be spread across multiple edge peering nodes. Ideally, withina peering fabric, a peer is connected to each leaf in the fabric. Incases where this cannot be done, the provider should use capacityplanning processes to balance peers and transit connections acrossmultiple leafs in the fabric. The added resiliency leads to greaterefficiency when failures do happen, with less reliance on peeringcapacity further away from the traffic destination.Distributed Fabric Device RolesPFL – Peering Fabric LeafThe Peering Fabric Leaf is the node physically connected to externalpeers. Peers could be aggregation routers or 3rd party CDNnodes. In a deconstructed design the PFL is analogous to a line card ina modular chassis solution. PFL nodes can be added as capacity needsgrow.PFS – Peering Fabric SpineThe Peering Fabric Spine acts as an aggregation node for the PFLs and isalso physical connected to the rest of the provider network. Theprovider network could refer to a metro core in the case of localizedpeering, a backbone core in relation to IXP peering, a DC spine layer inthe case of DC peering.Device InterconnectionIn order to maximize resiliency in the fabric, each PFL node isconnected to each PFS. While the design shown includes three PFLs andtwo PFS nodes, there could be any number of PFL and PFS nodes, scalinghorizontally to keep up with traffic and interface growth. PFL nodes arenot connected to each other, the PFS nodes provide the capacity for anytraffic between those nodes. The PFS nodes are also not interconnectedto each other, as no end device should terminate on the PFL, only otherrouters.Capacity ScalingCapacity of the peering fabric is scaled horizontally. The uplinkcapacity from PFL to PFS will be determine by an appropriateoversubscription factor determined by the service provider’s capacityplanning exercises. The leaf/spine architecture of the fabric connectseach PFL to each PFS with equal capacity. In steady-state operationtraffic is balanced between the PFS and PFL in both directions,maximizing the total capacity. The entropy in peering traffic generallyensures equal distribution between either ECMP paths or bundle interfacemember links in the egress direction. More information can be found inthe forwarding plane section of the document. An example deployment mayhave two NC55-36X100G-A-SE spine nodes and two NC55A1-24H leaf nodes. Ina 100GE peer deployment scenario each leaf would support 14x100GE clientconnections and 5x100GE to each spine node. A 10GE deployment wouldsupport 72x10GE client ports and 3x100GE to each spine, at a 1.2#1oversubscription ratio.Peering Fabric Control PlanePFL to PeerThe Peering Fabric Leaf is connected directly to peers via traditionalEBGP. BFD may additionally be used for fault detection if agreed to bythe peer. Each EBGP peer will utilize SR EPE to enable TE to the peerfrom elsewhere on the provider network.PFL to PFSPFL to Peering Fabric Spine uses widely deployed standard routingprotocols. IS-IS is the prescribed IGP protocol within the peeringfabric. Each PFS is configured with the same IS-IS L1 area. In the casewhere OSPF is being used as an IGP, the PFL nodes will reside in an OSPFNSSA area. The peering fabric IGP is SR-enabled with the loopback ofeach PFL assigned a globally unique SR Node SID. Each PFL also has anIBGP session to each PFR to distribute its learned EBGP routes upstreamand learn routes from elsewhere on the provider network. If a provideris distributing routes from PFL to PFL or from another peering locationto local PFLs it is important to enable the BGP “best-path-external”feature to ensure the PFS has the routing information to acceleratere-convergence if it loses the more preferred path.Egress peer engineering will be enabled for EBGP peering connections, sothat each peer or peer interface connected to a PFL is directlyaddressable by its AdJ-Peer-SID from anywhere on the SP network.Adj-Peer-SID information is currently not carried in the IGP of thenetwork. If utilized it is recommended to distribute this informationusing BGP-LS to all controllers creating paths to the PFL EPEdestinations.Each PFS node will be configured with IBGP multipath so traffic is loadbalanced to PFL nodes and increase resiliency in the case of peerfailure. On reception of a BGP withdraw update for a multipath route,traffic loss is minimized as the existing valid route is stillprogrammed into the FIB.PFS to CoreThe PFS nodes will participate in the global Core control plane and actas the gateway between the peering fabric and the rest of the SPnetwork. In order to create a more scalable and programmatic fabric, itis prescribed to use Segment Routing across the core infrastructure.IS-IS is the preferred protocol for transmitting SR SID information fromthe peering fabric to the rest of the core network and beyond. Indeployments where it may be difficult to transition quickly to an all-SRinfrastructure, the PFS nodes will also support OSPF and RSVP-TE forinterconnection to the core. The PFS acts as an ABR or ASBR between thepeering fabric and the larger metro or backbone core network.SR Peer Traffic EngineeringSummarySR allows a provider to create engineered paths to egress peeringdestinations or egress traffic destinations within the SP network. Astack of globally addressable labels is created at the traffic entrypoint, requiring no additional protocol state at midpoints in thenetwork and preserving qualities of normal IGP routing such as ECMP ateach hop. The Peering Fabric proposes end-to-end visibility fromthe PFL nodes to the destinations and vice-versa. This will allow arange of TE capabilities targeting a peering location, peering exitnode, or as granular as a specific peering interface on a particularnode. The use of anycast SIDs within a group of PFS nodes increasesresiliency and load balancing capability.Nodal EPENode EPE directs traffic to a specific peering node within the fabric.The node is targeted using first the PFS cluster anycast IP along withthe specific PFL node SID.Peer Interface EPEThis example uses an Egress Peer Engineering peer-adj-SID value assignedto a single peer interface. The result is traffic sent along this SRpath will use only the prescribed interface for egress traffic.Abstract PeeringAbstract peering allows a provider to simply address a Peering Fabric bythe anycast SIDs of its cluster of PFS nodes. In this case PHP is usedfor the anycast SIDs and traffic is simply forwarded as IP to the finaldestination across the fabric.Peering Fabric TelemetryOnce a peering fabric is deployed, it is extremely important to monitorthe health of the fabric as well as harness the wealth of data providedby the enhanced telemetry on the NCS5500 platform and IOS-XR. Throughstreaming data mechanisms such as Model-Driven Telemetry, BMP, andNetflow, providers can extract data useful for operations, capacityplanning, security, and many other use cases. In the diagram below, thetelemetry collection hosts could be a single system or distributedsystems used for collection. The distributed design of the peeringfabric enhances the ability to collect telemetry data from the fabric bydistributing resources across the fabric. Each PFL or PFS contains amodern multi-core CPU and at least 32GB of RAM (64GB in NC55A1-36H-SE)to support not only built in telemetry operation but also 3rdparty applications a service or content provider may want to deploy tothe node for additional telemetry. Examples of 3rd partytelemetry applications include those storing temporary data forroot-cause analysis if a node is isolated from the rest of the networkor performance measurement applications.The peering fabric also fully supports traditional collections methodssuch as SNMP, and NETCONF using YANG models to integrate with legacysystems.Telemetry DiagramModel-Driven TelemetryMDT uses standards-based or native IOS-XR YANG data models to streamoperational state data from deployed devices. The ability to pushstatistics and state data from the device adds capabilities andefficiency not found using traditional SNMP. Sensors and collectionhosts can be configured statically on the host (dial-out) or the set ofsensors, collection hosts, and their attributes can be managed off-boxusing OpenConfig or native IOS-XR YANG models. Pipeline is Cisco’s opensource collector, which can take MDT data as an input and output it viaa plugin architecture supporting scalable messages buses such as Kafka,or directly to a TSDB such as InfluxDB or Prometheus. The appendixcontains information about MDT YANG paths relevant to the peering fabricand their applicability to PFS and PFL nodes.BGP Monitoring ProtocolBMP, defined in RFC7854, is a protocol to monitor BGP RIB information,updates, and protocol statistics. BMP was created to alleviate theburden of collecting BGP routing information using inefficientmechanisms like screen scraping. BMP has two primary modes, RouteMonitoring mode and Route Mirroring mode. The monitoring mode willinitially transmit the adj-rib-in contents per-peer to a monitoringstation, and continue to send updates as they occur on the monitoreddevice. Setting the L bits on the RM header to 1 will convey this is apost-policy route, 0 will indicate pre-policy. The mirroring mode simplyreflects all received BGP messages to the monitoring host. IOS-XRsupports sending pre and post policy routing information and updates toa station via the Route Monitoring mode. BMP can additionally sendinformation on peer state change events, including why a peer went downin the case of a BGP event.There are drafts in the IETF process led by Cisco to extend BMP toreport additional routing data, such as the loc-RIB and per-peeradj-RIB-out. Local-RIB is the full device RIB include ng received BGProutes, routes from other protocols, and locally originated routes.Adj-RIB-out will add the ability to monitor routes advertised to peerspre and post routing policy.Netflow / IPFIXNetflow was invented by Cisco due to requirements for traffic visibilityand accounting. Netflow in its simplest form exports 5-tuple data foreach flow traversing a Netflow-enabled interface. Netflow data isfurther enhanced with the inclusion of BGP information in the exportedNetflow data, namely AS_PATH and destination prefix. This inclusionmakes it possible to see where traffic originated by ASN and derive thedestination for the traffic per BGP prefix. The latest iteration ofCisco Netflow is Netflow v9, with the next-generation IETF standardizedversion called IPFIX (IP Flow Information Export). IPFIX has expanded onNetflow’s capabilities by introducing hundreds of entities.Netflow is traditionally partially processed telemetry data. The deviceitself keeps a running cache table of flow entries and countersassociated with packets, bytes, and flow duration. At certain timeintervals or event triggered, the flow entries are exported to acollector for further processing. The type 315 extension to IPFIX,supported on the NCS5500, does not process flow data on the device, butsends the raw sampled packet header to an external collector for allprocessing. Due to the high bandwidth, PPS rate, and large number ofsimultaneous flows on Internet routers, Netflow samples packets at apre-configured rate for processing. Typical sampling values on peeringrouters are 1 in 8192 packets, however customers implementing Netflow orIPFIX should work with Cisco to fine tune parameters for optimal datafidelity and performance.Automation and ProgrammabilityCisco NSO ModulesCisco Network Services Orchestrator is a widely deployed networkautomation and orchestration platform, performing intent-drivenconfiguration and validation of networks from a single source of truthconfiguration database. The Peering design includes a Cisco NSOmodules to perform specific peering tasks such as peer turn-up, peermodification, deploying routing policy and ACLs to multiple nodes,providing a jumpstart to peering automation. The following table highlights the currently available Peering NSO services. The current peering service models use the IOS-XR CLI NED and are validated with NSO 4.5.5. Service Description peering-service Manage full BGP and Interface Configuration for EBGP Peers peering-acl Manage infrastructure ACLs referenced by the peering service prefix-set Manage IOS-XR prefix-sets as-path-set Manage IOS-XR as-path sets route-policy Manage XR routing policies for deployment to multiple peering nodes peering-common A set of services to manage as-path sets, community sets, and static routing policies drain-service Service to automate draining traffic away from a node under maintenance telemetry Service to enable telemetry sensors and export to collector bmp Service to enable BMP on configured peers and export to monitoring station netflow Service to enable Netflow on configured peer interfaces and export to collector PFL-to-PFS-Routing Configures IGP and BGP routing between PFL and PFS nodes PFS-Global-BGP Configures global BGP parameters for PFS nodes PFS-Global-ISIS Configures global IS-IS parameters for PFS nodes NetconfNetconf is an industry standard method for configuration networkdevices. Standardized in RFC 6241, Netconf has standard Remote ProcedureCalls (RPCs) to manipulate configuration data and retrieving state data.Netconf on IOS-XR supports the candidate datastore, meaningconfiguration must be explicitly committed for application to therunning configuration.YANG Model SupportWhile Netconf created standard RPCs for managing configuration on adevice, it did not define a language for expressing configuration. Theconfiguration syntax communicated by Netconf followed the typical CLIconfiguration, proprietary for each network vendor XML formatted withoutfollowing any common semantics. YANG or Yet Another Network Grammar, isa modeling language to express configuration using standard elementssuch as containers, groups, lists, and endpoint data called leafs. YANG1.0 was defined in RFC 6020 and updated to version 1.1 in RFC 7950.Vendors cover the majority of device configuration and state usingNative YANG models unique to each vendor, but the industry is headedtowards standardized models where applicable. Groups such as OpenConfigand the IETF are developing standardized YANG models allowing operatorsto write a configuration once across all vendors. Cisco has implementeda number of standard OpenConfig network models relevant to peeringincluding the BGP protocol, BGP RIB, and Interfaces model.The appendix contains information about YANG paths relevant toconfiguring the peering fabric and their applicability to PFS and PFLnodes.3rd Party Hosted ApplicationsIOS-XR starting in 6.0 runs on an x86 64-bit Linux foundation. The moveto an open and well supported operating system, with XR componentsrunning on top of it, allows network providers to run 3rdparty applications directly on the router. There are a wide variety ofapplications which can run on the XR host, with fast path interfaces inand out of the application. Example applications are telemetrycollection, custom network probes, or tools to manage other portions ofthe network within a location.XR Service Layer APIThe XR service layer API is a gRPC based API to extract data from adevice as well as provide a very fast programmatic path into therouter’s runtime state. One use case of SL API in the peering fabricis to directly program FIB entries on a device, overriding the defaultpath selection. Using telemetry extracted from a peering fabric, anexternal controller can use the data and additional external constraintsto programmatically direct traffic across the fabric. SL API alsosupports transmission of event data via subscriptions.Recommended Device and Protocol ConfigurationOverviewThe following configuration guidelines will step through the majorcomponents of the device and protocol configuration specific to thepeering fabric and highlight non-default configuration recommended foreach device role and the reasons behind those choices. Complete exampleconfigurations for each role can be found in the Appendix of thisdocument. Configuration specific to telemetry is covered in section 4.Common Node ConfigurationThe following configuration is common to both PFL and PFS NCS5500 seriesnodes.Enable LLDP GloballylldpPFS NodesAs the PFS nodes will integrate into the core control-plane, onlyrecommended configuration for connectivity to the PFL nodes is given.IGP Configurationrouter isis pf-internal-core set-overload-bit on-startup wait-for-bgp is-type level-1-2 net <L2 NET> net <L1 PF NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10Segment Routing Traffic EngineeringIn IOS-XR there are two mechanisms for configuring SR-TE. Prior to IOS-XR 6.3.2 SR-TE was configured using the MPLS traffic engineering tunnel interface configuration. Starting in 6.3.2 SR-TE can now be configured using the more flexible SR-TE Policy model. The following examples show how to define a static SR-TE path from PFS node to exit PE node using both the legacy tunnel configuration model as well as the new SR Policy model.Paths to PE exit node being load balanced across two static P routers using legacy tunnel configexplicit-path name PFS1-P1-PE1-1 index 1 next-address 192.168.12.1 index 2 next-address 192.168.11.1!explicit-path name PFS1-P2-PE1-1 index 1 next-label 16221 index 2 next-label 16511!interface tunnel-te1 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.1 path-option 1 explicit name PFS1-P1-PE1-1 segment-routing!interface tunnel-te2 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.2 path-option 1 explicit name PFS1-P2-PE1-1 segment-routingIOS-XR 6.3.2+ SR Policy Configurationsegment-routingtraffic-eng segment-list PFS1-P1-PE1-SR-1 index 1 mpls label 16211 index 2 mpls label 16511 ! segment-list PFS1-P2-PE1-SR-1 index 1 mpls label 16221 index 2 mpls label 16511 ! policy pfs1_pe1_via_p1 binding-sid mpls 900001 color 1 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! ! policy pfs1_pe1_via_p2 binding-sid mpls 900002 color 2 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! !BGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATH is longer address-family ipv4 unicast additional-paths receive maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths receive bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF Model-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1PFL NodesPeer QoS PolicyPolicy applied to edge of the network to rewrite any incoming DSCP valueto 0.policy-map peer-qos-in class class-default set dscp default ! end-policy-map!Peer Infrastructure ACLSee the Security section of the document for recommended best practicesfor ingress and egress infrastructure ACLs.access-group v4-infra-acl-in access-group v6-infra-acl-in access-group v4-infra-acl-out access-group v6-infra-acl-out Peer Interface Configurationinterface TenGigE0/0/0/0 description “external peer” service-policy input peer-qos-in ;Explicit policy to rewrite DSCP to 0 lldp transmit disable #Do not run LLDP on peer connected interfaces lldp receive disable #Do not run LLDP on peer connected interfaces ipv4 access-group v4-infra-acl-in #IPv4 Ingress infrastructure ACL ipv4 access-group v4-infra-acl-out #IPv4 Egress infrastructure ACL, BCP38 filtering ipv6 access-group v6-infra-acl-in #IPv6 Ingress infrastructure ACL ipv6 access-group v6-infra-acl-out #IPv6 Egress infrastructure ACL, BCP38 filtering IS-IS IGP Configurationrouter isis pf-internal set-overload-bit on-startup wait-for-bgp is-type level-1 net <L1 Area NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10 ! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10BGP Add-Path Route Policyroute-policy advertise-all ;Create policy for add-path advertisements set path-selection all advertiseend-policyBGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATh is longer address-family ipv4 unicast bgp attribute-download ;Enable BGP information for Netflow/IPFIX export additional-paths send additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv4 NLRI to PFS maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths send additional-paths receive additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv6 NLRI to PFS bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF EBGP Peer Configurationsession-group peer-session ignore-connected-check #Allow loopback peering over ECMP w/o EBGP Multihop egress-engineering #Allocate adj-peer-SID bmp-activate server 1 #Optional send BMP data to receiver 1af-group v4-af-peer address-family ipv4 unicast soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 1000 80;Set maximum inbound prefixes, warning at 80% thresholdaf-group v6-af-peer soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 100 80 #Set maximum inbound prefixes, warning at 80% thresholdneighbor-group v4-peer use session-group peer-session dmz-link-bandwidth ;Propagate external link BW address-family ipv4 unicast af-group v4-af-peerneighbor-group v6-peer use session-group peer-session dmz-link-bandwidth address-family ipv6 unicast af-group v6-af-peer neighbor 1.1.1.1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v4-peer address-family ipv4 unicast route-policy v4-peer-in(12345) in route-policy v4-peer-out(12345) out neighbor 2001#dead#b33f#0#1#1#1#1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v6-peer address-family ipv6 unicast route-policy v6-peer-in(12345) in route-policy v6-peer-out(12345) out PFL to PFS IBGP Configurationsession-group pfs-session bmp-activate server 1 #Optional send BMP data to receiver 1 update-source Loopback0 #Set BGP session source address to Loopback0 address af-group v4-af-pfs address-family ipv4 unicast next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v4-pfs-in in route-policy v4-pfs-out out af-group v6-af-pfs next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v6-pfs-in in route-policy v6-pfs-out out neighbor-group v4-pfs ! use session-group pfs-session address-family ipv4 unicast af-group v4-af-pfsneighbor-group v6-pfs ! use session-group pfs-session address-family ipv6 unicast af-group v6-af-pfs neighbor <PFS IP> description ~PFS #1~ remote-as <local ASN> use neighbor-group v4-pfsNetflow/IPFIX Configurationflow exporter-map nf-export version v9 options interface-table timeout 60 options sampler-table timeout 60 template timeout 30 ! transport udp <port> source Loopback0 destination <dest>flow monitor-map flow-monitor-ipv4 record ipv4 option bgpattr exporter nf-export cache entries 50000 cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-ipv6 record ipv6 option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-mpls record mpls ipv4-ipv6-fields option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10 sampler-map nf-sample-8192 random 1 out-of 8192Peer Interfaceinterface Bundle-Ether100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressPFS Upstream Interfaceinterface HundredGigE0/0/0/100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressModel-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1Abstract Peering ConfigurationAbstract peering uses qualities of Segment Routing anycast addresses toallow a provider to steer traffic to a specific peering fabric by simplyaddressing a node SID assigned to all PFS members of the peeringcluster. All of the qualities of SR such as midpoint ECMP and TI-LFAfast protection are preserved for the end to end BGP path, improvingconvergence across the network to the peering fabric. Additionally,through the use of SR-TE Policy, source routed engineered paths can beconfigured to the peering fabric based on business logic and additionalpath constraints.PFS ConfigurationOnly the PFS nodes require specific configuration to perform abstractpeering. Configuration shown is for example only with IS-IS configuredas the IGP carrying SR information. The routing policy setting thenext-hop to the AP anycast SID should be incorporated into standard IBGPoutbound routing policy.interface Loopback1 ipv4 address x.x.x.x/32 ipv6 address x#x#x#x##x/128 router isis <ID> passive address-family ipv4 unicast prefix-sid absolute <Global IPv4 AP Node SID> address-family ipv6 unicast prefix-sid absolute <Global IPv6 AP Node SID> route-policy v4-abstract-ibgp-out set next-hop <Loopback1 IPv4 address> route-policy v6-abstract-ibgp-out set next-hop <Loopback1 IPv6 address> router bgp <ASN> ibgp policy out enforce-modifications ;Enables a PFS node to set a next-hop address on routes reflected to IBGP peersrouter bgp <ASN> neighbor x.x.x.x address-family ipv4 unicast route-policy v4-abstract-ibgp-out neighbor x#x#x#x##x address-family ipv6 unicast route-policy v6-abstract-ibgp-out BGP Flowspec Configuration and OperationBGP Flowspec consists of two different node types. The BGP Flowspec Server is where Flowspec policy is defined and sent to peers via BGP sessions with the BGP Flowspec IPv4 and IPv6 AFI/SAFI enabled. The BGP Flowspec Client receives Flowspec policy information and applies the proper dataplane match and action criteria via dynamic ACLs applied to each routerinterface. By default, IOS-XR applies the dynamic policy to all interfaces, with an interface-level configuration setting used to disable BGP Flowspec on specific interfaces.In the Peering Fabric, PFL nodes will act as Flowspec clients. The PFS nodes may act as Flowspec servers, but will never act as clients.Flowspec policies are typically defined on an external controller to be advertised to the rest of the network. The XRv-9000 virtual router works well in these instances. If one is using an external element to advertise Flowspec policies to the peering fabric, they should be advertised to the PFS nodes which will reflect them to the PFL nodes. In the absence of an external policy injector Flowspec policies can be defined on the Peering Fabric PFS nodes for advertisement to all PFL nodes. IPv6 Flowspec on the NCS5500 requires the use of the following global command, followed by a device reboot. hw-module profile flowspec ipv6-enableEnabling BGP Flowspec Address Families on PFS and PFL NodesFollowing the standard Peering Fabric BGP group definitions the following new groups are augmented. The following configuration assumes the PFS node is the BGP Flowspec server.PFSrouter bgp <ASN>address-family ipv4 flowspec address-family ipv6 flowspec af-group v4-flowspec-af-pfl address-family ipv4 flowspec multipath route-reflector-client next-hop-self af-group v6-flowspec-af-pfl address-family ipv4 flowspec multipath route-reflector-client next-hop-self neighbor-group v4-pfl address-family ipv4 flowspec use af-group v4-flowspec-af-pfl neighbor-group v6-pfl address-family ipv6 flowspec use af-group v6-flowspec-af-pfl PFLrouter bgp <ASN>address-family ipv4 flowspec address-family ipv6 flowspec af-group v4-flowspec-af-pfs address-family ipv4 flowspec multipath af-group v6-flowspec-af-pfs address-family ipv4 flowspec multipath neighbor-group v4-pfs address-family ipv4 flowspec use af-group v4-flowspec-af-pfl neighbor-group v6-pfs address-family ipv6 flowspec use af-group v6-flowspec-af-pfl BGP Flowspec Server Policy DefinitionPolicies are defined using the standard IOS-XR QoS Configuration, the first example below matches the recent memcached DDoS attack and drops all traffic. Additional examples are given covering various packet matching criteria and actions.class-map type traffic match-all memcached match destination-port 11211 match protocol udp tcp match destination-address ipv4 10.0.0.0 255.255.255.0 end-class-map!!policy-map type pbr drop-memcached class type traffic memcached drop ! class type traffic class-default ! end-policy-mapclass-map type traffic match-all icmp-echo-flood match protocol icmp match ipv4 icmp type 8 match destination-address ipv4 10.0.0.0 255.255.255.0 end-class-map!!policy-map type pbr limit-icmp-echo class type traffic memcached police rate 100 kbps ! class type traffic class-default ! end-policy-mapclass-map type traffic match-all dns match protocol udp match source port 53 end-class-map!!policy-map type pbr redirect-dns class type traffic dns police rate 100 kbps ! class type traffic class-default redirect nexthop 1.1.1.1 redirect nexthop route-target 1000#1 ! end-policy-mapBGP Flowspec Server EnablementThe following global configuration will enable the Flowspec server and advertisethe policy via the BGP Flowspec NLRIflowspec address-family ipv4 service-policy type pbr drop-memcachedBGP Flowspec Client ConfigurationThe following global configuration enables the BGP Flowspec client function and installation of policies on all local interfaces. Flowspec can be disabled on individual interfaces using the [ipv4|ipv6] flowspec disable command in interface configuration mode.flowspec address-family ipv4 local-install interface-all QPPB Configuration and OperationQoS Policy Propagation using BGP is described in more detail in the Security section.QPPB applies standard QoS policies to packets matching BGP prefix criteria such as BGP community or AS Path. QPPB is supported for both IPv4 and IPv6 address families and packets. QPPB on the NCS5500 supports matching destination prefix attributes only.QPPB configuration starts with a standard RPL route policy that matches BGP attributes and sets a specific QoS group based on that criteria. This routing policy is applied to each address-family as a table-policy in the global BGP configuration. A standard MQC QoS policy is then defined using the specific QoS groups as match criteria to apply additional QoS behavior such as filtering, marking, or policing. This policy is applied to a logical interface, with a specific QPPB command used to enable the propagation of BGP data as part of the dataplane ACL packet match criteria.IPv6 QPPB on the NCS5500 requires the use of the following global command, followed by a device reboot. hw-module profile qos ipv6 shortRouting Policy Configurationroute-policy qppb-test if community matches-every (1000#1) then set qos-group 1 endif if community matches-every (1000#2) then set qos-group 2 endifend-policyGlobal BGP Configurationrouter bgp <ASN> address-family ipv4 unicast table-policy qppb-test address-family ipv6 unicast table-policy qppb-test QoS Policy Definitionclass-map match-any qos-group-1 match qos-group 1 end-class-map class-map match-any qos-group-2 match qos-group 2 end-class-map policy-map remark-peer-traffic class qos-group1 set precedence 5 set mpls experimental imposition 5 ! class qos-group2 set precedence 3 set mpls experimental imposition 3 ! class class-default ! end-policy-mapInterface-Level Configurationinterface gigabitethernet0/0/0/1 service-policy input remark-peer-traffic ipv4 bgp policy propagation input qos-group destination ipv6 bgp policy propagation input qos-group destination SecurityPeering by definition is at the edge of the network, where security ismandatory. While not exclusive to peering, there are a number of bestpractices and software features when implemented will protect your ownnetwork as well as others from malicious sources within your network.Peering and Internet in a VRFUsing VRFs to isolate peers and the Internet routing table from the infrastructure can enhance security by keeping internal infrastructure components separate from Internet and end user reachability. VRF separation can be done one of three different ways# Separate each peer into its own VRF, use default VRF on SP Network Single VRF for all “Internet” endpoints, including peers Separate each peer into its own VRF, and use a separate “Internet” VRFVRF per Peer, default VRF for InternetIn this method each peer, or groups of peers, are configured under separate VRFs. The SP carries these and all other routes via the default VRF in IOS-XR commonly known as the Global Routing Table. The VPNv4 and VPNv6 address families are NOT configured on the BGP peering sessions between the PFL and PFS nodes and the PFS nodes and the rest of the network. IOS-XR provides the command import from default-vrf and export to default-vrf with a route-policy to match specific routes to be imported to/from each peer VRF to the default VRF. This provides dataplane isolation between peers and another mechanism to determine which SP routes are advertised to each peer.Internet in a VRF OnlyIn this method all Internet endpoints are configured in the same “Internet” VRF. The security benefit is removing dataplane connectivity between the global Internet and your underlying infrastructure, which is using the default VRF for all internal connectivity. This method uses the VPNv4/VPNv6 address families on all BGP peers and requires the Internet VRF be configured on all peering fabric nodes as well as SP PEs participating in the global routing table. If there are VPN customers or public-facing services in their own VRF needing Internet access, routes can be imported/exported from the Internet VRF on the PE devices they attach to.VRF per Peer, Internet in a VRFThis method combines the properties and configuration of the previous two methods for a solution with dataplane isolation per peer and separation of all public Internet traffic from the SP infrastructure layer. The exchange of routes between the peer VRFs and Internet VRF takes place on the PFL nodes with the rest of the network operating the same as the Internet in a VRF use case.The VPNv4 and VPNv6 address families must be configured across all routers in the network.Infrastructure ACLsInfrastructure ACLs and their associated ACEs (Access Control Entries)are the perimeter protection for a network. The recommended PFL deviceconfiguration uses IPv4 and IPv6 infrastructure ACLs on all edgeinterfaces. These ACLs are specific to each provider’s security needs,but should include the following sections. Filter IPv4 and IPv6 BOGON space ingress and egress Drop ingress packets with a source address matching your own aggregate IPv4/IPv6 prefixes. Rate-limit ingress traffic to Unix services typically used in DDoSattacks, such as chargen (TCP/19). On ingress and egress, allow specific ICMP types and rate-limit toappropriate values, filter out ones not needed on your network. ICMPttl-exceeded, host unreachable, port unreachable, echo-reply,echo-request, and fragmentation needed should always be allowed in somecapacity.BCP ImplementationBest Current Practices are informational documents published by the IETFto give guidelines on operational practices. This document will notoutline the contents of the recommended BCPs, but two in particular areof interest to Internet peering. BCP38 explains the need to filterunused address space at the edges of the network, minimizing the chancesof spoofed traffic from DDoS sources reaching their intended target.BCP38 is applicable for ingress traffic and especially egress traffic,as it stops spoofed traffic before it reaches outside your network.BCP194, BGP Operations and Security, covers a number of BGP operationalpractices, many of which are used in Internet peering. IOS-XR supportsall of the mechanisms recommended in BCP38, BCP84, and BCP194, includingsoftware features such as GTTL, BGP dampening, and prefix limits.BGP Attribute and CoS ScrubbingScrubbing of data on ingress and egress of your network is an importantsecurity measure. Scrubbing falls into two categories, control-plane anddataplane. The control-plane for Internet peering is BGP and there are afew BGP transitive attributes one should take care to normalize. Yourinternal BGP communities should be deleted from outbound BGP NLRI viaegress policy. Most often you are setting communities on inboundprefixes, make sure you are replacing existing communities from the peerand not adding communities. Unless you have an agreement with the peer,normalize the MED attribute to zero or another standard value on allinbound prefixes.In the dataplane, it’s important to treat the peering edge as untrustedand clear any CoS markings on inbound packets, assuming a prioragreement hasn’t been reached with the peer to carry them across thenetwork boundary. It’s an overlooked aspect which could lead to peertraffic being prioritized on your network, leading to unexpected networkbehavior. An example PFL infrastructure ACL is given resetting incomingIPv4/IPv6 DSCP values to 0.Per-Peer Control Plane PolicersBGP protocol packets are handled at the RP level, meaning each packet ishandled by the router CPU with limited bandwidth and processingresources. In the case of a malicious or misconfigured peer this couldexhaust the processing power of the CPU impacting other important tasks.IOS-XR enforces protocol policers and BGP peer policers by default.BGP Prefix SecurityRPKI Origin ValidationPrefix hijacking has been prevalent throughout the last decade as theInternet became more integrated into our lives. This led to the creationof RPKI origin validation, a mechanism to validate a prefix was beingoriginated by its rightful owner by checking the originating ASN vs. asecure database. IOS-XR fully supports RPKI for origin validation.BGPSEC (Reference Only)RPKI origin validation works to validate the source of a prefix, butdoes not validate the entire path of the prefix. Origin validation alsodoes not use cryptographic signatures to ensure the originator is whothey say they are, so spoofing the ASN as well does not stop someoneform hijacking a prefix. BGPSEC is an evolution where a BGP prefix iscryptographically signed with the key of its valid originator, and eachBGP router receiving the path checks to ensure the prefix originatedfrom the valid owner. BGPSEC standards are being worked on in the SIDRworking group. Cisco continues to monitor the standards related to BGPSEC and similar technologies to determine which to implement to best serve our customers.AppendixApplicable YANG ModelsModelDataopenconfig-interfacesCisco-IOS-XR-infra-statsd-operCisco-IOS-XR-pfi-im-cmd-operInterface config and state Common counters found in SNMP IF-MIB openconfig-if-ethernet Cisco-IOS-XR-drivers-media-eth-operEthernet layer config and stateXR native transceiver monitoringopenconfig-platformInventory, transceiver monitoring openconfig-bgpCisco-IOS-XR-ipv4-bgp-oper Cisco-IOS-XR-ipv6-bgp-operBGP config and state Includes neighbor session state, message counts, etc.openconfig-bgp-rib Cisco-IOS-XR-ip-rib-ipv4-oper Cisco-IOS-XR-ip-rib-ipv6-operBGP RIB information. Note# Cisco native includes all protocols openconfig-routing-policyConfigure routing policy elements and combined policyopenconfig-telemetryConfigure telemetry sensors and destinations Cisco-IOS-XR-ip-bfd-cfg Cisco-IOS-XR-ip-bfd-operBFD config and state Cisco-IOS-XR-ethernet-lldp-cfg Cisco-IOS-XR-ethernet-lldp-operLLDP config and state openconfig-mplsMPLS config and state, including Segment RoutingCisco-IOS-XR-clns-isis-cfgCisco-IOS-XR-clns-isis-operIS-IS config and state Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-operNCS 5500 HW resources NETCONF YANG PathsNote that while paths are given to retrieve data from a specific leafnode, it is sometimes more efficient to retrieve all the data under aspecific heading and let a management station filter unwanted data thanperform operations on the router. Additionally, Model Driven Telemetrymay not work at a leaf level, requiring retrieval of an entire subset ofdata.The data is also available via NETCONF, which does allow subtree filtersand retrieval of specific data. However, this is a more resourceintensive operation on the router.MetricData     Logical Interface Admin State Enum SNMP OID IF-MIB#ifAdminStatus OC YANG openconfig-interfaces#interfaces/interface/state/admin-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Interface Operational State Enum SNMP OID IF-MIB#ifOperStatus OC YANG openconfig-interfaces#interfaces/interface/state/oper-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Last State Change (seconds) Counter SNMP OID IF-MIB#ifLastChange OC YANG openconfig-interfaces#interfaces/interface/state/last-change Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/last-state-transition-time     Logical Interface SNMP ifIndex Integer SNMP OID IF-MIB#ifIndex OC YANG openconfig-interfaces#interfaces/interface/state/if-index Native YANG Cisco-IOS-XR-snmp-agent-oper#snmp/interface-indexes/if-index     Logical Interface RX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCInOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-received     Logical Interface TX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCOutOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-sent     Logical Interface RX Errors Counter SNMP OID IF-MIB#ifInErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-errors MDT Native     Logical Interface TX Errors Counter SNMP OID IF-MIB#ifOutErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-errors     Logical Interface Unicast Packets RX Counter SNMP OID IF-MIB#ifHCInUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Unicast Packets TX Counter SNMP OID IF-MIB#ifHCOutUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Input Drops Counter SNMP OID IF-MIB#ifIntDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-drops     Logical Interface Output Drops Counter SNMP OID IF-MIB#ifOutDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-drops     Ethernet Layer Stats – All Interfaces Counters SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics     Ethernet PHY State – All Interfaces Counters SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info     Ethernet Input CRC Errors Counter SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state/oc-eth#counters/oc-eth#in-crc-errors Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics/statistic/dropped-packets-with-crc-align-errors The following transceiver paths retrieve the total power for thetransceiver, there are specific per-lane power levels which can beretrieved from both native and OC models, please refer to the model YANGfile for additionalinformation.     Ethernet Transceiver RX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-rx-power     Ethernet Transceiver TX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-tx-power BGP Operational StateGlobal BGP Protocol StateIOS-XR native models do not store route information in the BGP Opermodel, they are stored in the IPv4/IPv6 RIB models. These models containRIB information based on protocol, with a numeric identifier for eachprotocol with the BGP ProtoID being 5. The protoid must be specified orthe YANG path will return data for all configured routingprotocols.     BGP Total Paths (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-paths Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/num-active-paths MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native BGP Neighbor StateExample UsageDue the construction of the YANG model, the neighbor-address key must beincluded as a container in all OC BGP state RPCs. The following RPC getsthe session state for all configured peers#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address/> <state> <session-state/> </state> </neighbor> </neighbors> </bgp> </filter> </get></rpc>\t<nc#rpc-reply message-id=~urn#uuid#24db986f-de34-4c97-9b2f-ac99ab2501e3~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address>172.16.0.2</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> </neighbors> </bgp> </nc#data></nc#rpc-reply>     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Session State for all BGP neighbors Enum SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/session-state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/connection-state     Message counters for all BGP neighbors Counter SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/messages Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/message-statistics Current queue depth for all BGP neighborsCounterSNMP OIDNAOC YANG/openconfig-bgp#bgp/neighbors/neighbor/state/queuesNative YANGCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-outCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-inBGP RIB DataRIB data is retrieved per AFI/SAFI. To retrieve IPv6 unicast routesusing OC models, replace “ipv4-unicast” with “ipv6-unicast”IOS-XR native models do not have a BGP specific RIB, only RIB dataper-AFI/SAFI for all protocols. Retrieving RIB information from thesepaths will include this data.While this data is available via both NETCONF and MDT, it is recommendedto use BMP as the mechanism to retrieve RIB table data.Example UsageThe following retrieves a list of best-path IPv4 prefixes withoutattributes from the loc-RIB#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <loc-rib> <routes> <route> <prefix/> <best-path>true</best-path> </route> </routes> </loc-rib> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc>     IPv4 Local RIB – Prefix Count Counter OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/num-routes Native YANG       IPv4 Local RIB – IPv4 Prefixes w/o Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes/route/prefix     IPv4 Local RIB – IPv4 Prefixes w/Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes Native YANG   The following per-neighbor RIB paths can be qualified with a specificneighbor address to retrieve RIB data for a specific peer. Below is anexample of a NETCONF RPC to retrieve the number of post-policy routesfrom the 192.168.2.51 peer and the returned output.<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes/> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc><nc#rpc-reply message-id=~urn#uuid#7d9a0468-4d8d-4008-972b-8e703241a8e9~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <afi-safi-name xmlns#idx=~http#//openconfig.net/yang/rib/bgp-types~>idx#IPV4_UNICAST</afi-safi-name> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes>3</num-routes> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </nc#data></nc#rpc-reply>     IPv4 Neighbor adj-rib-in pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-re     IPv4 Neighbor adj-rib-in post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-post     IPv4 Neighbor adj-rib-out pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre     IPv4 Neighbor adj-rib-out post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre BGP Flowspec     BGP Flowspec Operational State Counters SNMP OID NA OC YANG NA Native YANG Cisco-IOS-XR-flowspec-oper MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native Device Resource YANG Paths     Device Inventory List OC YANG oc-platform#components     NCS5500 Dataplane Resources List OC YANG NA Native YANG Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data Validated Model-Driven Telemetry Sensor PathsThe following represents a list of validated sensor paths useful formonitoring the Peering Fabric and the data which can be gathered byconfiguring these sensorpaths.Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform openconfig-platform#components cisco-ios-xr-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data cisco-ios-xr-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info cisco-ios-xr-shellutil-oper#system-time/uptime cisco-ios-xr-wdsysmon-fd-oper#system-monitoring/cpu-utilizationLLDP MonitoringCisco-IOS-XR-ethernet-lldp-oper#lldpCisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/neighborsInterface statistics and stateopenconfig-interfaces#interfacesCisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-countersCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interfaceCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statisticsCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statistics/basic-interface-statsThe following sub-paths can be used but it is recommended to use the base openconfig-interfaces modelopenconfig-interfaces#interfaces/interfaceopenconfig-interfaces#interfaces/interface/stateopenconfig-interfaces#interfaces/interface/state/countersopenconfig-interfaces#interfaces/interface/subinterfaces/subinterface/state/countersAggregate bundle information (use interface models for interface counters)sensor-group openconfig-if-aggregate#aggregatesensor-group openconfig-if-aggregate#aggregate/statesensor-group openconfig-lacp#lacpsensor-group Cisco-IOS-XR-bundlemgr-oper#bundlessensor-group Cisco-IOS-XR-bundlemgr-oper#bundle-information/bfd-countersBGP Peering informationsensor-path openconfig-bgp#bgpsensor-path openconfig-bgp#bgp/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticsIS-IS IGP informationsensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighborssensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/interfacessensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/adjacenciesIt is not recommended to monitor complete RIB tables using MDT but can be used for troubleshootingCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countQoS and ACL monitoringopenconfig-acl#aclCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/general-statsCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/queue-stats-arrayBGP RIB informationIt is not recommended to monitor these paths using MDT with large tablesopenconfig-rib-bgp#bgp-ribCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intRouting policy InformationCisco-IOS-XR-policy-repository-oper#routing-policy/policies", "url": "/blogs/2018-10-01-peering-fabric-hld/", "author": "Phil Bedard", "tags": "iosxr, Peering, Design" } , "#": {} , "blogs-2019-02-02-modernizing-ixp-design": { "title": "", "content": " On This Page Modern IX Fabric Design Internet Exchange History Initial Exchanges Growth through Internet Privatization Dawn of the NAP and Rapid IX Expansion IX Design Evolution Initial Exchange Design Ethernet Takes Over More Advanced Fabric Transport VPLS and P2P PW over MPLS EVPN over VXLAN TRILL, PBB, and other L2 Fabric Technology Modern IX Fabric Requirements Hardware High-Density and Future-Proof Edge and WAN Interface Flexibility Power and Space Packet Transport Requirements Service Types and Requirements Point to Point Connectivity Multi-point Connectivity (Multi-lateral Peering) Layer 3 Cloud or Transit Connectivity QoS Requirements Broadcast, Unknown Unicast, and Multicast Traffic Security Peer Connection Isolation L2 Security L3 Security Additional IX Components Fabric and Peer Service Automation Route Servers Route Looking Glass Analytics and Monitoring Fabric Telemetry Route Update History Modern IXP Fabric Network Design Topology Considerations Scale Out Design Fabric Design Benefits Segment Routing Underlay Segment Routing and Segment Routing Traffic Engineering SR-MPLS Data Plane Segment-Routing Flexible Algorithms Constraint-based SR Policies On-Demand SR Policies Segment Routing Benefits in IXP Use L2 IX Services using Segment Routing and EVPN EVPN Background EVPN Benefits for IX Use Modern IXP Deployment Background Single-plane Segment Routing Underlay Deployment Topology Diagram for Single-plane Fabric SRGB and SRLB Definition Base IGP / Segment Routing Configuration Enabling TI-LFA Dual-Plane Fabric using SR Flexible Algorithms Flex-Algo Background Diagram Dual-plane Flex-Algo Configuration Simple L2VPN Services using EVPN BGP AFI/SAFI Configuration EVPN Service Configuration Elements EVI - Ethernet Virtual Instance RD - Route Distinguisher RT - Route Target ESI - Ethernet Segment Identifier Attachment Circuit ID Topology Diagram for Example Services P2P Peer Interconnect using EVPN-VPWS Single-homed EVPN-VPWS service Multi-homed Single-active/All-active EVPN-VPWS service EVPN ELAN Services EVPN ELAN with Single-homed Endpoints EVPN ELAN with Dual-homed Endpoint Appendix Segment Routing and EVPN Troubleshooting Commands Periodic Model Driven Telemetry Device Health Infrastructure Monitoring Routing Protocols Service Monitoring Event Driven Telemetry In our next blog we will explore advanced Segment Routing TE using ODN/Flex-Algo and Layer 3 services using L3VPN and EVPN IRB Modern IX Fabric DesignInternet Exchange HistoryInitial ExchangesThe Internet was founded on a loosely coupled open inter-connectivity model. The ability for two networks to use a simple protocol to exchange IP routing data was essential in the growth of the initial research-focused Internet as well as what it has become today. As the Internet grew it made sense to create locations where multiple networks could connect to each other over a common multi-access network. The initial exchanges connected to NFSNet in the 1980s were located in San Francisco, New York, and Chicago, with an additional European exchange in Stockholm, Sweden. During these days the connectivity was between universities and other government research institutions, but that would soon change as commercial interest in the Internet grew.Growth through Internet PrivatizationIn the early 1990s those running the Internet saw the first commercial companies join. The ANS Internet backbone was built by a consortium of commercial companies (PSI, MCI, and Merit) and those companies wanted to offer commercial services such as Email across the infrastructure and do other entities already connected to the Internet. One of the first commercial exchanges created was simply called the “Commercial Internet Exchange” or CIX, located in Reston, Virginia.One issue encountered in the initial exchange which still happens today is disagreements in connectivity. The original ANS backbone network, the most widely used network for connecting to the Internet, refused to connect to the CIX. As the Internet transitioned to a commercial network as opposed to a research network, the role of ANS diminished as exchanges like CIX become more important.Dawn of the NAP and Rapid IX ExpansionIn the mid 1990s as the Internet became more of a privatized public network, the need arose to create a number of public Internet peering exchange locations to help universities, enterprises, and service providers connect to each other. The original NAPs in the United States were run by either regional or national telecommunications companies. Figure 1 and the table below lists the original five US NAP locations. There were other exchange locations other than the NAPs, but these were the main peering points in the US, and at one point 90% of the worldwide Internet traffic flowed through these five locations. NAP Name Location Operator AADS Chicago, IL Ameritech MAE-EAST Washington DC, DC MFN, Pre-Existing Consortium NYIX New York, NY Sprint *MAE-WEST San Jose, CA MCI PAIX Palo Alto, CA PacBell MAE-WEST was not one of the original four NAPs awarded by NFSNET but was already established as a west coast IX prior to 1993 when those NAPs were awarded.The United States was not the only location in the world seeing the formation of Internet exchanges. The Amsterdam Internet Exchange, AMS-IX was formed in 1994 and is still the largest Internet Exchange in Europe.IX Design EvolutionAs the Internet has evolved so has IX design, driven by bandwidth growth and the need for more flexible interconnection as the scope of traffic and who connects to the Internet evolves.Initial Exchange DesignThe initial Internet exchanges were built to be multi-access networks where a participant could use a single physical connection for both private point to point and public connections. These exchanges primarily used either IP over FDDI or IP over ATM (over TDM) as the transport between peers. Some more forward-looking exchanges also used switched Ethernet, but it was not widely deployed in the mid-1990s. FDDI and ATM allowed the use of virtual circuits to provide point to point and multipoint connections over a common fabric. One important aspect of the fabrics is they used variable length data-link encoding, enabling packet-level statistical multiplexing. A fabric could be built using much less overall capacity than one using traditional TDM circuit switching. Interest in FDDI quickly waned and some IXs like MAE-EAST created a second fabric using ATM due to its popularity in the late 1990s and early 2000s.Ethernet Takes OverIn the late 1990s Ethernet was becoming popular to build Local Area Networks due to its simplicity and cost. The use of VLANs to segment Ethernet traffic into a number of virtual networks also gave it the same flexibility of ATM and FDDI. The original NAPs began to transition to Ethernet at this point for intra-exchange connectivity, such as the case when two providers have equipment co-located at the same IXP facility. One hurdle however was Ethernet circuits for WAN connectivity had not become popular yet, so it took some time for Ethernet to overtake ATM and FDDI completely.After Ethernet took over as the main transport for IX fabrics, they were primarily built using simple L2 switch fabrics. These switch fabrics do not have the loop prevention of IP networks, so protocols like STP or MST must be used for loop prevention. Ethernet fabrics are still in widespread use with IX networks, especially those with smaller scale requirements.More Advanced Fabric TransportAs IX fabrics began to grow, there arose a need for better control of traffic paths and more capabilities than what simple L2 fabrics could offer. This is not an exhaustive list of potential transport options, but lists a few popular ones for IX use.VPLS and P2P PW over MPLSMPLS (Multi-Protocol Label Switching) has been a popular data plane shim layer for providing virtual private networks over a common fabric for more than a decade now. Distribution of labels is done using LDP or RSVP-TE. RSVP-TE offer resilience through the use of fast-reroute and the ability to engineer traffic paths based on constraints. In the design section we will examine the benefits of Segment Routing over both LDP and RSVP-TE. MPLS itself is not enough to interconnect IX participants, it requires using overlay VPN services. VPLS (Virtual Private Lan Service) is the service most widely deployed today, emulating the data plane of a L2 switch, but carrying the traffic as MPLS-encapsulated frames. Point to point services are commonly provisioned using point to point pseudowires signaled using either a BGP or LDP control-plane. Please see RFC 4761 and RFC 6624.EVPN over VXLANWe will speak more about EVPN in the design section, but it has become the modern way to deliver L2VPN services, using a control-plane similar to L3VPN. EVPN extends MP-BGP with signaling extensions for L2 and L3 services that can utilize different underlying transport methods. VXLAN is one method which encapsulates Layer2 frames into a VXLAN packet carried over IP/UDP. VXLAN is considered “overlay transport” since it is carried in IP over any underlying path. VXLAN has no inherent ability to provide resiliency or traffic engineering capabilities. Gaining that functionality requires layering VXLAN on top of MPLS transport, adding complexity to the overall network. Using simple IP/UDP encapsulation, VXLAN is well suited for overlays traversing 3rd party opaque L3 networks, but IXP networks do not generally have this requirement.TRILL, PBB, and other L2 Fabric TechnologyAt a point in the early 2010s there was industry momentum towards creating a more advanced network without introducing L3 routing into the network, considered complex by those in favor of L2. Two protocols with support were TRILL (Transparent Interconnection of Lots of Links), 802.1ah PBB (Provider Backbone Bridging), and its TE addition PBB-TE. Proprietary fabrics were also proposed like Cisco FabricPath and Juniper QFabric. Ultimately these technologies faded and did not see widespread industry adoption.Modern IX Fabric RequirementsHardwareThe heart of an IX is the connectivity to participants at the edge and the transport connecting them to other participants. There are several components that needs to be considered when looking at hardware to build a modern IX.High-Density and Future-ProofBandwidth growth today requires the proper density to support the needs of the specific provider within a specific facility. This can range from 10s of Gbps to 10s of Tbps, requiring the ability to support a variable number of 10G and 100G interfaces. The ability to expand without replacing a chassis is also important as it can be especially cumbersome to do so in non-provider facilities. Today’s chassis-based deployments must be able to support a 400G future.Edge and WAN Interface FlexibilityEthernet is the single type of connectivity used today for connecting to an IX, but an IX does need flexibility to peers and the connectivity between IX fabric elements. 10G connections have largely replaced 1G connections for peers due to cost reduction in both device ports and optics. However, there is still a need for 1G connectivity as well as different physical medium types such as 10GBaseT. In order to support connectivity such as dark fiber, IXs sometimes must also provide ZR, ER, or DWDM based Ethernet port types.Modern IXs are also not typically limited ot a single physical location, so WAN connectivity also becomes important. Using technologies like coherent CFP2-DCO may be a requirement to interconnect IX facilities via high-speed flexible links.Power and SpaceIt almost goes without saying devices in a IX location must have the lowest possible space, power, and cooling footprint per BPS of traffic delivered. As interconnection continues to grow technology must advance to support higher density devices without considerably higher power and cooling requirements.Packet Transport RequirementsResiliency is key within any fabric, and the IX fabric must be able to withstand the failure of a link or node with minimal traffic disruption. Boundaries between different IX facilities must be redundantly connected.Service Types and RequirementsIn a modern IX there can be both traditional L2 connectivity as well as L3 connectivity. The following outlines the different service types and their requirements. One common requirement across all service types is redundant attachment. Each service type should support either active/active or active/standby connectivity from the participant.Point to Point ConnectivityThe most basic service type an IX fabric must provide is point to point participant connectivity. The edge must be able to accept and map traffic to a specific service based on physical port or VLAN tagged frames. The ability to rewrite VLANs at the edge may also be a requirement.Multi-point Connectivity (Multi-lateral Peering)Multi-point connectivity is most commonly used for multi-lateral peering fabrics or “public” fabrics where all participants belong to the same bridge domain. This type of fabric requires less configuration since a single IP interface is used to connect to multiple peers. BGP configuration can also be greatly simplified if the IX provides a route-server to advertise routes to participant peers versus a full mesh of peering sessions.Layer 3 Cloud or Transit ConnectivityDepending on the IX provider, they may offer their own blended transit services or multi-cloud connections via L3 connectivity. In this service type the IX will peer directly with the participant or provide a L3 gateway if the participant is not using dynamic routing to the IX.QoS RequirementsQuality of Service or Class of Service (CoS) covers an array of traffic handling components. With the use of higher speed 10G and 100G interfaces as defacto physical connectivity, the most basic QoS needed is policing or shaping of ingress traffic at the edge to the contracted rate of the participant.An IX can also offer differentiated services for connectivity between participants requiring traffic marking at the edge and specific treatment across the core of the network.Broadcast, Unknown Unicast, and Multicast TrafficA L2 fabric, whether traditional L2 switching or emulated via a technology like EVPN must provide controls to limit the effects of BUM traffic having the potential to flood networks with unwanted or duplicate traffic. At the PE-CE boundary “storm” controls must be supported to limit these traffic types to sensible packet rates.SecurityPeer Connection IsolationOne of the key tenets of a multi-tenant network fabric is to provide secure traffic isolation between parties using the fabric. This can be done using “soft” or “hard” mechanisms. Soft isolation uses packet structure in order to isolate traffic between tenants, such as MPLS headers or specific service tags. As you move down the stack, the isolation becomes “harder”, first using VLAN tags, channels in the case of Flex Ethernet, or separate physical medium to completely isolate tenants. Harder isolation is typically less efficient and more difficult to operate. In modern networks, isolation using VPN services is regarded as sufficient for an IX fabric and offers the greatest flexibility and scale. The isolation must be performed on the IX fabric itself and protect against users spoofing MPLS headers or VLANs tags at the attachment point in the network.L2 SecurityThe table below lists the more common L2 security features required by an IX network. Some of these should perform the action of disabling either permanently or temporarily a connected port. Feature Description L2 ACL Ability to create filters based on L2 frame criteria (SRC/DST MAC, Ethertype, control BPDUs, etc) ARP/ND/RA policing Police ARP/ND/RA requests MAC scale limits Limit MAC scale for specific Bridge Domain Static ARP Override dynamic ARP with static ARP entries L3 Security Feature Description L3 ACL Ability filter on L3 criteria, useful for filtering attacks towards fabric subnets and participants ARP/ND/RA policing Police ARP/ND/RA requests MAC scale limits Limit MAC scale for specific Bridge Domain Static ARP Override dynamic ARP with static ARP entries Additional IX ComponentsFabric and Peer Service AutomationIdeally the management of the underlying network fabric, participant interfaces, and participant services are automated. Using a tool like Cisco NSO as a single source of truth for the network eliminates configuration errors, eases deployment, and eases the removal of configuration when it is no longer needed. NSO also allows easy abstraction and deployment of point to point and multi-point services through the use of defined service models and templates. Deployed services should be well-defined to reduce support complexity.Route ServersA redundant set of route servers is used in many IX deployments to eliminate each peer having to configure a BGP session to every other peer. The route server is similar to a BGP route reflector with the main difference being a route server operates with EBGP peers and not IBGP peers. The route server also acts as a point of route security since the filters governing advertisements between participants is typically performed on the route server. Route server definition can be found in RFC 7947 and route server operations in RFC 7948.Route Looking GlassLooking glasses allow an outside user or internal participant to view the current real-time routing for a specific prefix or set of prefixes. This is invaluable for troubleshooting routing issues.Analytics and MonitoringFabric TelemetryHaving accurate statistics on peer and fabric state is important for evaluating the current health of the fabric as well assist in capacity planning. Monitoring traffic utilization, device health, and protocol state using modern telemetry such as streaming telemetry can help rectify faults faster and improve reliability. See the automation section of the design for a list of common Cisco and OpenConfig models used with streaming telemetry.Route Update HistoryRoute update history is one area of IX operation that can assist not only with IX growth but also Internet health as a whole. Being able to trace the history of route updates coming through an IX helps both providers and enterprises determine root cause for traffic issues, identify the origin of Internet security events, and assist those researching Internet routing change over time. Route update history can be communicated by either BGP or using BGP Monitoring Protocol (BMP).Modern IXP Fabric Network DesignTopology ConsiderationsScale Out DesignWe can learn from modern datacenter design in how we build a modern IX fabric, at least the network located within a single facility or group of locations in close proximity. The use of smaller functional building blocks increases operational efficiency and resiliency within the fabric. Connecting devices in a Clos (leaf/spine or fat-tree are other names) fabric seen in Figure XX versus a large modular chassis approach has a number of benefits.Fabric Design Benefits Scale the fabric by simply adding devices and interconnects Optimal connectivity between fabric endpoints Increased resiliency by utilizing ECMP across the fabric Ability to easily takes nodes in and out of service without affecting many services In the case of interconnecting remote datacenters using a more fabric based approach increases overall scale and resiliency.Segment Routing UnderlaySegment Routing and Segment Routing Traffic EngineeringSegment Routing is the modern simplified packet transport control-plane for multi-service networks. Segment Routing eliminates separate IGP and label distribution protocols running in parallel while providing built-in resilience and traffic engineering capabilities. Much more information on segment routing can be located at http#//www.segment-routing.netSR-MPLS Data PlaneOne important point with Segment Routing is that it is data plane agnostic, meaning the architecture is built to support multiple data plane types. The SR “SID” is an identifier expressed by the underlying data plane. The Segment Routing MPLS data plane uses standard MPLS headers to carry traffic end to end, with SR responsible for label distribution and path computation. In this design we will utilize the SR MPLS data plane, but as other data planes such as Segment Routing IPv6 become mature they could plugin in as well.Segment-Routing Flexible AlgorithmsAn exciting develop in SR capabilities is through the use of “Flexible Algorithms.” Simply put Flex-Algo allows one to define multiple SIDs on a single device with each one representing a specific “algorithm.” Using the algorithm as a constraint in head-end path computation simplifies the path to a single label since the algorithm itself takes of pruning links not applicable to the topology. See the figure below for an example of Flex-Algo using the initial topology definition to restrict the path to only encrypted links.Constraint-based SR PoliciesPath computation on the head-end SR node or external PCE can include more advanced constraints such as latency, link affinity, hop count, or SRLG avoidance. Cisco supports a wide range of path constraints both within XR on the SR head-end node as well as through SR-PCE, Cisco’s external Path Computation Element for Segment Routing.On-Demand SR PoliciesCisco has simplifies the control-plane even more with a feature called ODN (On Demand Next Hop) for SR-TE. When a head-end node receives an EVPN BGP route with a specific extended community, known as the color community, it instructs the head-end node to create an SR Policy to the BGP next-hop following the defined constraints for that community. The head-end node can compute the SR Policy path itself, or the ODN policy can instruct the head-end to consult a PCE for end-to-end path computation.Segment Routing Benefits in IXP Use Reduction of control-plane protocols across the fabric by eliminating additional label distribution protocols Advanced traffic engineering capabilities, all while reducing overall network complexity Built-in local protection through the use of TI-LFA, computing the post-convergence path for both link and node protection Advanced OAM capabilities using real-time performance measurement and automated data-plane path monitoring Ability to tie services to defined underlay paths, unlike pure overlays like VXLAN Quickly add to the topology by simply turning up new IGP linksL2 IX Services using Segment Routing and EVPNEVPN BackgroundEVPN is the next-generation service type for creating L2 VPN overlay services across a Segment Routing underlay network. EVPN replaces services like VPLS emulating a physical Ethernet switch with a scalable BGP based control-plane. MAC addresses are no longer learned across the network as part of traffic forwarded, they are learned at the edges and distributes as BGP VPN routes. Below is a list of just a few of the advantages of EVPN over legacy services types such as VPLS or LDP-signaled P2P pseudowires. RFC 7432 is the initial RFC defining EVPN service types and operation covering both MP and P2P L2 services VPWS point to point Ethernet VPN ELAN multi-point Ethernet VPN EVPN brings the paradigms of BGP L3VPN to Ethernet VPNs MAC and ARP/ND (MAC+IP) information is advertised using BGP NLRI EVPN signaling identifies the same ESI (Ethernet Segment) connected to multiple PE nodes, allowing active/active or active/standby multi-homing EVPN has been extended to support IRB (Integrated Routing and Bridging) for inter-subnet routing EVPN Benefits for IX Use BGP-based control plane has obvious scaling and distribution benefits Eliminates mesh of pseudowires between L2 endpoints All-active per-flow load-balancing across redundant active links between IX and peers Reduced flooding scope for ARP traffic BUM labels act as a way to control flooding without complex split-horizon configuration Fast MAC withdrawal improves convergence vs. data plane learning Filter ARP/MAC advertisements via common BGP route policy Distributed MAC pinning Once MAC is learned on a CE interface, it is advertised with EVPN BD as “sticky” Remote PEs will drop traffic sourced from a MAC labeled as sticky from another PE Works in redundancy scenarios Works seamlessly with existing L2 Ethernet fabrics without having to run L2 protocols such as STP within EVPN itself Provides L2 multi-homing replacement for MC-LAG and L3 multi-homing replacement for VRRP/HSRPModern IXP DeploymentBackgroundIn the following section we will explore the deployment using IOS-XR devices and CLI. We will start with the most basic deployment and add additional components to enable features such as multi-plane design and L3 services.Single-plane Segment Routing Underlay DeploymentIn the simplest deployment example, Segment Routing is deployed by configuring either OSPF or IS-IS with SR MPLS extensions enabled. The configuration example below utilizes IS-IS as the SR underlay IGP protocol. The underlay is deployed as a single IS-IS L2 domain using Segment Routing MPLS.Topology Diagram for Single-plane FabricSRGB and SRLB DefinitionIt’s recommended to first configure the Segment Routing Global Block (SRGB) across all nodes needing connectivity between each other. In most instances a single SRGB will be used across the entire network. In a SR MPLS deployment the SRGB and SRLB correspond to the label blocks allocated to SR. IOS-XR has a maximum configurable SRGB limit of 512,000 labels, however please consult platform-specific documentation for maximum values. The SRLB corresponds to the labels allocated for SIDs local to the node, such as Adjacency-SIDs. It is recommended to configure the same SRLB block across all nodes. The SRLB must not overlap with the SRGB. The SRGB and SRLB are configured in IOS-XR with the following configuration#segment-routing segment-routing global-block 16000 16999 local-block 17000 17999 Base IGP / Segment Routing ConfigurationThe following configuration example shows an example IS-IS deployment with SR-MPLS extensions enabled for the IPv4 address family. The SR-enabling configuration lines are bolded, showing how Segment Routing and TI-LFA (FRR) can be deployed with very little configuration. SR must be deployed on all interconnected nodes to provide end to end reachability.router isis example set-overload-bit on-startup wait-for-bgp is-type level-2-only net 49.0002.1921.6801.4003.00 distribute link-state log adjacency changes log pdu drops lsp-refresh-interval 65000 max-lsp-lifetime 65535 lsp-password hmac-md5 encrypted 03276828295E731F70 address-family ipv4 unicast maximum-paths 16 metric-style wide mpls traffic-eng level-2-only mpls traffic-eng router-id Loopback0 maximum-paths 32 segment-routing mpls ! interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid absolute 16041 ! ! ! interface GigabitEthernet0/0/0/1 circuit-type level-2-only point-to-point address-family ipv4 unicast fast-reroute per-prefix ti-lfa metric 10 The two key elements to enable Segment Routing are segment-routing mpls under the ipv4 unicast address family and the node prefix-sid absolute 16041 definition under the Loopback0 interface. The prefix-sid can be defined as either an indexed value or absolute. The index value is added to the SRGB start (16000 in our case) to derive the SID value. Using absolute SIDs is recommended where possible, but in a multi-vendor network where one vendor may not be able to use the same SRGB as the other, using an indexed value is necessary.Enabling TI-LFATopology-Independent Loop-Free Alternates is not enabled by default. The above configuration enables TI-LFA on the Gigabit0/0/0/1 interface for IPv4 prefixes. TI-LFA can be enabled for all interfaces by using this command under the address-family ipv4 unicast in the IS-IS instance configuration. It is recommended to enable it at the interface level to control other TI-LFA attributes such as node protection and SRLG support.interface GigabitEthernet0/0/0/1 circuit-type level-1 point-to-point address-family ipv4 unicast fast-reroute per-prefix ti-lfa metric 10This is all that is needed to enable Segment Routing, and you can already see the simplicity in its deployment vs additional label distribution protocols like LDP and RSVP-TEDual-Plane Fabric using SR Flexible AlgorithmsThe dual plane design extends the base configuration by defining a topology based on SR flexible algorithms. Defining two independent topologies allows us to easily support disjoint services across the IX fabric. IX operators can offer diverse services without the fear of convergence on common links. This can be done with a minimal amount of configuration. Flex-algo also supports LFA and will ensure LFA paths are also constrainted to a specific topology.Flex-Algo BackgroundSR Flex-Algo is a simple extension to SR and its compatible IGP protocols to advertise membership in a logical network topology by configuring a specific “algorithm” attached to a node prefix-sid. All nodes with the same algorithm defined participate in the topology and can use the definition to define the behavior of a path. In the below example when a head-end computes a path to node 9’s node-SID assigned to algorithm 129 it will only use nodes participating in the minimal delay topology. This means a constraint can be met using a single node SID in the SID list instead of multiple explicit SIDs. Flex-algo is defined in IETF draft# draft-ietf-lsr-flex-algo and more details on Flex-Algo can be found on http#//www.segment-routing.net Algo 0 = IGP Metric Algo 128 = Green = Minimize TE Metric Algo 129 = Red = Minimize DelayDiagramDual-plane Flex-Algo ConfigurationWe will not re-introduce all of the configuration but the subset necessary to define both planes. To enable flexible algorithms you must first define the algorithms globally in IS-IS. The second step is to define a node prefix-sid on a Loopback interface and attach an algorithm to the SID. By default all nodes participate in algorithm 0, which is to simply compute a path based on minimal IGP metric.The advertise-definition option advertises the definition through the IGP domain. Using this command the definition can be defined on a subset of nodes and the global configuration is unnecessary on all nodes. It’s recommended to define the flex-algo identifiers on all participating nodes and advertise them. IS-IS Configurationinterface GigabitEthernet0/0/0/1router isis 1 flex-algo 100 advertise-definition !flex-algo 200 advertise-definition !interface Loopback0 address-family ipv4 unicast prefix-sid algorithm 100 absolute 16141 prefix-sid algorithm 101 absolute 16241 Simple L2VPN Services using EVPNWe will first look at EVPN configuration for deploying basic point to point and multi-point L2 services without specific traffic engineering constraints or path diversity requirements. These services will simply follow the shortest path across the network.BGP AFI/SAFI ConfigurationEVPN uses additional BGP address families in order to carry EVPN information across the network. EVPN uses the BGP L2VPN AFI of 25 and a SAFI of 70. In order to carry EVPN information between two peers, this AFI/SAFI must be enabled on all peers. The following shows the minimum BGP configuration to enable this at a global and peer level. router bgp 100 bgp router-id 100.0.0.1 address-family l2vpn evpn !!neighbor-group EVPN remote-as 100 update-source Loopback0 address-family l2vpn evpn !!neighbor 100.0.0.2 use neighbor-group EVPN At this point the two neighbors will become established over the EVPN AFI/SAFI. The command to view the relationship in IOS-XR is show bgp l2vpn evpn summaryEVPN Service Configuration ElementsEVI - Ethernet Virtual InstanceAll EVPN services require the configuration of the Ethernet Virtual Instance identifier, used to advertise the existence of an EVPN serviceendpoint to routers participating in the EVPN service network. The EVI is locally significant to a PE but it’s recommended the same EVI be configured for all routers participating in a particular EVPN service.RD - Route DistinguisherSimilar to L3VPN and BGP-AD L2VPN, a Route Distinguisher is used to differentiate EVPN routes belonging to different EVIs. The RD is auto-generated based on the Loopack0 IP address as specified in the EVPN RFC.RT - Route TargetAlso similar to L3VPN and BGP-AD L2VPN, a Route Target extended community is defined so EVPN routes are imported into the correct EVI across the network. The RT is auto-generated based on the EVI ID, but can be manually configured. It is recommended to utilize the auto-generated RT value.ESI - Ethernet Segment IdentifierThe ESI is used to identify a particular Ethernet “segment” for the purpose of multi-homing. In single-homed scenarios, the ESI is set to 0 by default. In multi-homing scenarios such all-active attachment, the ESI is configured the same on multiple routers. If it is known an attachment will never be multi-homed, using the default ESI of 0 is recommended, but if there is a chance it may be multi-homed in the future, using a unique ESI is recommended. The ESI is a 10-octet value with 1 byte used for the type and 9 octets for the value. Values of 0 and the maximum value are reserved.Attachment Circuit IDThis value is used only with EVPN VPWS point to point services. It defines a local attachment circuit ID and the remote attachment circuit ID to be used in signalling the endpoints and for direct traffic to the correct attachment point. The local ID does not have to be the same on both ends, but it is recommended.Topology Diagram for Example ServicesThe following is a topology diagram to follow along with the service endpoints in the below service configuration examples. Each CE node represents a peering fabric participant.P2P Peer Interconnect using EVPN-VPWSThe following highlights a simple P2P transparent L2 interconnect using EVPN-VPWS. It is assumed the EVPN BGP address family has been configured.Single-homed EVPN-VPWS serviceThe simplest P2P interconnect is single-homed on both ends. The single-active service can use an entire physical interface or VLAN tagsto identify a specific service. This service originates on PE1 and terminates on PE3. The service data plane path utilizes ECMP across the core network, one of the benefits of using an SR underlay. As you can see in the config below, there is no static neighbor config, P2P VPWS connections are dynamically setup by matching the EVI, target, and source identifiers. The target and source identifiers must match on the two nodes participating in the service.DiagramPE1interface TenGigabitEthernet0/0/0/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 interface TenGigabitEthernet0/0/0/1.100 neighbor evpn evi 10 target 100 source 101 PE3interface TenGigabitEthernet0/0/1/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 interface TenGigabitEthernet0/0/1/1.100 neighbor evpn evi 10 target 101 source 100 Multi-homed Single-active/All-active EVPN-VPWS serviceA multi-homed service uses two attachment circuits from the CE to unique PE devices on the provider side. LACP is used between the PEs and CE device in single-active and all-active multi-homing. This requires configuring a static LACP ID and ESI on both PE routers. Multi-chassis LACP protocols such as ICCP are not required, all muti-homed signaling is done with the EVPN control-plane.In this example CE1 is configured in a multi-homed all-active configuration to PE1 and PE2, CE2 continues to be configured as single-homed. In this configuration traffic will be hashed using header information across all active links in the bundle across all PEs. PE3 will receive two routes for the VPWS service and utilize both to balance traffic towards PE1 and PE2. Another option is to use single-active load-balancing mode, which will only forward traffic towards the ethernet-segment from the DF (default forwarder). Single-active is commonly used to enforce customer bandwidth rates, while still providing redundancy. In the case where there are multiple EVPN services on the same bundle interface, they will be balanced across the interfaces using the DF election algorithm.DiagramNote the LACP system MAC and ethernet-segment (ESI) on both PE nodes must be configured with the same values.PE1lacp system mac 1001.1001.1001!interface TenGigabitEthernet0/0/0/1 description ~To CE1~ bundle id 1 mode on !interface Bundle-Ether1.100 l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!!evpn group 1 core interface TenGigabitEthernet0/0/1/24 ! interface Bundle-Ether1.100 ethernet-segment type 0 11.11.11.11.11.11.11.11.11 load-balancing-mode single-active <-- Optional core-isolation-group 1 !! l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 lacp system mac 3637.3637.3637 neighbor evpn evi 10 target 100 source 100 PE2lacp system mac 1001.1001.1001!interface TenGigabitEthernet0/0/0/1 description ~To CE1~ bundle id 1 mode on !interface Bundle-Ether1.100 l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!!evpn group 1 core interface TenGigabitEthernet0/0/1/24 ! interface Bundle-Ether1.100 ethernet-segment type 0 11.11.11.11.11.11.11.11.11 load-balancing-mode single-active <-- Optional core-isolation-group 1 !l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 interface Bundle-Ether1.100 neighbor evpn evi 10 target 100 source 100PE3interface TenGigabitEthernet0/0/1/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 interface TenGigabitEthernet0/0/1/1.100 neighbor evpn evi 10 target 100 source 100 EVPN ELAN ServicesAn EVPN ELAN service is analgous to the function of VPLS, but modernized to eliminate the deficiencies with VPLS highlighted in earlier sections. ELAN is a multipoint service interconnecting all participating hosts connected to an ESI participating in the same EVI.EVPN ELAN with Single-homed EndpointsIn this configuration example the CE devices are connected to each PE using a single attachment interface. The EVI is set to a value of 100. It is considered a best practice to manually configure the ESI value on each participating interface although not required in the case of a single-active service. The ESI must be unique for each Ethernet Segment attached to the EVPN EVI.The core-isolation-group configuration is used to shutdown CE access interfaces when a tracked core upstream interface goes down. This way a CE will not send traffic into a PE node isolated from the rest of the network.In the bridge configuration, L2 security for storm control is enabled for unknown-unicast and multicast traffic. Additionally the MAC agging time is set to 30 minutes to decrease ARP traffic, and the MAC limit is set to 1 since all peers should be connected with a routed L3 interface to the IX fabric. Under the physical interface configuration an input QoS policy is configured to remark all inbound traffic with a DSCP of 0 and a L2 access list is configured to only allow 802.1Q TPID traffic with a VLAN tag of 100 from a specific MAC address.PE1ethernet-services access-list restrict_mac 10 permit host 00aa.dc11.ba99 any 8100 vlan 100 20 deny any any ! policy-map remark-ingress class class-default set dscp 0 ! end-policy-map!interface TenGigabitEthernet0/0/1/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric service-policy input remark-ingress ethernet-services access-group restrict-peer-mac ingress< !!l2vpn bridge group evpn bridge-domain evpn-elan interface TenGigabitEthernet0/0/1/1.100 mac limit 1 maximum 1 mac aging time 3600 storm-control unknown-unicast pps 100 storm-control mulitcast pps 100 ! evi 100 ! ! ! evpn evi 100 advertise-mac ! ! group 1 core interface TenGigabitEthernet0/0/1/24 ! interface TenGigabitEthernet0/0/1/1.100 ethernet-segment identifier type 0 11.11.11.11.11.11.11.11.11 ! core-isolation-group 1 ! ! PE2ethernet-services access-list restrict_mac 10 permit host 00aa.dc11.ba99 any 8100 vlan 100 20 deny any any ! policy-map remark-ingress class class-default set dscp 0 ! end-policy-map!interface TenGigabitEthernet0/0/1/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric service-policy input remark-ingress ethernet-services access-group restrict-peer-mac ingress< !!l2vpn bridge group evpn bridge-domain evpn-elan interface TenGigabitEthernet0/0/1/1.100 mac limit 1 maximum 1 mac aging time 3600 ! evi 100 ! ! ! evpn evi 100 advertise-mac ! ! group 1 core interface TenGigabitEthernet0/0/1/24 ! interface TenGigabitEthernet0/0/1/1.100 ethernet-segment identifier type 0 11.11.11.11.11.11.11.11.12 ! core-isolation-group 1 ! ! EVPN ELAN with Dual-homed EndpointIn this configuration example the CE1 device is connected to both PE1 and PE2. The EVI is set to a value of 100. The ESI value of 11.11.11.11.11.11.11.11.11 is configured on both PE devices connected to CE1.PE1ethernet-services access-list restrict_mac 10 permit host 00aa.dc11.ba99 any 8100 vlan 100 20 deny any any ! policy-map remark-ingress class class-default set dscp 0 ! end-policy-map!lacp system mac 1001.1001.1001!interface TenGigabitEthernet0/0/0/1 description ~To CE1~ bundle id 1 mode on !interface Bundle-Ether1.100 l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric service-policy input remark-ingress ethernet-services access-group restrict-peer-mac ingress< !!l2vpn bridge group evpn bridge-domain evpn-elan interface Bundle-Ether1.100 mac limit 1 maximum 1 mac aging time 3600 ! evi 100 ! ! ! evpn evi 100 advertise-mac ! ! group 1 core interface TenGigabitEthernet0/0/1/24 ! interface TenGigabitEthernet0/0/1/1.100 ethernet-segment identifier type 0 11.11.11.11.11.11.11.11.11 load-balancing-mode single-active <-- Optional command to only forward through DF ! core-isolation-group 1 ! ! PE2ethernet-services access-list restrict_mac 10 permit host 00aa.dc11.ba99 any 8100 vlan 100 20 deny any any ! policy-map remark-ingress class class-default set dscp 0 ! end-policy-map!lacp system mac 1001.1001.1001!interface TenGigabitEthernet0/0/0/1 description ~To CE1~ bundle id 1 mode on !interface Bundle-Ether1.100 l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric service-policy input remark-ingress ethernet-services access-group restrict-peer-mac ingress< !!l2vpn bridge group evpn bridge-domain evpn-elan interface Bundle-Ether1.100 mac limit 1 maximum 1 mac aging time 3600 ! evi 100 ! ! ! evpn evi 100 advertise-mac ! ! group 1 core interface TenGigabitEthernet0/0/1/24 ! interface TenGigabitEthernet0/0/1/1.100 ethernet-segment identifier type 0 11.11.11.11.11.11.11.11.11 load-balancing-mode single-active <-- Optional command to only forward through DF ! core-isolation-group 1 ! ! AppendixSegment Routing and EVPN Troubleshooting Commands Show command Function isis segment-routing label table Display learned node SIDs mpls forwarding detail Show general MPLS forwarding table mpls forwarding prefix [prefix] detail Show detail forwarding information for exact prefix cef Show FIB hardware forwarding information mpls forwarding labels [label] detail Display forwarding info and stats for EVPN label bgp l2vpn evpn Display EVPN NLRI bgp l2vpn evpn rd [rd] Display EVPN NLRI belonging to specific RD bgp l2vpn evpn route-type [type] Display EVPN routes of a specific route type evpn internal-label Display labels allocated to EVPN instances evpn ethernet-segment esi [esi] carving detail Display EVPN service details evpn evi [vpn-id] mac Show MAC address tables and MPLS label info for all EVI evpn evi vpn-id [vpn] detail Show detail info for a specific local EVI evpn evi vpn-id [vpn] detail Show detail info for a specific local EVI l2vpn forwarding location [location] L2 forwarding database l2vpn forwarding bridge-domain [bridge-group#bridge-domain] mac-address detail location [location] l2 forwaridng info for local bridge domain l2vpn forwarding evpn[bridge-group#bridge-domain] mac-address detail location [location] l2 forwaridng info for local bridge domain l2vpn forwarding bridge-domain evpn ipv4-mac detail location [location] Show EVPN IPv4 MAC info l2vpn forwarding bridge-domain evpn ipv6-mac detail location [location] Show EVPN IPv6 MAC info l2vpn xconnect detail Display EVPN VPWS info and state Periodic Model Driven TelemetryDevice Health Function Sensor Path Uptime Cisco-IOS-XR-shellutil-oper#system-time/uptime CPU Cisco-IOS-XR-wdsysmon-fd-oper#system-monitoring/cpu-utilization Memory Cisco-IOS-XR-nto-misc-oper#memory-summary/nodes/node/summary ASR9K Power Cisco-IOS-XR-asr9k-sc-envmon-oper#environmental-monitoring/racks/rack/slots/slot/modules/module/power/power-bag NCS 5500 Environmentals Cisco-IOS-XR-sysadmin-fretta-envmon-ui#environment/oper NCS 5500 FIB Resources Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-oper#dpa/stats/nodes/node/hw-resources-datas/hw-resources-data Infrastructure Monitoring Function Sensor Path Interface Summary Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-summary Interface Counters Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters Interface Data/PPS Rates (show int) Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/cache/data-rate IS-IS Stats Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/statistics-global Optics Information Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info Aggregate Bundle Stats Cisco-IOS-XR-bundlemgr-oper#bundles LLDP Neighbor Information Cisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/neighbors QoS Input Stats Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input QoS Output Stats Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/output QoS VOQ Information Cisco-IOS-XR-qos-ma-oper#qos/qos-global/vo-q/vo-q-statistics/vo-qinterfaces/vo-qinterface LPTS (Control Plane) Flow Information Cisco-IOS-XR-lpts-pre-ifib-oper#lpts-pifib/nodes/node/dynamic-flows-stats/flow IPv4 ACL Resources Cisco-IOS-XR-ipv4-acl-oper#ipv4-acl-and-prefix-list/oor/access-list-summary/details IPv6 ACL Resources Cisco-IOS-XR-ipv6-acl-oper#ipv4-acl-and-prefix-list/oor/access-list-summary/details Routing Protocols Function Sensor Path IS-IS Protocol Stats Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/statistics-global IS-IS Interfaces and Stats Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/interfaces IS-IS Adjacencies Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/level/adjacencies/adjacency IS-IS Route Info Cisco-IOS-XR-ip-rib-ipv4-oper#rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/protocol/isis/as/information BFD Statistics Cisco-IOS-XR-ip-bfd-oper#bfd/summary BFD Session Details Cisco-IOS-XR-ip-bfd-oper#bfd/session-details IPv4 BGP GRT Process Info Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info IPv6 BGP GRT Process Info Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info IPv4 BGP GRT Neighbor Stats Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors IPv6 BGP GRT Neighbor Stats Cisco-IOS-XR-ipv6-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors BGP Route Target Entries Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/rt-entries/rt-entry RPKI Summary Stats Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/rpki-summary BGP Flowspec Stats Cisco-IOS-XR-flowspec-oper#flow-spec/vrfs/vrf/afs/af/flows MPLS Label Allocation Cisco-IOS-XR-mpls-lsd-oper#mpls-lsd/label-summary SR Node Prefix-SIDs Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/topologies/topology/ipv4-routes/ipv4-route/native-status/native-details/primary/source/nodal-sid Service Monitoring Function Sensor Path L2VPN FIB Summary Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary L2VPN Bridge Domain Info Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/bridge-domains/bridge-domain L2VPN BD MAC Details Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fibmac-details L2VPN BD Stats Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-bridge-domains EVPN IPv4 Learned IP/MAC Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-evpn-ip4macs EVPN IPv6 Learned IP/MAC Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-evpn-ip6macs L2VPN Xconnect Info Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnects Event Driven TelemetryThese telemetry paths can be configured as EDT, only sending data when an event is triggered, for example an interface state change.One configures a supported sensor-path as Event Driven by setting the sample-interval in the subscription to 0 Function Sensor Path Interface Admin State Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interfaces/interface/state Interface Oper State Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interfaces/interface/line-state IPv4 Route Attributes Cisco-IOS-XR-ip-rib-ipv4-oper#rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/routes IPv4 Route Attributes Cisco-IOS-XR-ip-rib-ipv6-oper#rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/routes Optics Admin Sfxtate Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info/transport-admin-state Optics State Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info/controller-state In our next blog we will explore advanced Segment Routing TE using ODN/Flex-Algo and Layer 3 services using L3VPN and EVPN IRB", "url": "/blogs/2019-02-02-modernizing-ixp-design/", "author": "", "tags": "" } , "blogs-2019-03-29-back-to-the-future-with-fabrics": { "title": "The Future of Network Fabrics", "content": "Turning Fabrics Inside OutUntil recently, the notion of a “fabric” in a network was confined to single device# the switching fabric inside an NPU or ASIC, the fabric module of a modular router, or the fabric chassis of a multi-chassis system. All these internal fabrics fulfilled the same basic function# non-blocking connectivity between the input and output ports of the system.Largely driven by the needs of the massively scalable data center, a new network design has quite literally turned fabrics inside out. Instead of building fabrics inside silicon, data center fabrics provide (statistically) non-blocking connectivity by connecting simple devices in a densely connected spine and leaf topology. These designs can be relatively small (e.g. replacing a large modular device with a fabric of small devices) or as massive as the data centers they enable (e.g. Facebook’s Data Center Fabric).As every network architect knows, there is no one “best” design for a network. Every design represents tradeoffs. Large scale fabrics have been very successful in the data center. The question is# are fabrics sensible in other parts of the network as well? Let’s look at that from a couple of angles# scale, availability, traffic patterns, and cost.Scale# Up and OutWhen designing networks, scale is almost always top of mind. There are two basic paradigms for scale# scale out and scale up. Networking has traditionally relied primarily on a scale up paradigm# when you need to scale, buy a bigger box or more line cards or denser line cards. Scale out, used almost exclusively in cloud-scale computing, takes the opposite approach. Instead of buying bigger boxes, buy lots of smaller ones.Scale out can scale very large. The Cisco NCS 5516, a large 100GE routing platform, today provides 576 100GE ports. An external fabric built entirely of 48-port NCS 5502s could support up to twice that many ports. If you put NCS 5508s in the spine, you could get up to 6912 user-facing ports. That’s truly massive scale.Scale Impacts Availability Impacts Upgradability Impacts The Bottom LineHow you scale has a direct impact on how you achieve high availability in a given design. When you scale up, those ever-larger and denser devices create an ever-larger “blast radius.” If you could lose a big chunk of your network capacity when a single device goes down, you’ll want to harden that device with lots of redundant hardware and complex software to keep it all running. Unfortunately, complex, tightly coupled systems like this can be difficult to upgrade and troubleshoot. If you can’t upgrade quickly, you might delay critical bug fixes as well as new features that could enable profitable new services.When you scale out, the blast radius of an individual device is smaller. Without the complexities of large, redundant systems, networks made out of many small, simple devices can achieve very high availability. The smaller the blast radius of a device, the easier it is to take it out of service for upgrade. Since upgradability has a direct impact on both quality and service agility, a scale out network should be able to deliver higher quality services, faster.Traditionally, networking has had a bias for scale up, but in truth scale out and scale up exist on a continuum. Even in traditional designs, two routers are usually deployed in a given role for redundancy. Two is the smallest amount of “scale out” possible, but it’s still “scale out.” Of course, the ultimate in “scale out” is the full spine and leaf fabric design of data center fame, but there are designs that balance “scale out” and “scale up” to achieve the right fit for purpose.Traffic Patterns# Is An East-West Wind Blowing Your Way?Massive data center fabrics arose out of the need to provide equidistant, non-blocking bandwidth for traffic between storage and compute components distributed across the data center. This is what is called an “east-west” traffic pattern, in contrast to the “north-south” pattern that characterizes classic Campus and Service Provider designs. When north-south traffic predominates, one or two large chassis can provide the requisite aggregation function very efficiently.Having a clear understanding of the traffic pattern you need to support is crucial in understanding if fabrics are right for you. There are certainly places in the network where traffic patterns are changing very rapidly today. For example, the rise of local peering and caching in Service Provider networks means that traffic that might once have been aggregated and sent across the backbone can now be served locally instead. That introduces a strong east-west component into what was once an almost entirely north-south pattern. Introducing a fabric with a spine layer to provide connectivity between PE nodes and Peering or Caching nodes starts to make sense in that scenario.Having a spine allows you to attach diverse leaf devices, some “heavy” (richly featured, more expensive) and some “light” (basic features, less expensive). You can also scale each leaf type independently. If your peering traffic is growing faster than core-bound traffic, just add more peering leaves. When you want to introduce new features, simply plug a new-feature-capable-leaf into the spine and away you go. If it doesn’t work the way you want, unplug it. There’s no impact on other services, no downtime for upgrades.Good Fabrics Are Not FreeSome people assume that fabrics will be less expensive than modular systems because fabrics are built from smaller, simpler, cheaper devices. This overlooks a couple of important points. First, it takes a lot of spines and leaves to achieve a statistically non-blocking architecture. For example, to build a non-blocking 96-port fabric out of 48-port devices, you need…wait for it…six 48-port devices (2 spines and 4 leaves). Now we’ve gotten very good at building cost-effective NPUs for routers, but you still need to connect all those spines and leaves. The optics required for all that connectivity quickly comes to dominate the cost of an external fabric. Using Active Optical Cables (AOCs) can help mitigate the optics cost but it remains non-negligible.Speaking of connectivity, remember that connecting spines and leaves will take a lot of cables. For our example of 96 user-facing ports, you’ll need 96 cables between the spines and leaves for fabric connectivity. That effectively doubles the number of cables your ops team has to manage.In terms of space, power and cooling, fabrics exact a cost as well. We build large, modular systems for a reason# they are very efficient. In apples-to-apples comparisons (e.g. same ASIC family), a fabric of small devices always consumes more space and power for the equivalent number of ports in a large chassis.The larger number of devices and interfaces in a fabric will naturally have an impact on your control plane. Take our simple example of the 96-port fabric. Can your IGP scale 6X for those additional devices? Depending on the size of your network, that may be entirely reasonable, since IGPs today can easily handle thousands of nodes. For very large networks, however, this could be a significant concern.Given the ratio of capex to opex (typically 4#1 for Service Providers), it’s also important to take a good hard look at the impact fabrics can have on the ops team more generally. In our 96-port example, ops has to manage six devices (six management address, six ACL and QoS domains, six IGP instances, etc.) where before they might have had only one. We know from massive data center designs that the only way to scale ops like that is to go all-in on automation. From cable plans to config generation and deployment to upgrade to troubleshooting, every aspect of network operations will have to be automated.Conclusion# To Fabric or Not To Fabric?Thanks to the work done in data centers large and small, we know the costs and benefits of network fabrics. For sheer scale, availability, upgradability, and east-west traffic patterns, it’s hard to beat a fabric of small, simple devices. But the large number of devices in external fabrics makes large-scale automation an absolute requirement. Until fabrics go mainsteam, much of that automation will remain bespoke. And don’t expect to optimize cost by deploying a fabric. Port for port, optics, cooling and power all favor a modular system over an equivalent fabric by a significant margin.So are fabrics fated to stay inside NPUs, chassis and data centers forever? Not necessarily. At some point, the physics of NPUs will reach a point that we can no longer build highly efficient large modular routers. That point might be 10 years in the future, but it is coming. In that sense, fabrics are inevitable. And as William Gibson famously said, “the future is already here, it’s just not evenly distributed.” Even today, some service providers have weighed the cost and benefits of fabric architectures and decided that the availability, scale, upgradability and flexibility of fabrics outweigh the upfront capex and opex investments for the traffic patterns they need to support. Fabric architectures are not evenly distributed, but they are already emerging as a valid design pattern in modern networks.Finally, many network architects are coming to see that you can realize some of the benefits of fabrics without going to a full-scale spine and leaf design. For example, adding a little more scale out (say 4 or 8 routers in an LSR or LER role instead of the traditional 2) can enable a smaller blast radius and easier upgrades without the massive device scale of a full spine and leaf fabric. This might stretch (sorry) the data center definition of fabrics, but it’s a step closer to the availability, upgradability, quality and service agility that every network needs.", "url": "/blogs/2019-03-29-back-to-the-future-with-fabrics/", "author": "Shelly Cadora", "tags": "" } , "#": {} , "#": {} , "#": {} , "blogs-2019-04-01-converged-sdn-transport-implementation-guide": { "title": "Converged SDN Transport Implementation Guide", "content": " On This Page Targets Testbed Overview Devices Role-Based Configuration Transport IOS-XR – All IOS-XR nodes IGP Protocol (ISIS) and Segment Routing MPLS configuration MPLS Segment Routing Traffic Engineering (SRTE) configuration Transport IOS-XE – All IOS-XE nodes Segment Routing MPLS configuration IGP-ISIS configuration MPLS Segment Routing Traffic Engineering (SRTE) Area Border Routers (ABRs) IGP-ISIS Redistribution configuration BGP – Access or Provider Edge Routers IOS-XR configuration IOS-XE configuration Area Border Routers (ABRs) IGP Topology Distribution Transport Route Reflector (tRR) Services Route Reflector (sRR) Segment Routing Path Computation Element (SR-PCE) Segment Routing Traffic Engineering (SRTE) and Services Integration On Demand Next-Hop (ODN) configuration – IOS-XR On Demand Next-Hop (ODN) configuration – IOS-XE Preferred Path configuration – IOS-XR Preferred Path configuration – IOS-XE Services End-To-End Services L3VPN MP-BGP VPNv4 On-Demand Next-Hop Access Router Service Provisioning (IOS-XE)# L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Access Router Service Provisioning (IOS-XR)# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# End-To-End Services Data Plane Hierarchical Services L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Router Service Provisioning (IOS-XR)# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# Targets Hardware# ASR9000 as Provider Edge (PE) node NCS5500 as Aggregation and P-Aggregation Node ASR920 and NCS5500 (standing for the NCS540) as Access Router Software# IOS-XR 6.5.3 on ASR9000 and NCS5500 IOS-XE 16.8.1 on ASR920 Key technologies Transport# End-To-End Segment-Routing Network Programmability# SRTE Inter-Domain LSPs with On-DemandNext Hop Network Availability# TI-LFA/Anycast-SID Services# BGP-based L2 and L3 Virtual Private Network services(EVPN and L3VPN) Testbed OverviewFigure 1# Compass Converged SDN Transport High Level TopologyFigure 2# Testbed Physical TopologyFigure 3# Testbed Route-Reflector and SR-PCE physical connectivityFigure 4# Testbed IGP DomainsDevicesAccess Routers Cisco NCS5501-SE (IOS-XR) – A-PE1, A-PE2, A-PE3, A-PE7 Cisco ASR920 (IOS-XE) – A-PE4, A-PE5, A-PE6, A-PE9 Area Border Routers (ABRs) and Provider Edge Routers# Cisco ASR9000 (IOS-XR) – PE1, PE2, PE3, PE4Route Reflectors (RRs)# Cisco IOS XRv 9000 – tRR1-A, tRR1-B, sRR1-A, sRR1-B, sRR2-A, sRR2-B,sRR3-A, sRR3-BSegment Routing Path Computation Element (SR-PCE)# Cisco IOS XRv 9000 – SR-PCE1-A, SR-PCE1-B, SR-PCE2-A, SR-PCE2-B, SR-PCE3-A, SR-PCE3-BRole-Based ConfigurationTransport IOS-XR – All IOS-XR nodesIGP Protocol (ISIS) and Segment Routing MPLS configurationRouter isis configurationkey chain ISIS-KEY key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password 00071A150754 send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm HMAC-MD5All Routers, except Provider Edge (PE) Routers, are part of one IGPdomain (ISIS ACCESS or ISIS-CORE). PEs act as Area Border Routers (ABRs)and run two IGP processes (ISIS-ACCESS and ISIS-CORE). Please note thatLoopback 0 is part of both IGP processes.router isis ISIS-ACCESS set-overload-bit on-startup 360 is-type level-2-only net 49.0001.0101.0000.0110.00 nsr nsf cisco log adjacency changes lsp-gen-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 lsp-refresh-interval 65000 max-lsp-lifetime 65535 lsp-password keychain ISIS-KEY lsp-password keychain ISIS-KEY level 1 address-family ipv4 unicast metric-style wide spf-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 segment-routing mpls spf prefix-priority critical tag 5000 spf prefix-priority high tag 1000 !PEs Loopback 0 is part of both IGP processes together with same“prefix-sid index” value. interface Loopback0 address-family ipv4 unicast prefix-sid index 150 ! !TI-LFA FRR configuration interface TenGigE0/0/0/10 point-to-point hello-password keychain ISIS-KEY address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 ! ! !interface Loopback0 ipv4 address 100.0.1.50 255.255.255.255!MPLS Interface configurationinterface TenGigE0/0/0/10 bfd mode ietf bfd address-family ipv4 timers start 180 bfd address-family ipv4 multiplier 3 bfd address-family ipv4 destination 10.1.2.1 bfd address-family ipv4 fast-detect bfd address-family ipv4 minimum-interval 50 mtu 9216 ipv4 address 10.15.150.1 255.255.255.254 ipv4 unreachables disable bundle minimum-active links 1 load-interval 30 dampening!MPLS Segment Routing Traffic Engineering (SRTE) configurationipv4 unnumbered mpls traffic-eng Loopback0router isis ACCESS address-family ipv4 unicast mpls traffic-eng level-2-only mpls traffic-eng router-id Loopback0Transport IOS-XE – All IOS-XE nodesSegment Routing MPLS configurationmpls label range 6001 32767 static 16 6000segment-routing mpls ! set-attributes address-family ipv4 sr-label-preferred exit-address-family ! global-block 16000 32000 !Prefix-SID assignment to loopback 0 configuration connected-prefix-sid-map address-family ipv4 100.0.1.51/32 index 151 range 1 exit-address-family !IGP-ISIS configurationkey chain ISIS-KEY key 1 key-string cisco accept-lifetime 00#00#00 Jan 1 2018 infinite send-lifetime 00#00#00 Jan 1 2018 infinite!router isis ACCESS net 49.0001.0102.0000.0254.00 is-type level-2-only authentication mode md5 authentication key-chain ISIS-KEY metric-style wide fast-flood 10 set-overload-bit on-startup 120 max-lsp-lifetime 65535 lsp-refresh-interval 65000 spf-interval 5 50 200 prc-interval 5 50 200 lsp-gen-interval 5 5 200 log-adjacency-changes segment-routing mpls segment-routing prefix-sid-map advertise-localTI-LFA FRR configuration fast-reroute per-prefix level-2 all fast-reroute ti-lfa level-2 microloop avoidance protected redistribute connected!interface Loopback0 ip address 100.0.1.51 255.255.255.255 ip router isis ACCESS isis circuit-type level-2-onlyendMPLS Interface configurationinterface TenGigabitEthernet0/0/12 mtu 9216 ip address 10.117.151.1 255.255.255.254 ip router isis ACCESS mpls ip isis circuit-type level-2-only isis network point-to-point isis metric 100endMPLS Segment Routing Traffic Engineering (SRTE)router isis ACCESS mpls traffic-eng router-id Loopback0 mpls traffic-eng level-2interface TenGigabitEthernet0/0/12 mpls traffic-eng tunnelsArea Border Routers (ABRs) IGP-ISIS Redistribution configurationPEs have to provide IP reachability for RRs, SR-PCEs and NSO between bothISIS-ACCESS and ISIS-CORE IGP domains. This is done by specific IPprefixes redistribution.router staticaddress-family ipv4 unicast 100.0.0.0/24 Null0 100.0.1.0/24 Null0 100.1.0.0/24 Null0 100.1.1.0/24 Null0prefix-set ACCESS-XTC_SvRR-LOOPBACKS 100.0.1.0/24, 100.1.1.0/24end-setprefix-set RR-LOOPBACKS 100.0.0.0/24, 100.1.0.0/24end-setredistribute Core SvRR and TvRR loopback into Access domainroute-policy CORE-TO-ACCESS1 if destination in RR-LOOPBACKS then pass else drop endifend-policyrouter isis ACCESS address-family ipv4 unicast redistribute static route-policy CORE-TO-ACCESS1 redistribute Access SR-PCE and SvRR loopbacks into Core domainroute-policy ACCESS1-TO-CORE if destination in ACCESS-XTC_SvRR-LOOPBACKS then pass else drop endif end-policy router isis CORE address-family ipv4 unicast redistribute static route-policy CORE-TO-ACCESS1 BGP – Access or Provider Edge RoutersIOS-XR configurationrouter bgp 100 nsr bgp router-id 100.0.1.50 bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family l2vpn evpn ! neighbor-group SvRR remote-as 100 update-source Loopback0 address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family l2vpn evpn ! ! neighbor 100.0.1.201 use neighbor-group SvRR !IOS-XE configurationrouter bgp 100 bgp router-id 100.0.1.51 bgp log-neighbor-changes no bgp default ipv4-unicast neighbor SvRR peer-group neighbor SvRR remote-as 100 neighbor SvRR update-source Loopback0 neighbor 100.0.1.201 peer-group SvRR ! address-family ipv4 exit-address-family ! address-family vpnv4 neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family ! address-family l2vpn evpn neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family !Area Border Routers (ABRs) IGP Topology DistributionNext network diagram# “BGP-LS Topology Distribution” shows how AreaBorder Routers (ABRs) distribute IGP network topology from ISIS ACCESSand ISIS CORE to Transport Route-Reflectors (tRRs). tRRs then reflecttopology to Segment Routing Path Computation Element (SR-PCEs)Figure 5# BGP-LS Topology Distributionrouter isis ACCESS distribute link-state instance-id 101 net 49.0001.0101.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0router isis CORE distribute link-state instance-id 100 net 49.0001.0100.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0router bgp 100 address-family link-state link-state ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR !Transport Route Reflector (tRR)router static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.10 bgp graceful-restart ibgp policy out enforce-modifications address-family link-state link-state additional-paths receive additional-paths send ! neighbor-group RRC remote-as 100 update-source Loopback0 address-family link-state link-state route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group RRC ! neighbor 100.0.0.2 use neighbor-group RRC ! neighbor 100.0.0.3 use neighbor-group RRC ! neighbor 100.0.0.4 use neighbor-group RRC ! neighbor 100.0.0.100 use neighbor-group RRC ! neighbor 100.0.1.101 use neighbor-group RRC ! neighbor 100.0.2.102 use neighbor-group RRC ! neighbor 100.1.1.101 use neighbor-group RRC !!Services Route Reflector (sRR)router static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.200 bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast additional-paths receive additional-paths send ! address-family vpnv6 unicast additional-paths receive additional-paths send retain route-target all ! address-family l2vpn evpn additional-paths receive additional-paths send ! neighbor-group SvRR-Client remote-as 100 update-source Loopback0 address-family l2vpn evpn route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group SvRR-Client ! neighbor 100.0.0.2 use neighbor-group SvRR-Client ! neighbor 100.0.0.3 use neighbor-group SvRR-Client ! neighbor 100.0.0.4 use neighbor-group SvRR-Client ! neighbor 100.2.0.5 use neighbor-group SvRR-Client description Ixia-P1 ! neighbor 100.2.0.6 use neighbor-group SvRR-Client description Ixia-P2 ! neighbor 100.0.1.201 use neighbor-group SvRR-Client ! neighbor 100.0.2.202 use neighbor-group SvRR-Client !!Segment Routing Path Computation Element (SR-PCE)router static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.100 bgp graceful-restart ibgp policy out enforce-modifications address-family link-state link-state ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR !!pce address ipv4 100.0.0.100!Segment Routing Traffic Engineering (SRTE) and Services IntegrationThis section shows how to integrate Traffic Engineering (SRTE) withServices. Particular usecase refers to next sub-section.On Demand Next-Hop (ODN) configuration – IOS-XRsegment-routing traffic-eng logging policy status ! on-demand color 100 dynamic pce ! metric type igp ! ! ! pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 ! pce address ipv4 100.1.1.101 ! !extcommunity-set opaque BLUE 100end-setroute-policy ODN_EVPN set extcommunity color BLUEend-policyrouter bgp 100 address-family l2vpn evpn route-policy ODN_EVPN out !!On Demand Next-Hop (ODN) configuration – IOS-XEmpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-allmpls traffic-eng auto-tunnel p2p config unnumbered-interface Loopback0mpls traffic-eng auto-tunnel p2p tunnel-num min 1000 max 5000!mpls traffic-eng lsp attributes L3VPN-SRTE path-selection metric igp pce!ip community-list 1 permit 9999route-map L3VPN-ODN-TE-INIT permit 10 match community 1 set attribute-set L3VPN-SRTE!route-map L3VPN-SR-ODN-Mark-Comm permit 10 match ip address L3VPN-ODN-Prefixes set community 9999!!endrouter bgp 100 address-family vpnv4 neighbor SvRR send-community both neighbor SvRR route-map L3VPN-ODN-TE-INIT in neighbor SvRR route-map L3VPN-SR-ODN-Mark-Comm outPreferred Path configuration – IOS-XRsegment-routing traffic-eng pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 ! pce address ipv4 100.1.1.101 ! !Preferred Path configuration – IOS-XEmpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-allServicesEnd-To-End ServicesFigure 6# End-To-End Services TableL3VPN MP-BGP VPNv4 On-Demand Next-HopFigure 7# L3VPN MP-BGP VPNv4 On-Demand Next-Hop Control PlaneAccess Routers# Cisco ASR920 IOS-XE Operator# New VPNv4 instance via CLI or NSO Access Router# Advertises/receives VPNv4 routes to/from ServicesRoute-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Please refer to “On Demand Next-Hop (ODN) – IOS-XE” section forinitial ODN configuration.Access Router Service Provisioning (IOS-XE)#VRF definition configurationvrf definition L3VPN-SRODN-1 rd 100#100 route-target export 100#100 route-target import 100#100 address-family ipv4 exit-address-familyVRF Interface configurationinterface GigabitEthernet0/0/2 mtu 9216 vrf forwarding L3VPN-SRODN-1 ip address 10.5.1.1 255.255.255.0 negotiation autoendBGP VRF configuration Static & BGP neighbor Static routing configurationrouter bgp 100 address-family ipv4 vrf L3VPN-SRODN-1 redistribute connected exit-address-familyBGP neighbor configurationrouter bgp 100 neighbor Customer-1 peer-group neighbor Customer-1 remote-as 200 neighbor 10.10.10.1 peer-group Customer-1 address-family ipv4 vrf L3VPN-SRODN-2 neighbor 10.10.10.1 activate exit-address-familyL2VPN Single-Homed EVPN-VPWS On-Demand Next-HopFigure 8# L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Advertises/receives EVPN-VPWS instance to/fromServices Route-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Please refer to “On Demand Next-Hop (ODN) – IOS-XR” section forinitial ODN configuration.Access Router Service Provisioning (IOS-XR)#PORT Based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 1000 target 1 source 1 interface TenGigE0/0/0/5 l2transportVLAN Based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 interface TenGigE0/0/0/5.1 neighbor evpn evi 1000 target 1 source 1 interface TenGigE0/0/0/5.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!L2VPN Static Pseudowire (PW) – Preferred Path (PCEP)Figure 9# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) ControlPlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Access Router Service Provisioning (IOS-XR)#segment-routing traffic-eng policy GREEN-PE7 color 200 end-point ipv4 100.0.2.52 candidate-paths preference 1 dynamic pce ! metric type igpPort Based Service configurationinterface TenGigE0/0/0/15 l2transportl2vpn pw-class static-pw-class-PE7 encapsulation mpls control-word preferred-path sr-te policy GREEN-PE7 p2p Static-PW-to-PE7-1 interface TenGigE0/0/0/15 neighbor ipv4 100.0.2.52 pw-id 1000 mpls static label local 1000 remote 1000 pw-class static-pw-class-PE7 VLAN Based Service configurationinterface TenGigE0/0/0/5.1001 l2transport encapsulation dot1q 1001 rewrite ingress tag pop 1 symmetricl2vpn pw-class static-pw-class-PE7 encapsulation mpls control-word preferred-path sr-te policy GREEN-PE7 p2p Static-PW-to-PE7-2 interface TenGigE0/0/0/5.1001 neighbor ipv4 100.0.2.52 pw-id 1001 mpls static label local 1001 remote 1001 pw-class static-pw-class-PE7 Access Router Service Provisioning (IOS-XE)#Port Based service with Static OAM configurationinterface GigabitEthernet0/0/1 mtu 9216 no ip address negotiation auto no keepalive service instance 10 ethernet encapsulation default xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word ! pseudowire-static-oam class static-oam timeout refresh send 10 ttl 255 pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 status protocol notification static static-oam ! VLAN Based Service configurationinterface GigabitEthernet0/0/1 no ip address negotiation auto service instance 1 ethernet Static-VPWS-EVC encapsulation dot1q 10 rewrite ingress tag pop 1 symmetric xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word !pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 End-To-End Services Data PlaneFigure 10# End-To-End Services Data PlaneHierarchical ServicesFigure 11# Hierarchical Services TableL3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE)Figure 12# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Provider Edge Router# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together withPseudowire-Headend (PWHE) via CLI or NSO Provider Edge Router# Path to remote PE is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p L3VPN-VRF1 interface TenGigE0/0/0/5.501 neighbor evpn evi 13 target 501 source 501 ! ! !interface TenGigE0/0/0/5.501 l2transport encapsulation dot1q 501 rewrite ingress tag pop 1 symmetricPort based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 13 target 502 source 502 interface TenGigE0/0/0/5 l2transportAccess Router Service Provisioning (IOS-XE)#VLAN based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric !Port based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation defaultProvider Edge Router Service Provisioning (IOS-XR)#VRF configurationvrf L3VPN-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#501 ! export route-target 100#501 ! ! address-family ipv6 unicast import route-target 100#501 ! export route-target 100#501 ! !BGP configurationrouter bgp 100 vrf L3VPN-ODNTE-VRF1 rd 100#501 address-family ipv4 unicast redistribute connected ! address-family ipv6 unicast redistribute connected ! !PWHE configurationinterface PW-Ether1 vrf L3VPN-ODNTE-VRF1 ipv4 address 10.13.1.1 255.255.255.0 ipv6 address 1000#10#13##1/126 attach generic-interface-list PWHE!EVPN VPWS configuration towards Access PEl2vpn xconnect group evpn-vpws-l3vpn-A-PE3 p2p L3VPN-ODNTE-VRF1 interface PW-Ether1 neighbor evpn evi 13 target 501 source 501 !Figure 13# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 withPseudowire-Headend (PWHE) Data PlaneL3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRBFigure 14# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 withAnycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together with Anycast IRBvia CLI or NSO Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word !Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2 l2transport!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word !Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word !Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word !Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override ribAnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012L2VPN configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! !EVPN configurationevpn evi 12001 ! advertise-mac ! ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30!VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !!BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! !Figure 15# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB Datal PlaneL2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRBFigure 16# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L2VPN Multipoint EVPN instance together withAnycast IRB via CLI or NSO (Anycast IRB is optional when L2 and L3is required in same service instance) Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Please note that provisioning on Access and Provider Edge routers issame as in “L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB”. In this use case there is BGP EVPN instead of MP-BGPVPNv4/6 in the core.Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word !Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! ! interface TenGigE0/0/0/2 l2transport!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word !Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word !Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word !Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override ribAnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012L2VPN Configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! !EVPN configurationevpn evi 12001 ! advertise-mac ! ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30!VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !!BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 17# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Data Plane", "url": "/blogs/2019-04-01-converged-sdn-transport-implementation-guide/", "author": "Jiri Chaloupka", "tags": "iosxr, cisco, Metro, Design" } , "blogs-2019-09-06-modernizing-ixp-design": { "title": "Modernizing IX Fabric Design Using Segment Routing and EVPN", "content": " On This Page Modern IX Fabric Design Internet Exchange History Initial Exchanges Growth through Internet Privatization Dawn of the NAP and Rapid IX Expansion IX Design Evolution Initial Exchange Design Ethernet Takes Over More Advanced Fabric Transport VPLS and P2P PW over MPLS EVPN over VXLAN TRILL, PBB, and other L2 Fabric Technology Modern IX Fabric Requirements Hardware High-Density and Future-Proof Edge and WAN Interface Flexibility Power and Space Packet Transport Requirements Service Types and Requirements Point to Point Connectivity Multi-point Connectivity (Multi-lateral Peering) Layer 3 Cloud or Transit Connectivity QoS Requirements Broadcast, Unknown Unicast, and Multicast Traffic Security Peer Connection Isolation L2 Security L3 Security Additional IX Components Fabric and Peer Service Automation Route Servers Route Looking Glass Analytics and Monitoring Fabric Telemetry Route Update History Modern IXP Fabric Network Design Topology Considerations Scale Out Design Fabric Design Benefits Segment Routing Underlay Segment Routing and Segment Routing Traffic Engineering SR-MPLS Data Plane Segment-Routing Flexible Algorithms Constraint-based SR Policies On-Demand SR Policies Segment Routing Benefits in IXP Use L2 IX Services using Segment Routing and EVPN EVPN Background EVPN Benefits for IX Use Modern IXP Deployment Background Single-plane Segment Routing Underlay Deployment Topology Diagram for Single-plane Fabric SRGB and SRLB Definition Base IGP / Segment Routing Configuration Enabling TI-LFA Dual-Plane Fabric using SR Flexible Algorithms Flex-Algo Background Diagram Dual-plane Flex-Algo Configuration Simple L2VPN Services using EVPN BGP AFI/SAFI Configuration EVPN Service Configuration Elements EVI - Ethernet Virtual Instance RD - Route Distinguisher RT - Route Target ESI - Ethernet Segment Identifier Attachment Circuit ID Topology Diagram for Example Services P2P Peer Interconnect using EVPN-VPWS Single-homed EVPN-VPWS service Multi-homed Single-active/All-active EVPN-VPWS service EVPN ELAN Services EVPN ELAN with Single-homed Endpoints EVPN ELAN with Dual-homed Endpoint Appendix Segment Routing and EVPN Troubleshooting Commands Periodic Model Driven Telemetry Device Health Infrastructure Monitoring Routing Protocols Service Monitoring Event Driven Telemetry In our next blog we will explore advanced Segment Routing TE using ODN/Flex-Algo and Layer 3 services using L3VPN and EVPN IRB Modern IX Fabric DesignInternet Exchange HistoryInitial ExchangesThe Internet was founded on a loosely coupled open inter-connectivity model. The ability for two networks to use a simple protocol to exchange IP routing data was essential in the growth of the initial research-focused Internet as well as what it has become today. As the Internet grew it made sense to create locations where multiple networks could connect to each other over a common multi-access network. The initial exchanges connected to NFSNet in the 1980s were located in San Francisco, New York, and Chicago, with an additional European exchange in Stockholm, Sweden. During these days the connectivity was between universities and other government research institutions, but that would soon change as commercial interest in the Internet grew.Growth through Internet PrivatizationIn the early 1990s those running the Internet saw the first commercial companies join. The ANS Internet backbone was built by a consortium of commercial companies (PSI, MCI, and Merit) and those companies wanted to offer commercial services such as Email across the infrastructure and do other entities already connected to the Internet. One of the first commercial exchanges created was simply called the “Commercial Internet Exchange” or CIX, located in Reston, Virginia.One issue encountered in the initial exchange which still happens today is disagreements in connectivity. The original ANS backbone network, the most widely used network for connecting to the Internet, refused to connect to the CIX. As the Internet transitioned to a commercial network as opposed to a research network, the role of ANS diminished as exchanges like CIX become more important.Dawn of the NAP and Rapid IX ExpansionIn the mid 1990s as the Internet became more of a privatized public network, the need arose to create a number of public Internet peering exchange locations to help universities, enterprises, and service providers connect to each other. The original NAPs in the United States were run by either regional or national telecommunications companies. Figure 1 and the table below lists the original five US NAP locations. There were other exchange locations other than the NAPs, but these were the main peering points in the US, and at one point 90% of the worldwide Internet traffic flowed through these five locations. NAP Name Location Operator AADS Chicago, IL Ameritech MAE-EAST Washington DC, DC MFN, Pre-Existing Consortium NYIX New York, NY Sprint *MAE-WEST San Jose, CA MCI PAIX Palo Alto, CA PacBell MAE-WEST was not one of the original four NAPs awarded by NFSNET but was already established as a west coast IX prior to 1993 when those NAPs were awarded.The United States was not the only location in the world seeing the formation of Internet exchanges. The Amsterdam Internet Exchange, AMS-IX was formed in 1994 and is still the largest Internet Exchange in Europe.IX Design EvolutionAs the Internet has evolved so has IX design, driven by bandwidth growth and the need for more flexible interconnection as the scope of traffic and who connects to the Internet evolves.Initial Exchange DesignThe initial Internet exchanges were built to be multi-access networks where a participant could use a single physical connection for both private point to point and public connections. These exchanges primarily used either IP over FDDI or IP over ATM (over TDM) as the transport between peers. Some more forward-looking exchanges also used switched Ethernet, but it was not widely deployed in the mid-1990s. FDDI and ATM allowed the use of virtual circuits to provide point to point and multipoint connections over a common fabric. One important aspect of the fabrics is they used variable length data-link encoding, enabling packet-level statistical multiplexing. A fabric could be built using much less overall capacity than one using traditional TDM circuit switching. Interest in FDDI quickly waned and some IXs like MAE-EAST created a second fabric using ATM due to its popularity in the late 1990s and early 2000s.Ethernet Takes OverIn the late 1990s Ethernet was becoming popular to build Local Area Networks due to its simplicity and cost. The use of VLANs to segment Ethernet traffic into a number of virtual networks also gave it the same flexibility of ATM and FDDI. The original NAPs began to transition to Ethernet at this point for intra-exchange connectivity, such as the case when two providers have equipment co-located at the same IXP facility. One hurdle however was Ethernet circuits for WAN connectivity had not become popular yet, so it took some time for Ethernet to overtake ATM and FDDI completely.After Ethernet took over as the main transport for IX fabrics, they were primarily built using simple L2 switch fabrics. These switch fabrics do not have the loop prevention of IP networks, so protocols like STP or MST must be used for loop prevention. Ethernet fabrics are still in widespread use with IX networks, especially those with smaller scale requirements.More Advanced Fabric TransportAs IX fabrics began to grow, there arose a need for better control of traffic paths and more capabilities than what simple L2 fabrics could offer. This is not an exhaustive list of potential transport options, but lists a few popular ones for IX use.VPLS and P2P PW over MPLSMPLS (Multi-Protocol Label Switching) has been a popular data plane shim layer for providing virtual private networks over a common fabric for more than a decade now. Distribution of labels is done using LDP or RSVP-TE. RSVP-TE offer resilience through the use of fast-reroute and the ability to engineer traffic paths based on constraints. In the design section we will examine the benefits of Segment Routing over both LDP and RSVP-TE. MPLS itself is not enough to interconnect IX participants, it requires using overlay VPN services. VPLS (Virtual Private Lan Service) is the service most widely deployed today, emulating the data plane of a L2 switch, but carrying the traffic as MPLS-encapsulated frames. Point to point services are commonly provisioned using point to point pseudowires signaled using either a BGP or LDP control-plane. Please see RFC 4761 and RFC 6624.EVPN over VXLANWe will speak more about EVPN in the design section, but it has become the modern way to deliver L2VPN services, using a control-plane similar to L3VPN. EVPN extends MP-BGP with signaling extensions for L2 and L3 services that can utilize different underlying transport methods. VXLAN is one method which encapsulates Layer2 frames into a VXLAN packet carried over IP/UDP. VXLAN is considered “overlay transport” since it is carried in IP over any underlying path. VXLAN has no inherent ability to provide resiliency or traffic engineering capabilities. Gaining that functionality requires layering VXLAN on top of MPLS transport, adding complexity to the overall network. Using simple IP/UDP encapsulation, VXLAN is well suited for overlays traversing 3rd party opaque L3 networks, but IXP networks do not generally have this requirement.TRILL, PBB, and other L2 Fabric TechnologyAt a point in the early 2010s there was industry momentum towards creating a more advanced network without introducing L3 routing into the network, considered complex by those in favor of L2. Two protocols with support were TRILL (Transparent Interconnection of Lots of Links), 802.1ah PBB (Provider Backbone Bridging), and its TE addition PBB-TE. Proprietary fabrics were also proposed like Cisco FabricPath and Juniper QFabric. Ultimately these technologies faded and did not see widespread industry adoption.Modern IX Fabric RequirementsHardwareThe heart of an IX is the connectivity to participants at the edge and the transport connecting them to other participants. There are several components that needs to be considered when looking at hardware to build a modern IX.High-Density and Future-ProofBandwidth growth today requires the proper density to support the needs of the specific provider within a specific facility. This can range from 10s of Gbps to 10s of Tbps, requiring the ability to support a variable number of 10G and 100G interfaces. The ability to expand without replacing a chassis is also important as it can be especially cumbersome to do so in non-provider facilities. Today’s chassis-based deployments must be able to support a 400G future.Edge and WAN Interface FlexibilityEthernet is the single type of connectivity used today for connecting to an IX, but an IX does need flexibility to peers and the connectivity between IX fabric elements. 10G connections have largely replaced 1G connections for peers due to cost reduction in both device ports and optics. However, there is still a need for 1G connectivity as well as different physical medium types such as 10GBaseT. In order to support connectivity such as dark fiber, IXs sometimes must also provide ZR, ER, or DWDM based Ethernet port types.Modern IXs are also not typically limited ot a single physical location, so WAN connectivity also becomes important. Using technologies like coherent CFP2-DCO may be a requirement to interconnect IX facilities via high-speed flexible links.Power and SpaceIt almost goes without saying devices in a IX location must have the lowest possible space, power, and cooling footprint per BPS of traffic delivered. As interconnection continues to grow technology must advance to support higher density devices without considerably higher power and cooling requirements.Packet Transport RequirementsResiliency is key within any fabric, and the IX fabric must be able to withstand the failure of a link or node with minimal traffic disruption. Boundaries between different IX facilities must be redundantly connected.Service Types and RequirementsIn a modern IX there can be both traditional L2 connectivity as well as L3 connectivity. The following outlines the different service types and their requirements. One common requirement across all service types is redundant attachment. Each service type should support either active/active or active/standby connectivity from the participant.Point to Point ConnectivityThe most basic service type an IX fabric must provide is point to point participant connectivity. The edge must be able to accept and map traffic to a specific service based on physical port or VLAN tagged frames. The ability to rewrite VLANs at the edge may also be a requirement.Multi-point Connectivity (Multi-lateral Peering)Multi-point connectivity is most commonly used for multi-lateral peering fabrics or “public” fabrics where all participants belong to the same bridge domain. This type of fabric requires less configuration since a single IP interface is used to connect to multiple peers. BGP configuration can also be greatly simplified if the IX provides a route-server to advertise routes to participant peers versus a full mesh of peering sessions.Layer 3 Cloud or Transit ConnectivityDepending on the IX provider, they may offer their own blended transit services or multi-cloud connections via L3 connectivity. In this service type the IX will peer directly with the participant or provide a L3 gateway if the participant is not using dynamic routing to the IX.QoS RequirementsQuality of Service or Class of Service (CoS) covers an array of traffic handling components. With the use of higher speed 10G and 100G interfaces as defacto physical connectivity, the most basic QoS needed is policing or shaping of ingress traffic at the edge to the contracted rate of the participant.An IX can also offer differentiated services for connectivity between participants requiring traffic marking at the edge and specific treatment across the core of the network.Broadcast, Unknown Unicast, and Multicast TrafficA L2 fabric, whether traditional L2 switching or emulated via a technology like EVPN must provide controls to limit the effects of BUM traffic having the potential to flood networks with unwanted or duplicate traffic. At the PE-CE boundary “storm” controls must be supported to limit these traffic types to sensible packet rates.SecurityPeer Connection IsolationOne of the key tenets of a multi-tenant network fabric is to provide secure traffic isolation between parties using the fabric. This can be done using “soft” or “hard” mechanisms. Soft isolation uses packet structure in order to isolate traffic between tenants, such as MPLS headers or specific service tags. As you move down the stack, the isolation becomes “harder”, first using VLAN tags, channels in the case of Flex Ethernet, or separate physical medium to completely isolate tenants. Harder isolation is typically less efficient and more difficult to operate. In modern networks, isolation using VPN services is regarded as sufficient for an IX fabric and offers the greatest flexibility and scale. The isolation must be performed on the IX fabric itself and protect against users spoofing MPLS headers or VLANs tags at the attachment point in the network.L2 SecurityThe table below lists the more common L2 security features required by an IX network. Some of these should perform the action of disabling either permanently or temporarily a connected port. Feature Description L2 ACL Ability to create filters based on L2 frame criteria (SRC/DST MAC, Ethertype, control BPDUs, etc) ARP/ND/RA policing Police ARP/ND/RA requests MAC scale limits Limit MAC scale for specific Bridge Domain Static ARP Override dynamic ARP with static ARP entries L3 Security Feature Description L3 ACL Ability filter on L3 criteria, useful for filtering attacks towards fabric subnets and participants ARP/ND/RA policing Police ARP/ND/RA requests MAC scale limits Limit MAC scale for specific Bridge Domain Static ARP Override dynamic ARP with static ARP entries Additional IX ComponentsFabric and Peer Service AutomationIdeally the management of the underlying network fabric, participant interfaces, and participant services are automated. Using a tool like Cisco NSO as a single source of truth for the network eliminates configuration errors, eases deployment, and eases the removal of configuration when it is no longer needed. NSO also allows easy abstraction and deployment of point to point and multi-point services through the use of defined service models and templates. Deployed services should be well-defined to reduce support complexity.Route ServersA redundant set of route servers is used in many IX deployments to eliminate each peer having to configure a BGP session to every other peer. The route server is similar to a BGP route reflector with the main difference being a route server operates with EBGP peers and not IBGP peers. The route server also acts as a point of route security since the filters governing advertisements between participants is typically performed on the route server. Route server definition can be found in RFC 7947 and route server operations in RFC 7948.Route Looking GlassLooking glasses allow an outside user or internal participant to view the current real-time routing for a specific prefix or set of prefixes. This is invaluable for troubleshooting routing issues.Analytics and MonitoringFabric TelemetryHaving accurate statistics on peer and fabric state is important for evaluating the current health of the fabric as well assist in capacity planning. Monitoring traffic utilization, device health, and protocol state using modern telemetry such as streaming telemetry can help rectify faults faster and improve reliability. See the automation section of the design for a list of common Cisco and OpenConfig models used with streaming telemetry.Route Update HistoryRoute update history is one area of IX operation that can assist not only with IX growth but also Internet health as a whole. Being able to trace the history of route updates coming through an IX helps both providers and enterprises determine root cause for traffic issues, identify the origin of Internet security events, and assist those researching Internet routing change over time. Route update history can be communicated by either BGP or using BGP Monitoring Protocol (BMP).Modern IXP Fabric Network DesignTopology ConsiderationsScale Out DesignWe can learn from modern datacenter design in how we build a modern IX fabric, at least the network located within a single facility or group of locations in close proximity. The use of smaller functional building blocks increases operational efficiency and resiliency within the fabric. Connecting devices in a Clos (leaf/spine or fat-tree are other names) fabric seen in Figure XX versus a large modular chassis approach has a number of benefits.Fabric Design Benefits Scale the fabric by simply adding devices and interconnects Optimal connectivity between fabric endpoints Increased resiliency by utilizing ECMP across the fabric Ability to easily takes nodes in and out of service without affecting many services In the case of interconnecting remote datacenters using a more fabric based approach increases overall scale and resiliency.Segment Routing UnderlaySegment Routing and Segment Routing Traffic EngineeringSegment Routing is the modern simplified packet transport control-plane for multi-service networks. Segment Routing eliminates separate IGP and label distribution protocols running in parallel while providing built-in resilience and traffic engineering capabilities. Much more information on segment routing can be located at http#//www.segment-routing.netSR-MPLS Data PlaneOne important point with Segment Routing is that it is data plane agnostic, meaning the architecture is built to support multiple data plane types. The SR “SID” is an identifier expressed by the underlying data plane. The Segment Routing MPLS data plane uses standard MPLS headers to carry traffic end to end, with SR responsible for label distribution and path computation. In this design we will utilize the SR MPLS data plane, but as other data planes such as Segment Routing IPv6 become mature they could plugin in as well.Segment-Routing Flexible AlgorithmsAn exciting develop in SR capabilities is through the use of “Flexible Algorithms.” Simply put Flex-Algo allows one to define multiple SIDs on a single device with each one representing a specific “algorithm.” Using the algorithm as a constraint in head-end path computation simplifies the path to a single label since the algorithm itself takes of pruning links not applicable to the topology. See the figure below for an example of Flex-Algo using the initial topology definition to restrict the path to only encrypted links.Constraint-based SR PoliciesPath computation on the head-end SR node or external PCE can include more advanced constraints such as latency, link affinity, hop count, or SRLG avoidance. Cisco supports a wide range of path constraints both within XR on the SR head-end node as well as through SR-PCE, Cisco’s external Path Computation Element for Segment Routing.On-Demand SR PoliciesCisco has simplifies the control-plane even more with a feature called ODN (On Demand Next Hop) for SR-TE. When a head-end node receives an EVPN BGP route with a specific extended community, known as the color community, it instructs the head-end node to create an SR Policy to the BGP next-hop following the defined constraints for that community. The head-end node can compute the SR Policy path itself, or the ODN policy can instruct the head-end to consult a PCE for end-to-end path computation.Segment Routing Benefits in IXP Use Reduction of control-plane protocols across the fabric by eliminating additional label distribution protocols Advanced traffic engineering capabilities, all while reducing overall network complexity Built-in local protection through the use of TI-LFA, computing the post-convergence path for both link and node protection Advanced OAM capabilities using real-time performance measurement and automated data-plane path monitoring Ability to tie services to defined underlay paths, unlike pure overlays like VXLAN Quickly add to the topology by simply turning up new IGP linksL2 IX Services using Segment Routing and EVPNEVPN BackgroundEVPN is the next-generation service type for creating L2 VPN overlay services across a Segment Routing underlay network. EVPN replaces services like VPLS emulating a physical Ethernet switch with a scalable BGP based control-plane. MAC addresses are no longer learned across the network as part of traffic forwarded, they are learned at the edges and distributes as BGP VPN routes. Below is a list of just a few of the advantages of EVPN over legacy services types such as VPLS or LDP-signaled P2P pseudowires. RFC 7432 is the initial RFC defining EVPN service types and operation covering both MP and P2P L2 services VPWS point to point Ethernet VPN ELAN multi-point Ethernet VPN EVPN brings the paradigms of BGP L3VPN to Ethernet VPNs MAC and ARP/ND (MAC+IP) information is advertised using BGP NLRI EVPN signaling identifies the same ESI (Ethernet Segment) connected to multiple PE nodes, allowing active/active or active/standby multi-homing EVPN has been extended to support IRB (Integrated Routing and Bridging) for inter-subnet routing EVPN Benefits for IX Use BGP-based control plane has obvious scaling and distribution benefits Eliminates mesh of pseudowires between L2 endpoints All-active per-flow load-balancing across redundant active links between IX and peers Reduced flooding scope for ARP traffic BUM labels act as a way to control flooding without complex split-horizon configuration Fast MAC withdrawal improves convergence vs. data plane learning Filter ARP/MAC advertisements via common BGP route policy Distributed MAC pinning Once MAC is learned on a CE interface, it is advertised with EVPN BD as “sticky” Remote PEs will drop traffic sourced from a MAC labeled as sticky from another PE Works in redundancy scenarios Works seamlessly with existing L2 Ethernet fabrics without having to run L2 protocols such as STP within EVPN itself Provides L2 multi-homing replacement for MC-LAG and L3 multi-homing replacement for VRRP/HSRPModern IXP DeploymentBackgroundIn the following section we will explore the deployment using IOS-XR devices and CLI. We will start with the most basic deployment and add additional components to enable features such as multi-plane design and L3 services.Single-plane Segment Routing Underlay DeploymentIn the simplest deployment example, Segment Routing is deployed by configuring either OSPF or IS-IS with SR MPLS extensions enabled. The configuration example below utilizes IS-IS as the SR underlay IGP protocol. The underlay is deployed as a single IS-IS L2 domain using Segment Routing MPLS.Topology Diagram for Single-plane FabricSRGB and SRLB DefinitionIt’s recommended to first configure the Segment Routing Global Block (SRGB) across all nodes needing connectivity between each other. In most instances a single SRGB will be used across the entire network. In a SR MPLS deployment the SRGB and SRLB correspond to the label blocks allocated to SR. IOS-XR has a maximum configurable SRGB limit of 512,000 labels, however please consult platform-specific documentation for maximum values. The SRLB corresponds to the labels allocated for SIDs local to the node, such as Adjacency-SIDs. It is recommended to configure the same SRLB block across all nodes. The SRLB must not overlap with the SRGB. The SRGB and SRLB are configured in IOS-XR with the following configuration#segment-routing segment-routing global-block 16000 16999 local-block 17000 17999 Base IGP / Segment Routing ConfigurationThe following configuration example shows an example IS-IS deployment with SR-MPLS extensions enabled for the IPv4 address family. The SR-enabling configuration lines are bolded, showing how Segment Routing and TI-LFA (FRR) can be deployed with very little configuration. SR must be deployed on all interconnected nodes to provide end to end reachability. This example shows defining an absolute prefix-sid as well as a manually defined adjacency SID.It is recommended to use absolute SIDs and manual adjacency SIDs if possible. Interoperability with other vendors requires the use of relative prefix-SIDs. A protected SID is eligible for backup path computation. In the case of having multiple adjacencies between the same two nodes, use the same adjacency-sid on each link.router isis example set-overload-bit on-startup wait-for-bgp is-type level-2-only net 49.0002.1921.6801.4003.00 distribute link-state log adjacency changes log pdu drops lsp-refresh-interval 65000 max-lsp-lifetime 65535 lsp-password keychain ISIS-KEY address-family ipv4 unicast maximum-paths 16 metric-style wide mpls traffic-eng level-2-only mpls traffic-eng router-id Loopback0 maximum-paths 32 segment-routing mpls ! interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid absolute 16041 ! ! ! interface GigabitEthernet0/0/0/1 circuit-type level-2-only point-to-point address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 10 adjacency-sid absolute 15002 protected The two key elements to enable Segment Routing are segment-routing mpls under the ipv4 unicast address family and the node prefix-sid absolute 16041 definition under the Loopback0 interface. The prefix-sid can be defined as either an indexed value or absolute. The index value is added to the SRGB start (16000 in our case) to derive the SID value. Using absolute SIDs is recommended where possible, but in a multi-vendor network where one vendor may not be able to use the same SRGB as the other, using an indexed value is necessary.Enabling TI-LFATopology-Independent Loop-Free Alternates is not enabled by default. The above configuration enables TI-LFA on the Gigabit0/0/0/1 interface for IPv4 prefixes. TI-LFA can be enabled for all interfaces by using this command under the address-family ipv4 unicast in the IS-IS instance configuration. It is recommended to enable it at the interface level to control other TI-LFA attributes such as node protection and SRLG support.interface GigabitEthernet0/0/0/1 circuit-type level-1 point-to-point address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 10This is all that is needed to enable Segment Routing, and you can already see the simplicity in its deployment vs additional label distribution protocols like LDP and RSVP-TEDual-Plane Fabric using SR Flexible AlgorithmsThe dual plane design extends the base configuration by defining a topology based on SR flexible algorithms. Defining two independent topologies allows us to easily support disjoint services across the IX fabric. IX operators can offer diverse services without the fear of convergence on common links. This can be done with a minimal amount of configuration. Flex-algo also supports LFA and will ensure LFA paths are also constrainted to a specific topology.Flex-Algo BackgroundSR Flex-Algo is a simple extension to SR and its compatible IGP protocols to advertise membership in a logical network topology by configuring a specific “algorithm” attached to a node prefix-sid. All nodes with the same algorithm defined participate in the topology and can use the definition to define the behavior of a path. In the below example when a head-end computes a path to node 9’s node-SID assigned to algorithm 129 it will only use nodes participating in the minimal delay topology. This means a constraint can be met using a single node SID in the SID list instead of multiple explicit SIDs. Flex-algo is defined in IETF draft# draft-ietf-lsr-flex-algo and more details on Flex-Algo can be found on http#//www.segment-routing.net Algo 0 = IGP Metric Algo 128 = Green = Minimize TE Metric Algo 129 = Red = Minimize DelayDiagramDual-plane Flex-Algo ConfigurationWe will not re-introduce all of the configuration but the subset necessary to define both planes. To enable flexible algorithms you must first define the algorithms globally in IS-IS. The second step is to define a node prefix-sid on a Loopback interface and attach an algorithm to the SID. By default all nodes participate in algorithm 0, which is to simply compute a path based on minimal IGP metric.The advertise-definition option advertises the definition through the IGP domain. Using this command the definition can be defined on a subset of nodes and the global configuration is unnecessary on all nodes. It’s recommended to define the flex-algo identifiers on all participating nodes and advertise them. IS-IS Configurationinterface GigabitEthernet0/0/0/1router isis 1 flex-algo 100 advertise-definition !flex-algo 200 advertise-definition !interface Loopback0 address-family ipv4 unicast prefix-sid algorithm 100 absolute 16141 prefix-sid algorithm 101 absolute 16241 Simple L2VPN Services using EVPNWe will first look at EVPN configuration for deploying basic point to point and multi-point L2 services without specific traffic engineering constraints or path diversity requirements. These services will simply follow the shortest path across the network.BGP AFI/SAFI ConfigurationEVPN uses additional BGP address families in order to carry EVPN information across the network. EVPN uses the BGP L2VPN AFI of 25 and a SAFI of 70. In order to carry EVPN information between two peers, this AFI/SAFI must be enabled on all peers. The following shows the minimum BGP configuration to enable this at a global and peer level. router bgp 100 bgp router-id 100.0.0.1 address-family l2vpn evpn !!neighbor-group EVPN remote-as 100 update-source Loopback0 address-family l2vpn evpn !!neighbor 100.0.0.2 use neighbor-group EVPN At this point the two neighbors will become established over the EVPN AFI/SAFI. The command to view the relationship in IOS-XR is show bgp l2vpn evpn summaryEVPN Service Configuration ElementsEVI - Ethernet Virtual InstanceAll EVPN services require the configuration of the Ethernet Virtual Instance identifier, used to advertise the existence of an EVPN serviceendpoint to routers participating in the EVPN service network. The EVI is locally significant to a PE but it’s recommended the same EVI be configured for all routers participating in a particular EVPN service.RD - Route DistinguisherSimilar to L3VPN and BGP-AD L2VPN, a Route Distinguisher is used to differentiate EVPN routes belonging to different EVIs. The RD is auto-generated based on the Loopack0 IP address as specified in the EVPN RFC.RT - Route TargetAlso similar to L3VPN and BGP-AD L2VPN, a Route Target extended community is defined so EVPN routes are imported into the correct EVI across the network. The RT is auto-generated based on the EVI ID, but can be manually configured. It is recommended to utilize the auto-generated RT value.ESI - Ethernet Segment IdentifierThe ESI is used to identify a particular Ethernet “segment” for the purpose of multi-homing. In single-homed scenarios, the ESI is set to 0 by default. In multi-homing scenarios such all-active attachment, the ESI is configured the same on multiple routers. If it is known an attachment will never be multi-homed, using the default ESI of 0 is recommended, but if there is a chance it may be multi-homed in the future, using a unique ESI is recommended. The ESI is a 10-octet value with 1 byte used for the type and 9 octets for the value. Values of 0 and the maximum value are reserved.Attachment Circuit IDThis value is used only with EVPN VPWS point to point services. It defines a local attachment circuit ID and the remote attachment circuit ID to be used in signalling the endpoints and for direct traffic to the correct attachment point. The local ID does not have to be the same on both ends, but it is recommended.Topology Diagram for Example ServicesThe following is a topology diagram to follow along with the service endpoints in the below service configuration examples. Each CE node represents a peering fabric participant.P2P Peer Interconnect using EVPN-VPWSThe following highlights a simple P2P transparent L2 interconnect using EVPN-VPWS. It is assumed the EVPN BGP address family has been configured.Single-homed EVPN-VPWS serviceThe simplest P2P interconnect is single-homed on both ends. The single-active service can use an entire physical interface or VLAN tagsto identify a specific service. This service originates on PE1 and terminates on PE3. The service data plane path utilizes ECMP across the core network, one of the benefits of using an SR underlay. As you can see in the config below, there is no static neighbor config, P2P VPWS connections are dynamically setup by matching the EVI, target, and source identifiers. The target and source identifiers must match on the two nodes participating in the service.DiagramPE1interface TenGigabitEthernet0/0/0/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 interface TenGigabitEthernet0/0/0/1.100 neighbor evpn evi 10 target 100 source 101 PE3interface TenGigabitEthernet0/0/1/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 interface TenGigabitEthernet0/0/1/1.100 neighbor evpn evi 10 target 101 source 100 Multi-homed Single-active/All-active EVPN-VPWS serviceA multi-homed service uses two attachment circuits from the CE to unique PE devices on the provider side. LACP is used between the PEs and CE device in single-active and all-active multi-homing. This requires configuring a static LACP ID and ESI on both PE routers. Multi-chassis LACP protocols such as ICCP are not required, all muti-homed signaling is done with the EVPN control-plane.In this example CE1 is configured in a multi-homed all-active configuration to PE1 and PE2, CE2 continues to be configured as single-homed. In this configuration traffic will be hashed using header information across all active links in the bundle across all PEs. PE3 will receive two routes for the VPWS service and utilize both to balance traffic towards PE1 and PE2. Another option is to use single-active load-balancing mode, which will only forward traffic towards the ethernet-segment from the DF (default forwarder). Single-active is commonly used to enforce customer bandwidth rates, while still providing redundancy. In the case where there are multiple EVPN services on the same bundle interface, they will be balanced across the interfaces using the DF election algorithm.DiagramNote the LACP system MAC and ethernet-segment (ESI) on both PE nodes must be configured with the same values.PE1lacp system mac 1001.1001.1001!interface TenGigabitEthernet0/0/0/1 description ~To CE1~ bundle id 1 mode on !interface Bundle-Ether1.100 l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!!evpn group 1 core interface TenGigabitEthernet0/0/1/24 ! interface Bundle-Ether1.100 ethernet-segment type 0 11.11.11.11.11.11.11.11.11 load-balancing-mode single-active <-- Optional core-isolation-group 1 !! l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 lacp system mac 3637.3637.3637 neighbor evpn evi 10 target 100 source 100 PE2lacp system mac 1001.1001.1001!interface TenGigabitEthernet0/0/0/1 description ~To CE1~ bundle id 1 mode on !interface Bundle-Ether1.100 l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!!evpn group 1 core interface TenGigabitEthernet0/0/1/24 ! interface Bundle-Ether1.100 ethernet-segment type 0 11.11.11.11.11.11.11.11.11 load-balancing-mode single-active <-- Optional core-isolation-group 1 !l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 interface Bundle-Ether1.100 neighbor evpn evi 10 target 100 source 100PE3interface TenGigabitEthernet0/0/1/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric !!l2vpn xconnect group evpn-vpws-example p2p pe1-to-pe2 interface TenGigabitEthernet0/0/1/1.100 neighbor evpn evi 10 target 100 source 100 EVPN ELAN ServicesAn EVPN ELAN service is analgous to the function of VPLS, but modernized to eliminate the deficiencies with VPLS highlighted in earlier sections. ELAN is a multipoint service interconnecting all participating hosts connected to an ESI participating in the same EVI.EVPN ELAN with Single-homed EndpointsIn this configuration example the CE devices are connected to each PE using a single attachment interface. The EVI is set to a value of 100. It is considered a best practice to manually configure the ESI value on each participating interface although not required in the case of a single-active service. The ESI must be unique for each Ethernet Segment attached to the EVPN EVI.The core-isolation-group configuration is used to shutdown CE access interfaces when a tracked core upstream interface goes down. This way a CE will not send traffic into a PE node isolated from the rest of the network.In the bridge configuration, L2 security for storm control is enabled for unknown-unicast and multicast traffic. Additionally the MAC agging time is set to 30 minutes to decrease ARP traffic, and the MAC limit is set to 1 since all peers should be connected with a routed L3 interface to the IX fabric. Under the physical interface configuration an input QoS policy is configured to remark all inbound traffic with a DSCP of 0 and a L2 access list is configured to only allow 802.1Q TPID traffic with a VLAN tag of 100 from a specific MAC address.PE1ethernet-services access-list restrict_mac 10 permit host 00aa.dc11.ba99 any 8100 vlan 100 20 deny any any ! policy-map remark-ingress class class-default set dscp 0 ! end-policy-map!interface TenGigabitEthernet0/0/1/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric service-policy input remark-ingress ethernet-services access-group restrict-peer-mac ingress< !!l2vpn bridge group evpn bridge-domain evpn-elan interface TenGigabitEthernet0/0/1/1.100 mac limit 1 maximum 1 mac aging time 3600 storm-control unknown-unicast pps 100 storm-control mulitcast pps 100 ! evi 100 ! ! ! evpn evi 100 advertise-mac ! ! group 1 core interface TenGigabitEthernet0/0/1/24 ! interface TenGigabitEthernet0/0/1/1.100 ethernet-segment identifier type 0 11.11.11.11.11.11.11.11.11 ! core-isolation-group 1 ! ! PE2ethernet-services access-list restrict_mac 10 permit host 00aa.dc11.ba99 any 8100 vlan 100 20 deny any any ! policy-map remark-ingress class class-default set dscp 0 ! end-policy-map!interface TenGigabitEthernet0/0/1/1.100 encapsulation l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric service-policy input remark-ingress ethernet-services access-group restrict-peer-mac ingress< !!l2vpn bridge group evpn bridge-domain evpn-elan interface TenGigabitEthernet0/0/1/1.100 mac limit 1 maximum 1 mac aging time 3600 ! evi 100 ! ! ! evpn evi 100 advertise-mac ! ! group 1 core interface TenGigabitEthernet0/0/1/24 ! interface TenGigabitEthernet0/0/1/1.100 ethernet-segment identifier type 0 11.11.11.11.11.11.11.11.12 ! core-isolation-group 1 ! ! EVPN ELAN with Dual-homed EndpointIn this configuration example the CE1 device is connected to both PE1 and PE2. The EVI is set to a value of 100. The ESI value of 11.11.11.11.11.11.11.11.11 is configured on both PE devices connected to CE1.PE1ethernet-services access-list restrict_mac 10 permit host 00aa.dc11.ba99 any 8100 vlan 100 20 deny any any ! policy-map remark-ingress class class-default set dscp 0 ! end-policy-map!lacp system mac 1001.1001.1001!interface TenGigabitEthernet0/0/0/1 description ~To CE1~ bundle id 1 mode on !interface Bundle-Ether1.100 l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric service-policy input remark-ingress ethernet-services access-group restrict-peer-mac ingress< !!l2vpn bridge group evpn bridge-domain evpn-elan interface Bundle-Ether1.100 mac limit 1 maximum 1 mac aging time 3600 ! evi 100 ! ! ! evpn evi 100 advertise-mac ! ! group 1 core interface TenGigabitEthernet0/0/1/24 ! interface TenGigabitEthernet0/0/1/1.100 ethernet-segment identifier type 0 11.11.11.11.11.11.11.11.11 load-balancing-mode single-active <-- Optional command to only forward through DF ! core-isolation-group 1 ! ! PE2ethernet-services access-list restrict_mac 10 permit host 00aa.dc11.ba99 any 8100 vlan 100 20 deny any any ! policy-map remark-ingress class class-default set dscp 0 ! end-policy-map!lacp system mac 1001.1001.1001!interface TenGigabitEthernet0/0/0/1 description ~To CE1~ bundle id 1 mode on !interface Bundle-Ether1.100 l2transport encapsulation dot1q 100 rewrite ingress tag pop 100 symmetric service-policy input remark-ingress ethernet-services access-group restrict-peer-mac ingress< !!l2vpn bridge group evpn bridge-domain evpn-elan interface Bundle-Ether1.100 mac limit 1 maximum 1 mac aging time 3600 ! evi 100 ! ! ! evpn evi 100 advertise-mac ! ! group 1 core interface TenGigabitEthernet0/0/1/24 ! interface TenGigabitEthernet0/0/1/1.100 ethernet-segment identifier type 0 11.11.11.11.11.11.11.11.11 load-balancing-mode single-active <-- Optional command to only forward through DF ! core-isolation-group 1 ! ! AppendixSegment Routing and EVPN Troubleshooting Commands Show command Function isis segment-routing label table Display learned node SIDs mpls forwarding detail Show general MPLS forwarding table mpls forwarding prefix [prefix] detail Show detail forwarding information for exact prefix cef Show FIB hardware forwarding information mpls forwarding labels [label] detail Display forwarding info and stats for EVPN label bgp l2vpn evpn Display EVPN NLRI bgp l2vpn evpn rd [rd] Display EVPN NLRI belonging to specific RD bgp l2vpn evpn route-type [type] Display EVPN routes of a specific route type evpn internal-label Display labels allocated to EVPN instances evpn ethernet-segment esi [esi] carving detail Display EVPN service details evpn evi [vpn-id] mac Show MAC address tables and MPLS label info for all EVI evpn evi vpn-id [vpn] detail Show detail info for a specific local EVI evpn evi vpn-id [vpn] detail Show detail info for a specific local EVI l2vpn forwarding location [location] L2 forwarding database l2vpn forwarding bridge-domain [bridge-group#bridge-domain] mac-address detail location [location] l2 forwaridng info for local bridge domain l2vpn forwarding evpn[bridge-group#bridge-domain] mac-address detail location [location] l2 forwaridng info for local bridge domain l2vpn forwarding bridge-domain evpn ipv4-mac detail location [location] Show EVPN IPv4 MAC info l2vpn forwarding bridge-domain evpn ipv6-mac detail location [location] Show EVPN IPv6 MAC info l2vpn xconnect detail Display EVPN VPWS info and state Periodic Model Driven TelemetryDevice Health Function Sensor Path Uptime Cisco-IOS-XR-shellutil-oper#system-time/uptime CPU Cisco-IOS-XR-wdsysmon-fd-oper#system-monitoring/cpu-utilization Memory Cisco-IOS-XR-nto-misc-oper#memory-summary/nodes/node/summary ASR9K Power Cisco-IOS-XR-asr9k-sc-envmon-oper#environmental-monitoring/racks/rack/slots/slot/modules/module/power/power-bag NCS 5500 Environmentals Cisco-IOS-XR-sysadmin-fretta-envmon-ui#environment/oper NCS 5500 FIB Resources Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-oper#dpa/stats/nodes/node/hw-resources-datas/hw-resources-data Infrastructure Monitoring Function Sensor Path Interface Summary Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-summary Interface Counters Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters Interface Data/PPS Rates (show int) Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/cache/data-rate IS-IS Stats Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/statistics-global Optics Information Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info Aggregate Bundle Stats Cisco-IOS-XR-bundlemgr-oper#bundles LLDP Neighbor Information Cisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/neighbors QoS Input Stats Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input QoS Output Stats Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/output QoS VOQ Information Cisco-IOS-XR-qos-ma-oper#qos/qos-global/vo-q/vo-q-statistics/vo-qinterfaces/vo-qinterface LPTS (Control Plane) Flow Information Cisco-IOS-XR-lpts-pre-ifib-oper#lpts-pifib/nodes/node/dynamic-flows-stats/flow IPv4 ACL Resources Cisco-IOS-XR-ipv4-acl-oper#ipv4-acl-and-prefix-list/oor/access-list-summary/details IPv6 ACL Resources Cisco-IOS-XR-ipv6-acl-oper#ipv4-acl-and-prefix-list/oor/access-list-summary/details Routing Protocols Function Sensor Path IS-IS Protocol Stats Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/statistics-global IS-IS Interfaces and Stats Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/interfaces IS-IS Adjacencies Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/level/adjacencies/adjacency IS-IS Route Info Cisco-IOS-XR-ip-rib-ipv4-oper#rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/protocol/isis/as/information BFD Statistics Cisco-IOS-XR-ip-bfd-oper#bfd/summary BFD Session Details Cisco-IOS-XR-ip-bfd-oper#bfd/session-details IPv4 BGP GRT Process Info Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info IPv6 BGP GRT Process Info Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info IPv4 BGP GRT Neighbor Stats Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors IPv6 BGP GRT Neighbor Stats Cisco-IOS-XR-ipv6-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors BGP Route Target Entries Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/rt-entries/rt-entry RPKI Summary Stats Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/rpki-summary BGP Flowspec Stats Cisco-IOS-XR-flowspec-oper#flow-spec/vrfs/vrf/afs/af/flows MPLS Label Allocation Cisco-IOS-XR-mpls-lsd-oper#mpls-lsd/label-summary SR Node Prefix-SIDs Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/topologies/topology/ipv4-routes/ipv4-route/native-status/native-details/primary/source/nodal-sid Service Monitoring Function Sensor Path L2VPN FIB Summary Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary L2VPN Bridge Domain Info Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/bridge-domains/bridge-domain L2VPN BD MAC Details Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fibmac-details L2VPN BD Stats Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-bridge-domains EVPN IPv4 Learned IP/MAC Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-evpn-ip4macs EVPN IPv6 Learned IP/MAC Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-evpn-ip6macs L2VPN Xconnect Info Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnects Event Driven TelemetryThese telemetry paths can be configured as EDT, only sending data when an event is triggered, for example an interface state change.One configures a supported sensor-path as Event Driven by setting the sample-interval in the subscription to 0 Function Sensor Path Interface Admin State Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interfaces/interface/state Interface Oper State Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interfaces/interface/line-state IPv4 Route Attributes Cisco-IOS-XR-ip-rib-ipv4-oper#rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/routes IPv4 Route Attributes Cisco-IOS-XR-ip-rib-ipv6-oper#rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/ip-rib-route-table-name/routes Optics Admin Sfxtate Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info/transport-admin-state Optics State Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info/controller-state In our next blog we will explore advanced Segment Routing TE using ODN/Flex-Algo and Layer 3 services using L3VPN and EVPN IRB", "url": "/blogs/2019-09-06-modernizing-ixp-design/", "author": "Phil Bedard", "tags": "iosxr, Peering, Design" } , "#": {} , "blogs-2020-03-09-converged-sdn-transport-convert": { "title": "Converged SDN Transport High Level Design v3.0", "content": " On This Page Revision History Value Proposition Summary Technical Overview Hardware Components in Design ASR 9000 NCS-560 NCS 5504, 5508, 5516 Modular Chassis NCS-5501, NCS-5501-SE, and N540-ACC-SYS NCS-55A2-MOD ASR-920 Transport – Design Components Network Domain Structure Topology options and PE placement - Inline and non-inline PE Connectivity using 100G/200G coherent optics w/MACSec Ring deployment without multiplexers Ring deployment with multiplexer Intra-Domain Intra-Domain Routing and Forwarding Intra-Domain Forwarding - Fast Re-Route Inter-Domain Inter-Domain Forwarding Area Border Routers – Prefix-SID vs Anycast-SID Inter-Domain Forwarding - High Availability and Fast Re-Route Transport Programmability Traffic Engineering (Tactical Steering) – SR-TE Policy Transport Controller Path Computation Engine (PCE) Segment Routing Path Computation Element (SR-PCE) PCE Controller Summary – SR-PCE Path Computation Engine – Workflow Delegated Computation to SR-PCE Segment Routing and Unified MPLS (BGP-LU) Co-existence Summary ABR BGP-LU design Quality of Service and Assurance Overview NCS 540, 560, and 5500 QoS Primer Hierarchical Edge QoS H-QoS platform support CST Core QoS mapping with five classes Example Core QoS Class and Policy Maps Class maps for ingress header matching Class maps for egress queuing and marking policies Egress QoS queuing policy Egress QoS marking policy Converged SDN Transport Use Cases 5G Mobile Networks Summary and 5G Service Types Key Validated Components Low latency SR-TE path computation SR Policy latency constraint configuration on configured policy SR Policy latency constraint configuration for ODN policies Static defined link delay metric TE metric definition End to end network QoS with H-QoS on Access PE CST QoS mapping with 5 classes Cable Converged Interconnect Network (CIN) Summary Distributed Access Architecture Remote PHY Components and Requirements Remote PHY Device (RPD) RPD Network Connections Cisco cBR-8 and cnBR cBR-8 Network Connections cBR-8 Redundancy Remote PHY Communication DHCP Remote PHY Standard Flows GCP UEPI and DEPI L2TPv3 Tunnels CIN Network Requirements IPv4/IPv6 Unicast and Multicast Network Timing QoS DHCPv4 and DHCPv6 Relay Converged SDN Transport CIN Design Deployment Topology Options High Scale Design (Recommended) Collapsed Digital PIC and SUP Uplink Connectivity Collapsed RPD and cBR-8 DPIC Connectivity Cisco Hardware Scalable L3 Routed Design L3 IP Routing CIN Router to Router Interconnection Leaf Transit Traffic cBR-8 DPIC to CIN Interconnection DPIC Interface Configuration Router Interface Configuration RPD to Router Interconnection Native IP or L3VPN/mVPN Deployment SR-TE CIN Quality of Service (QoS) CST Network Traffic Classification CST and Remote-PHY Load Balancing 4G Transport and Services Modernization L3 IP Multicast and mVPN LDP Auto-configuration LDP mLDP-only Session Capability (RFC 7473) LDP Unicast FEC Filtering for SR Unicast with mLDP Multicast EVPN Multicast LDP to Converged SDN Transport Migration Towards Converged SDN Transport Design Segment Routing Enablement Segment Routing Mapping Server Design Automation Zero Touch Provisioning Model-Driven Telemetry Network Services Orchestrator (NSO) Converged SDN Transport Supported Service Models Services – Design Overview Ethernet VPN (EVPN) Ethernet VPN Hardware Support Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or with Data Center End-To-End (Flat) – Services Hierarchical – Services Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHE Services – Route-Reflector (S-RR) Ethernet Services OAM using Ethernet CFM Transport and Services Integration The Converged SDN Transport Design Transport Transport Programmability Services Transport and Services Integration The Converged SDN Transport Design - Summary Revision History Version Date Comments 1.0 05/08/2018 Initial Converged SDN Transport publication 1.5 09/24/2018 NCS540 Access, ZTP, NSO Services 2.0 4/1/2019 Non-inline PE Topology, NCS-55A2-MOD, IPv4/IPv6/mLDP Multicast, LDP to SR Migration 3.0 1/20/2020 Converged Transport for Cable CIN, Multi-domain Multicast, Qos w/H-QoS access, MACSEC, Coherent Optic connectivity Value PropositionService Providers are facing the challenge to provide next generationservices that can quickly adapt to market needs. New paradigms such as5G introduction, video traffic continuous growth, IoT proliferation andcloud services model require unprecedented flexibility, elasticity andscale from the network. Increasing bandwidth demands and decreasing ARPUput pressure on reducing network cost. At the same time, services needto be deployed faster and more cost effectively to stay competitive.Metro Access and Aggregation solutions have evolved from nativeEthernet/Layer 2 based, to Unified MPLS to address the above challenges.The Unified MPLS architecture provides a single converged networkinfrastructure with a common operational model. It has great advantagesin terms of network convergence, high scalability, high availability,and optimized forwarding. However, that architectural model is stillquite challenging to manage, especially on large-scale networks, becauseof the large number of distributed network protocols involved whichincreases operational complexity.Converged SDN Transport design introduces an SDN-ready architecturewhich evolves traditional Metro network design towards an SDN enabled,programmable network capable of delivering all services (Residential,Business, 4G/5G Mobile Backhaul, Video, IoT) on the premise ofsimplicity, full programmability, and cloud integration, with guaranteedservice level agreements (SLAs).The Converged SDN Transport design brings tremendous value to ServiceProviders# Fast service deployment and rapid time to market throughfully automated service provisioning and end-to-end networkprogrammability Operational simplicity with less protocols to operate and manage Smooth migration towards an SDN-ready architecture thanks tobackward-compatibility with existing network protocols and services Next generation service creation leveraging guaranteed SLAs Enhanced and optimized operations using telemetry/analytics inconjunction with automation tools The Converged SDN Transport design is targeted at Service Providercustomers who# Want to evolve their existing Unified MPLS Network Are looking for an SDN ready solution Need a simple, scalable design that can support future growth Want a future proof architecture built using industry-leading technology SummaryThe Converged SDN Transport design satisfies the following criteria for scalable next-generation networks# Simple# based on Segment Routing as unified forwarding plane andEVPN and L3VPN as a common BGP based services control plane Programmable# Using SR-PCE to program end-to-end multi-domain paths across thenetwork with guaranteed SLAs Automated # Service provisioning is fully automated using NSOand YANG models; Analytics with model driven telemetry inconjunction with Crosswork Network Insights toenhance operations and network visibility Technical OverviewThe Converged SDN Transport design evolves from the successful CiscoEvolved Programmable Network (EPN) 5.0 architecture framework, to bringgreater programmability and automation.In the Converged SDN Transport design, the transport and service are builton-demand when the customer service is requested. The end-to-endinter-domain network path is programmed through controllers and selectedbased on the customer SLA, such as the need for a low latency path.The Converged SDN Transport is made of the following main buildingblocks# IOS-XR as a common Operating System proved in Service ProviderNetworks Transport Layer based on Segment Routing as UnifiedForwarding Plane SDN - Segment Routing Path Computation Element (SR-PCE) as Cisco Path ComputationEngine (PCE) coupled with Segment Routing to provide simple andscalable inter-domain transport connectivity and TrafficEngineering and Path control Service Layer for Layer 2 (EVPN) and Layer 3 VPN services basedon BGP as Unified Control Plane Automation and Analytics NSO for service provisioning Netconf/YANG data models Telemetry to enhance and simplify operations Zero Touch Provisioning and Deployment (ZTP/ZTD) Hardware Components in DesignASR 9000The ASR 9000 is the router of choice for high scale edge services. The Converged SDN Transport utilizes the ASR 9000 in a PE function role, performing high scale L2VPN, L3VPN, and Pseudowire headend termination. All testing up to 3.0 has been performed using Tomahawk series line cards on the ASR 9000.NCS-560The NCS-560 with RSP4 is a next-generation platform with high scale and modularity to fit in many access, pre-aggregation, and aggregation roles. Available in 4-slot and 7-slot versions, the NCS 560 is fully redundant with a variety of 40GE/100GE, 10GE, and 1GE modular adapters. The NCS 560 RSP4 has built-in GNSS timing support along with a high scale (-E) version to support full Internet routing tables or large VPN routing tables with room to spare for 5+ years of growth. The NCS 560 provides all of this with a very low power and space footprint with a depth of 9.5”.NCS 5504, 5508, 5516 Modular ChassisThe modular chassis version of the NCS 5500 is available in 4, 8, and 16 slot versions for flexible interfaces at high scale with dual RP modules. A variety of line cards are available with 10G, 40G, 100G, and 400G interface support. The NCS 5500 fully supports timing distribution for applications needing high accuracy clocks like mobile backhaul.NCS-5501, NCS-5501-SE, and N540-ACC-SYSThe NCS 5501, 5501-SE, and 540 hardware is validated in both an access and aggregation role in the Converged SDN Transport. The 5501 has 48x1G/10G SFP+ and 6x100G QSFP28 interfaces, the SE adds higher route scale via an external TCAM. The N540-ACC-SYS is a next-generation access node with 24x10G SFP+, 8x25G SFP28, and 2x100G QSFP28 interfaces. The NCS540 is available in extended temperature with a conformal coating for deployment deep into access networks.NCS-55A2-MODThe Converged SDN Transport design now supports the NCS-55A2-MOD access and aggregation router. The 55A2-MOD is a modular 2RU router with 24 1G/10G SFP+, 16 1G/10G/25G SFP28 onboard interfaces, and two modular slots capable of 400G of throughput per slot using Cisco NCS Modular Port Adapters or MPAs. MPAs add additional 1G/10G SFP+, 100G QSFP28, or 100G/200G CFP2 interfaces. The 55A2-MOD is available in an extended temperature version with a conformal coating as well as a high scale configuration (NCS-55A2-MOD-SE-S) scaling to millions of IPv4 and IPv6 routes.ASR-920The IOS-XE based ASR 920 is tested within the Converged SDN Transport as an access node. The Segment Routing data plane and supported service types are validated on the ASR 920 within the CST design. Please see the services support section for all service types supported on the ASR 920. Transport – Design ComponentsNetwork Domain StructureTo provide unlimited network scale, the Converged SDN Transport isstructured into multiple IGP Domains# Access, Aggregation, and Core.Refer to the network topology in Figure 1.Figure 1# High scale fully distributedThe network diagram in Figure 2 shows how a Service Provider network canbe simplified by decreasing the number of IGP domains. In this scenariothe Core domain is extended over the Aggregation domain, thus increasingthe number of nodes in theCore.Figure 2# Distributed with expanded accessA similar approach is shown in Figure 3. In this scenario the Coredomain remains unaltered and the Access domain is extended over theAggregation domain, thus increasing the number of nodes in the Accessdomain.#%s/Figure 3# Distributed with expanded coreThe Converged SDN Transport transport design supports all three networkoptions, while remaining easily customizable.The first phase of the Converged SDN Transport, discussed later in thisdocument, will cover in depth the scenario described in Figure 3.Topology options and PE placement - Inline and non-inline PEThe non-inline PE topology, shown in the figure below, moves the services edge PE device from the forwarding path between the access/aggregation networks and the core. There are several factors which can drive providers to this design vs. one with an in-line PE, some of which are outlined in the table below. The control-plane configuration of the Converged SDN Transport does not change, all existing ABR configuration remains the same, but the device no longer acts as a high-scale PE.Figure# Non-Inline Aggregation TopologyConnectivity using 100G/200G coherent optics w/MACSecIn Converged SDN Transport 3.0 we add support for the use of pluggable CFP2-DCO transceivers to enable high speed aggregation and access network infrastructure. As endpoint bandwidth increases due to technology innovation such as 5G and Remote PHY, access and aggregation networks must grow from 1G and 10G to 100G and beyond. Coherent router optics simplify this evolution by allowing an upgrade path to increase ring bandwidth up to 400Gbps without deploying costly DWDM optical line systems.MACSec is an industry standard protocol running at L2 to provide encryption across Ethernet links. In CST 3.0 MACSec is enabled across CFP2-DCO access to aggregation links. MACSec support is hardware dependent, please consult individual hardware data sheets for MACSec support.Ring deployment without multiplexersIn the simplest deployment access rings are deployed over dark fiber, enabling plug and play operation up to 80km without amplification.CFP2-DCO DWDM ring deploymentRing deployment with multiplexerIn this option the nodes are deployed with active or passive multiplexers to maximize fiber utilization rings needing more bandwidth per ring site. While this example shows each site on the ring having direct DWDM links back to the aggregation nodes, a hybrid approach could also be supported targeting only high-bandwidth locations with direct links while leaving other sites on a an aggregation ring.CFP2-DCO DWDM hub and spoke or partial mesh deploymentThe Cisco NCS 55A2-MOD and 55A2-MOD-SE hardened modular platform has a mix of fixed SFP+ and SFP28 ports along with two MPA slots. The coherent aggregation and access solution can utilize either the 2xCFP2-DCO MPA or 2xQSFP28+1xCFP2-DCO MPA. The same MPA modules can be used in the 5504, 5508, and 5516 chassis using the NC55-MOD-A-S and NC55-MODD-A-SE line cards, with 12xSFP+ and 2xQSFP+ ports.Cisco 55A2 modular hardened routerCisco NCS 5500 chassis modular line cardIntra-DomainIntra-Domain Routing and ForwardingThe Converged SDN Transport is based on a fully programmable transport thatsatisfies the requirements described earlier. The foundation technologyused in the transport design is Segment Routing (SR) with a MPLS basedData Plane in Phase 1 and a IPv6 based Data Plane (SRv6) in future.Segment Routing dramatically reduces the amount of protocols needed in aService Provider Network. Simple extensions to traditional IGP protocolslike ISIS or OSPF provide full Intra-Domain Routing and ForwardingInformation over a label switched infrastructure, along with HighAvailability (HA) and Fast Re-Route (FRR) capabilities.Segment Routing defines the following routing related concepts# Prefix-SID – A node identifier that must be unique for each node ina IGP Domain. Prefix-SID is statically allocated by th3 networkoperator. Adjacency-SID – A node’s link identifier that must be unique foreach link belonging to the same node. Adjacency-SID is typicallydynamically allocated by the node, but can also be staticallyallocated. In the case of Segment Routing with a MPLS Data Plane, both Prefix-SIDand Adjacency-SID are represented by the MPLS label and both areadvertised by the IGP protocol. This IGP extension eliminates the needto use LDP or RSVP protocol to exchange MPLS labels.The Converged SDN Transport design uses ISIS as the IGP protocol.Intra-Domain Forwarding - Fast Re-RouteSegment-Routing embeds a simple Fast Re-Route (FRR) mechanism known asTopology Independent Loop Free Alternate (TI-LFA).TI-LFA provides sub 50ms convergence for link and node protection.TI-LFA is completely stateless and does not require any additionalsignaling mechanism as each node in the IGP Domain calculates a primaryand a backup path automatically and independently based on the IGPtopology. After the TI-LFA feature is enabled, no further care isexpected from the network operator to ensure fast network recovery fromfailures. This is in stark contrast with traditional MPLS-FRR, whichrequires RSVP and RSVP-TE and therefore adds complexity in the transportdesign.Please refer also to the Area Border Router Fast Re-Route covered inSection# “Inter-Domain Forwarding - High Availability and Fast Re-Route” for additional details.Inter-DomainInter-Domain ForwardingThe Converged SDN Transport achieves network scale by IGP domainseparation. Each IGP domain is represented by separate IGP process onthe Area Border Routers (ABRs).Section# “Intra-Domain Routing and Forwarding” described basic Segment Routing concepts# Prefix-SID andAdjacency-SID. This section introduces the concept of Anycast SID.Segment Routing allows multiple nodes to share the same Prefix-SID,which is then called a “Anycast” Prefix-SID or Anycast-SID. Additionalsignaling protocols are not required, as the network operator simplyallocates the same Prefix SID (thus a Anycast-SID) to a pair of nodestypically acting as ABRs.Figure 4 shows two sets of ABRs# Aggregation ABRs – AG Provider Edge ABRs – PE Figure 4# IGP Domains - ABRs Anycast-SIDFigure 5 shows the End-To-End Stack of SIDs for packets traveling fromleft to right through thenetwork.Figure 5# Inter-Domain LSP – SR-TE PolicyThe End-To-End Inter-Domain Label Switched Path (LSP) was computed viaSegment Routing Traffic Engineering (SR-TE) Policies.On the Access router “A” the SR-TE Policy imposes# Local Aggregation Area Border Routers Anycast-SID# Local-AGAnycast-SID Local Provider Edge Area Border Routers Anycast-SID# Local-PEAnycast SID Remote Provider Edge Area Border Routers Anycast-SID# Remote-PEAnycast-SID Remote Aggregation Area Border Routers Anycast-SID# Remote-AGAnycast-SID Remote/Destination Access Router# Destination-A Prefix-SID#Destination-A Prefix-SID The SR-TE Policy is programmed on the Access device on-demand by anexternal Controller and does not require any state to be signaledthroughout the rest of the network. The SR-TE Policy provides, by simpleSID stacking (SID-List), an elegant and robust way to programInter-Domain LSPs without requiring additional protocols such as BGP-LU(RFC3107).Please refer to Section# “Transport Programmability” for additional details.Area Border Routers – Prefix-SID vs Anycast-SIDSection# “Inter-Domain Forwarding” showed the use of Anycast-SID at the ABRs for theprovisioning of an Access to Access End-To-End LSP. When the LSP is setup between the Access Router and the AG/PE ABRs, there are two options# ABRs are represented by Anycast-SID; or Each ABR is represented by a unique Prefix-SID. Choosing between Anycast-SID or Prefix-SID depends on the requestedservice. Please refer to Section# “Services - Design”.Note that both options can be combined on the same network.Inter-Domain Forwarding - High Availability and Fast Re-RouteAG/PE ABRs redundancy enables high availability for Inter-DomainForwarding.Figure 7# IGP Domains - ABRs Anycast-SIDWhen Anycast-SID is used to represent AG or PE ABRs, no other mechanismis needed for Fast Re-Route (FRR). Each IGP Domain provides FRRindependently by TI-LFA as described in Section# “Intra-Domain Forwarding - Fast Re-Route”.Figure 8 shows how FRR is achieved for a Inter-DomainLSP.Figure 8# Inter-Domain - FRRThe access router on the left imposes the Anycast-SID of the ABRs andthe Prefix-SID of the destination access router. For FRR, any router inIGP1, including the Access router, looks at the top label# “ABRAnycast-SID”. For this label, each device maintains a primary and backuppath preprogrammed in the HW. In IGP2, the top label is “Destination-A”.For this label, each node in IGP2 has primary and backup pathspreprogrammed in the HW. The backup paths are computed by TI-LFA.As Inter-Domain forwarding is achieved via SR-TE Policies, FRR iscompletely self-contained and does not require any additional protocol.Note that when traditional BGP-LU is used for Inter-Domain forwarding,BGP-PIC is also required for FRR.Inter-Domain LSPs provisioned by SR-TE Policy are protected by FRR alsoin case of ABR failure (because of Anycast-SID). This is not possiblewith BGP-LU/BGP-PIC, since BGP-LU/BGP-PIC have to wait for the IGP toconverge first.Transport ProgrammabilityFigure 9 and Figure 10 show the design of Route-Reflectors (RR), Segment Routing Path Computation Element (SR-PCE) and WAN Automation Engines (WAE).High-Availability is achieved by device redundancy in the Aggregationand Core networks.Figure 9# Transport Programmability – PCEPTransport RRs collect network topology from ABRs through BGP Link State (BGP-LS).Each Transport ABR has a BGP-LS session with the two Domain RRs. Each domain is represented by a different BGP-LS instance ID.Aggregation Domain RRs collect network topology information from theAccess and the Aggregation IGP Domain (Aggregation ABRs are part of theAccess and the Aggregation IGP Domain). Core Domain RRs collect networktopology information from the Core IGP Domain.Aggregation Domain RRs have BGP-LS sessions with Core RRs.Through the Core RRs, the Aggregation Domains RRs advertise localAggregation and Access IGP topologies and receive the network topologiesof the remote Access and Aggregation IGP Domains as well as the networktopology of the Core IGP Domain. Hence, each RR maintains the overallnetwork topology in BGP-LS.Redundant Domain SR-PCEs have BGP-LS sessions with the local Domain RRsthrough which they receive the overall network topology. Refer toSection# “Segment Routing Path Computation Element (SR-PCE)” for more details about SR-PCE.SR-PCE is then capable of computing the Inter-Domain LSP path on-demand. The computed path (Segment Routing SID List) is communicated to the Service End Points via a Path Computation Element Protocol (PCEP) response as shown in Figure 9, orBGP-SR-TE, as shown in Figure 10. In the caseof PCEP, SR-PCEs and Service End Points communicate directly, while forBGP-SR-TE, they communicate via RRs. Phase 1 uses PCEP only.The Service End Points create a SR-TE Policy and use the SID list returned by SR-PCE as the primary path.Service End Points can be located on the Access Routers for FlatServices or at both the Access and domain PE routers for Hierarchical Services. The domain PE routers and ABRs may or may not be the same router. The SR-TE Policy DataPlane in the case of Service End Point co-located with the Access routerwas described in Figure 5.Figure 10# Transport Programmability – BGP-SR-TEThe proposed design is very scalable and can be easily extended tosupport even higher numbers of BGP-SR-TE/PCEP sessions by addingadditional RRs and SR-PCE elements into the Access Domain.Figure 11 shows the Converged SDN Transport physical topology with examplesof product placement.Figure 11# Converged SDN Transport – Physical Topology with transportprogrammabilityTraffic Engineering (Tactical Steering) – SR-TE PolicyOperators want to fully monetize their network infrastructure byoffering differentiated services. Traffic engineering is used to providedifferent paths (optimized based on diverse constraints, such aslow-latency or disjoined paths) for different applications. Thetraditional RSVP-TE mechanism requires signaling along the path fortunnel setup or tear down, and all nodes in the path need to maintainstates. This approach doesn’t work well for cloud applications, whichhave hyper scale and elasticity requirements.Segment Routing provides a simple and scalable way of defining anend-to-end application-aware traffic engineering path computed onceagain through SR-TE Policy.In the Converged SDN Transport design, the Service End Point uses PCEP oralong with Segment Routing On-Demand Next-hop (SR-ODN) capability, to request from the controller a path thatsatisfies specific constraints (such as low latency). This is done byassociating an SLA tag/attribute to the path request. Upon receiving therequest, the SR-PCE controller calculates the path based on the requestedSLA, and uses PCEP to dynamically program the ingress nodewith a specific SR-TE Policy.Transport Controller Path Computation Engine (PCE)Segment Routing Path Computation Element (SR-PCE)Segment Routing Path Computation Element, or SR-PCE, is a Cisco Path Computation Engine(PCE) and is implemented as a feature included as part of CiscoIOS-XR operating system. The function is typically deployed on a CiscoIOS-XR cloud appliance XRv-9000, as it involves control plane operationsonly. The SR-PCE gains network topology awareness from BGP-LSadvertisements received from the underlying network. Such knowledge isleveraged by the embedded multi-domain computation engine to provideoptimal path information to Path Computation Element Clients (PCCs) using the PathComputation Element Protocol (PCEP).The PCC is the device where the service originates (PE) and therefore itrequires end-to-end connectivity over the segment routing enabledmulti-domain network.The SR-PCE provides a path based on constraints such as# Shortest path (IGP metrics). Traffic-Engineering metrics. Disjoint paths starting on one or two nodes. Figure 12# XR Transport Controller – ComponentsPCE Controller Summary – SR-PCESegment Routing Path Computation Element (SR-PCE)# Runs as a feature on a physicla or virtual IOS-XR node Collects topology from BGP using BGP-LS, ISIS, or OSPF Deploys SR Policies based on client requests Computes Shortest, Disjoint, Low Latency, and Avoidance paths North Bound interface with applications via REST APIPath Computation Engine – WorkflowThere are three models available to program transport LSPs# Delegated Computation to SR-PCEAll models assume SR-PCE has acquired full network topology throughBGP-LS.Figure 13# PCE Path ComputationDelegated Computation to SR-PCE NSO provisions the service. Alternatively, the service can beprovisioned via CLI Access Router requests a path SR-PCE computes the path SR-PCE provides the path to Access Router Access Router acknowledges Segment Routing and Unified MPLS (BGP-LU) Co-existenceSummaryIn the Converged SDN Transport 3.0 design we introduce validation for the co-existence of services using BGP Labeled Unicast transport for inter-domain forwarding and those using SR-TE. Many networks deployed today have an existing BGP-LU design which may not be easily migrated to SR, so graceful introduction between the two transport methods is required. In the case of a multipoint service such as EVPN ELAN or L3VPN, an endpoint may utilize BGP-LU to one endpoint and SR-TE to another.ABR BGP-LU designIn a BGP-LU design each IGP domain or ASBR boundary node will exchange BGP labeled prefixes between domains while resetting the BGP next-hop to its own loopback address. The labeled unicast label will change at each domain boundary across the end to end network. Within each IGP domain, a label distribution protocol is used to supply MPLS connectivity between the domain boundary and interior nodes. In the Converged SDN Transport design, IS-IS with SR-MPLS extensions is used to provide intra-domain MPLS transport. This ensures within each domain BGP-LU prefixes are protected using TI-LFA.The BGP-LU design utilized in the Converged SDN Transport validation is based on Cisco’s Unified MPLS design used in EPN 4.0. More information can be found at# <a href=https#//www.cisco.com/c/dam/en/us/td/docs/solutions/Enterprise/Mobility/EPN/4_0/EPN_4_Transport_Infrastructure_DIG.pdf></a>Quality of Service and AssuranceOverviewQuality of Service is of utmost importance in today’s multi-service converged networks. The Converged SDN Transport design has the ability to enforce end to end traffic path SLAs using Segment Routing Traffic Engineering. In addition to satisfying those path constraints, traditional QoS is used to make sure the PHB (Per-Hop Behavior) of each packet is enforced at each node across the converged network.NCS 540, 560, and 5500 QoS PrimerFull details of the NCS 540 and 5500 QoS capabilities and configuration can be found at# <a href=https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/qos/66x/b-qos-cg-ncs5500-66x/b-qos-cg-ncs5500-66x_chapter_010.html></a>The NCS platforms utilize the same MQC configuration for QoS as other IOS-XR platforms but based on their hardware architecture use different elements for implementing end to end QoS. On these platforms ingress traffic is# Matched using flexible criteria via Class Maps Assigned to a specific Traffic Class (TC) and/or QoS Group for further treatment on egress Has its header marked with a specific IPP, DSCP, or MPLS EXP valueTraffic Classes are used internally for determining fabric priority and as the match condition for egress queuing. QoS Groups are used internally as the match criteria for egress CoS header re-marking. IPP/DSCP marking and re-marking of ingress MPLS traffic is done using ingress QoS policies. MPLS EXP for imposed labels can be done on ingress or egress, but if you wish to rewrite both the IPP/DSCP and set an explicit EXP for imposed labels, the MPLS EXP must be set on egress.The priority-level command used in an egress QoS policy specifies the egress transmit priority of the traffic vs. other priority traffic. Priority levels can be configured as 1-7 with 1 being the highest priority. Priority level 0 is reserved for best-effort traffic.Please note, multicast traffic does not follow the same constructs as unicast traffic for prioritization. All multicast traffic assigned to Traffic Classes 1-4 are treated as Low Priority and traffic assigned to 5-6 treated as high priority.Hierarchical Edge QoSHierarchical QoS enables a provider to set an overall traffic rate across all services, and then configure parameters per-service via a child QoS policy where the percentages of guaranteed bandwidth are derived from the parent rateH-QoS platform supportNCS platforms support 2-level and 3-level H-QoS. 3-level H-QoS applies a policer (ingress) or shaper (egress) to a physical interface, with each sub-interface having a 2-level H-QoS policy applied. Hierarchical QoS is not enabled by default on the NCS 540 and 5500 platforms. H-QoS is enabled using the hw-module profile qos hqos-enable command. Once H-QoS is enabled, the number of priority levels which can be assigned is reduced from 1-7 to 1-4. Additionally, any hierarchical QoS policy assigned to a L3 sub-interface using priority levels must include a “shape” command.The ASR9000 supports multi-level H-QoS at high scale for edge aggregation function. In the case of hierarchical services, H-QoS can be applied to PWHE L3 interfaces.CST Core QoS mapping with five classesQoS designs are typically tailored for each provider, but we introduce a 5-level QoS design which can fit most provider needs. The design covers transport of both unicast and multicast traffic. Traffic Type Core Marking Core Priority Comments Network Control EXP 6 Highest Underlay network control plane Low latency EXP 5 Highest Low latency service, consistent delay High Priority 1 EXP 3 Medium-High High priority service traffic Medium Priority / Multicast EXP 2 Medium priority and multicast   Best Effort EXP 0 General user traffic   Example Core QoS Class and Policy MapsThese are presented for reference only, please see the implementation guide for the full QoS configurationClass maps for ingress header matchingclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 end-class-mapIngress QoS policypolicy-map ingress-classifier class match-ef-exp5 set traffic-class 2 set qos-group 2 ! class match-cs5-exp4 set traffic-class 3 set qos-group 3 ! class class-default set traffic-class 0 set dscp 0 set qos-group 0 ! end-policy-mapClass maps for egress queuing and marking policiesclass-map match-any match-traffic-class-2 description ~Match highest priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match high priority traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-mapEgress QoS queuing policypolicy-map egress-queuing class match-traffic-class-2 priority level 2 ! class match-traffic-class-3 priority level 3 ! class class-default ! end-policy-mapEgress QoS marking policypolicy-map core-egress-exp-marking class match-qos-group-2 set mpls experimental imposition 5 ! class match-qos-group-3 set mpls experimental imposition 4 class class-default set mpls experimental imposition 0 ! end-policy-mapConverged SDN Transport Use CasesService Provider networks must adopt a very flexible design that satisfyany to any connectivity requirements, without compromising in stabilityand availability. Moreover, transport programmability is essential tobring SLA awareness into the network.The goal of the Converged SDN Transport is to provide a flexible networkblueprint that can be easily customized to meet customer specificrequirements. This blueprint must adapt to carry any service type, for examplecable access, mobile, and business services over the same converged network infrastructure.5G Mobile NetworksSummary and 5G Service TypesThe Converged SDN Transport design introduces initial support for 5G networks and 5G services. There are a variety of new service use cases being defined by 3GPP for use on 5G networks, illustrated by the figure below. Networks must now be built to support the stringent SLA requirements of Ultra-Reliable Low-Latency services while also being able to cope with the massive bandwidth introduced by Enhanced Mobile Broadband services. The initial support for 5G in the Converged SDN Transport design focuses on the backhaul and midhaul portions of the network utilizing end to end Segment Routing. The design introduces no new service types, the existing scalable L3VPN and EVPN based services using BGP are sufficient for carrying 5G control-plane and user-plane traffic.Key Validated ComponentsThe following key features have been added to the CST validated design to support initial 5G deploymentsLow latency SR-TE path computationIn this release of the CST design, we introduce a new validated constraint type for SR-TE paths used for carrying services across the network. The “latency” constraint used either with a configured SR Policy or ODN SR Policy specifies the computation engine to look for the lowest latency path across the network. The latency computation algorithm can use different mechanisms for computing the end to end path. The first and preferred mechanism is to use the realtime measured per-link one-way delay across the network. This measured information is distributed via IGP extensions across the IGP domain and then onto external PCEs using BGP-LS extensions for use in both intra-domain and inter-domain calculations. In version 3.0 of the CST this is supported on ASR9000 links using the Performance Measurement link delay feature. More detail on the configuration can be found at https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-0/segment-routing/configuration/guide/b-segment-routing-cg-asr9000-70x/b-segment-routing-cg-asr9000-70x_chapter_010000.html#id_118505. In release 6.6.3 NCS 540 and NCS 5500 nodes support the configuration of static link-delay values which are distributed using the same method as the dynamic values. Two other metric types can also be utilized as part of the “latency” path computation. The TE metric, which can be defined on all SR IS-IS links and the regular IGP metric can be used in the absence of the link-delay metric.Different metric types can be used in a single path computation, with the following order used# Unidirectional link delay metric either computed or statically defined Statically defined TE metric IGP metricSR Policy latency constraint configuration on configured policy segment-routing traffic-eng policy LATENCY-POLICY color 20 end-point ipv4 1.1.1.3 candidate-paths preference 100 dynamic mpls metric type latencySR Policy latency constraint configuration for ODN policies segment-routing traffic-eng on-demand color 100 dynamic pcep ! metric type latencyStatic defined link delay metricperformance-measurement interface TenGigE0/0/0/10 delay-measurement advertise-delay 15000 interface TenGigE0/0/0/20 delay-measurement advertise-delay 10000TE metric definitionsegment-routing traffic-eng interface TenGigE0/0/0/10 metric 15 ! interface TenGigE0/0/0/20 metric 10The link-delay metrics are quantified in the unit of microseconds. On most networks this can be quite large and may be out of range from normal IGP metrics, so care must be taken to ensure proper compatibility when mixing metric types. The largest possible IS-IS metric is 16777214 which is equivalent to 16.77 seconds.End to end network QoS with H-QoS on Access PEQoS is of utmost importance for ensuring the mobile control plane and critical user plane traffic meets SLA requirements. Overall network QoS is covered in the QoS section in this document, this section will focus on basic Hierarchical QoS to support 5G services.H-QoS enables a provider to set an overall traffic rate across all services, and then configure parameters per-service via a child QoS policy where the percentages of guaranteed bandwidth are derived from the parent rate. NCS platforms support 2-level and 3-level H-QoS. 3-level H-QoS applies a policer (ingress) or shaper (egress) to a physical interface, with each sub-interface having a 2-level H-QoS policy applied.CST QoS mapping with 5 classes Traffic Type Ingress Marking Core Marking Comments Low latency IPP 5 EXP 5 URLLC, consistent delay, small buffer 5G Control Plane IPP 4 EXP 4 Mobile control and billing High Priority Service IPP 3 (in contract), 1 (out of contract) EXP 1,3 Business service Best Effort IPP 0 EXP 0 General user traffic Network Control IPP 6 EXP 6 Underlay network control plane Cable Converged Interconnect Network (CIN)SummaryThe Converged SDN Transport Design enables a multi-service CIN by adding support for the features and functions required to build a scalable next-generation Ethernet/IP cable access network. Differentiated from simple switch or L3 aggregation designs is the ability to support NG cable transport over the same common infrastructure already supporting other services like mobile backhaul and business VPN services. Cable Remote PHY is simply another service overlayed onto the existing Converged SDN Transport network architecture. We will cover all aspects of connectivity between the Cisco cBR-8 and the RPD device.Distributed Access ArchitectureThe cable Converged Interconnect Network is part of a next-generation Distributed Access Architecture (DAA), an architecture unlocking higher subscriber bandwidth by moving traditional cable functions deeper into the network closer to end users. R-PHY or Remote PHY, places the analog to digital conversion much closer to users, reducing the cable distance and thus enabling denser and higher order modulation used to achieve Gbps speeds over existing cable infrastructure. This reference design will cover the CIN design to support Remote PHY deployments.Remote PHY Components and RequirementsThis section will list some of the components of an R-PHY network and the network requirements driven by those components. It is not considered to be an exhaustive list of all R-PHY components, please see the CableLabs specification document, the latest which can be access via the following URL# https#//specification-search.cablelabs.com/CM-SP-R-PHYRemote PHY Device (RPD)The RPD unlocks the benefits of DAA by integrating the physical analog to digital conversions in a device deployed either in the field or located in a shelf in a facility. The uplink side of the RPD or RPHY shelf is simply IP/Ethernet, allowing transport across widely deployed IP infrastructure. The RPD-enabled node puts the PHY function much closer to an end user, allowing higher end-user speeds. The shelf allows cable operators to terminate only the PHY function in a hub and place the CMTS/MAC function in a more centralized facility, driving efficiency in the hub and overall network. The following diagram shows various options for how RPDs or an RPD shelf can be deployed. Since the PHY function is split from the MAC it allows independent placement of those functions.RPD Network ConnectionsEach RPD is typically deployed with a single 10GE uplink connection. The compact RPD shelf uses a single 10GE uplink for each RPD.Cisco cBR-8 and cnBRThe Cisco Converged Broadband Router performs many functions as part of a Remote PHY solution. The cBR-8 provisions RPDs, originates L2TPv3 tunnels to RPDs, provisions cable modems, performs cable subscriber aggregation functions, and acts as the uplink L3 router to the rest of the service provider network. In the Remote PHY architecture the cBR-8 acts as the DOCSIS core and can also serve as a GCP server and video core. The cBR-8 runs IOS-XE. The cnBR, cloud native Broadband Router, provides DOCSIS core functionality in a server-based software platform deployable anywhere in the SP network. CST 3.0 has been validated using the cBR-8, the cnBR will be validated in an upcoming release.cBR-8 Network ConnectionsThe cBR-8 is best represented as having “upstream” and “downstream” connectivity.The upstream connections are from the cBR8 Supervisor module to the SP network. Subscriber data traffic and video ingress these uplink connections for delivery to the cable access network. The cBR-8 SUP-160 has 8x10GE SFP+ physical connections, the SUP-250 has 2xQSFP28/QSFP+ interfaces for 40G/100G upstream connections.In a remote PHY deployment the downstream connections to the CIN are via the Digital PIC (DPIC-8X10G) providing 40G of R-PHY throughput with 8 SFP+ network interfaces.cBR-8 RedundancyThe cBR-8 supports both upstream and downstream redundancy. Supervisor redundancy uses active/standby connections to the SP network. Downstream redundancy can be configured at both the line card and port level. Line card redundancy uses an active/active mechanism where each RPD connects to the DOCSIS core function on both the active and hot standby Digital PIC line card. Port redundancy uses the concept of “port pairs” on each Digital PIC, with ports 0/1, 2/3, 4/6, and 6/7 using either an active/active (L2) or active/standby (L3) mechanism. In the CST design we utilize a L3 design with the active/standby mechanism. The mechanism uses the same IP address on both ports, with the standby port kept in a physical down state until switchover occurs.Remote PHY CommunicationDHCPThe RPD is provisioned using ZTP (Zero Touch Provisioning). DHCPv4 and DHCPv6 are used along with CableLabs DHCP options in order to attach the RPD to the correct GCP server for further provisioning.Remote PHY Standard FlowsThe following diagram shows the different core functions of a Remote PHY solution and the communication between those elements.GCPGeneric Communications Protocol is used for the initial provisioning of the RPD. When the RPD boots and received its configuration via DHCP, one of the DHCP options will direct the RPD to a GCP server which can be the cBR-8 or Cisco Smart PHY. GCP runs over TCP typically on port 8190.UEPI and DEPI L2TPv3 TunnelsThe upstream output from an RPD is IP/Ethernet, enabling the simplification of the cable access network. Tunnels are used between the RPD PHY functions and DOCSIS core components to transport signals from the RPD to the core elements, whether it be a hardware device like the Cisco cBR-8 or a virtual network function provided by the Cisco cnBR (cloud native Broadband Router).DEPI (Downstream External PHY Interface) comes from the M-CMTS architecture, where a distributed architecture was used to scale CMTS functions. In the Remote PHY architecture DEPI represents a tunnel used to encapsulate and transport from the DOCSIS MAC function to the RPD. UEPI (Upstream External PHY Interface) is new to Remote PHY, and is used to encode and transport analog signals from the RPD to the MAC function.In Remote PHY both DEPI and UEPI tunnels use L2TPv3, defined in RFC 3931, to transport frames over an IP infrastructure. Please see the following Cisco white paper for more information on how tunnels are created specific to upstream/downstream channels and how data is encoded in the specific tunnel sessions. https#//www.cisco.com/c/en/us/solutions/collateral/service-provider/converged-cable-access-platform-ccap-solution/white-paper-c11-732260.html. In general there will be one or two (standby configuration) UEPI and DEPI L2TPv3 tunnels to each RPD, with each tunnel having many L2TPv3 sessions for individual RF channels identified by a unique session ID in the L2TPv3 header. Since L2TPv3 is its own protocol, no port number is used between endpoints, the endpoint IP addresses are used to identify each tunnel. Unicast DOCSIS data traffic can utilize either or multicast L2TPv3 tunnels. Multicast tunnels are used with downstream virtual splitting configurations. Multicast video is encoded and delivered using DEPI tunnels as well, using a multipoint L2TPv3 tunnel to multiple RPDs to optimize video delivery.CIN Network RequirementsIPv4/IPv6 Unicast and MulticastDue to the large number of elements and generally greenfield network builds, the CIN network must support all functions using both IPv4 and IPv6. IPv6 may be carried natively across the network or within an IPv6 VPN across an IPv4 MPLS underlay network. Similarly the network must support multicast traffic delivery for both IPv4 and IPv6 delivered via the global routing table or Multicast VPN. Scalable dynamic multicast requires the use of PIMv4, PIMv6, IGMPv3, and MLDv2 so these protocols are validated as part of the overall network design. IGMPv2 and MLDv2 snooping are also required for designs using access bridge domains and BVI interfaces for aggregation.Network TimingFrequency and phase synchronization is required between the cBR-8 and RPD to properly handle upstream scheduling and downstream transmission. Remote PHY uses PTP (Precision Timing Protocol) for timing synchronization with the ITU-T G.8275.2 timing profile. This profile carries PTP traffic over IP/UDP and supports a network with partial timing support, meaning multi-hop sessions between Grandmaster, Boundary Clocks, and clients as shown in the diagram below. The cBR-8 and its client RPD require timing alignment to the same Primary Reference Clock (PRC). In order to scale, the network itself must support PTP G.8275.2 as a T-BC (Boundary Clock). Synchronous Ethernet (SyncE) is also recommended across the CIN network to maintain stability when timing to the PRC.QoSControl plane functions of Remote PHY are critical to achieving proper operation and subscriber traffic throughput. QoS is required on all RPD-facing ports, the cBR-8 DPIC ports, and all core interfaces in between. Additional QoS may be necessary between the cBR-8, RPD, and any PTP timing elements. See the design section for further details on QoS components.DHCPv4 and DHCPv6 RelayAs a critical component of the initial boot and provisioning of RPDs, the network must support DHCP relay functionality on all RPD-facing interfaces, for both IPv4 and IPv6.Converged SDN Transport CIN DesignDeployment Topology OptionsThe Converged SDN Transport design is extremely flexible in how Remote PHY components are deployed. Depending on the size of the deployment, components can be deployed in a scalable leaf-spine fabric with dedicated routers for RPD and cBR-8 DPIC connections or collapsed into a single pair of routers for smaller deployments. If a smaller deployment needs to be expanded, the flexible L3 routed design makes it very easy to simply interconnect new devices and scale the design to a fabric supporting thousands of RPD and other access network connections.High Scale Design (Recommended)This option maximizes statistical multiplexing by aggregating Digital PIC downstream connections on a separate leaf device, allowing one to connect a number of cBR-8 interfaces to a fabric with minimal 100GE uplink capacity. The topology also supports the connectivity of remote shelves for hub consolidation. Another benefit is the fabric has optimal HA and the ability to easily scale with more leaf and spine nodes.High scale topologyCollapsed Digital PIC and SUP Uplink ConnectivityThis design for smaller deployments connects both the downstream Digital PIC connections and uplinks on the same CIN core device. If there is enough physical port availability and future growth does not dictate capacity beyond these nodes this design can be used. This design still provides full redundancy and the ability to connect RPDs to any cBR-8. Care should be taken to ensure traffic between the DPIC and RPD does not traverse the SUP uplink interfaces.Collapsed cBR-8 uplink and Digital PIC connectivityCollapsed RPD and cBR-8 DPIC ConnectivityThis design connects each cBR-8 Digital PIC connection to the RPD leaf connected to the RPDs it will serve. This design can also be considered a “pod” design where cBR-8 and RPD connectivity is pre-planned. Careful planning is needed since the number of ports on a single device may not scale efficiently with bandwidth in this configuration.Collapsed or Pod cBR-8 Digital PIC and RPD connectivityIn the collapsed desigs care must be taken to ensure traffic between each RPD can reach the appropriate DPIC interface. If a leaf is single-homed to the aggregation router its DPIC interface is on, RPDs may not be able to reach their DPIC IP. The options with the shortest convergence time are# Adding interconnects between the agg devices or multiple uplinks from the leaf to agg devices.Cisco HardwareThe following table highlights the Cisco hardware utilized within the Converged SDN Transport design for Remote PHY. This table is non-exhaustive. One highlight is all NCS platforms listed are built using the same NPU family and share most features across all platforms. See specific platforms for supported scale and feature support. Product Role 10GE SFP+ 25G SFP28 100G QSFP28 Timing Comments NCS-55A1-24Q6H-S RPD leaf 48 24 6 Class B   N540-ACC-SYS RPD leaf 24 8 2 Class B Smaller deployments NCS-55A1-48Q6H-S DPIC leaf 48 48 6 Class B   NCS-55A2-MOD Remote agg 40 24 upto 8 Class B CFP2-DCO support NCS-55A1-36H-S Spine 144 (breakout) 0 36 Class B   NCS-5502 Spine 192 (breakout) 0 48 None   NCS-5504 Multi Upto 576 x Upto 144 Class B 4-slot modular platform Scalable L3 Routed DesignThe Cisco validated design for cable CIN utilizes a L3 design with or without Segment Routing. Pure L2 networks are no longer used for most networks due to their inability to scale, troubleshooting difficulty, poor network efficiency, and poor resiliency. L2 bridging can be utilized on RPD aggregation routers to simplify RPD connectivity.L3 IP RoutingLike the overall CST design, we utilize IS-IS for IPv4 and IPv6 underlay routing and BGP to carry endpoint information across the network. The following diagram illustrates routing between network elements using a reference deployment. The table below describes the routing between different functions and interfaces. See the implementation guide for specific configuration. Interface Routing Comments cBR-8 Uplink IS-IS Used for BGP next-hop reachability to SP Core cBR-8 Uplink BGP Advertise subscriber and cable-modem routes to SP Core cBR-8 DPIC Static default in VRF Each DPIC interface should be in its own VRF on the cBR-8 so it has a single routing path to its connected RPDs RPD Leaf Main IS-IS Used for BGP next-hop reachability RPD Leaf Main BGP Advertise RPD L3 interfaces to CIN for cBR-8 to RPD connectivity RPD Leaf Timing BGP Advertise RPD upstream timing interface IP to rest of network DPIC Leaf IS-IS Used for BGP next-hop reachability DPIC Leaf BGP Advertise cBR-8 DPIC L3 interfaces to CIN for cBR-8 to RPD connectivity CIN Spine IS-IS Used for reachability between BGP endpoints, the CIN Spine does not participate in BGP in a SR-enabled network CIN Spine RPD Timing IS-IS Used to advertise RPD timing interface BGP next-hop information and advertise default CIN Spine BGP (optional) In a native IP design the spine must learn BGP routes for proper forwarding CIN Router to Router InterconnectionIt is recommended to use multiple L3 links when interconnecting adjacent routers, as opposed to using LAG, if possible. Bundles increase the possibility for timing inaccuracy due to asymmetric timing traffic flow between slave and master. If bundle interfaces are utilized, care should be taken to ensure the difference in paths between two member links is kept to a minimum. All router links will be configured according to the global CST design. Leaf devices will be considered CST access PE devices and utilize BGP for all services routing.Leaf Transit TrafficIn a single IGP network with equal IGP metrics, certain link failures may cause a leaf to become a transit node. Several options are available to keep transit traffic from transiting a leaf and potentially causing congestion. Using high metrics on all leaf to agg uplinks will prohibit this and is recommended in all configurations.cBR-8 DPIC to CIN InterconnectionThe cBR-8 supports two mechanisms for DPIC high availability outlined in the overview section. DPIC line card and link redundancy is recommended but not a requirement. In the CST reference design, if link redundancy is being used each port pair on the active and standby line cards is connected to a different router and the default active ports (even port number) is connected to a different router. In the example figure, port 0 from active DPIC card 0 is connected to R1 and port 0 from standby DPIC card 1 is connected to R2. DPIC link redundancy MUST be configured using the “cold” method since the design is using L3 to each DPIC interface and no intermediate L2 switching. This is done with the cable rphy link redundancy cold global command and will keep the standby link in a down/down state until switchover occurs.DPIC line card and link HADPIC Interface ConfigurationEach DPIC interface should be configured in its own L3 VRF. This ensures traffic from an RPD assigned to a specific DPIC interface takes the traffic path via the specific interface and does not traverse the SUP interface for either ingress or egress traffic. It’s recommended to use a static default route within each DPIC VRF towards the CIN network. Dynamic routing protocols could be utilized, however it will slow convergence during redundancy switchover.Router Interface ConfigurationIf no link redundancy is utilized each DPIC interface will connect to the router using a point to point L3 interface.If using cBR-8 link HA, failover time is reduced by utilizing the same gateway MAC address on each router. Link HA uses the same IP and MAC address on each port pair on the cBR-8, and retains routing and ARP information for the L3 gateway. If a different MAC address is used on each router, traffic will be dropped until an ARP occurs to populate the GW MAC address on the router after failover. On the NCS platforms, a static MAC address cannot be set on a physical L3 interface. The method used to set a static MAC address is to use a BVI (Bridged Virtual Interface), which allows one to set a static MAC address. In the case of DPIC interface connectivity, each DPIC interface should be placed into its own bridge domain with an associated BVI interface. Since each DPIC port is directly connected to the router interface, the same MAC address can be utilized on each BVI.If using IS-IS to distribute routes across the CIN, each DPIC physical interface or BVI should be configured as a passive IS-IS interface in the topology. If using BGP to distribute routing information the “redistribute connected” command should be used with an appropriate route policy to restrict connected routes to only DPIC interface. The BGP configuration is the same whether using L3VPN or the global routing table.It is recommended to use a /31 for IPv4 and /127 for IPv6 addresses for each DPIC port whether using a L3 physical interface or BVI on the CIN router.RPD to Router InterconnectionThe Converged SDN Transport design supports both P2P L3 interfaces for RPD and DPIC aggregation as well as using Bridge Virtual Interfaces. A BVI is a logical L3 interface within a L2 bridge domain. In the BVI deployment the DPIC and RPD physical interfaces connected to a single leaf device share a common IP subnet with the gateway residing on the leaf router.It is recommended to configure the RPD leaf using bridge-domains and BVI interfaces. This eases configuration on the leaf device as well as the DHCP configuration used for RPD provisioning.The following shows the P2P and BVI deployment options.Native IP or L3VPN/mVPN DeploymentTwo options are available and validated to carry Remote PHY traffic between the RPD and MAC function. Native IP means the end to end communication occurs as part of the global routing table. In a network with SR-MPLS deployed such as the CST design, unicast IP traffic is still carried across the network using an MPLS header. This allows for fast reconvergence in the network by using SR and enabled the network to carry other VPN services on the network even if they are not used to carry Remote PHY traffic. In then native IP deployment, multicast traffic uses either PIM signaling with IP multicast forwarding or mLDP in-band signaling for label-switched multicast. The multicast profile used is profile 7 (Global mLDP in-band signaling). L3VPN and mVPN can also be utilized to carry Remote PHY traffic within a VPN service end to end. This has the benefit of separating Remote PHY traffic from the network underlay, improving security and treating Remote PHY as another service on a converged access network. Multicast traffic in this use case uses mVPN profile 14. mLDP is used for label-switched multicast, and the NG-MVPN BGP control plane is used for all multicast discovery and signaling. SR-TESegment Routing Traffic Engineering may be utilized to carry traffic end to end across the CIN network. Using On-Demand Networking simplifies the deployment of SR-TE Policies from ingress to egress by using specific color BGP communities to instruct head-end nodes to create policies satisfying specific user constraints. As an example, if RPD aggregation prefixes are advertised using BGP to the DPIC aggregation device, SR-TE tunnels following a user constraint can be built dynamically between those endpoints.CIN Quality of Service (QoS)QoS is a requirement for delivering trouble-free Remote PHY. This design uses sample QoS configurations for concept illustration, but QoS should be tailored for specific network deployments. New CIN builds can utilize the configurations in the implementation guide verbatim if no other services are being carried across the network. Please see the section in this document on QoS for general NCS QoS information and the implementation guide for specific details.CST Network Traffic ClassificationThe following lists specific traffic types which should be treated with specific priority, default markings, and network classification points. Traffic Type Ingress Interface Priority Default Marking Comments BGP Routers, cBR-8 Highest CS6 (DSCP 48) None IS-IS Routers, cBR-8 Highest CS6 IS-IS is single-hop and uses highest priority queue by default BFD Routers Highest CS6 BFD is single-hop and uses highest priority queue by default DHCP RPD High CS5 DHCP COS is set explicitly PTP All High DSCP 46 Default on all routers, cBR-8, and RPD DOCSIS MAP/UCD RPD, cBR-8 DPIC High DSCP 46   DOCSIS BWR RPD, cBR-8 DPIC High DSCP 46   GCP RPD, cBR-8 DPIC Low DSCP 0   DOCSIS Data RPD, cBR-8 DPIC Low DSCP 0   Video cBR-8 Medium DSCP 32 Video within multicast L2TPv3 tunnel when cBR-8 is video core MDD RPD, cBR-8 Medium DSCP 40   CST and Remote-PHY Load BalancingUnicast network traffic is load balanced based on MPLS labels and IP header criteria. The devices used in the CST design are capable of load balancing traffic based on MPLS labels used in the SR underlay and IP headers underneath any MPLS labels. In the higher bandwidth downstream direction, where a series of L2TP3 tunnels are created from the cBR-8 to the RPD, traffic is hashed based on the source and destination IP addresses of those tunnels. Downstream L2TPv3 tunnels from a single Digital PIC interface to a set of RPDs will be distributed across the fabric based on RPD destination IP address. The followUing illustrates unicast load balancing across the network.Multicast traffic is not load balanced across the network. Whether the network is utilizing PIMv4, PIMv6, or mVPN, a multicast flow with two equal cost downstream paths will utilize only a single path, and only a single member link will be utilized in a link bundle. If using multicast, ensure sufficient bandwidth is available on a single link between two adjacencies.4G Transport and Services ModernizationWhile talk about deploying 5G services has reached a fever pitch, many providers are continuing to build and evolve their 4G networks. New services require more agile and scalable networks, satisfied by Cisco’s Converged SDN Transport. The services modernization found in Converged SDN Transport 2.0 follows work done in EPN 4.0 located here# https#//www.cisco.com/c/dam/en/us/td/docs/solutions/Enterprise/Mobility/EPN/4_0/EPN_4_Transport_Infrastructure_DIG.pdf. Transport modernization requires simplification and new abilities. We evolve the EPN 4.0 design based on LDP and hierarchical BGP-LU to one using Segment Routing with an MPLS data plane and the SR-PCE to add inter-domain path computation, scale, and programmability. L3VPN based 4G services remain, but are modernized to utilize SR-TE On-Demand Next-Hop, reducing provisioning complexity, increasing scale, and adding advanced path computation constraints. 4G services utilizing L3VPN remain the same, but those utilizing L2VPN such as VPWS and VPLS transition to EVPN services. EVPN is the modern replacement for legacy LDP signalled L2VPN services, reducing complexity and adding advanced multi-homing functionality. The following table highlights the legacy and new way of delivering services for 4G. Element EPN 4.0 Converged SDN Transport Intra-domain MPLS Transport LDP IS-IS w/Segment Routing Inter-domain MPLS Transport BGP Labeled Unicast SR using SR-PCE for Computation MPLS L3VPN (LTE S1,X2) MPLS L3VPN MPLS L3VPN w/ODN L2VPN VPWS LDP Pseudowire EVPN VPWS w/ODN eMBMS Multicast Native / mLDP Native / mLDP The CST 4G Transport modernization covers only MPLS-based access and not L2 access scenarios.L3 IP Multicast and mVPNIP multicast continues to be an optimization method for delivering content traffic to many endpoints,especially traditional broadcast video. Unicast content dominates the traffic patterns of most networks today, but multicast carries critical high value services, so proper design and implementation is required. In Converged SDN Transport 2.0 we introduced multicast edge and core validation for native IPv4/IPv6 multicast using PIM, global multicast using in-band mLDP (profile 7), and mVPN using mLDP with in-band signaling (profile 6). Converged SDN Transport 3.0 extends this functionality by adding support for mLDP LSM with the NG-MVPN BGP control plane (profile 14). Using BGP signaling adds additional scale to the network over in-band mLDP signaling and fits with the overall design goals of CST. More information about deployment of profile 14 can be found in the Converged SDN Transport implementation guide. Converged SDN Transport 3.0 supports mLDP-based label switched multicast within a single doman and across IGP domain boundaries. In the case of the Converged SDN Transport design multicast has been tested with the source and receivers on both access and ABR PE devices. Supported Multicast Profiles Description Profile 6 mLDP VRF using in-band signaling Profile 7 mLDP global routing table using in-band signaling Profile 14 Partitioned MDT using BGP-AD and BGP c-multicast signaling LDP Auto-configurationLDP can automatically be enabled on all IS-IS interfaces with the following configuration in the IS-IS configurationrouter isis ACCESS address-family ipv4 unicast mpls ldp auto-config LDP mLDP-only Session Capability (RFC 7473)In Converged SDN Transport 3.0 we introduce the ability to only advertise mLDP state on each router adjacency, eliminating the need to filter LDP unicast FECs from advertisement into the network. This is done using the SAC (State Advertisement Control) TLV in the LDP initialization messages to advertise which LDP FEC classes to receive from an adjacent peer. We can restrict the capabilities to mLDP only using the following configuration. Please see the implementation guide and configurations for the full LDP configuration.mpls ldp capabilities sac mldp-only LDP Unicast FEC Filtering for SR Unicast with mLDP MulticastThe following is for historical context, please see the above section regarding disabling LDP unicast FECs using session capability advertisements.The Converged SDN Transport design utilized Segment Routing with the MPLS dataplane for all unicast traffic. The first phase of multicast support in Converged SDN Transport 2.0 will use mLDP for use with existing mLDP based networks and new networks wishing to utilize label switcched multicast across the core. LDP is enabled on an interface for both unicast and multicast by default. Since SR is being used for unicast, one must filtering out all LDP unicast FECs to ensure they are not distributed across the network. SR is used for all unicast traffic in the presence of an LDP FEC for the same prefix, but filtering them reduces control-plane activity, may aid in re-convergence, and simplifies troubleshooting. The following should be applied to all interfaces which have mLDP enabled#ipv4 access-list no-unicast-ldp 10 deny ipv4 any any!RP/0/RSP0/CPU0#Node-6#show run mpls ldpmpls ldplog neighboraddress-family ipv4 label local allocate for no-unicast-ldp EVPN MulticastMulticast within a L2VPN EVPN has been supported since Converged SDN Transport 1.0. Multicast traffic within an EVPN is replicated to the endpoints interested in a specific group via EVPN signaling. EVPN utilizes ingress replication for all multicast traffic, meaning multicast is encapsulated with a specific EVPN label and unicast to each PE router with interested listeners for each multicast group. Ingress replication may add additional traffic to the network, but simplifies the core and data plane by eliminating multicast signaling, state, and hardware replication. EVPN multicast is also not subject to domain boundary restrictions.LDP to Converged SDN Transport MigrationVery few networks today are built as greenfield networks, most new designs are migrated from existing ones and must support some level of interop during migration. In the Converged SDN Transport design we tackle one of the most common migration scenarios, LDP to the Converged SDN Transport design. The following sections explain the configuration and best practices for performing the migration. The design is applicable to transport and services originating and terminating in the same LDP domain.Towards Converged SDN Transport DesignThe Converged SDN Transport design utilizes isolated IGP domains in different parts of the network, with each domain separated at a logical boundary by an ASBR router. SR-PCE is used to provide end to end paths across the inter-domain network. LDP does not support inter-domain transport, only between LDP FECs in the same IGP domain. It is recommended to plan logical boundaries if necessary when doing a flat LDP migration to the Converged SDN Transport design, so that when migration is complete the future scale benefits can be realized.Segment Routing EnablementOne must define the global Segment Routing Block (SRGB) to be used across the network on every node participating in SR. There is a default block enabled by default but it may not be large enough to supportan entire network, so it’s advised to right-size this value for your deployment. The current maximum SRGB sizefor SR-MPLS is 256K entries.Enabling SR in IS-IS requires only issuing the command “segment-routing mpls” under the IPv4 address-family and assigning a prefix-sid value to any loopback interfaces you require the node be addressed towards as a service destination. Enabling TI-LFA is done on a per-interface basis in the IS-IS configuration for each interface.Enabling SR-Prefer within IS-IS aids in migration by preferring a SR prefix-sid to a prefix over an LDP prefix, allowing a seamless migration to SR without needing to enable SR completely within a domain.Segment Routing Mapping Server DesignOne component introduced with Segment Routing is the SR Mapping Server (SRMS), a control-plane element converting unicast LDP FECs to Segment Routing prefix-SIDs for advertisement throughout the Segment Routing domain. Each separate IGP domain requires a pair of SRMS nodes until full migratino to SR is complete.AutomationZero Touch ProvisioningIn addition to model-driven configuration and operation, Converged SDN Transport 1.5 supports ZTP operation for automated device provisioning. ZTP is useful both in production as well as staging environments to automate initial device software installation, deploy an initial bootstrap configuration, as well as advanced functionality triggered by ZTP scripts. ZTP is supported on both out of band management interfaces as well as in-band data interfaces. When a device first boots, the IOS-XR ZTP processbeging on the management interface of the device and if no response is received, or the the interface is not active, the ZTP process will begin the process on data ports. IOS-XRcan be part of an ecosystem of automated device and service provisioning via Cisco NSO.Model-Driven TelemetryIn the 3.0 release the implementation guide includes a table of model-driven telemetry paths applicable to different components within the design. More information on Cisco model-driven telemetry can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/telemetry/66x/b-telemetry-cg-ncs5500-66x.html. Additional information about how to consume and visualize telemetry data can be found at https#//xrdocs.io/telemetry. We also introduce integration with Cisco Crosswork Health Insights, a telemetry and automated remediation platform, and sensor packs correspondding to Converged SDN Transport components. More information on Crosswork Health Insights can be found at https#//www.cisco.com/c/en/us/support/cloud-systems-management/crosswork-health-insights/model.html.Network Services Orchestrator (NSO)The NSO is a management and orchestration (MANO) solution for networkservices and Network Functions Virtualization (NFV). The NSO includescapabilities for describing, deploying, configuring, and managingnetwork services and VNFs, as well as configuring the multi-vendorphysical underlay network elements with the help of standard open APIssuch as NETCONF/YANG or a vendor-specific CLI using Network ElementDrivers (NED).In the Converged SDN Transport design, the NSO is used for ServicesManagement, Service Provisioning, and Service Orchestration.The NSO provides several options for service designing as shown inFigure 32 Service model with service template Service model with mapping logic Service model with mapping logic and servicetemplates Figure 32# NSO – ComponentsA service model is a way of defining a service in a template format.Once the service is defined, the service model accepts user inputs forthe actual provisioning of the service. For example, a E-Line servicerequires two endpoints and a unique virtual circuit ID to enable theservice. The end devices, attachment circuit UNI interfaces, and acircuit ID are required parameters that should be provided by the userto bring up the E-Line service. The service model uses the YANG modelinglanguage (RFC 6020) inside NSO to define a service.Once the service characteristics are defined based on the requirements,the next step is to build the mapping logic in NSO to extract the userinputs. The mapping logic can be implemented using Python or Java. Thepurpose of the mapping logic is to transform the service models todevice models. It includes mechanisms of how service related operationsare reflected on the actual devices. This involves mapping a serviceoperation to available operations on the devices.Finally, service templates need to be created in XML for each devicetype. In NSO, the service templates are required to translate theservice logic into final device configuration through CLI NED. The NSOcan also directly use the device YANG models using NETCONF for deviceconfiguration. These service templates enable NSO to operate in amulti-vendor environment.Converged SDN Transport Supported Service ModelsConverged SDN Transport 1.5 and later supports the following NSO service models for provisioning both hierarchical and flat services across the fabric. All NSO service modules in 1.5 utilize the IOS-XR and IOS-XE CLI NEDs for configuration.Figure 33# Automation – End-to-End Service ModelsFigure 34# Automation – Hierarchical Service ModelsServices – DesignOverviewThe Converged SDN Transport Design aims to enable simplification across alllayers of a Service Provider network. Thus, the Converged SDN Transportservices layer focuses on a converged Control Plane based on BGP.BGP based Services include EVPNs and Traditional L3VPNs (VPNv4/VPNv6).EVPN is a technology initially designed for Ethernet multipoint servicesto provide advanced multi-homing capabilities. By using BGP fordistributing MAC address reachability information over the MPLS network,EVPN brought the same operational and scale characteristics of IP basedVPNs to L2VPNs. Today, beyond DCI and E-LAN applications, the EVPNsolution family provides a common foundation for all Ethernet servicetypes; including E-LINE, E-TREE, as well as data center routing andbridging scenarios. EVPN also provides options to combine L2 and L3services into the same instance.To simplify service deployment, provisioning of all services is fullyautomated using Cisco Network Services Orchestrator (NSO) using (YANG)models and NETCONF. Refer to Section# “Network Services Orchestrator (NSO)”.There are two types of services# End-To-End and Hierarchical. The nexttwo sections describe these two types of services in more detail.Ethernet VPN (EVPN)EVPNs solve two long standing limitations for Ethernet Services inService Provider Networks# Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or withData Center Ethernet VPN Hardware SupportIn CST 3.0 EVPN ELAN, ETREE, and VPWS services are supported on all IOS-XR devices. The ASR920 running IOS-XE does not support native EVPN services, but can integrate into an overall EVPN service by utilizing service hierarchy. Please see the tables under Flat and Hierarchical Services for supported service types. Please note ODN is NOT supported for EVPN ELAN services in IOS-XR 6.6.3.Multi-Homed & All-Active Ethernet AccessFigure 21 demonstrates the greatest limitation of traditional L2Multipoint solutions likeVPLS.Figure 21# EVPN All-Active AccessWhen VPLS runs in the core, loop avoidance requires that PE1/PE2 andPE3/PE4 only provide Single-Active redundancy toward their respectiveCEs. Traditionally, techniques such mLACP or Legacy L2 protocols likeMST, REP, G.8032, etc. were used to provide Single-Active accessredundancy.The same situation occurs with Hierarchical-VPLS (H-VPLS), where theaccess node is responsible for providing Single-Active H-VPLS access byactive and backup spoke pseudowire (PW).All-Active access redundancy models are not deployable as VPLStechnology lacks the capability of preventing L2 loops that derive fromthe forwarding mechanisms employed in the Core for certain categories oftraffic. Broadcast, Unknown-Unicast and Multicast (BUM) traffic sourcedfrom the CE is flooded throughout the VPLS Core and is received by allPEs, which in turn flood it to all attached CEs. In our example PE1would flood BUM traffic from CE1 to the Core, and PE2 would sends itback toward CE1 upon receiving it.EVPN uses BGP-based Control Plane techniques to address this issue andenables Active-Active access redundancy models for either Ethernet orH-EVPN access.Figure 22 shows another issue related to BUM traffic addressed byEVPN.Figure 22# EVPN BUM DuplicationIn the previous example, we described how BUM is flooded by PEs over theVPLS Core causing local L2 loops for traffic returning from the core.Another issue is related to BUM flooding over VPLS Core on remote PEs.In our example either PE3 or PE4 receive and send the BUM traffic totheir attached CEs, causing CE2 to receive duplicated BUM traffic.EVPN also addresses this second issue, since the BGP Control Planeallows just one PE to send BUM traffic to an All-Active EVPN access.Figure 23 describes the last important EVPNenhancement.Figure 23# EVPN MAC Flip-FloppingIn the case of All-Active access, traffic is load-balanced (per-flow)over the access PEs (CE uses LACP to bundle multiple physical ethernetports and uses hash algorithm to achieve per flow load-balancing).Remote PEs, PE3 and PE4, receive the same flow from different neighbors.With a VPLS core, PE3 and PE4 would rewrite the MAC address tablecontinuously, each time the same mac address is seen from a differentneighbor.EVPN solves this by mean of “Aliasing”, which is also signaled via theBGP Control Plane.Service Provider Network - Integration with Central Office or with Data CenterAnother very important EVPN benefit is the simple integration withCentral Office (CO) or with Data Center (DC). Note that Metro CentralOffice design is not covered by this document.The adoption of EVPNs provides huge benefits on how L2 Multipointtechnologies can be deployed in CO/DC. One such benefit is the convergedControl Plane (BGP) and converged data plane (SR MPLS/SRv6) over SP WANand CO/DC network.Moreover, EVPNs can replace existing proprietary EthernetMulti-Homed/All-Active solutions with a standard BGP-based ControlPlane.End-To-End (Flat) – ServicesThe End-To-End Services use cases are summarized in the table in Figure24 and shown in the network diagram in Figure 25.Figure 24# End-To-End – Services tableFigure 25# End-To-End – ServicesAll services use cases are based on BGP Control Plane.Refer also to Section# “Transport and Services Integration”.Hierarchical – ServicesHierarchical Services Use Cases are summarized in the table of Figure 26and shown in the network diagram of Figure 27.Figure 26# Hierarchical – Services tableFigure 27# Hierarchical - ServicesHierarchical services designs are critical for Service Providers lookingfor limiting requirements on the access platforms and deploying morecentralized provisioning models that leverage very rich features sets ona limited number of touch points.Hierarchical Services can also be required by Service Providers who wantto integrate their SP-WAN with the Central Office/Data Center networkusing well-established designs based on Data Central Interconnect (DCI).Figure 27 shows hierarchical services deployed on PE routers, but thesame design applies when services are deployed on AG or DCI routers.The Converged SDN Transport Design offers scalable hierarchical services withsimplified provisioning. The three most important use cases aredescribed in the following sections# Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service(H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) andPWHE Hierarchical L2 Multipoint Multi-Homed/All-ActiveFigure 28 shows a very elegant way to take advantage of the benefits ofSegment-Routing Anycast-SID and EVPN. This use case providesHierarchical L2 Multipoint Multi-Homed/All-Active (Single-Homed Ethernetaccess) service with traditional access routerintegration.Figure 28# Hierarchical – Services (Anycast-PW)Access Router A1 establishes a Single-Active static pseudowire(Anycast-Static-PW) to the Anycast IP address of PE1/PE2. PEs anycast IPaddress is represented by Anycast-SID.Access Router A1 doesn’t need to establish active/backup PWs as in atraditional H-VPLS design and doesn’t need any enhancement on top of theestablished spoke pseudowire design.PE1 and PE2 use BGP EVPN Control Plane to provide Multi-Homed/All-Activeaccess, protecting from L2 loop, and providing efficient per-flowload-balancing (with aliasing) toward the remote PEs (PE3/PE4).A3, PE3 and PE4 do the same, respectively.Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRBFigure 29 shows how EVPNs can completely replace the traditional H-VPLSsolution. This use case provides the greatest flexibility asHierarchical L2 Multi/Single-Home, All/Single-Active modes are availableat each layer of the servicehierarchy.Figure 29# Hierarchical – Services (H-EVPN)Optionally, Anycast-IRB can be used to enable Hierarchical L2/L3Multi/Single-Home, All/Single-Active service and to provide optimal L3routing.Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHEFigure 30 shows how the previous H-EVPN can be extended by takingadvantage of Pseudowire Headend (PWHE). PWHE with the combination ofMulti-Homed, Single-Active EVPN provides an Hierarchical L2/L3Multi-Homed/Single-Active (H-EVPN) solution that supports QoS.It completely replaces traditional H-VPLS based solutions. This use caseprovides Hierarchical L2 Multi/Single-Home, All/Single-Activeservice.Figure 30# Hierarchical – Services (H-EVPN and PWHE)Refer also to the section# “Transport and Services Integration”.Services – Route-Reflector (S-RR)Figure 31 shows the design of Services Router-Reflectors(S-RRs).Figure 31# Services – Route-ReflectorsThe Converged SDN Transport Design focuses mainly on BGP-based services,therefore it is important to provide a robust and scalable ServicesRoute-Reflector (S-RR) design.For Redundancy reasons, there are at least 2 S-RRs in any given IGPDomain, although Access and Aggregation are supported by the same pairof S-RRs.Each node participating in BGP-based service termination has two BGPsessions with Domain Specific S-RRs and supports multipleaddress-Families# VPNv4, VPNv6, EVPN.Core Domain S-RRs cover the core Domain. Aggregation Domain S-RRs coverAccess and Aggregation Domains. Aggregation Domain S-RRs and Core S-RRshave BGP sessions among each other.The described solution is very scalable and can be easily extended toscale to higher numbers of BGP sessions by adding another pair of S-RRsin the Access Domain.Ethernet Services OAM using Ethernet CFMEthernet CFM using 802.1ag/Y.1731 has been added in the CST 3.0 design. Ethernet CFM provides end-to-end continuity monitoring and alerting on a per-service basis. Maintenance End Points (MEPs) are configured on PE-CE interfaces with periodic Continuity Check Messages (CCMs) sent between them utilizing the same forwarding path as service traffic. Ethernet CFM also enables the transmission of Alarm Indication Signal (AIS) messages to alert remote endpoints of local faults. Additional information on Ethernet CFM can be found in the CST Implementation Guide at https#//xrdocs.io/design/blogs/latest-converged-sdn-transport-implementation-guideTransport and Services IntegrationSection# “Transport - Design” described how Segment Routing provides flexible End-To-End andAny-To-Any Highly-Available transport together with Fast Re-Route. Aconverged BGP Control Plane provides a scalable and flexible solutionalso at the services layer.Figure 35 shows a consolidated view of the Converged SDN Transport networkfrom a Control-Plane standpoint. Note that while network operators coulduse both PCEP and BGR-SR-TE at the same time, it is nottypical.Figure 35# Converged SDN Transport – Control-PlaneAs mentioned, service provisioning is independent of the transportlayer. However, transport is responsible for providing the path based onservice requirements (SLA). The component that enables such integrationis On-Demand Next Hop (ODN). ODN is the capability of requesting to acontroller a path that satisfies specific constraints (such as lowlatency). This is achieved by associating an SLA tag/attribute to thepath request. Upon receiving the request, the SR-PCE controller calculatesthe path based on the requested SLA and use PCEP or BGP-SR-TE todynamically program the Service End Point with a specific SR-TE Policy.The Converged SDN Transport design also use MPLS Performance Management tomonitor link delay/jitter/drop (RFC6374) to be able to create a LowLatency topology dynamically.Figure 36 shows a consolidated view of the Converged SDN Transport network froma Data Planestandpoint.Figure 36# Converged SDN Transport – Data-PlaneThe Converged SDN Transport DesignTransportThis section describes in detail the Converged SDN Transportdesign. This Converged SDN Transport design focuses on transport programmability using Segment Routing and BGP-basedservices adoption.Figure 35 and Figure 36 show the network topology and transport DataPlane details for Phase 1. Refer also to the Access domain extension usecase in Section# “Use Cases”.The network is split into Access and Core IGP domains. Each IGP domainis represented by separate IGP processes. The Converged SDN Transportdesign uses ISIS IGP protocol for validation.Validation will be done on two types of access platforms, IOS-XR andIOS-XE, to proveinteroperability.Figure 37# Access Domain Extension – End-To-End TransportFor the End-To-End LSP shown in Figure 35, the Access Router imposes 3transport labels (SID-list) An additional label, the TI-LFA label, canbe also added for FRR (node and link protection). In the Core and in theremote Access IGP Domain, 2 additional TI-LFA labels can be used for FRR(node and link protection). In Phase 1 PE ABRs are represented byPrefix-SID. Refer also to Section# “Transport Programmability - Phase 1”.Figure 38# Access Domain Extension – Hierarchical TransportFigure 38 shows how the Access Router imposes a single transport labelto reach local PE ABRs, where the hierarchical service is terminated.Similarly, in the Core and in the remote Access IGP domain, thetransport LSP is contained within the same IGP domain (Intra-DomainLSP). Routers in each IGP domain can also impose two additional TI-LFAlabels for FRR (to provide node and link protection).In the Hierarchical transport use case, PE ABRs are represented byAnycast-SID or Prefix-SID. Depending on the type of service, Anycast-SIDor Prefix-SID is used for the transport LSP.Transport ProgrammabilityThe Converged SDN Transport employs a distributed and highly available SR-PCEdesign as described in Section# “Transport Programmability”. Transport programmability is basedon PCEP. Figure 39 shows the design when SR-PCE uses PCEP.Figure 39# SR-PCE – PCEPSR-PCE in the Access domain is responsible for Inter-Domain LSPs andprovides the SID-list. PE ABRs are represented by Prefix-SID.SR-PCE in the Core domain is responsible for On-Demand Nexthop (ODN) forhierarchical services. Refer to the table in Figure 41 to see whatservices use ODN. Refer to Section# “Transport Controller - Path Computation Engine (PCE)” to see more details about XRTransport Controller (SR-PCE). Note that Phase 1 uses the “DelegatedComputation to SR-PCE” mode described in Section# “Path Computation Engine - Workflow” without WAE as shownin Figure38.Figure 40# PCE Path Computation – Phase 1Delegated Computation to SR-PCE NSO provisions the service – Service can also be provisioned via CLI Access Router requests a path SR-PCE computes the path SR-PCE provides the path to Access Router Access Router confirms ServicesThis section describes the Services used in the Converged SDN TransportPhase 1.The table in Figure 41 describes the End-To-End services, while thenetwork diagram in Figure 42 shows how services are deployed in thenetwork. Refer also to Section# “Services - Design” of this document.Figure 41# End-To-End Services tableFigure 42# End-To-End ServicesThe table in Figure 43 describes the hierarchical services, while thenetwork diagram in Figure 44 shows how services are deployed in thenetwork. Refer also to Section# “Services - Design” of this document.In addition, the table in Figure 44 shows where PE ABRs Anycast-SID isrequired and where ODN in the Core IGP domain is used.Figure 43# Hierarchical Services tableFigure 44# Hierarchical ServicesThe Converged SDN Transport uses the hierarchical Services Route-Reflectors(S-RRs) design described in Section# “Services - Route-Reflector (S-RR)”. Figure 45 shows in detail the S-RRs design used for Phase 1.Figure 45# Services Route-Reflectors (S-RRs)Network Services Orchestrator (NSO) is used for service provisioning.Refer to Section# “Network Services Orchestrator (NSO)”.Transport and Services IntegrationTransport and Services integration is described in Section# “Transport and Services Integration” of this document. Figure 46 shows an example of End-To-End LSP and servicesintegration.Figure 46# Transport and Services Data-PlaneFigure 47 shows a consolidated view of the Transport and ServicesControl-Plane.Figure 47# Transport and Services Control-PlaneFigure 48 shows the detailed topology of the testbed used for validation.Figure 48# TestbedFigure 49 shows the detailed topology of the testbed used for CIN and Remote PHY validation.Figure 49# Remote PHY/CIN Validation TestbedThe Converged SDN Transport Design - SummaryThe Converged SDN Transport brings huge simplification at the Transport aswell as at the Services layers of a Service Provider network.Simplification is a key factor for real Software Defined Networking(SDN). Cisco continuously improves Service Provider network designs tosatisfy market needs for scalability and flexibility.From a very well established and robust Unified MPLS design, Cisco hasembarked on a journey toward transport simplification andprogrammability, which started with the Transport Control Planeunification in Evolved Programmable Network 5.0 (EPN5.0). The CiscoConverged SDN Transport provides another huge leap forward in simplification andprogrammability adding Services Control Plane unification andcentralized path computation.Figure 50# Converged SDN Transport – EvolutionThe transport layer requires only IGP protocols with Segment Routingextensions for Intra and Inter Domain forwarding. Fast recovery for nodeand link failures leverages Fast Re-Route (FRR) by Topology IndependentLoop Free Alternate (TI-LFA), which is a built-in function of SegmentRouting. End to End LSPs are built using Traffic Engineering by SegmentRouting, which does not require additional signaling protocols. Insteadit solely relies on SDN controllers, thus increasing overall networkscalability. The controller layer is based on standard industryprotocols like BGP-LS, PCEP, BGP-SR-TE, etc., for path computation andNETCONF/YANG for service provisioning, thus providing a on openstandards based solution.For all those reasons, the Cisco Converged SDN Transport design really brings anexciting evolution in Service Provider Networking.", "url": "/blogs/2020-03-09-converged-sdn-transport-convert/", "author": "Phil Bedard", "tags": "iosxr, Metro, Design, 5G, Cable, CIN" } , "#": {} , "blogs-2019-12-10-cst-implementation-guide": { "title": "Converged SDN Transport Implementation Guide", "content": " On This Page Targets Testbed Overview Devices Key Resources to Allocate Role-Based Router Configuration IOS-XR Nodes - SR-MPLS Transport Underlay physical interface configuration with BFD SRGB and SRLB Definition IGP protocol (ISIS) and Segment Routing MPLS configuration IS-IS router configuration IS-IS Loopback and node SID configuration IS-IS interface configuration with TI-LFA MPLS Segment Routing Traffic Engineering (SRTE) configuration MPLS Segment Routing Traffic Engineering (SRTE) TE metric configuration Interface delay metric static configuration IOS-XE Nodes - SR-MPLS Transport Segment Routing MPLS configuration Prefix-SID assignment to loopback 0 configuration IGP protocol (ISIS) with Segment Routing MPLS configuration TI-LFA FRR configuration IS-IS and MPLS interface configuration MPLS Segment Routing Traffic Engineering (SRTE) Area Border Routers (ABRs) IGP-ISIS Redistribution configuration (IOS-XR) Redistribute Core SvRR and TvRR loopback into Access domain Redistribute Access SR-PCE and SvRR loopbacks into CORE domain Multicast transport using mLDP Overview mLDP core configuration LDP base configuration with defined interfaces LDP auto-configuration G.8275.1 and G.8275.2 PTP (1588v2) timing configuration Summary Enable frequency synchronization Optional Synchronous Ethernet configuration (PTP hybrid mode) PTP G.8275.2 global timing configuration PTP G.8275.2 interface profile definitions IPv4 G.8275.2 master profile IPv6 G.8275.2 master profile IPv4 G.8275.2 slave profile IPv6 G.8275.2 slave profile PTP G.8275.1 global timing configuration IPv6 G.8275.1 slave profile IPv6 G.8275.1 master profile Application of PTP profile to physical interface G.8275.2 interface configuration G.8275.1 interface configuration Segment Routing Path Computation Element (SR-PCE) configuration BGP - Services (sRR) and Transport (tRR) route reflector configuration Services Route Reflector (sRR) configuration Transport Route Reflector (tRR) configuration BGP – Provider Edge Routers (A-PEx and PEx) to service RR IOS-XR configuration IOS-XE configuration BGP-LU co-existence BGP configuration Segment Routing Global Block Configuration Boundary node configuration PE node configuration Area Border Routers (ABRs) IGP topology distribution Segment Routing Traffic Engineering (SRTE) and Services Integration On Demand Next-Hop (ODN) configuration – IOS-XR On Demand Next-Hop (ODN) configuration – IOS-XE SR-PCE configuration – IOS-XR SR-PCE configuration – IOS-XE QoS Implementation Summary Core QoS configuration Class maps used in QoS policies Core ingress classifier policy Core egress queueing map Core egress MPLS EXP marking map H-QoS configuration Enabling H-QoS on NCS 540 and NCS 5500 Example H-QoS policy for 5G services Class maps used in ingress H-QoS policies Parent ingress QoS policy H-QoS ingress child policies Egress H-QoS parent policy (Priority levels) Egress H-QoS child using priority only Egress H-QoS child using reserved bandwidth Egress H-QoS child using shaping Services End-To-End VPN Services L3VPN MP-BGP VPNv4 On-Demand Next-Hop Access Router Service Provisioning (IOS-XR) Access Router Service Provisioning (IOS-XE) L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Access Router Service Provisioning (IOS-XR)# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Ethernet CFM for L2VPN service assurance Maintenance Domain configuration MEP configuration for EVPN-VPWS services Multicast NG-MVPN Profile 14 using mLDP and ODN L3VPN Multicast core configuration Unicast L3VPN PE configuration Multicast PE configuration End-To-End VPN Services Data Plane Hierarchical Services L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Router Service Provisioning (IOS-XR)# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# Remote PHY CIN Implementation Summary Sample QoS Policies Class maps RPD and DPIC interface policy maps Core QoS CIN Timing Configuration Example CBR-8 RPD DTI Profile Multicast configuration Summary Global multicast configuration - Native multicast Global multicast configuration - LSM using profile 14 PIM configuration - Native multicast PIM configuration - LSM using profile 14 IGMPv3/MLDv2 configuration - Native multicast IGMPv3/MLDv2 configuration - LSM profile 14 IGMPv3 / MLDv2 snooping profile configuration (BVI aggregation) RPD DHCPv4/v6 relay configuration Native IP / Default VRF RPHY L3VPN cBR-8 DPIC interface configuration without Link HA cBR-8 DPIC interface configuration with Link HA cBR-8 Digital PIC Interface Configuration RPD interface configuration P2P L3 BVI RPD/DPIC agg device IS-IS configuration Additional configuration for L3VPN Design Global VRF Configuration BGP Configuration Targets Hardware# ASR 9000 as Centralized Provider Edge (C-PE) router NCS 5500 and NCS 55A2 as Aggregation and Pre-Aggregation router NCS 5500 as P core router ASR 920, NCS 540, and NCS 5500 as Access Provider Edge (A-PE) cBR-8 CMTS with 8x10GE DPIC for Remote PHY Compact Remote PHY shelf with three 1x2 Remote PHY Devices (RPD) Software# IOS-XR 6.6.3 on ASR 9000, NCS 540, NCS 5500, and NCS 55A2 routers IOS-XE 16.8.1 on ASR 920 IOS-XE 16.10.1f on cBR-8 Key technologies Transport# End-To-End Segment-Routing Network Programmability# SR- TE Inter-Domain LSPs with On-DemandNext Hop Network Availability# TI-LFA/Anycast-SID Services# BGP-based L2 and L3 Virtual Private Network services(EVPN and L3VPN/mVPN) Network Timing# G.8275.1 and G.8275.2 Network Assurance# 802.1ag Testbed OverviewFigure 1# Compass Converged SDN Transport High Level TopologyFigure 2# Testbed Physical TopologyFigure 3# Testbed Route-Reflector and SR-PCE physical connectivityFigure 4# Testbed IGP DomainsDevicesAccess PE (A-PE) Routers Cisco NCS5501-SE (IOS-XR) – A-PE7 Cisco NCS540 (IOS-XR) - A-PE1, A-PE2, A-PE3, A-PE8 Cisco ASR920 (IOS-XE) – A-PE4, A-PE5, A-PE6, A-PE9Pre-Aggregation (PA) Routers Cisco NCS5501-SE (IOS-XR) – PA3, PA4Aggregation (PA) Routers Cisco NCS5501-SE (IOS-XR) – AG1, AG2, AG3, AG4High-scale Provider Edge Routers Cisco ASR9000 (IOS-XR) – PE1, PE2, PE3, PE4Area Border Routers (ABRs) Cisco ASR9000 (IOS-XR) – PE3, PE4 Cisco 55A2-MOD-SE - PA2 Cisco NCS540 - PA1Service and Transport Route Reflectors (RRs) Cisco IOS XRv 9000 – tRR1-A, tRR1-B, sRR1-A, sRR1-B, sRR2-A, sRR2-B,sRR3-A, sRR3-BSegment Routing Path Computation Element (SR-PCE) Cisco IOS XRv 9000 – SRPCE-A1-A, SRPCE-A1-B, SRPCE-A2-A, SRPCE-A2-A, SRPCE-CORE-A, SRPCE-CORE-BKey Resources to Allocate IP Addressing IPv4 address plan IPv6 address plan, recommend dual plane day 1 Plan for SRv6 in the future Color communities for ODN Segment Routing Blocks SRGB (segment-routing address block) Keep in mind anycast SID for ABR node pairs Allocate 3 SIDs for potential future Flex-algo use SRLB (segment routing local block) Local significance only Can be quite small and re-used on each node IS-IS unique instance identifiers for each domainRole-Based Router ConfigurationIOS-XR Nodes - SR-MPLS TransportUnderlay physical interface configuration with BFDinterface TenGigE0/0/0/10 bfd mode ietf bfd address-family ipv4 timers start 180 bfd address-family ipv4 multiplier 3 bfd address-family ipv4 destination 10.1.2.1 bfd address-family ipv4 fast-detect bfd address-family ipv4 minimum-interval 50 mtu 9216 ipv4 address 10.15.150.1 255.255.255.254 ipv4 unreachables disable load-interval 30 dampeningSRGB and SRLB DefinitionIt’s recommended to first configure the Segment Routing Global Block (SRGB) across all nodes needing connectivity between each other. In most instances a single SRGB will be used across the entire network. In a SR MPLS deployment the SRGB and SRLB correspond to the label blocks allocated to SR. IOS-XR has a maximum configurable SRGB limit of 512,000 labels, however please consult platform-specific documentation for maximum values. The SRLB corresponds to the labels allocated for SIDs local to the node, such as Adjacency-SIDs. It is recommended to configure the same SRLB block across all nodes. The SRLB must not overlap with the SRGB. The SRGB and SRLB are configured in IOS-XR with the following configuration#segment-routing global-block 16000 23999 local-block 15000 15999 IGP protocol (ISIS) and Segment Routing MPLS configurationKey chain global configuration for IS-IS authenticationkey chain ISIS-KEY key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password 00071A150754 send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm HMAC-MD5 IS-IS router configurationAll routers, except Area Border Routers (ABRs), are part of one IGPdomain and L2 area (ISIS-ACCESS or ISIS-CORE). Area border routersrun two IGP IS-IS processes (ISIS-ACCESS and ISIS-CORE). Note that Loopback0 is part of both IGP processes.router isis ISIS-ACCESS set-overload-bit on-startup 360 is-type level-2-only net 49.0001.0101.0000.0110.00 nsr nsf cisco log adjacency changes lsp-gen-interval maximum-wait 5000 initial-wait 5 secondary-wait 100 lsp-refresh-interval 65000 max-lsp-lifetime 65535 lsp-password keychain ISIS-KEY address-family ipv4 unicast metric-style wide advertise link attributes spf-interval maximum-wait 1000 initial-wait 5 secondary-wait 100 segment-routing mpls spf prefix-priority high tag 1000 maximum-redistributed-prefixes 100 level 2 ! address-family ipv6 unicast metric-style wide spf-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 maximum-redistributed-prefixes 100 level 2Note# ABR Loopback 0 on domain boundary is part of both IGP processes together with same “prefix-sid absolute” valueNote# The prefix SID can be configured as either absolute or index. The index configuration is required for interop with nodes using a different SRGB.IS-IS Loopback and node SID configuration interface Loopback0 ipv4 address 100.0.1.50 255.255.255.255 address-family ipv4 unicast prefix-sid absolute 16150 tag 1000 IS-IS interface configuration with TI-LFAIt is recommended to use manual adjacency SIDs. A protected SID is eligible for backup path computation, meaning if a packet ingresses the node with the label a backup path will be provided in case of a failure. In the case of having multiple adjacencies between the same two nodes, use the same adjacency-sid on each link. interface TenGigE0/0/0/10 point-to-point hello-password keychain ISIS-KEY address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa adjacency-sid absolute 15002 protected metric 100 ! address-family ipv6 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 MPLS Segment Routing Traffic Engineering (SRTE) configurationThe following configuration is done at the global ISIS configuration level and should be performed for all IOS-XR nodes.router isis ACCESS address-family ipv4 unicast mpls traffic-eng level-2-only mpls traffic-eng router-id Loopback0MPLS Segment Routing Traffic Engineering (SRTE) TE metric configurationThe TE metric is used when computing SR Policy paths with the “te” or “latency” constraint type. The TE metric is carried as a TLV within the TE opaque LSA distributed across the IGP area and to the PCE via BGP-LS.The TE metric is used in the CST 5G Transport use case. If no TE metric is defined the local CSPF or PCE will utilize the IGP metric.segment-routing traffic-eng interface TenGigE0/0/0/6 metric 1000Interface delay metric static configurationIn the absence of dynamic realtime one-way latency monitoring for physical interfaces, the interface delay can be set manually. The one-way delay measurement value is used when computing SR Policy paths with the “latency” constraint type. The configured value is advertised in the IGP using extensions defined in RFC 7810, and advertised to the PCE using BGP-LS extensions. Keep in mind the delay metric value is defined in microseconds, so if you are mixing dynamic computation with static values they should be set appropriately.performance-measurement interface TenGigE0/0/0/10 delay-measurement advertise-delay 15000 interface TenGigE0/0/0/20 delay-measurement advertise-delay 10000IOS-XE Nodes - SR-MPLS TransportSegment Routing MPLS configurationmpls label range 6001 32767 static 16 6000segment-routing mpls ! set-attributes address-family ipv4 sr-label-preferred exit-address-family ! global-block 16000 24999 ! Prefix-SID assignment to loopback 0 configuration connected-prefix-sid-map address-family ipv4 100.0.1.51/32 index 151 range 1 exit-address-family ! IGP protocol (ISIS) with Segment Routing MPLS configurationkey chain ISIS-KEY key 1 key-string cisco accept-lifetime 00#00#00 Jan 1 2018 infinite send-lifetime 00#00#00 Jan 1 2018 infinite!router isis ACCESS net 49.0001.0102.0000.0254.00 is-type level-2-only authentication mode md5 authentication key-chain ISIS-KEY metric-style wide fast-flood 10 set-overload-bit on-startup 120 max-lsp-lifetime 65535 lsp-refresh-interval 65000 spf-interval 5 50 200 prc-interval 5 50 200 lsp-gen-interval 5 5 200 log-adjacency-changes segment-routing mpls segment-routing prefix-sid-map advertise-local TI-LFA FRR configuration fast-reroute per-prefix level-2 all fast-reroute ti-lfa level-2 microloop avoidance protected!interface Loopback0 ip address 100.0.1.51 255.255.255.255 ip router isis ACCESS isis circuit-type level-2-onlyend IS-IS and MPLS interface configurationinterface TenGigabitEthernet0/0/12 mtu 9216 ip address 10.117.151.1 255.255.255.254 ip router isis ACCESS mpls ip isis circuit-type level-2-only isis network point-to-point isis metric 100end MPLS Segment Routing Traffic Engineering (SRTE)router isis ACCESS mpls traffic-eng router-id Loopback0 mpls traffic-eng level-2interface TenGigabitEthernet0/0/12 mpls traffic-eng tunnels Area Border Routers (ABRs) IGP-ISIS Redistribution configuration (IOS-XR)The ABR nodes must provide IP reachability for RRs, SR-PCEs and NSO between ISIS-ACCESS and ISIS-CORE IGP domains. This is done by IPprefix redistribution. The ABR nodes have static hold-down routes for the block of IP space used in each domain across the network, those static routes are then redistributed into the domains using the redistribute static command with a route-policy. The distance command is used to ensure redistributed routes are not preferred over local IS-IS routes on the opposite ABR. The distance command must be applied to both ABR nodes.router staticaddress-family ipv4 unicast 100.0.0.0/24 Null0 100.0.1.0/24 Null0 100.1.0.0/24 Null0 100.1.1.0/24 Null0prefix-set ACCESS-PCE_SvRR-LOOPBACKS 100.0.1.0/24, 100.1.1.0/24end-setprefix-set RR-LOOPBACKS 100.0.0.0/24, 100.1.0.0/24end-set Redistribute Core SvRR and TvRR loopback into Access domainroute-policy CORE-TO-ACCESS1 if destination in RR-LOOPBACKS then pass else drop endifend-policy!router isis ACCESS address-family ipv4 unicast distance 254 0.0.0.0/0 RR-LOOPBACKS redistribute static route-policy CORE-TO-ACCESS1 Redistribute Access SR-PCE and SvRR loopbacks into CORE domainroute-policy ACCESS1-TO-CORE if destination in ACCESS-PCE_SvRR-LOOPBACKS then pass else drop endif end-policy ! router isis CORE address-family ipv4 unicast distance 254 0.0.0.0/0 ACCESS-PCE_SvRR-LOOPBACKS redistribute static route-policy CORE-TO-ACCESS1 Multicast transport using mLDPOverviewThis portion of the implementation guide instructs the user how to configure mLDP end to end across the multi-domain network. Multicast service examples are given in the “Services” section of the implementation guide.mLDP core configurationIn order to use mLDP across the Converged SDN Transport network LDP must first be enabled. There are two mechanisms to enable LDP on physical interfaces across the network, LDP auto-configuration or manually under the MPLS LDP configuration context. The capabilities statement will ensure LDP unicast FECs are not advertised, only mLDP FECs. Recursive forwarding is required in a multi-domain network. mLDP must be enabled on all participating A-PE, PE, AG, PA, and P routers.LDP base configuration with defined interfacesmpls ldp capabilities sac mldp-only mldp logging notifications address-family ipv4 make-before-break delay 30 forwarding recursive recursive-fec ! ! router-id 100.0.2.53 session protection address-family ipv4 ! interface TenGigE0/0/0/6 ! interface TenGigE0/0/0/7 LDP auto-configurationLDP can automatically be enabled on all IS-IS interfaces with the following configuration in the IS-IS configuration. It is recommended to do this only after configuring all MPLS LDP properties.router isis ACCESS address-family ipv4 unicast segment-routing mpls sr-prefer mpls ldp auto-config G.8275.1 and G.8275.2 PTP (1588v2) timing configurationSummaryThis section contains the base configurations used for both G.8275.1 and G.8275.2 timing. Please see the CST 3.0 HLD for an overview on timing in general.Enable frequency synchronizationIn order to lock the internal oscillator to a PTP source, frequency synchronization must first be enabled globally.frequency synchronization quality itu-t option 1 clock-interface timing-mode system log selection changes! Optional Synchronous Ethernet configuration (PTP hybrid mode)If the end-to-end devices support SyncE it should be enabled. SyncE will allow much faster frequency sync and maintain integrity for long periods of time during holdover events. Using SyncE for frequency and PTP for phase is known as “Hybrid” mode. A lower priority is used on the SyncE input (50 for SyncE vs. 100 for PTP).interface TenGigE0/0/0/10 frequency synchronization selection input priority 50 !! PTP G.8275.2 global timing configurationAs of CST 3.0, IOS-XR supports a single PTP timing profile and single clock type in the global PTP configuration. The clock domain should follow the ITU-T guidelines for specific profiles using a domain >44 for G.8275.2 clocks.ptp clock domain 60 profile g.8275.2 clock-type T-BC ! frequency priority 100 time-of-day priority 50 log servo events best-master-clock changes ! PTP G.8275.2 interface profile definitionsIt is recommended to use “profiles” defined globally which are then applied to interfaces participating in timing. This helps minimize per-interface timing configuration. It is also recommended to define different profiles for “master” and “slave” interfaces.IPv4 G.8275.2 master profileThe master profile is assigned to interfaces for which the router is acting as a boundary clockptp profile g82752_master_v4 transport ipv4 port state master-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 5 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv6 G.8275.2 master profileThe master profile is assigned to interfaces for which the router is acting as a boundary clockptp profile g82752_master_v6 transport ipv6 port state master-only sync frequency 16 clock operation one-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv4 G.8275.2 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82752_master_v4 transport ipv4 port state slave-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv6 G.8275.2 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82752_master_v6 transport ipv6 port state slave-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! PTP G.8275.1 global timing configurationAs of CST 3.0, IOS-XR supports a single PTP timing profile and single clock type in the global PTP configuration. The clock domain should follow the ITU-T guidelines for specific profiles using a domain <44 for G.8275.1 clocks.ptpclock domain 24 operation one-step Use one-step for NCS series, two-step for ASR 9000 physical-layer-frequency frequency priority 100 profile g.8275.1 clock-type T-BClog servo events best-master-clock changes IPv6 G.8275.1 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82751_slave port state slave-only clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 delay-request frequency 16 multicast transport ethernet !! IPv6 G.8275.1 master profileThe master profile is assigned to interfaces for which the router is acting as a master to slave devicesptp profile g82751_slave port state master-only clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step sync frequency 16 announce timeout 10 announce interval 1 delay-request frequency 16 multicast transport ethernet !! Application of PTP profile to physical interfaceNote# In CST 3.0 PTP may only be enabled on physical interfaces. G.8275.1 operates at L2 and supports PTP across Bundle member links and interfaces part of a bridge domain. G.8275.2 operates at L3 and does not support Bundle interfaces or BVI interfaces.G.8275.2 interface configurationThis example is of a slave device using a master of 2405#10#23#253##0.interface TenGigE0/0/0/6 ptp profile g82752_slave_v6 master ipv6 2405#10#23#253## ! ! G.8275.1 interface configurationinterface TenGigE0/0/0/6 ptp profile g82751_slave ! ! Segment Routing Path Computation Element (SR-PCE) configurationrouter static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.100 bgp graceful-restart graceful-reset bgp graceful-restart ibgp policy out enforce-modifications address-family link-state link-state ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR !!pce address ipv4 100.100.100.1 rest user rest_user password encrypted 00141215174C04140B ! authentication basic ! state-sync ipv4 100.100.100.2 peer-filter ipv4 access-list pe-routers! BGP - Services (sRR) and Transport (tRR) route reflector configurationServices Route Reflector (sRR) configurationIn the CST validation a sRR is used to reflect all service routes. In a production network each service could be allocated its own sRR based on resiliency and scale demands.router static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.200 bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast nexthop trigger-delay critical 10 additional-paths receive additional-paths send ! address-family vpnv6 unicast nexthop trigger-delay critical 10 additional-paths receive additional-paths send retain route-target all ! address-family l2vpn evpn additional-paths receive additional-paths send ! address-family ipv4 mvpn nexthop trigger-delay critical 10 soft-reconfiguration inbound always ! address-family ipv6 mvpn nexthop trigger-delay critical 10 soft-reconfiguration inbound always ! neighbor-group SvRR-Client remote-as 100 bfd fast-detect bfd minimum-interval 3 update-source Loopback0 address-family l2vpn evpn route-reflector-client ! address-family vpnv4 unicast route-reflector-client ! address-family vpnv6 unicast route-reflector-client ! address-family ipv4 mvpn route-reflector-client ! address-family ipv6 mvpn route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group SvRR-Client !! Transport Route Reflector (tRR) configurationrouter static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.10 bgp graceful-restart ibgp policy out enforce-modifications address-family link-state link-state additional-paths receive additional-paths send ! neighbor-group RRC remote-as 100 update-source Loopback0 address-family link-state link-state route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group RRC ! neighbor 100.0.0.2 use neighbor-group RRC! BGP – Provider Edge Routers (A-PEx and PEx) to service RREach PE router is configured with BGP sessions to service route-reflectors for advertising VPN service routes across the inter-domain network.IOS-XR configurationrouter bgp 100 nsr bgp router-id 100.0.1.50 bgp graceful-restart graceful-reset bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! address-family l2vpn evpn ! neighbor-group SvRR remote-as 100 bfd fast-detect bfd minimum-interval 3 update-source Loopback0 address-family vpnv4 unicast soft-reconfiguration inbound always ! address-family vpnv6 unicast soft-reconfiguration inbound always ! address-family ipv4 mvpn soft-reconfiguration inbound always ! address-family ipv6 mvpn soft-reconfiguration inbound always ! address-family l2vpn evpn soft-reconfiguration inbound always ! ! neighbor 100.0.1.201 use neighbor-group SvRR ! ! IOS-XE configurationrouter bgp 100 bgp router-id 100.0.1.51 bgp log-neighbor-changes no bgp default ipv4-unicast neighbor SvRR peer-group neighbor SvRR remote-as 100 neighbor SvRR update-source Loopback0 neighbor 100.0.1.201 peer-group SvRR ! address-family ipv4 exit-address-family ! address-family vpnv4 neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family ! address-family l2vpn evpn neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family ! BGP-LU co-existence BGP configurationCST 3.0 introduced co-existence between services using BGP-LU and SR endpoints. If you are using SR and BGP-LU within the same domain it requires using BGP-SR in order to resolve prefixes correctly on the each ABR. BGP-SR uses a new BGP community attached to the BGP-LU prefix to convey the SR prefix-sid index end to end across the network. Using the same prefix-sid index both within the SR-MPLS IGP domain and across the BGP-LU network simplifies the network from an operational perspective since the path to an end node can always be identified by that SID.It is recommended to enable the BGP-SR configuration when enabling SR on the PE node. See the PE configuration below for an example of this configuration.Segment Routing Global Block ConfigurationThe BGP process must know about the SRGB in order to properly allocate local BGP-SR labels when receiving a BGP-LU prefix with a BGP-SR index community. This is done via the following configuration. If a SRGB is defined under the IGP it must match the global SRGB value. The IGP will inherit this SRGB value if none is previously defined.segment-routing global-block 32000 64000 !! Boundary node configurationThe following configuration is necessary on all domain boundary nodes. Note the ibgp policy out enforce-modifications command is required to change the next-hop on reflected IBGP routes.router bgp 100 ibgp policy out enforce-modifications neighbor-group BGP-LU-PE remote-as 100 update-source Loopback0 address-family ipv4 labeled-unicast soft-reconfiguration inbound always route-reflector-client next-hop-self ! ! neighbor-group BGP-LU-PE remote-as 100 update-source Loopback0 address-family ipv4 labeled-unicast soft-reconfiguration inbound always route-reflector-client next-hop-self ! ! neighbor 100.0.2.53 use neighbor-group BGP-LU-PE ! neighbor 100.0.2.52 use neighbor-group BGP-LU-PE ! neighbor 100.0.0.1 use neighbor-group BGP-LU-BORDER ! neighbor 100.0.0.2 use neighbor-group BGP-LU-BORDER ! ! PE node configurationThe following configuration is necessary on all domain PE nodes participating in BGP-LU/BGP-SR. The label-index set must match the index of the Loopback addresses being advertised into BGP. This example shows a single Loopback address being advertised into BGP.route-policy LOOPBACK-INTO-BGP-LU($SID-LOOPBACK0) set label-index $SID-LOOPBACK0 set aigp-metric igp-costend-policy!router bgp 100 address-family ipv4 unicast network 100.0.2.53/32 route-policy LOOPBACK-INTO-BGP-LU(153) ! neighbor-group BGP-LU-BORDER remote-as 100 update-source Loopback0 address-family ipv4 labeled-unicast ! ! neighbor 100.0.0.3 use neighbor-group BGP-LU-BORDER ! neighbor 100.0.0.4 use neighbor-group BGP-LU-BORDER ! Area Border Routers (ABRs) IGP topology distributionNext network diagram# “BGP-LS Topology Distribution” shows how AreaBorder Routers (ABRs) distribute IGP network topology from ISIS ACCESSand ISIS CORE to Transport Route-Reflectors (tRRs). tRRs then reflecttopology to Segment Routing Path Computation Element (SR-PCEs). Each SR-PCE has full visibility of the entire inter-domain network.Note# Each IS-IS process in the network requires a unique instance-id to identify itself to the PCE.Figure 5# BGP-LS Topology Distributionrouter isis ACCESS **distribute link-state instance-id 101** net 49.0001.0101.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0 !! router isis CORE **distribute link-state instance-id 100** net 49.0001.0100.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0 !! router bgp 100 **address-family link-state link-state** ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR ! Segment Routing Traffic Engineering (SRTE) and Services IntegrationThis section shows how to integrate Traffic Engineering (SRTE) withservices. ODN is configured by first defining a global ODN color associated with specific SR Policy constraints. The color and BGP next-hop address on the service route will be used to dynamically instantiate a SR Policy to the remote VPN endpoint.On Demand Next-Hop (ODN) configuration – IOS-XRsegment-routing traffic-eng logging policy status ! on-demand color 100 dynamic pce ! metric type igp ! ! ! pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 ! pce address ipv4 100.1.1.101 ! !extcommunity-set opaque BLUE 100end-setroute-policy ODN_EVPN set extcommunity color BLUEend-policyrouter bgp 100 address-family l2vpn evpn route-policy ODN_EVPN out !! On Demand Next-Hop (ODN) configuration – IOS-XEmpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-allmpls traffic-eng auto-tunnel p2p config unnumbered-interface Loopback0mpls traffic-eng auto-tunnel p2p tunnel-num min 1000 max 5000!mpls traffic-eng lsp attributes L3VPN-SRTE path-selection metric igp pce!ip community-list 1 permit 9999!route-map L3VPN-ODN-TE-INIT permit 10 match community 1 set attribute-set L3VPN-SRTE!route-map L3VPN-SR-ODN-Mark-Comm permit 10 match ip address L3VPN-ODN-Prefixes set community 9999 !!router bgp 100 address-family vpnv4 neighbor SvRR send-community both neighbor SvRR route-map L3VPN-ODN-TE-INIT in neighbor SvRR route-map L3VPN-SR-ODN-Mark-Comm out SR-PCE configuration – IOS-XRsegment-routing traffic-eng pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 ! pce address ipv4 100.1.1.101 ! ! SR-PCE configuration – IOS-XE</pre></div> mpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-all</pre></div>QoS ImplementationSummaryPlease see the CST 3.0 HLD for in-depth information on design choices.Core QoS configurationThe core QoS policies defined for CST 3.0 utilize priority levels, with no bandwidth guarantees per traffic class. In a production network it is recommended to analyze traffic flows and determine an appropriate BW guarantee per traffic class. The core QoS uses four classes. Note the “video” class uses priority level 6 since only levels 6 and 7 are supported for high priority multicast. Traffic Type Priority Level Core EXP Marking     Network Control 1 6   Voice 2 5   High Priority 3 4   Video 6 2   Default 0 0 Class maps used in QoS policiesClass maps are used within a policy map to match packet criteria or internal QoS markings like traffic-class or qos-groupclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 match mpls experimental topmost 5 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 match mpls experimental topmost 4 end-class-map!class-map match-any match-video-cs4-exp2 description Video match dscp 32 match mpls experimental topmost 2 end-class-map!class-map match-any match-cs6-exp6 description Highest priority control-plane traffic match dscp cs6 match mpls experimental topmost 6 end-class-map!class-map match-any match-qos-group-1 match qos-group 1 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-map!class-map match-any match-qos-group-6 match qos-group 3 end-class-map!class-map match-any match-traffic-class-1 description ~Match highest priority traffic-class 1~ match traffic-class 1 end-class-map!class-map match-any match-traffic-class-2 description ~Match high priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match medium traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-traffic-class-6 description ~Match video traffic-class 6~ match traffic-class 6 end-class-map Core ingress classifier policypolicy-map core-ingress-classifier class match-cs6-exp6 set traffic-class 1 ! class match-ef-exp5 set traffic-class 2 ! class match-cs5-exp4 set traffic-class 3 ! class match-video-cs4-exp2 set traffic-class 6 ! class class-default set mpls experimental topmost 0 set traffic-class 0 set dscp 0 ! end-policy-map! Core egress queueing mappolicy-map core-egress-queuing class match-traffic-class-2 priority level 2 queue-limit 100 us ! class match-traffic-class-3 priority level 3 queue-limit 500 us ! class match-traffic-class-6 priority level 6 queue-limit 500 us ! class match-traffic-class-1 priority level 1 queue-limit 500 us ! class class-default queue-limit 250 ms ! end-policy-map! Core egress MPLS EXP marking mapThe following policy must be applied for PE devices with MPLS-based VPN services in order for service traffic classified in a specific QoS Group to be marked. VLAN-based P2P L2VPN services will by default inspect the incoming 802.1p bits and copy those the egress MPLS EXP if no specific ingress policy overrides that behavior. Note the EXP can be set in either an ingress or egress QoS policy. This QoS example sets the EXP via the egress map.policy-map core-egress-exp-marking class match-qos-group-1 set mpls experimental imposition 6 ! class match-qos-group-2 set mpls experimental imposition 5 ! class match-qos-group-3 set mpls experimental imposition 4 ! class match-qos-group-6 set mpls experimental imposition 2 ! class class-default set mpls experimental imposition 0 ! end-policy-map! H-QoS configurationEnabling H-QoS on NCS 540 and NCS 5500Enabling H-QoS on the NCS platforms requires the following global command and requires a reload of the device.hw-module profile qos hqos-enable Example H-QoS policy for 5G servicesThe following H-QoS policy represents an example QoS policy reserving 5Gbps on a sub-interface. On ingress each child class is policed to a certain percentage of the 5Gbps policer. In the egress queuing policy, shaping is used with guaranteed each class a certain amount of egress bandwidth, with high priority traffic being serviced in a low-latency queue (LLQ).Class maps used in ingress H-QoS policiesclass-map match-any edge-hqos-2-in match dscp 46 end-class-map!class-map match-any edge-hqos-3-in match dscp 40 end-class-map!class-map match-any edge-hqos-6-in match dscp 32 end-class-map Parent ingress QoS policypolicy-map hqos-ingress-parent-5g class class-default service-policy hqos-ingress-child-policer police rate 5 gbps ! ! end-policy-map H-QoS ingress child policiespolicy-map hqos-ingress-child-policer class edge-hqos-2-in set traffic-class 2 police rate percent 10 ! ! class edge-hqos-3-in set traffic-class 3 police rate percent 30 ! ! class edge-hqos-6-in set traffic-class 6 police rate percent 30 ! ! class class-default set traffic-class 0 set dscp 0 police rate percent 100 ! ! end-policy-map Egress H-QoS parent policy (Priority levels)policy-map hqos-egress-parent-4g-priority class class-default service-policy hqos-egress-child-priority shape average 4 gbps ! end-policy-map! Egress H-QoS child using priority onlyIn this policy all classes can access 100% of the bandwidth, queues are services based on priority level. The lower priority level has preference.policy-map hqos-egress-child-priority class match-traffic-class-2 shape average percent 100 priority level 2 ! class match-traffic-class-3 shape average percent 100 priority level 3 ! class match-traffic-class-6 priority level 4 shape average percent 100 ! class class-default ! end-policy-map Egress H-QoS child using reserved bandwidthIn this policy each class is reserved a certain percentage of bandwidth. Each class may utilize up to 100% of the bandwidth, if traffic exceeds the guaranteed bandwidth it is eligible for drop.policy-map hqos-egress-child-bw class match-traffic-class-2 bandwidth remaining percent 30 ! class match-traffic-class-3 bandwidth remaining percent 30 ! class match-traffic-class-6 bandwidth remaining percent 30 ! class class-default bandwidth remaining percent 10 ! end-policy-map Egress H-QoS child using shapingIn this policy each class is shaped to a defined amount and cannot exceed the defined bandwidth.policy-map hqos-egress-child-shaping class match-traffic-class-2 shape average percent 30 ! class match-traffic-class-3 shape average percent 30 ! class match-traffic-class-6 shape average percent 30 ! class class-default shape average percent 10 ! end-policy-map! ServicesEnd-To-End VPN ServicesFigure 6# End-To-End Services TableL3VPN MP-BGP VPNv4 On-Demand Next-HopFigure 7# L3VPN MP-BGP VPNv4 On-Demand Next-Hop Control PlaneAccess Routers# Cisco ASR920 IOS-XE and NCS540 IOS-XR Operator# New VPNv4 instance via CLI or NSO Access Router# Advertises/receives VPNv4 routes to/from ServicesRoute-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Please refer to “On Demand Next-Hop (ODN)” sections for initial ODN configuration.Access Router Service Provisioning (IOS-XR)ODN route-policy configurationextcommunity-set opaque ODN-GREEN 100end-setroute-policy ODN-L3VPN-OUT set extcommunity color ODN-GREEN passend-policy VRF definition configurationvrf ODN-L3VPN rd 100#1 address-family ipv4 unicast import route-target 100#1 ! export route-target export route-policy ODN-L3VPN-OUT 100#1 ! ! address-family ipv6 unicast import route-target 100#1 ! export route-target export route-policy ODN-L3VPN-OUT 100#1 ! ! VRF Interface configurationinterface TenGigE0/0/0/23.2000 mtu 9216 vrf ODN-L3VPN ipv4 address 172.106.1.1 255.255.255.0 encapsulation dot1q 2000 BGP VRF configuration with static/connected onlyrouter bgp 100 vrf VRF-MLDP rd auto address-family ipv4 unicast redistribute connected redistribute static ! address-family ipv6 unicast redistribute connected redistribute static ! Access Router Service Provisioning (IOS-XE)VRF definition configurationvrf definition L3VPN-SRODN-1 rd 100#100 route-target export 100#100 route-target import 100#100 address-family ipv4 exit-address-family VRF Interface configurationinterface GigabitEthernet0/0/2 mtu 9216 vrf forwarding L3VPN-SRODN-1 ip address 10.5.1.1 255.255.255.0 negotiation autoend BGP VRF configuration Static & BGP neighbor Static routing configurationrouter bgp 100 address-family ipv4 vrf L3VPN-SRODN-1 redistribute connected exit-address-family BGP neighbor configurationrouter bgp 100 neighbor Customer-1 peer-group neighbor Customer-1 remote-as 200 neighbor 10.10.10.1 peer-group Customer-1 address-family ipv4 vrf L3VPN-SRODN-2 neighbor 10.10.10.1 activate exit-address-family L2VPN Single-Homed EVPN-VPWS On-Demand Next-HopFigure 8# L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Advertises/receives EVPN-VPWS instance to/fromServices Route-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Note# Please refer to On Demand Next-Hop (ODN) – IOS-XR section for initial ODN configuration. The correct EVPN L2VPN routes must be advertised with a specific color ext-community to trigger dynamic SR Policy instantiation.Access Router Service Provisioning (IOS-XR)#Port based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 1000 target 1 source 1 interface TenGigE0/0/0/5 l2transport VLAN Based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 neighbor evpn evi 1000 target 1 source 1 !! interface TenGigE0/0/0/5.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric! L2VPN Static Pseudowire (PW) – Preferred Path (PCEP)Figure 9# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) ControlPlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Access Router Service Provisioning (IOS-XR)#Note# EVPN VPWS dual homing is not supported when using an SR-TE preferred path.Note# In IOS-XR 6.6.3 the SR Policy used as the preferred path must be referenced by its generated name and not the configured policy name. This requires first issuing the commandDefine SR Policy traffic-eng policy GREEN-PE3-1 color 1001 end-point ipv4 100.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type igp Determine auto-configured policy name The auto-configured policy name will be persistant and must be used as a reference in the L2VPN preferred-path configuration.RP/0/RP0/CPU0#A-PE8#show segment-routing traffic-eng policy candidate-path name GREEN-PE3-1   SR-TE policy database Color# 1001, End-point# 100.0.1.50 Name# srte_c_1001_ep_100.0.1.50 Port Based Service configurationinterface TenGigE0/0/0/15 l2transport ! ! l2vpn pw-class static-pw-class-PE3 encapsulation mpls control-word preferred-path sr-te policy srte_c_1001_ep_100.0.1.50 ! ! ! p2p Static-PW-to-PE3-1 interface TenGigE0/0/0/15 neighbor ipv4 100.0.0.3 pw-id 1000 mpls static label local 1000 remote 1000 pw-class static-pw-class-PE3 VLAN Based Service configurationinterface TenGigE0/0/0/5.1001 l2transport encapsulation dot1q 1001 rewrite ingress tag pop 1 symmetric ! ! l2vpn pw-class static-pw-class-PE3 encapsulation mpls control-word preferred-path sr-te policy srte_c_1001_ep_100.0.1.50 p2p Static-PW-to-PE7-2 interface TenGigE0/0/0/5.1001 neighbor ipv4 100.0.0.3 pw-id 1001 mpls static label local 1001 remote 1001 pw-class static-pw-class-PE3 Access Router Service Provisioning (IOS-XE)#Port Based service with Static OAM configurationinterface GigabitEthernet0/0/1 mtu 9216 no ip address negotiation auto no keepalive service instance 10 ethernet encapsulation default xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word ! pseudowire-static-oam class static-oam timeout refresh send 10 ttl 255 ! ! ! pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 status protocol notification static static-oam ! VLAN Based Service configurationinterface GigabitEthernet0/0/1 no ip address negotiation auto service instance 1 ethernet Static-VPWS-EVC encapsulation dot1q 10 rewrite ingress tag pop 1 symmetric xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word ! ! ! pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 Ethernet CFM for L2VPN service assuranceEthernet Connectivity Fault Management is an Ethernet OAM component used to validate end-to-end connectivity between service endpoints. Ethernet CFM is defined by two standards, 802.1ag and Y.1731. Within an SP network, Maintenance Domains are created based on service scope. Domains are typically separated by operator boundaries and may be nested but cannot overlap. Within each service, maintenance points can be created to verify bi-directional end to end connectivity. These are known as MEPs (Maintenance End-Point) and MIPs (Maintenance Intermediate Points). These maintenance points process CFM messages. A MEP is configured at service endpoints and has directionality where an “up” MEP faces the core of the network and a “down” MEP faces a CE device or NNI port. MIPs are optional and are created dynamically. Detailed information on Ethernet CFM configuration and operation can be found at https#//www.cisco.com/c/en/us/td/docs/routers/ncs5500/software/interfaces/configuration/guide/b-interfaces-hardware-component-cg-ncs5500-66x/b-interfaces-hardware-component-cg-ncs5500-66x_chapter_0101.htmlMaintenance Domain configurationA Maintenance Domain is defined by a unique name and associated level. The level can be 0-7. The numerical identifier usually corresponds to the scope of the MD, where 7 is associated with CE endpoints, 6 associated with PE devices connected to a CE. Additional levels may be required based on the topology and service boundaries which occur along the end-to-end service. In this example we only a single domain and utilize level 0 for all MEPs.ethernet cfm domain EVPN-VPWS-PE3-PE8 level 0 MEP configuration for EVPN-VPWS servicesFor L2VPN xconnect services, each service must have a MEP created on the end PE device. There are two components to defining a MEP, first defining the Ethernet CFM “service” and then defining the MEP on the physical or logical interface participating in the L2VPN xconnect service. In the following configuration the xconnect group “EVPN-VPWS-ODN-PE3” and P2P EVPN VPWS service odn-8 are already defined. The Ethernet CFM service of “odn-8” does NOT have to match the xconnect service name. The MEP crosscheck defines a remote MEP to listen for Continuity Check messages from. It does not have to be the same as the local MEP defined on the physical sub-interface (103), but for P2P services it is best practice to make them identical. This configuration will send Ethernet CFM Continuity Check (CC) messages every 1 minute to verify end to end reachability.L2VPN configurationl2vpn xconnect group EVPN-VPWS-ODN-PE3 p2p odn-8 interface TenGigE0/0/0/23.8 neighbor evpn evi 1318 target 8 source 8 ! ! !! Physical sub-interface configurationinterface TenGigE0/0/0/23.8 l2transport encapsulation dot1q 8 rewrite ingress tag pop 1 symmetric ethernet cfm mep domain EVPN-VPWS-PE3-PE8 service odn-8 mep-id 103 ! !! Ethernet CFM service configurationethernet cfm domain EVPN-VPWS-PE3-PE8 service odn-8 xconnect group EVPN-VPWS-ODN-PE3 p2p odn-8 mip auto-create all continuity-check interval 1m mep crosscheck mep-id 103 ! log crosscheck errors log continuity-check errors log continuity-check mep changes ! !! Multicast NG-MVPN Profile 14 using mLDP and ODN L3VPNIn ths service example we will implement multicast delivery across the CST network using mLDP transport for multicast and SR-MPLS for unicast traffic. L3VPN SR paths will be dynamically created using ODN. Multicast profile 14 is the “Partitioned MDT - MLDP P2MP - BGP-AD - BGP C-Mcast Signaling” Using this profile each mVPN will use a dedicated P2MP tree, endpoints will be auto-discovered using NG-MVPN BGP NLRI, and customer multicast state such as source streams, PIM, and IGMP membership data will be signaled using BGP. Profile 14 is the recommended profile for high scale and utilizing label-switched multicast (LSM) across the core.Multicast core configurationThe multicast “core” includes transit endpoints participating in mLDP only. See the mLDP core configuration section for details on end-to-end mLDP configuration.Unicast L3VPN PE configurationIn order to complete an RPF check for SSM sources, unicast L3VPN configuration is required. Additionally the VRF must be defined under the BGP configuration with the NG-MVPN address families configured. In our use case we are utilizing ODN for creating the paths between L3VPN endpoints with a route-policy attached to the mVPN VRF to set a specific color on advertised routes.ODN opaque ext-community setextcommunity-set opaque MLDP 1000end-set ODN route-policyroute-policy ODN-MVPN set extcommunity color MLDP passend-policy Global L3VPN VRF definitionvrf VRF-MLDP address-family ipv4 unicast import route-target 100#38 ! export route-policy ODN-MVPN export route-target 100#38 ! ! address-family ipv6 unicast import route-target 100#38 ! export route-policy ODN-MVPN export route-target 100#38 ! !! BGP configurationrouter bgp 100 vrf VRF-MLDP rd auto address-family ipv4 unicast redistribute connected redistribute static ! address-family ipv6 unicast redistribute connected redistribute static ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! !! Multicast PE configurationThe multicast “edge” includes all endpoints connected to native multicast sources or receivers.Define RPF policyroute-policy mldp-partitioned-p2mp set core-tree mldp-partitioned-p2mpend-policy! Enable Multicast and define mVPN VRFmulticast-routing address-family ipv4 interface Loopback0 enable ! ! vrf VRF-MLDP address-family ipv4 mdt source Loopback0 rate-per-route interface all enable accounting per-prefix bgp auto-discovery mldp ! mdt partitioned mldp ipv4 p2mp mdt data 100 ! !! Enable PIM for mVPN VRF In this instance there is an interface TenGigE0/0/0/23.2000 which is using PIM within the VRFrouter pim address-family ipv4 rp-address 100.0.1.50 ! vrf VRF-MLDP address-family ipv4 rpf topology route-policy mldp-partitioned-p2mp mdt c-multicast-routing bgp ! interface TenGigE0/0/0/23.2000 enable ! ! Enable IGMP for mVPN VRF interface To discover listeners for a specific group, enable IGMP on interfaces within the VRF. These interested receivers will be advertised via BGP to establish end to end P2MP trees from the source.router igmp vrf VRF-MLDP interface TenGigE0/0/0/23.2001 ! version 3 !! End-To-End VPN Services Data PlaneFigure 10# End-To-End Services Data PlaneHierarchical ServicesFigure 11# Hierarchical Services TableL3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE)Figure 12# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Provider Edge Router# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together withPseudowire-Headend (PWHE) via CLI or NSO Provider Edge Router# Path to remote PE is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p L3VPN-VRF1 interface TenGigE0/0/0/5.501 neighbor evpn evi 13 target 501 source 501 ! ! !interface TenGigE0/0/0/5.501 l2transport encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric Port based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 13 target 502 source 502 ! ! !! interface TenGigE0/0/0/5 l2transport Access Router Service Provisioning (IOS-XE)#VLAN based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric ! Port based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation default Provider Edge Router Service Provisioning (IOS-XR)#VRF configurationvrf L3VPN-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#501 ! export route-target 100#501 ! ! address-family ipv6 unicast import route-target 100#501 ! export route-target 100#501 ! ! BGP configurationrouter bgp 100 vrf L3VPN-ODNTE-VRF1 rd 100#501 address-family ipv4 unicast redistribute connected ! address-family ipv6 unicast redistribute connected ! ! PWHE configurationinterface PW-Ether1 vrf L3VPN-ODNTE-VRF1 ipv4 address 10.13.1.1 255.255.255.0 ipv6 address 1000#10#13##1/126 attach generic-interface-list PWHE! EVPN VPWS configuration towards Access PEl2vpn xconnect group evpn-vpws-l3vpn-A-PE3 p2p L3VPN-ODNTE-VRF1 interface PW-Ether1 neighbor evpn evi 13 target 501 source 501 ! Figure 13# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 withPseudowire-Headend (PWHE) Data PlaneL3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRBFigure 14# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 withAnycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together with Anycast IRBvia CLI or NSO Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2 l2transport !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 n-flag-clear L2VPN configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30 VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 15# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB Datal PlaneL2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRBFigure 16# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L2VPN Multipoint EVPN instance together withAnycast IRB via CLI or NSO (Anycast IRB is optional when L2 and L3is required in same service instance) Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Please note that provisioning on Access and Provider Edge routers issame as in “L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB”. In this use case there is BGP EVPN instead of MP-BGPVPNv4/6 in the core.Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !!interface TenGigE0/0/0/2 l2transport!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 L2VPN Configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30! VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 17# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Data PlaneRemote PHY CIN ImplementationSummaryDetail can be found in the CST 3.0 high-level design guide for design decisions, this section will provide sample configurations.Sample QoS PoliciesThe following are usable policies but policies should be tailored for specific network deployments.Class mapsClass maps are used within a policy map to match packet criteria for further treatmentclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 match mpls experimental topmost 5 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 match mpls experimental topmost 4 end-class-map!class-map match-any match-video-cs4-exp2 description Video match dscp 32 match mpls experimental topmost 2 end-class-map!class-map match-any match-cs6-exp6 description Highest priority control-plane traffic match dscp cs6 match mpls experimental topmost 6 end-class-map!class-map match-any match-qos-group-1 match qos-group 1 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-map!class-map match-any match-qos-group-6 match qos-group 3 end-class-map!class-map match-any match-traffic-class-1 description ~Match highest priority traffic-class 1~ match traffic-class 1 end-class-map!class-map match-any match-traffic-class-2 description ~Match high priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match medium traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-traffic-class-6 description ~Match video traffic-class 6~ match traffic-class 6 end-class-map RPD and DPIC interface policy mapsThese are applied to all interfaces connected to cBR-8 DPIC and RPD devices.Note# Egress queueing maps are not supported on L3 BVI interfacesRPD/DPIC ingress classifier policy mappolicy-map rpd-dpic-ingress-classifier class match-cs6-exp6 set traffic-class 1 set qos-group 1 ! class match-ef-exp5 set traffic-class 2 set qos-group 2 ! class match-cs5-exp4 set traffic-class 3 set qos-group 3 ! class match-video-cs4-exp2 set traffic-class 6 set qos-group 6 ! class class-default set traffic-class 0 set dscp 0 set qos-group 0 ! end-policy-map! P2P RPD and DPIC egress queueing policy mappolicy-map rpd-dpic-egress-queuing class match-traffic-class-1 priority level 1 queue-limit 500 us ! class match-traffic-class-2 priority level 2 queue-limit 100 us ! class match-traffic-class-3 priority level 3 queue-limit 500 us ! class match-traffic-class-6 priority level 6 queue-limit 500 us ! class class-default queue-limit 250 ms ! end-policy-map! Core QoSPlease see the general QoS section for core-facing QoS configurationCIN Timing ConfigurationPlease see the G.8275.2 timing configuration guide in this document for details on timing configuration. The following values should be used for PTP configuration attributes. Please note in CST 3.0 the use of an IOS-XR router as a Boundary Clock is only supported on P2P L3 interfaces. The use of a BVI for RPD aggregation requires the BC used for RPD nodes be located upstream, or alternatively a physical loopback cable may be used to provide timing off the IOS-XR based RPD leaf device. PTP variable IOS-XR configuration value IOS-XE value Announce Interval 1 1 Announce Timeout 5 5 Sync Frequency 16 -4 Delay Request Frequency 16 -4 Example CBR-8 RPD DTI Profileptp r-dti 4 profile G.8275.2 ptp-domain 60 clock-port 1 clock source ip 192.168.3.1 sync interval -4 announce timeout 5 delay-req interval -4 Multicast configurationSummaryWe present two different configuration options based on either native multicast deployment or the use of a L3VPN to carry Remote PHY traffic. The L3VPN option shown uses Label Switched Multicast profile 14 (partitioned mLDP) however profile 6 could also be utilized.Global multicast configuration - Native multicastOn CIN aggregation nodes all interfaces should have multicast enabled.multicast-routing address-family ipv4 interface all enable ! address-family ipv6 interface all enable enable ! Global multicast configuration - LSM using profile 14On CIN aggregation nodes all interfaces should have multicast enabled.vrf VRF-MLDP address-family ipv4 mdt source Loopback0 rate-per-route interface all enable accounting per-prefix bgp auto-discovery mldp ! mdt partitioned mldp ipv4 p2mp mdt data 100 ! ! PIM configuration - Native multicastPIM should be enabled for IPv4/IPv6 on all core facing interfacesrouter pim address-family ipv4 interface Loopback0 enable ! interface TenGigE0/0/0/6 enable ! interface TenGigE0/0/0/7 enable ! ! PIM configuration - LSM using profile 14The PIM configuration is utilized even though no PIM neighbors may be connected.route-policy mldp-partitioned-p2mp set core-tree mldp-partitioned-p2mpend-policy!router pim address-family ipv4 interface Loopback0 enable vrf rphy-vrf address-family ipv4 rpf topology route-policy mldp-partitioned-p2mp mdt c-multicast-routing bgp ! ! IGMPv3/MLDv2 configuration - Native multicastInterfaces connected to RPD and DPIC interfaces should have IGMPv3 and MLDv2 enabledrouter igmp interface BVI100 version 3 ! interface TenGigE0/0/0/25 version 3 !!router mld interface BVI100 version 2 interface TenGigE0/0/0/25 version 3 ! ! IGMPv3/MLDv2 configuration - LSM profile 14Interfaces connected to RPD and DPIC interfaces should have IGMPv3 and MLDv2 enabled as neededrouter igmp vrf rphy-vrf interface BVI101 version 3 ! interface TenGigE0/0/0/15 ! !!router mld vrf rphy-vrf interface TenGigE0/0/0/15 version 2 ! !! IGMPv3 / MLDv2 snooping profile configuration (BVI aggregation)In order to limit L2 multicast replication for specific groups to only interfaces with interested receivers, IGMP and MLD snooping must be enabled.igmp snooping profile igmp-snoop-1!mld snooping profile mld-snoop-1! RPD DHCPv4/v6 relay configurationIn order for RPDs to self-provision DHCP relay must be enabled on all RPD-facing L3 interfaces. In IOS-XR the DHCP relay configuration is done in its own configuration context without any configuration on the interface itself.Native IP / Default VRFdhcp ipv4 profile rpd-dhcpv4 relay helper-address vrf default 10.0.2.3 ! interface BVI100 relay profile rpd-dhcpv4!dhcp ipv6 profile rpd-dhcpv6 relay helper-address vrf default 2001#10#0#2##3 iana-route-add source-interface BVI100 ! interface BVI100 relay profile rpd-dhcpv6 RPHY L3VPNIn this example it is assumed the DHCP server exists within the rphy-vrf VRF, if it does not then additional routing may be necessary to forward packets between VRFs.dhcp ipv4 vrf rphy-vrf relay profile rpd-dhcpv4-vrf profile rpd-dhcpv4-vrf relay helper-address vrf rphy-vrf 10.0.2.3 relay information option allow-untrusted ! inner-cos 5 outer-cos 5 interface BVI101 relay profile rpd-dhcpv4-vrf interface TenGigE0/0/0/15 relay profile rpd-dhcpv4-vrf! cBR-8 DPIC interface configuration without Link HAWithout link HA the DPIC port is configured as a normal physical interfaceinterface TenGigE0/0/0/25 description .. Connected to cbr8 port te1/1/0 service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 carrier-delay up 0 down 0 load-interval 30 cBR-8 DPIC interface configuration with Link HAWhen using Link HA faster convergence is achieved when each DPIC interface is placed into a BVI with a statically assigned MAC address. Each DPIC interface is placed into a separate bridge-domain with a unique BVI L3 interface. The same MAC address should be utilized on all BVI interfaces. Convergence using BVI interfaces is <50ms, L3 physical interfaces is 1-2s.Even DPIC port CIN interface configurationinterface TenGigE0/0/0/25 description ~Connected to cBR8 port Te1/1/0~ lldp ! carrier-delay up 0 down 0 load-interval 30 l2transport !!l2vpn bridge group cbr8 bridge-domain port-ha-0 interface TenGigE0/0/0/25 ! routed interface BVI500 ! ! ! interface BVI500 description ~BVI for cBR8 port HA, requires static MAC~ service-policy input rpd-dpic-ingress-classifier ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 mac-address 8a.9698.64 load-interval 30! Odd DPIC port CIN interface configurationinterface TenGigE0/0/0/26 description ~Connected to cBR8 port Te1/1/1~ lldp ! carrier-delay up 0 down 0 load-interval 30 l2transport !!l2vpn bridge group cbr8 bridge-domain port-ha-1 interface TenGigE0/0/0/26 ! routed interface BVI501 ! ! ! interface BVI501 description ~BVI for cBR8 port HA, requires static MAC~ service-policy input rpd-dpic-ingress-classifier ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 mac-address 8a.9698.64 load-interval 30! cBR-8 Digital PIC Interface Configurationinterface TenGigE0/0/0/25 description .. Connected to cbr8 port te1/1/0 service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 carrier-delay up 0 down 0 load-interval 30 RPD interface configurationP2P L3In this example the interface has PTP enabled towards the RPDinterface TeGigE0/0/0/15 description To RPD-1 mtu 9200 ptp profile g82752_master_v4 ! service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 192.168.2.0 255.255.255.254 ipv6 address 2001#192#168#2##0/127 ipv6 enable ! BVIl2vpn bridge group rpd bridge-domain rpd-1 mld snooping profile mld-snoop-1 igmp snooping profile igmp-snoop-1 interface TenGigE0/0/0/15 ! interface TenGigE0/0/0/16 ! interface TenGigE0/0/0/17 ! routed interface BVI100 ! ! ! !!interface BVI100 description ... to downstream RPD hosts service-policy input rpd-dpic-ingress-classifier ipv4 address 192.168.2.1 255.255.255.0 ipv6 address 2001#192#168#2##1/64 ipv6 enable ! RPD/DPIC agg device IS-IS configurationThe standard IS-IS configuration should be used on all core interfaces with the addition of specifying all DPIC and RPD connected as IS-IS passive interfaces. Using passive interfaces is preferred over redistributing connected routes. This configuration is needed for reachability between DPIC and RPDs across the CIN network.router isis ACCESS interface TenGigE0/0/0/25 passive address-family ipv4 unicast ! address-family ipv6 unicast Additional configuration for L3VPN DesignGlobal VRF ConfigurationThis configuration is required on all DPIC and RPD connected routers as well as ancillary elements communicating with Remote PHY elementsvrf rphy-vrf address-family ipv4 unicast import route-target 100#5000 ! export route-target 100#5000 ! ! address-family ipv6 unicast import route-target 100#5000 ! export route-target 100#5000 ! ! BGP ConfigurationThis configuration is required on all DPIC and RPD connected routers as well as ancillary elements communicating with Remote PHY elementsrouter bgp 100 vrf rphy-vrf rd auto address-family ipv4 unicast label mode per-vrf redistribute connected ! address-family ipv6 unicast label mode per-vrf redistribute connected ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! ! ", "url": "/blogs/2019-12-10-cst-implementation-guide/", "author": "Phil Bedard", "tags": "iosxr, cisco, 5G, cin, rphy, Metro, Design" } , "blogs-2020-01-10-peering-fabric-hld": { "title": "Peering Fabric Design", "content": " On This Page Revision History Key Drivers Traffic Growth Network Simplification Network Efficiency High-Level Design Peering Strategy Content Cache Aggregation Topology and Peer Distribution Platforms Control-Plane Telemetry Automation Zero Touch Provisioning Cisco Crosswork Health Insights KPI pack Advanced Security using BGP Flowspec and QPPB (1.5) Radware validated DDoS solution Radware DefensePro Radware DefenseFlow Solution description Solution diagram Router SPAN (monitor) to physical interface configuration Router SPAN (monitor) to PWE Internet and Peering in a VRF RPKI Next-Generation IXP Fabric Validated Design Peering Fabric Design Use Cases Traditional IXP Peering Migration to Peering Fabric Peering Fabric Extension Localized Metro Peering and Content Delivery Express Peering Fabric Datacenter Edge Peering Peer Traffic Engineering with Segment Routing ODN (On-Demand Next-Hop) for Peering DDoS Traffic Steering using SR-TE and EPE Low-Level Design Integrated Peering Fabric Reference Diagram Distributed Peering Fabric Reference Diagram Peering Fabric Hardware Detail NCS-5501-SE NCS-55A1-36H-SE NCS-55A1-24H NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card NCS-55A2-MOD-SE-S Peer Termination Strategy Distributed Fabric Device Roles PFL – Peering Fabric Leaf PFS – Peering Fabric Spine Device Interconnection Capacity Scaling Peering Fabric Control Plane PFL to Peer PFL to PFS PFS to Core SR Peer Traffic Engineering Summary Nodal EPE Peer Interface EPE Abstract Peering SR-TE On-Demand Next-Hop for Peering ODN Configuration IXP Fabric Low Level Design Segment Routing Underlay EVPN L2VPN Services Peering Fabric Telemetry Telemetry Diagram Model-Driven Telemetry BGP Monitoring Protocol Netflow / IPFIX Automation and Programmability Cisco NSO Modules Netconf YANG Model Support 3rd Party Hosted Applications XR Service Layer API Recommended Device and Protocol Configuration Overview Common Node Configuration Enable LLDP Globally PFS Nodes IGP Configuration Segment Routing Traffic Engineering BGP Global Configuration Model-Driven Telemetry Configuration PFL Nodes Peer QoS Policy Peer Infrastructure ACL Peer Interface Configuration IS-IS IGP Configuration BGP Add-Path Route Policy BGP Global Configuration EBGP Peer Configuration PFL to PFS IBGP Configuration Netflow/IPFIX Configuration Model-Driven Telemetry Configuration Abstract Peering Configuration PFS Configuration BGP Flowspec Configuration and Operation Enabling BGP Flowspec Address Families on PFS and PFL Nodes BGP Flowspec Server Policy Definition BGP Flowspec Server Enablement BGP Flowspec Client Configuration QPPB Configuration and Operation Routing Policy Configuration Global BGP Configuration QoS Policy Definition Interface-Level Configuration BGP Graceful Shutdown Outbound graceful shutdown configuration Inbound graceful shutdown configuration Activating graceful shutdown Security Peering and Internet in a VRF VRF per Peer, default VRF for Internet Internet in a VRF Only VRF per Peer, Internet in a VRF Infrastructure ACLs BCP Implementation BGP Attribute and CoS Scrubbing Per-Peer Control Plane Policers BGP Prefix Security RPKI Origin Validation BGP RPKI and ROV Confguration Create ROV Routing Policies Configure RPKI Server and ROV Options Enabling RPKI ROV on BGP Neighbors Communicating ROV Status via Well-Known BGP Community BGPSEC (Reference Only) DDoS traffic steering using SR-TE SR-TE Policy configuration Egress node BGP configuration Egress node MPLS static LSP configuration Appendix Applicable YANG Models NETCONF YANG Paths BGP Operational State Global BGP Protocol State BGP Neighbor State Example Usage BGP RIB Data Example Usage BGP Flowspec Device Resource YANG Paths Validated Model-Driven Telemetry Sensor Paths Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform LLDP Monitoring Interface statistics and state The following sub-paths can be used but it is recommended to use the base openconfig-interfaces model Aggregate bundle information (use interface models for interface counters) BGP Peering information IS-IS IGP information It is not recommended to monitor complete RIB tables using MDT but can be used for troubleshooting QoS and ACL monitoring BGP RIB information It is not recommended to monitor these paths using MDT with large tables Routing policy Information Revision History Version Date Comments 1.0 05/08/2018 Initial Peering Fabric publication 1.5 07/31/2018 BGP-FS, QPPB, ZTP, Internet/Peering in a VRF, NSO Services 2.0 04/01/2019 IXP Fabric, ODN and SR-PCE for Peering, RPKI 3.0 01/10/2020 SR-TE steering for DDoS, BGP graceful shutdown, Radware DDoS validation Key DriversTraffic GrowthInternet traffic has seen a compounded annual growth rate of 30% orhigher over the last five years, as more devices are connected and morecontent is consumed, fueled by the demand for video. Traffic willcontinue to grow as more content sources are added and Internetconnections speeds increase. Service and content providers must designtheir peering networks to scale for a future of more connected deviceswith traffic sources and destinations spanning the globe. Efficientpeering is required to deliver traffic to consumers.Network SimplificationSimple networks are easier to build and easier to operate. As networksscale to handle traffic growth, the level of network complexity mustremain flat. A prescriptive design using standard discrete componentsmakes it easier for providers to scale from networks handling a smallamount of traffic to 10s of Tbps without complete network forklifts.Fabrics with reduced control-plane elements and feature sets enhancestability and availability. Dedicating nodes to specific functions ofthe network also helps isolate the rest of the network from maliciousbehavior, defects, or instability.Network EfficiencyNetwork efficiency refers not only to maximizing network resources butalso optimizing the environmental impact of the deployed network. Muchof Internet peering today is done in 3rd party facilitieswhere space, power, and cooling are at a premium. High-density, lowerenvironmental footprint devices are critical to handling more trafficwithout exceeding the capabilities of a facility. In cases wheremultiple facilities must be connected, a simple and efficient way toextend networks must exist.High-Level DesignThe Peering design incorporates high-density environmentallyefficient edge routers, a prescriptive topology and peer terminationstrategy, and features delivered through IOS-XR to solve the needs ofservice and content providers. Also included as part of the Peeringdesign are ways to monitor the health and operational status of thepeering edge and through Cisco NSO integration assist providers inautomating peer configuration and validation. All designs areboth feature tested and validated as a complete design to ensurestability once implemented.Peering Strategyproposes a localized peering strategy to reduce network cost for“eyeball” service providers by placing peering or content provider cachenodes closer to traffic consumers. This reduces not only reducescapacity on long-haul backbone networks carrying traffic from IXPs toend users but also improves the quality of experience for users byreducing latency to content sources. The same design can also be usedfor content provider networks wishing to deploy a smaller footprintsolution in a SP location or 3rd party peering facility.Content Cache AggregationTraditional peering via EBGP at defined locations or over point to point circuits between routers is not sufficient enough today to optimize and efficiently deliver content between content providers and end consumers. Caching has been used for decades now performing traffic offload closer to eyeballs, and plays a critical role in today’s networks. The Peering Fabric design considers cache aggregation another role in “Peering” in creating a cost-optimized and scalable way to aggregate both provider and 3rd party caching servers such as those from Netflix, Google, or Akamai. The following diagram ** depicts a typical cache aggregation scenario at a metro aggregation facility. In larger high bandwidth facilities it is recommended to place caching nodes on a separate scalable set of devices separate from functions such as PE edge functions. Deeper in the network, Peering Fabric devices have the flexibility to integrate other functions such as small edge PE and compute termination such as in a 5G Mobile Edge Compute edge DC. Scale limitations are not a consideration with the ability to support full routing tablesin an environmentally optimized 1RU/2RU footprint.Topology and Peer DistributionThe Cisco Peering Fabric introduces two options for fabric topology andpeer termination. The first, similar to more traditional peeringdeployments, collapses the Peer Termination and Core Connectivitynetwork functions into a single physical device using the device’sinternal fabric to connect each function. The second option utilizes afabric separating the network functions into separate physical layers,connected via an external fabric running over standard Ethernet.In many typical SP peering deployments, a traditional two-node setup isused where providers vertically upgrade nodes to support the highercapacity needs of the network. Some may employ technologies such as backto back or multi-chassis clusters in order to support more connectionswhile keeping what seems like the operational footprint low. However,failures and operational issues occurring in these types of systems aretypically difficult to troubleshoot and repair. They also requirelengthy planning and timeframes for performing system upgrades. Weintroduce a horizontally scalable distributed peering fabric, the endresult being more deterministic interface or node failures.Minimizing the loss of peering capacity is very important for bothingress-heavy SPs and egress-heavy content providers. The loss of localpeering capacity means traffic must ingress or egress a sub-optimalnetwork port. Making a conscious design decision to spread peerconnections, even to the same peer, across multiple edge nodes helpsincrease resiliency and limit traffic-affecting network events.PlatformsThe Cisco NCS5500 platform is ideal for edge peer termination, given itshigh-density, large RIB and FIB scale, buffering capability, and IOS-XRsoftware feature set. The NCS5500 is also space and power efficient with36x100GE supporting up to 4M IPv4 routes in a 1RU fixed form factor orsingle modular line card. The Peering fabric can provide36x100GE, 144x10GE, or a mix of non-blocking peering connections withfull resiliency in 4RU. The fabric can also scale to support 10s ofterabits of capacity in a single rack for large peering deployments.Fixed chassis are ideal for incrementally building a peering edgefabric, the NCS NC55-36X100GE-A-SE and NC55A1-24H are efficient highdensity building blocks which can be rapidly deployed as needed withoutinstalling a large footprint of devices day one. Deployments needingmore capacity or interface flexibility such as IPoDWDM to extend peeringcan utilize the NCS5504 4-slot or NCS5508 8-slot modular chassis. If thepeering location has a need for services termination the ASR9000 familyor XRv-9000 virtual edge node can be incorporated into the fabric.All NCS5500 routers also contain powerful Route Processors to unlockpowerful telemetry and programmability. The Peering Fabric fixedchassis contain 1.6Ghz 8-core processors and 32GB of RAM. The latestNC55-RP-E for the modular NCS5500 chassis has a 1.9Ghz 6-core processorand 32G of RAM.Control-PlaneThe peering fabric design introduces a simplified control-plane builtupon IPv4/IPv6 with Segment Routing. In the collapsed design, eachpeering node is connected to EBGP peers and upstream to the core viastandard IS-IS, OSPF, and TE protocols, acting as a PE or LER in aprovider network.In the distributed design, network functions are separated. PeerTermination happens on Peering Fabric Leaf nodes. Peering Fabric Spineaggregation nodes are responsible for Core Connectivity and perform moreadvanced LER functions. The PFS routers use ECMP to balance trafficbetween PFL routers and are responsible for forwarding within the fabricand to the rest of the provider network. Each PFS acts as an LER,incorporated into the control-plane of the core network. The PFS, oralternatively vRRs, reflect learned peer routes from the PFL to the restof the network. The SR control-plane supports several trafficengineering capabilities. EPE to a specific peer interface, PFL node, orPFS is supported. We also introduce the abstract peering concept wherePFS nodes utilize a next-hop address bound to an anycast SR SID to allowtraffic engineering on a per-peering center basis.TelemetryThe Peering fabric design uses the rich telemetry available in IOS-XRand the NCS5500 platform to enable an unprecedented level of insightinto network and device behavior. The Peering Fabric leverages Model-DrivenTelemetry and NETCONF along with both standard and native YANG modelsfor metric statistics collection. Telemetry configuration and applicablesensor paths have been identified to assist providers in knowing what tomonitor and how to monitor it.AutomationNETCONF and YANG using OpenConfig and native IOS-XR models are used tohelp automate peer configuration and validation. Cisco has developed specific Peering Fabric NSO service models to help automate common tasks suchas peer interface configuration, peer BGP configuration, and addingphysical interfaces to an existing peer bundle.Zero Touch ProvisioningIn addition to model-driven configuration and operation, Peering Fabric 1.5 alsosupports ZTP operation for automated device provisioning. ZTP is useful both in production as well as staging environments to automate initial device software installation, deploy an initial bootstrap configuration, as well as advanced functionality triggered by ZTP scripts. ZTP is supported on both out of band management interfaces as well as in-band data interfaces.Cisco Crosswork Health Insights KPI packTo ease the monitoring of common peering telemetry using CW Health Insights, a peering sensor pack is available containing common elements monitored for peering not included in the baseline CW HI KPI definitions. These include BGP session monitoring, RIB/FIB counts, and Flowspec statistics.Advanced Security using BGP Flowspec and QPPB (1.5)Release 1.5 of the Cisco Peering Fabric enhances the design by adding advancedsecurity capabilities using BGP Flowspec and QoS Policy Propagation using BGPor QPPB. BGP Flowspec was standardized in RFC 5575 and defines additional BGPNLRI to inject packet filter information to receiving routers. BGP is the control-plane fordisseminating the policy information while it is up to the BGP Flowspecreceiver to implement the dataplane rules specified in the NLRI. At theInternet peering edge, DDoS protection has become extremely important,and automating the remediation of an incoming DDoS attack has becomevery important. Automated DDoS protection is only one BGP Flowspec usecase, any application needing a programmatic way to create interfacepacket filters can make se use of its capabilities.QPPB allows using BGP attributes as a match criteria in dataplane packet filters. Matching packets based on attributes like BGP community and AS Path allows serviceproviders to create simplified edge QoS policies by not having to manage more cumbersome prefix lists or keep up to date when new prefixes are added. QPPB is supported in the peering fabric for destination prefix BGP attribute matching and has a number of use cases when delivering traffic from external providers to specific internal destinations.Radware validated DDoS solutionRadware, a Cisco partner, provides a robust and intelligent DDoS detection and mitigation solution covering both volumetric and application-layer DDoS attacks. The validated solution includes the following elements#Radware DefenseProDefensePro is used for attack detection and traffic scrubbing. DefensePro can be deployed at the edge of the network or centralized as is the case with a centralized scrubbing center. DefensePro uses realtime traffic analysis through SPAN (monitor) sessions from the edge routers to the DefensePro virtual machine or hardware appliance.Radware DefenseFlowDefenseFlow can work in a variety of ways as part of a comprehensive DDoS mitigation solution. DefenseFlow performs $anomaly detection by using advanced network behavioral analysis to first baseline a network during peacetime and then evaluate anomalies to determine when an attack is occurring. DefenseFlow can also incorporate third party data such as flow data or other data to enhance its attack detection capability. DefenseFlow also coordinates the mitigation actions of other solution components such as DefensePro and initiates traffic redirection through the use of BGP and BGP Flowspec on edge routers.Solution descriptionThe following steps describe the analysis and mitigation of DDoS attacks using Radware components. Radware DefenseFlow is deployed to orchestrate DDoS attack detection and mitigation. Virtual or appliance version of Radware DefensePro is deployed to a peering fabric location or centralized location. PFL nodes use interface monitoring sessions to mirror specific ingress traffic to an interface connected to the DefensePro element. The interface can be local to the PFL node or traffic or SPAN over Pseudowire can be used to tunnel traffic to an interface attached to a centralized DefensePro.Solution diagramRouter SPAN (monitor) to physical interface configurationThe following is used to direct traffic to a DefensePro virtual machine or appliance.monitor-session radware ethernet destination interface TenGigE0/0/2/2!interface TenGigE0/0/2/1 description ~DefensePro clean interface~ ipv4 address 182.10.1.1 255.255.255.252! interface TenGigE0/0/2/2 description ~SPAN interface to DefensePro~ !interface TenGigE0/0/2/3 description ~Transit peer connection~ ipv4 address 182.30.1.1 255.255.255.252 monitor-session radware ethernet port-level !end Router SPAN (monitor) to PWEThe following is used to direct traffic to a DefensePro virtual machine or appliance at a remote locationmonitor-session radware ethernet destination pseudowire !l2vpn xconnect group defensepro-remote p2p dp1 monitor-session radware neighbor ipv4 100.0.0.1 pw-id 1!interface TenGigE0/0/2/3 description ~Transit peer connection~ ipv4 address 182.30.1.1 255.255.255.252 monitor-session radware ethernet port-level !end Internet and Peering in a VRFWhile Internet peering and carrying the Internet table in a provider network is typically done using the Global Routing Table (default VRF in IOS-XR) many modern networks are being built to isolate the GRT from the underlying infrastructure. In this case, the Internet global table is carried as a service just like any other VPN service, leaving the infrastructure layer protected from both the global Internet. Another application using VRFs is to simply isolate peers to specific VRFs in order to isolate the forwarding plane of each peer from each other and be able to control which routes a peer sees by the use of VPN route target communities as opposed to outbound routing policy. In this simplified use the case the global table is still carried in the default VRF, using IOS-XR capabilities to import and export routes to and from specific peer VRFs. Separating Internet and Peering routes into specific VRFs also gives flexibility in creating custom routing tables for specific customers, giving a service provider the flexibility to offer separate regional or global reach on the same network.Internet in a VRF and Peering in a VRF for IPv4 and IPv6 are compatible with most Peering Fabric features. Specific caveats are document in the Appendixof the document.RPKIRPKI stands for Resource Public Key Infrastructure and is a repository for attaching a trust anchor to Internet routing resources such as Autonomous Systems and IP Prefixes. Each RIR (Regional Internet Registry) houses the signed resource records it is responsible for, giving a trust anchor to those resources.The RPKI contains a Route Origin Authorization object, used to uniquely identify the ASN originating a prefix and optionally, the longer sub-prefixes covered by it. RPKI records are published by each Regional Internet Regitstry (RIR) adn consume by offline RPKI validators. The RPKI validator is an on-premise application responsible for compiling a list of routes considered VALID. Keep in mind these are only the routes which are registered in the RPKI database, no information is gathered from the global routing table. Once resource records are validated, the validator uses the RTR protocol **insert RFC ref to communicate with client routers who periodically make requests for an updated database.The router uses this database along with policy to validate incoming BGP prefixes against the database, a process called as Route Origin Validation (ROV). ROV verifies the origin ASN in the AS_PATH of the prefix NLRI matches the RPKI database. A communication flow diagram is given below. RPKI configuration examples are given in the implementation section.The Peering Fabric design was validated using the Routinator RPKI validator. Please see the security section for configuration of RPKI ROV in IOS-XR.Next-Generation IXP FabricIntroduced in Peering Fabric 2.0 is a modern design for IXP fabrics. The design creates a simplified fault-tolerant L2VPN fabric with point to point and multi-point peer connectivity. Segment Routing brings a simplified MPLS underlay with resilience using TI-LFA and traffic engineering capabilities using Segment Routing - Traffic Engineering Policies. Today’s IX Fabrics utilize either traditional L2 networks or emulated L2 using VPLS and LDP/RSVP-TE underlays. The Cisco NG IX Fabric uses EVPN for all L2VPN services, replacing complicated LDP signaled services with a scalable BGP control-plane. See the implementation section for more details on configuring the IX fabric underlay and EVPN services.The IX fabric can also utilize the NSO automation created in the Metro Fabric design for deploying EVPN VPWS (point-to-point) and multi-point EVPN ELAN services.Validated DesignThe Peering Fabric Design control, management, and forwarding planes haveundergone validation testing to ensure individual design features workas intended and the peering fabric as a whole performs without fault.Validation is done exceeding real-world scaling requirements to ensurethe design fulfills its rule in existing networks with room for futuregrowth.Peering Fabric Design Use CasesTraditional IXP Peering Migration to Peering FabricA traditional SP IXP design traditionally uses one or two large modularsystems terminating all peering connections. In many cases, sinceproviders are constrained on space and power they use a collapsed designwhere the minimal set of peering nodes not only terminates peerconnections but also provides services and core connectivity to thelocation. The Peering Fabric uses best of breed high density,low footprint hardware requiring much less space than older generationmodular systems. Many older systems provide densities at approximately4x100GE per rack unit, while Peering Fabric PFL nodes start at 24x100GEor 36x100GE per 1RU with high FIB capability. Due to the superior spaceefficiency, there is no longer a limitation of using just a pair ofnodes for these functions. In either a collapsed function or distributedfunction design, peers can be distributed across a number of devices toincrease resiliency and lessen collateral impact when failures occur.The diagram below shows a fully distributed fabric, where peers are nowdistributed across three PFL nodes, each with full connectivity toupstream PFS nodes.Peering Fabric ExtensionIn some cases, there may be peering facilities within close geographicproximity which need to integrate into a single fabric. This may happenif there are multiple 3rd party facilities in a closegeographic area, each with unique peers you want to connect to. Theremay also be multiple independent peering facilities within a smallgeographic area you do not wish to install a complete peering fabricinto. In those cases, connecting remote PFL nodes to a larger peeringfabric can be done using optical transport or longer range gray optics.Localized Metro Peering and Content DeliveryIn order to drive greater network efficiency, content sources should beplaces as close to the end destination as possible. Traditional wirelineand wireless service providers have heavy inbound traffic from contentproviders delivering OTT video. Providers may also be providing theirown IP video services to on-net and off-net destinations via a SP CDN.Peering and internal CDN equipment can be placed within a localized peeror content delivery center, connected via a common peering fabric. Inthese cases the PFS nodes connect directly to the metro core to enabledelivery across the region or metro.Express Peering FabricAn evolution to localized metro peering is to interconnect the PFSpeering nodes directly or a metro-wide peering core. The main driver fordirect interconnection is minimizing the number of router and transportnetwork interfaces traffic must pass through. High density opticalmuxponders such as the NCS1002 along with flexible photonic ROADMarchitectures enabled by the NCS2000 can help make the most efficientuse of metro fiber assets.Datacenter Edge PeeringIn order to serve traffic as close to consumer endpoints as possible aprovider may construct a peering edge attached to an edge or centraldatacenter. As gateway functions in the network become virtualized forapplications such as vPE, vCPE, and mobile 5G, the need to attachInternet peering to the SP DC becomes more important. The Peering Fabric supports interconnected to the DC via the SP core or withthe PFS nodes as leafs to the DC spine. These would act as traditionalborder routers in the DC design.Peer Traffic Engineering with Segment RoutingSegment Routing performs efficient source routing of traffic across aprovider network. Traffic engineering is particular applicable topeering as content providers look for ways to optimize egress networkports and eyeball providers work to reduce network hops between ingressand subscriber. There are also a number of advanced use cases based onusing constraints to place traffic on optimal paths, such as latency. AnSRTE Policy represents a forwarding entity within the SR domain mappingtraffic to a specific network path, defined statically on the node orcomputed by an external PCE. An additional benefit of SR is the abilityto source route traffic based on a node SID or an anycast SIDrepresenting a set of nodes. ECMP behavior is preserved at each point inthe network, redundancy is simplified, and traffic protection issupplied using TI-LFA.In the Low-Level Design we explore common peer engineering use cases.Much more information on Segment Routing technology and its futureevolution can be found at http#//segment-routing.netODN (On-Demand Next-Hop) for PeeringThe 2.0 release of Peering Fabric introduces ODN as a method for dynamically provisioning SR-TE Policies to nodes based on specific “color” extended communities attached to advertised BGP routes. The color represents a set of constraints used for the provisioned SR-TE Policy, applied to traffic automatically steered into the Policy once the SR-TE Policy is instantiated.An applicable example is the use case where I have several types of peers on the same device sending traffic to destinations across my larger SP network. Some of this traffic may be Best Effort with no constraints, other traffic from cloud partners may be considered low-latency traffic, and traffic from a services partner may have additional constraints such as maintaining a disjoint path from the same peer on another router. Traffic in the reverse direction egressing a peer from a SP location can also utilize the same mechanisms to apply constraints to egress traffic.DDoS Traffic Steering using SR-TE and EPESR-TE and Egress Peer Engineering can be utilized to direct DDoS traffic to a specific end node and specific DDoS destination interface without the complexities of using VRFs to separate dirty/clean traffic. On ingress, traffic is immediately steered into a SR-TE Policy and no IP lookup is performed between the ingress node and egress DDoS “dirty” interface. In the 3.0 design using IOS-XR 6.6.3 Flowspec redirects traffic to a next-hop IP pointing to a pre-configured “DDoS” SR-Policy. An MPLS xconnect is used map DDoS traffic with a specific EPE label on the egress node to a specific egress interface.Low-Level DesignIntegrated Peering Fabric Reference DiagramDistributed Peering Fabric Reference DiagramPeering Fabric Hardware DetailThe NCS5500 family of routers provide high density, high routing scale,idea buffer sizes, and environmental efficiency to help providerssatisfy any peering fabric use case. Due to high FIB scale, largebuffers, and broad XR feature set, all prescribed hardware can serve ineither a collapsed or distributed fabric. Further detailed informationon each platform can be found athttps#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.html.NCS-5501-SEThe NCS 5501 is a 1RU fixed router with 40X10GE SFP+ and 4X100GE QSFP28interfaces. The 5501 has IPv4 FIB scale of at least 2M routes. The5501-SE is ideal as a peering leaf node when providers need 10GEinterface flexibility such as ER, ZR, or DWDM.NCS-55A1-36H-SEThe 55A1-36H-SE is a second generation 1RU NCS5500 fixed platform with36 100GE QSFP28 ports operating at line rate. The –SE model contains anexternal TCAM increasing route scale to a minimum of 3M IPv4/512K IPv6routes in its FIB. It also contains a powerful multi-core routeprocessor with 64GB of RAM and an on-board 64GB SSD. Its high density,efficiency, and buffering capability make it ideal in 10GE or 100GEdeployments. Peering fabrics can scale to much higher capacity 1RU at atime by simply adding additional 55A1-36H-SE spine nodes.NCS-55A1-24HThe NCS-55A1-24H is a second generation 1RU NCS5500 fixed platform with24 100GE QSFP28 ports. The device uses two 900GB NPUs, with 12X100GEports connected to each NPU. The 55A1-24H uses a high scale NPU with aminimum of 1.3M IPv4/256K IPv6 routes. At just 675W it is ideal for 10GEpeering fabric deployments with a migration path to 100GE connectivity.The 55A1-24H also has a powerful multi-core processor and 32GB of RAM.NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card Very large peering fabric deployments or those needing interfaceflexibility such as IPoDWDM connectivity can use the modular NCS5500series chassis. Large deployments can utilize the second-generation36X100G-A-SE line card with external TCAM, supporting a minimum of 3MIPv4 routes.NCS-55A2-MOD-SE-SThe NCS-55A2-MOD router is a 2RU router with 24x10G SFP+ interfaces, 16x25 SFP28 interfaces, and two Modular Port Adapter (MPA) slots with 400Gbps of full-duplex bandwidth. A variety of MPAs are available, adding additional 10GE, 100GE QSFP28, and 100G/200G CFP2 interfaces. The CFP2 interfaces support CFP2-DCO Digital Coherent Optics, simplifying deployment for peering extensions connected over dark fiber or DWDM multiplexers.The 55A2-MOD-SE-S uses a next-generation external TCAM with a minimum route scale of 3M IPv4/512K IPv6. The 55A2-MOD-SE-S also supports advanced security using BGP Flowspec and QPPB.Peer Termination StrategyOften overlooked when connecting to Internet peers is determining astrategy to maximize efficiency and resiliency within a local peeringinstance. Often times a peer is connected to a single peering node evenwhen two nodes exist for ease of configuration and coordination with thepeering or transit partner. However, with minimal additionalconfiguration and administration assisted by automation, even singlepeers can be spread across multiple edge peering nodes. Ideally, withina peering fabric, a peer is connected to each leaf in the fabric. Incases where this cannot be done, the provider should use capacityplanning processes to balance peers and transit connections acrossmultiple leafs in the fabric. The added resiliency leads to greaterefficiency when failures do happen, with less reliance on peeringcapacity further away from the traffic destination.Distributed Fabric Device RolesPFL – Peering Fabric LeafThe Peering Fabric Leaf is the node physically connected to externalpeers. Peers could be aggregation routers or 3rd party CDNnodes. In a deconstructed design the PFL is analogous to a line card ina modular chassis solution. PFL nodes can be added as capacity needsgrow.PFS – Peering Fabric SpineThe Peering Fabric Spine acts as an aggregation node for the PFLs and isalso physical connected to the rest of the provider network. Theprovider network could refer to a metro core in the case of localizedpeering, a backbone core in relation to IXP peering, a DC spine layer inthe case of DC peering.Device InterconnectionIn order to maximize resiliency in the fabric, each PFL node isconnected to each PFS. While the design shown includes three PFLs andtwo PFS nodes, there could be any number of PFL and PFS nodes, scalinghorizontally to keep up with traffic and interface growth. PFL nodes arenot connected to each other, the PFS nodes provide the capacity for anytraffic between those nodes. The PFS nodes are also not interconnectedto each other, as no end device should terminate on the PFL, only otherrouters.Capacity ScalingCapacity of the peering fabric is scaled horizontally. The uplinkcapacity from PFL to PFS will be determine by an appropriateoversubscription factor determined by the service provider’s capacityplanning exercises. The leaf/spine architecture of the fabric connectseach PFL to each PFS with equal capacity. In steady-state operationtraffic is balanced between the PFS and PFL in both directions,maximizing the total capacity. The entropy in peering traffic generallyensures equal distribution between either ECMP paths or bundle interfacemember links in the egress direction. More information can be found inthe forwarding plane section of the document. An example deployment mayhave two NC55-36X100G-A-SE spine nodes and two NC55A1-24H leaf nodes. Ina 100GE peer deployment scenario each leaf would support 14x100GE clientconnections and 5x100GE to each spine node. A 10GE deployment wouldsupport 72x10GE client ports and 3x100GE to each spine, at a 1.2#1oversubscription ratio.Peering Fabric Control PlanePFL to PeerThe Peering Fabric Leaf is connected directly to peers via traditionalEBGP. BFD may additionally be used for fault detection if agreed to bythe peer. Each EBGP peer will utilize SR EPE to enable TE to the peerfrom elsewhere on the provider network.PFL to PFSPFL to Peering Fabric Spine uses widely deployed standard routingprotocols. IS-IS is the prescribed IGP protocol within the peeringfabric. Each PFS is configured with the same IS-IS L1 area. In the casewhere OSPF is being used as an IGP, the PFL nodes will reside in an OSPFNSSA area. The peering fabric IGP is SR-enabled with the loopback ofeach PFL assigned a globally unique SR Node SID. Each PFL also has anIBGP session to each PFR to distribute its learned EBGP routes upstreamand learn routes from elsewhere on the provider network. If a provideris distributing routes from PFL to PFL or from another peering locationto local PFLs it is important to enable the BGP “best-path-external”feature to ensure the PFS has the routing information to acceleratere-convergence if it loses the more preferred path.Egress peer engineering will be enabled for EBGP peering connections, sothat each peer or peer interface connected to a PFL is directlyaddressable by its AdJ-Peer-SID from anywhere on the SP network.Adj-Peer-SID information is currently not carried in the IGP of thenetwork. If utilized it is recommended to distribute this informationusing BGP-LS to all controllers creating paths to the PFL EPEdestinations.Each PFS node will be configured with IBGP multipath so traffic is loadbalanced to PFL nodes and increase resiliency in the case of peerfailure. On reception of a BGP withdraw update for a multipath route,traffic loss is minimized as the existing valid route is stillprogrammed into the FIB.PFS to CoreThe PFS nodes will participate in the global Core control plane and actas the gateway between the peering fabric and the rest of the SPnetwork. In order to create a more scalable and programmatic fabric, itis prescribed to use Segment Routing across the core infrastructure.IS-IS is the preferred protocol for transmitting SR SID information fromthe peering fabric to the rest of the core network and beyond. Indeployments where it may be difficult to transition quickly to an all-SRinfrastructure, the PFS nodes will also support OSPF and RSVP-TE forinterconnection to the core. The PFS acts as an ABR or ASBR between thepeering fabric and the larger metro or backbone core network.SR Peer Traffic EngineeringSummarySR allows a provider to create engineered paths to egress peeringdestinations or egress traffic destinations within the SP network. Astack of globally addressable labels is created at the traffic entrypoint, requiring no additional protocol state at midpoints in thenetwork and preserving qualities of normal IGP routing such as ECMP ateach hop. The Peering Fabric proposes end-to-end visibility fromthe PFL nodes to the destinations and vice-versa. This will allow arange of TE capabilities targeting a peering location, peering exitnode, or as granular as a specific peering interface on a particularnode. The use of anycast SIDs within a group of PFS nodes increasesresiliency and load balancing capability.Nodal EPENode EPE directs traffic to a specific peering node within the fabric.The node is targeted using first the PFS cluster anycast IP along withthe specific PFL node SID.Peer Interface EPEThis example uses an Egress Peer Engineering peer-adj-SID value assignedto a single peer interface. The result is traffic sent along this SRpath will use only the prescribed interface for egress traffic.Abstract PeeringAbstract peering allows a provider to simply address a Peering Fabric bythe anycast SIDs of its cluster of PFS nodes. In this case PHP is usedfor the anycast SIDs and traffic is simply forwarded as IP to the finaldestination across the fabric.SR-TE On-Demand Next-Hop for PeeringSR-TE On-Demand Next-Hop is a method to dynamically create specific constraint-based tunnels across an SP network to/from edge peering nodes. ODN utilizes Cisco’s Segment Routing Path Computation Element (SR-PCE) to compute paths on demand based on the BGP next-hop and associated “color” communities.When a node receives a route with a specific community, it builds a SR-TE Policy to the BGP next-hop based on policy.One provider example is the case where I have DIA (Direct Internet Access) customers with different levels of service. I can create a specific SLA for “Gold” customers so their traffic takes a lower latency path across the network. In B2B peering arrangements, I can ensure voice or video traffic I am ingesting from a partner network takes priority. I can do this without creating a number of static tunnels on the network.ODN ConfigurationODN requires a few components be configured. In this example we tag routes coming from a specific provider with the color “BLUE” with a numerical value of 100. In IOS-XR we first define an extended community set defining our color with a unique string identifier of BLUE. This configuration should be found on both the ingress and egress nodes of the SR Policy.extcommunity-set opaque BLUE 100end-setThe next step is to define an inbound routing policy on the PFL nodes tagging all inbound routes from PEER1 with the BLUE extended community.route-policy PEER1-IN set community (65000#100) set local-preference 100 set extcommunity color BLUE passend-policyIn order for the head-end node to process the color community and create an SR Policy with constraints, the color must be configured under SR Traffic Engineering. The following configuration defined a color value of 100, the same as our extended community BLUE, and instructs the router how to handle creating the SR-TE Policy to the BGP next-hop address of the prefix received with the community. In this instance it instructs the router to utilize an external PCE, SR-PCE, to compute the path and use the lower IGP metric path cost to reach the destination. Other options available are TE metric, latency, hop count, and others covered in the SR Traffic Engineering documentation found on cisco.com.segment-routing traffic-eng on-demand color 100 dynamic pcep ! metric type igpThe head-end router will only create a single SR-TE Policy to the next-hop address, other prefixes matching the original next-hop constraints will utilize the pre-existing tunnel. The tunnels are ephemeral meaning they will not persist across router reboots.IXP Fabric Low Level DesignSegment Routing UnderlayThe underlay network used in the IXP Fabric design is the same as utilized with the regular Peering Fabric design. The validated IGP used for all iterations of the IXP Fabric is IS-IS, with all elements of the fabric belonging to the same Level 2 IS-IS domain.EVPN L2VPN ServicesComprehensive configuration for EVPN L2VPN services are outside the scope of this document, please consult the Converged SDN Transport design guide or associated Cisco documentation for low level details on configuring EVPN VPWS and EVPN ELAN services. The Converged SDN Transport design guide can be found at the following URL# https#//xrdocs.io/design/blogs/latest-converged-sdn-transport-hldPeering Fabric TelemetryOnce a peering fabric is deployed, it is extremely important to monitorthe health of the fabric as well as harness the wealth of data providedby the enhanced telemetry on the NCS5500 platform and IOS-XR. Throughstreaming data mechanisms such as Model-Driven Telemetry, BMP, andNetflow, providers can extract data useful for operations, capacityplanning, security, and many other use cases. In the diagram below, thetelemetry collection hosts could be a single system or distributedsystems used for collection. The distributed design of the peeringfabric enhances the ability to collect telemetry data from the fabric bydistributing resources across the fabric. Each PFL or PFS contains amodern multi-core CPU and at least 32GB of RAM (64GB in NC55A1-36H-SE)to support not only built in telemetry operation but also 3rdparty applications a service or content provider may want to deploy tothe node for additional telemetry. Examples of 3rd partytelemetry applications include those storing temporary data forroot-cause analysis if a node is isolated from the rest of the networkor performance measurement applications.The peering fabric also fully supports traditional collections methodssuch as SNMP, and NETCONF using YANG models to integrate with legacysystems.Telemetry DiagramModel-Driven TelemetryMDT uses standards-based or native IOS-XR YANG data models to streamoperational state data from deployed devices. The ability to pushstatistics and state data from the device adds capabilities andefficiency not found using traditional SNMP. Sensors and collectionhosts can be configured statically on the host (dial-out) or the set ofsensors, collection hosts, and their attributes can be managed off-boxusing OpenConfig or native IOS-XR YANG models. Pipeline is Cisco’s opensource collector, which can take MDT data as an input and output it viaa plugin architecture supporting scalable messages buses such as Kafka,or directly to a TSDB such as InfluxDB or Prometheus. The appendixcontains information about MDT YANG paths relevant to the peering fabricand their applicability to PFS and PFL nodes.BGP Monitoring ProtocolBMP, defined in RFC7854, is a protocol to monitor BGP RIB information,updates, and protocol statistics. BMP was created to alleviate theburden of collecting BGP routing information using inefficientmechanisms like screen scraping. BMP has two primary modes, RouteMonitoring mode and Route Mirroring mode. The monitoring mode willinitially transmit the adj-rib-in contents per-peer to a monitoringstation, and continue to send updates as they occur on the monitoreddevice. Setting the L bits on the RM header to 1 will convey this is apost-policy route, 0 will indicate pre-policy. The mirroring mode simplyreflects all received BGP messages to the monitoring host. IOS-XRsupports sending pre and post policy routing information and updates toa station via the Route Monitoring mode. BMP can additionally sendinformation on peer state change events, including why a peer went downin the case of a BGP event.There are drafts in the IETF process led by Cisco to extend BMP toreport additional routing data, such as the loc-RIB and per-peeradj-RIB-out. Local-RIB is the full device RIB include ng received BGProutes, routes from other protocols, and locally originated routes.Adj-RIB-out will add the ability to monitor routes advertised to peerspre and post routing policy.Netflow / IPFIXNetflow was invented by Cisco due to requirements for traffic visibilityand accounting. Netflow in its simplest form exports 5-tuple data foreach flow traversing a Netflow-enabled interface. Netflow data isfurther enhanced with the inclusion of BGP information in the exportedNetflow data, namely AS_PATH and destination prefix. This inclusionmakes it possible to see where traffic originated by ASN and derive thedestination for the traffic per BGP prefix. The latest iteration ofCisco Netflow is Netflow v9, with the next-generation IETF standardizedversion called IPFIX (IP Flow Information Export). IPFIX has expanded onNetflow’s capabilities by introducing hundreds of entities.Netflow is traditionally partially processed telemetry data. The deviceitself keeps a running cache table of flow entries and countersassociated with packets, bytes, and flow duration. At certain timeintervals or event triggered, the flow entries are exported to acollector for further processing. The type 315 extension to IPFIX,supported on the NCS5500, does not process flow data on the device, butsends the raw sampled packet header to an external collector for allprocessing. Due to the high bandwidth, PPS rate, and large number ofsimultaneous flows on Internet routers, Netflow samples packets at apre-configured rate for processing. Typical sampling values on peeringrouters are 1 in 8192 packets, however customers implementing Netflow orIPFIX should work with Cisco to fine tune parameters for optimal datafidelity and performance.Automation and ProgrammabilityCisco NSO ModulesCisco Network Services Orchestrator is a widely deployed networkautomation and orchestration platform, performing intent-drivenconfiguration and validation of networks from a single source of truthconfiguration database. The Peering design includes a Cisco NSOmodules to perform specific peering tasks such as peer turn-up, peermodification, deploying routing policy and ACLs to multiple nodes,providing a jumpstart to peering automation. The following table highlights the currently available Peering NSO services. The current peering service models use the IOS-XR CLI NED and are validated with NSO 4.5.5. Service Description peering-service Manage full BGP and Interface Configuration for EBGP Peers peering-acl Manage infrastructure ACLs referenced by the peering service prefix-set Manage IOS-XR prefix-sets as-path-set Manage IOS-XR as-path sets route-policy Manage XR routing policies for deployment to multiple peering nodes peering-common A set of services to manage as-path sets, community sets, and static routing policies drain-service Service to automate draining traffic away from a node under maintenance telemetry Service to enable telemetry sensors and export to collector bmp Service to enable BMP on configured peers and export to monitoring station netflow Service to enable Netflow on configured peer interfaces and export to collector PFL-to-PFS-Routing Configures IGP and BGP routing between PFL and PFS nodes PFS-Global-BGP Configures global BGP parameters for PFS nodes PFS-Global-ISIS Configures global IS-IS parameters for PFS nodes NetconfNetconf is an industry standard method for configuration networkdevices. Standardized in RFC 6241, Netconf has standard Remote ProcedureCalls (RPCs) to manipulate configuration data and retrieving state data.Netconf on IOS-XR supports the candidate datastore, meaningconfiguration must be explicitly committed for application to therunning configuration.YANG Model SupportWhile Netconf created standard RPCs for managing configuration on adevice, it did not define a language for expressing configuration. Theconfiguration syntax communicated by Netconf followed the typical CLIconfiguration, proprietary for each network vendor XML formatted withoutfollowing any common semantics. YANG or Yet Another Network Grammar, isa modeling language to express configuration using standard elementssuch as containers, groups, lists, and endpoint data called leafs. YANG1.0 was defined in RFC 6020 and updated to version 1.1 in RFC 7950.Vendors cover the majority of device configuration and state usingNative YANG models unique to each vendor, but the industry is headedtowards standardized models where applicable. Groups such as OpenConfigand the IETF are developing standardized YANG models allowing operatorsto write a configuration once across all vendors. Cisco has implementeda number of standard OpenConfig network models relevant to peeringincluding the BGP protocol, BGP RIB, and Interfaces model.The appendix contains information about YANG paths relevant toconfiguring the peering fabric and their applicability to PFS and PFLnodes.3rd Party Hosted ApplicationsIOS-XR starting in 6.0 runs on an x86 64-bit Linux foundation. The moveto an open and well supported operating system, with XR componentsrunning on top of it, allows network providers to run 3rdparty applications directly on the router. There are a wide variety ofapplications which can run on the XR host, with fast path interfaces inand out of the application. Example applications are telemetrycollection, custom network probes, or tools to manage other portions ofthe network within a location.XR Service Layer APIThe XR service layer API is a gRPC based API to extract data from adevice as well as provide a very fast programmatic path into therouter’s runtime state. One use case of SL API in the peering fabricis to directly program FIB entries on a device, overriding the defaultpath selection. Using telemetry extracted from a peering fabric, anexternal controller can use the data and additional external constraintsto programmatically direct traffic across the fabric. SL API alsosupports transmission of event data via subscriptions.Recommended Device and Protocol ConfigurationOverviewThe following configuration guidelines will step through the majorcomponents of the device and protocol configuration specific to thepeering fabric and highlight non-default configuration recommended foreach device role and the reasons behind those choices. Complete exampleconfigurations for each role can be found in the Appendix of thisdocument. Configuration specific to telemetry is covered in section 4.Common Node ConfigurationThe following configuration is common to both PFL and PFS NCS5500 seriesnodes.Enable LLDP GloballylldpPFS NodesAs the PFS nodes will integrate into the core control-plane, onlyrecommended configuration for connectivity to the PFL nodes is given.IGP Configurationrouter isis pf-internal-core set-overload-bit on-startup wait-for-bgp is-type level-1-2 net <L2 NET> net <L1 PF NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10Segment Routing Traffic EngineeringIn IOS-XR there are two mechanisms for configuring SR-TE. Prior to IOS-XR 6.3.2 SR-TE was configured using the MPLS traffic engineering tunnel interface configuration. Starting in 6.3.2 SR-TE can now be configured using the more flexible SR-TE Policy model. The following examples show how to define a static SR-TE path from PFS node to exit PE node using both the legacy tunnel configuration model as well as the new SR Policy model.Paths to PE exit node being load balanced across two static P routers using legacy tunnel configexplicit-path name PFS1-P1-PE1-1 index 1 next-address 192.168.12.1 index 2 next-address 192.168.11.1!explicit-path name PFS1-P2-PE1-1 index 1 next-label 16221 index 2 next-label 16511!interface tunnel-te1 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.1 path-option 1 explicit name PFS1-P1-PE1-1 segment-routing!interface tunnel-te2 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.2 path-option 1 explicit name PFS1-P2-PE1-1 segment-routingIOS-XR 6.3.2+ SR Policy Configurationsegment-routingtraffic-eng segment-list PFS1-P1-PE1-SR-1 index 1 mpls label 16211 index 2 mpls label 16511 ! segment-list PFS1-P2-PE1-SR-1 index 1 mpls label 16221 index 2 mpls label 16511 ! policy pfs1_pe1_via_p1 binding-sid mpls 900001 color 1 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! ! policy pfs1_pe1_via_p2 binding-sid mpls 900002 color 2 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! !BGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATH is longer address-family ipv4 unicast additional-paths receive maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths receive bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF Model-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1PFL NodesPeer QoS PolicyPolicy applied to edge of the network to rewrite any incoming DSCP valueto 0.policy-map peer-qos-in class class-default set dscp default ! end-policy-map!Peer Infrastructure ACLSee the Security section of the document for recommended best practicesfor ingress and egress infrastructure ACLs.access-group v4-infra-acl-in access-group v6-infra-acl-in access-group v4-infra-acl-out access-group v6-infra-acl-out Peer Interface Configurationinterface TenGigE0/0/0/0 description “external peer” service-policy input peer-qos-in ;Explicit policy to rewrite DSCP to 0 lldp transmit disable #Do not run LLDP on peer connected interfaces lldp receive disable #Do not run LLDP on peer connected interfaces ipv4 access-group v4-infra-acl-in #IPv4 Ingress infrastructure ACL ipv4 access-group v4-infra-acl-out #IPv4 Egress infrastructure ACL, BCP38 filtering ipv6 access-group v6-infra-acl-in #IPv6 Ingress infrastructure ACL ipv6 access-group v6-infra-acl-out #IPv6 Egress infrastructure ACL, BCP38 filtering IS-IS IGP Configurationrouter isis pf-internal set-overload-bit on-startup wait-for-bgp is-type level-1 net <L1 Area NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10 ! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10BGP Add-Path Route Policyroute-policy advertise-all ;Create policy for add-path advertisements set path-selection all advertiseend-policyBGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATh is longer address-family ipv4 unicast bgp attribute-download ;Enable BGP information for Netflow/IPFIX export additional-paths send additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv4 NLRI to PFS maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths send additional-paths receive additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv6 NLRI to PFS bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF EBGP Peer Configurationsession-group peer-session ignore-connected-check #Allow loopback peering over ECMP w/o EBGP Multihop egress-engineering #Allocate adj-peer-SID ttl-security #Enable gTTL security if neighbor supports it bmp-activate server 1 #Optional send BMP data to receiver 1af-group v4-af-peer address-family ipv4 unicast soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 1000 80;Set maximum inbound prefixes, warning at 80% thresholdaf-group v6-af-peer soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 100 80 #Set maximum inbound prefixes, warning at 80% thresholdneighbor-group v4-peer use session-group peer-session dmz-link-bandwidth ;Propagate external link BW address-family ipv4 unicast af-group v4-af-peerneighbor-group v6-peer use session-group peer-session dmz-link-bandwidth address-family ipv6 unicast af-group v6-af-peer neighbor 1.1.1.1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v4-peer address-family ipv4 unicast route-policy v4-peer-in(12345) in route-policy v4-peer-out(12345) out neighbor 2001#dead#b33f#0#1#1#1#1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v6-peer address-family ipv6 unicast route-policy v6-peer-in(12345) in route-policy v6-peer-out(12345) out PFL to PFS IBGP Configurationsession-group pfs-session ttl-security #Enable gTTL security if neighbor supports it bmp-activate server 1 #Optional send BMP data to receiver 1 update-source Loopback0 #Set BGP session source address to Loopback0 address af-group v4-af-pfs address-family ipv4 unicast next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v4-pfs-in in route-policy v4-pfs-out out af-group v6-af-pfs next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v6-pfs-in in route-policy v6-pfs-out out neighbor-group v4-pfs ! use session-group pfs-session address-family ipv4 unicast af-group v4-af-pfsneighbor-group v6-pfs ! use session-group pfs-session address-family ipv6 unicast af-group v6-af-pfs neighbor <PFS IP> description ~PFS #1~ remote-as <local ASN> use neighbor-group v4-pfsNetflow/IPFIX Configurationflow exporter-map nf-export version v9 options interface-table timeout 60 options sampler-table timeout 60 template timeout 30 ! transport udp <port> source Loopback0 destination <dest>flow monitor-map flow-monitor-ipv4 record ipv4 option bgpattr exporter nf-export cache entries 50000 cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-ipv6 record ipv6 option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-mpls record mpls ipv4-ipv6-fields option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10 sampler-map nf-sample-8192 random 1 out-of 8192Peer Interfaceinterface Bundle-Ether100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressPFS Upstream Interfaceinterface HundredGigE0/0/0/100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressModel-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1Abstract Peering ConfigurationAbstract peering uses qualities of Segment Routing anycast addresses toallow a provider to steer traffic to a specific peering fabric by simplyaddressing a node SID assigned to all PFS members of the peeringcluster. All of the qualities of SR such as midpoint ECMP and TI-LFAfast protection are preserved for the end to end BGP path, improvingconvergence across the network to the peering fabric. Additionally,through the use of SR-TE Policy, source routed engineered paths can beconfigured to the peering fabric based on business logic and additionalpath constraints.PFS ConfigurationOnly the PFS nodes require specific configuration to perform abstractpeering. Configuration shown is for example only with IS-IS configuredas the IGP carrying SR information. The routing policy setting thenext-hop to the AP anycast SID should be incorporated into standard IBGPoutbound routing policy.interface Loopback1 ipv4 address x.x.x.x/32 ipv6 address x#x#x#x##x/128 router isis <ID> passive address-family ipv4 unicast prefix-sid absolute <Global IPv4 AP Node SID> address-family ipv6 unicast prefix-sid absolute <Global IPv6 AP Node SID> route-policy v4-abstract-ibgp-out set next-hop <Loopback1 IPv4 address> route-policy v6-abstract-ibgp-out set next-hop <Loopback1 IPv6 address> router bgp <ASN> ibgp policy out enforce-modifications ;Enables a PFS node to set a next-hop address on routes reflected to IBGP peersrouter bgp <ASN> neighbor x.x.x.x address-family ipv4 unicast route-policy v4-abstract-ibgp-out neighbor x#x#x#x##x address-family ipv6 unicast route-policy v6-abstract-ibgp-out BGP Flowspec Configuration and OperationBGP Flowspec consists of two different node types. The BGP Flowspec Server is where Flowspec policy is defined and sent to peers via BGP sessions with the BGP Flowspec IPv4 and IPv6 AFI/SAFI enabled. The BGP Flowspec Client receives Flowspec policy information and applies the proper dataplane match and action criteria via dynamic ACLs applied to each routerinterface. By default, IOS-XR applies the dynamic policy to all interfaces, with an interface-level configuration setting used to disable BGP Flowspec on specific interfaces.In the Peering Fabric, PFL nodes will act as Flowspec clients. The PFS nodes may act as Flowspec servers, but will never act as clients.Flowspec policies are typically defined on an external controller to be advertised to the rest of the network. The XRv-9000 virtual router works well in these instances. If one is using an external element to advertise Flowspec policies to the peering fabric, they should be advertised to the PFS nodes which will reflect them to the PFL nodes. In the absence of an external policy injector Flowspec policies can be defined on the Peering Fabric PFS nodes for advertisement to all PFL nodes. IPv6 Flowspec on the NCS5500 requires the use of the following global command, followed by a device reboot. hw-module profile flowspec ipv6-enableEnabling BGP Flowspec Address Families on PFS and PFL NodesFollowing the standard Peering Fabric BGP group definitions the following new groups are augmented. The following configuration assumes the PFS node is the BGP Flowspec server.PFSrouter bgp <ASN>address-family ipv4 flowspec address-family ipv6 flowspec af-group v4-flowspec-af-pfl address-family ipv4 flowspec multipath route-reflector-client next-hop-self af-group v6-flowspec-af-pfl address-family ipv4 flowspec multipath route-reflector-client next-hop-self neighbor-group v4-pfl address-family ipv4 flowspec use af-group v4-flowspec-af-pfl neighbor-group v6-pfl address-family ipv6 flowspec use af-group v6-flowspec-af-pfl PFLrouter bgp <ASN>address-family ipv4 flowspec address-family ipv6 flowspec af-group v4-flowspec-af-pfs address-family ipv4 flowspec multipath af-group v6-flowspec-af-pfs address-family ipv4 flowspec multipath neighbor-group v4-pfs address-family ipv4 flowspec use af-group v4-flowspec-af-pfl neighbor-group v6-pfs address-family ipv6 flowspec use af-group v6-flowspec-af-pfl BGP Flowspec Server Policy DefinitionPolicies are defined using the standard IOS-XR QoS Configuration, the first example below matches the recent memcached DDoS attack and drops all traffic. Additional examples are given covering various packet matching criteria and actions.class-map type traffic match-all memcached match destination-port 11211 match protocol udp tcp match destination-address ipv4 10.0.0.0 255.255.255.0 end-class-map!!policy-map type pbr drop-memcached class type traffic memcached drop ! class type traffic class-default ! end-policy-mapclass-map type traffic match-all icmp-echo-flood match protocol icmp match ipv4 icmp type 8 match destination-address ipv4 10.0.0.0 255.255.255.0 end-class-map!!policy-map type pbr limit-icmp-echo class type traffic memcached police rate 100 kbps ! class type traffic class-default ! end-policy-mapclass-map type traffic match-all dns match protocol udp match source port 53 end-class-map!!policy-map type pbr redirect-dns class type traffic dns police rate 100 kbps ! class type traffic class-default redirect nexthop 1.1.1.1 redirect nexthop route-target 1000#1 ! end-policy-mapBGP Flowspec Server EnablementThe following global configuration will enable the Flowspec server and advertisethe policy via the BGP Flowspec NLRIflowspec address-family ipv4 service-policy type pbr drop-memcachedBGP Flowspec Client ConfigurationThe following global configuration enables the BGP Flowspec client function and installation of policies on all local interfaces. Flowspec can be disabled on individual interfaces using the [ipv4|ipv6] flowspec disable command in interface configuration mode.flowspec address-family ipv4 local-install interface-all QPPB Configuration and OperationQoS Policy Propagation using BGP is described in more detail in the Security section.QPPB applies standard QoS policies to packets matching BGP prefix criteria such as BGP community or AS Path. QPPB is supported for both IPv4 and IPv6 address families and packets. QPPB on the NCS5500 supports matching destination prefix attributes only.QPPB configuration starts with a standard RPL route policy that matches BGP attributes and sets a specific QoS group based on that criteria. This routing policy is applied to each address-family as a table-policy in the global BGP configuration. A standard MQC QoS policy is then defined using the specific QoS groups as match criteria to apply additional QoS behavior such as filtering, marking, or policing. This policy is applied to a logical interface, with a specific QPPB command used to enable the propagation of BGP data as part of the dataplane ACL packet match criteria.IPv6 QPPB on the NCS5500 requires the use of the following global command, followed by a device reboot. hw-module profile qos ipv6 shortRouting Policy Configurationroute-policy qppb-test if community matches-every (1000#1) then set qos-group 1 endif if community matches-every (1000#2) then set qos-group 2 endifend-policyGlobal BGP Configurationrouter bgp <ASN> address-family ipv4 unicast table-policy qppb-test address-family ipv6 unicast table-policy qppb-test QoS Policy Definitionclass-map match-any qos-group-1 match qos-group 1 end-class-map class-map match-any qos-group-2 match qos-group 2 end-class-map policy-map remark-peer-traffic class qos-group1 set precedence 5 set mpls experimental imposition 5 ! class qos-group2 set precedence 3 set mpls experimental imposition 3 ! class class-default ! end-policy-mapInterface-Level Configurationinterface gigabitethernet0/0/0/1 service-policy input remark-peer-traffic ipv4 bgp policy propagation input qos-group destination ipv6 bgp policy propagation input qos-group destination BGP Graceful ShutdownBGP graceful shutdown is an IETF standard mechanism for notifying an IBGP or EBGP peer the peer will be g$ing offline. Graceful shutdown uses a well-known community, the GSHUT community (65535#0), on each prefix advertised to a peer so the peer can match the community and perform an action to move traffic gracefully away from the peer before it goes down. In the example in the peering design we will lower the local preference on the route.Outbound graceful shutdown configurationGraceful shutdown is part of the graceful maintenance configuration within BGP. Graceful maintenance can also perform an AS prepend operation when activated. Sending the GSHUT community is enabled using the send-community-gshut-ebgp command under each address family. Graceful maintenance is enabled using the “activate” keyword in the configuration for the neighbor, neighbor-group, or globally for the BGP process.neighbor 1.1.1.1 graceful-maintenance as-prepends 3 address-family ipv4 unicast send-community-gshut-ebgp ! address-family ipv6 unicast send-community-gshut-ebgp Inbound graceful shutdown configurationInbound prefixes tagged with the GSHUT community should be processed with a local-preference of 0 applied so if there is another path for traffic it can be utilized prior to the peer going down. The following is a simple example of a community-set and routing policy to perform this. This could also be added to an existing peer routing policy.community-set graceful-shutdown 65535#0 end-set ! route-policy gshut-inbound if community matches-any graceful-shutdown then set local-preference 0 endif end-policy Activating graceful shutdownGraceful maintenance can be activated globally or for a specific neighbor/neighbor-group. To enable graceful shutdown use the activate keyword under the “graceful-maintenance” configuration context. Without the “all-neighbors” flag maintenance will only be enabled for peers with their own graceful-maintenance configuration. The activate command is persistantGlobalrouter bgp 100 graceful-maintenance activate [ all-neighbors ] Individual neighborrouter bgp 100 neighbor 1.1.1.1 graceful-maintenance activate Peers in specific neighbor-groupneighbor-group peer-group graceful-maintenance activate SecurityPeering by definition is at the edge of the network, where security ismandatory. While not exclusive to peering, there are a number of bestpractices and software features when implemented will protect your ownnetwork as well as others from malicious sources within your network.Peering and Internet in a VRFUsing VRFs to isolate peers and the Internet routing table from the infrastructure can enhance security by keeping internal infrastructure components separate from Internet and end user reachability. VRF separation can be done one of three different ways# Separate each peer into its own VRF, use default VRF on SP Network Single VRF for all “Internet” endpoints, including peers Separate each peer into its own VRF, and use a separate “Internet” VRFVRF per Peer, default VRF for InternetIn this method each peer, or groups of peers, are configured under separate VRFs. The SP carries these and all other routes via the default VRF in IOS-XR commonly known as the Global Routing Table. The VPNv4 and VPNv6 address families are NOT configured on the BGP peering sessions between the PFL and PFS nodes and the PFS nodes and the rest of the network. IOS-XR provides the command import from default-vrf and export to default-vrf with a route-policy to match specific routes to be imported to/from each peer VRF to the default VRF. This provides dataplane isolation between peers and another mechanism to determine which SP routes are advertised to each peer.Internet in a VRF OnlyIn this method all Internet endpoints are configured in the same “Internet” VRF. The security benefit is removing dataplane connectivity between the global Internet and your underlying infrastructure, which is using the default VRF for all internal connectivity. This method uses the VPNv4/VPNv6 address families on all BGP peers and requires the Internet VRF be configured on all peering fabric nodes as well as SP PEs participating in the global routing table. If there are VPN customers or public-facing services in their own VRF needing Internet access, routes can be imported/exported from the Internet VRF on the PE devices they attach to.VRF per Peer, Internet in a VRFThis method combines the properties and configuration of the previous two methods for a solution with dataplane isolation per peer and separation of all public Internet traffic from the SP infrastructure layer. The exchange of routes between the peer VRFs and Internet VRF takes place on the PFL nodes with the rest of the network operating the same as the Internet in a VRF use case.The VPNv4 and VPNv6 address families must be configured across all routers in the network.Infrastructure ACLsInfrastructure ACLs and their associated ACEs (Access Control Entries)are the perimeter protection for a network. The recommended PFL deviceconfiguration uses IPv4 and IPv6 infrastructure ACLs on all edgeinterfaces. These ACLs are specific to each provider’s security needs,but should include the following sections. Filter IPv4 and IPv6 BOGON space ingress and egress Drop ingress packets with a source address matching your own aggregate IPv4/IPv6 prefixes. Rate-limit ingress traffic to Unix services typically used in DDoSattacks, such as chargen (TCP/19). On ingress and egress, allow specific ICMP types and rate-limit toappropriate values, filter out ones not needed on your network. ICMPttl-exceeded, host unreachable, port unreachable, echo-reply,echo-request, and fragmentation needed should always be allowed in somecapacity.BCP ImplementationBest Current Practices are informational documents published by the IETFto give guidelines on operational practices. This document will notoutline the contents of the recommended BCPs, but two in particular areof interest to Internet peering. BCP38 explains the need to filterunused address space at the edges of the network, minimizing the chancesof spoofed traffic from DDoS sources reaching their intended target.BCP38 is applicable for ingress traffic and especially egress traffic,as it stops spoofed traffic before it reaches outside your network.BCP194, BGP Operations and Security, covers a number of BGP operationalpractices, many of which are used in Internet peering. IOS-XR supportsall of the mechanisms recommended in BCP38, BCP84, and BCP194, includingsoftware features such as GTTL, BGP dampening, and prefix limits.BGP Attribute and CoS ScrubbingScrubbing of data on ingress and egress of your network is an importantsecurity measure. Scrubbing falls into two categories, control-plane anddataplane. The control-plane for Internet peering is BGP and there are afew BGP transitive attributes one should take care to normalize. Yourinternal BGP communities should be deleted from outbound BGP NLRI viaegress policy. Most often you are setting communities on inboundprefixes, make sure you are replacing existing communities from the peerand not adding communities. Unless you have an agreement with the peer,normalize the MED attribute to zero or another standard value on allinbound prefixes.In the dataplane, it’s important to treat the peering edge as untrustedand clear any CoS markings on inbound packets, assuming a prioragreement hasn’t been reached with the peer to carry them across thenetwork boundary. It’s an overlooked aspect which could lead to peertraffic being prioritized on your network, leading to unexpected networkbehavior. An example PFL infrastructure ACL is given resetting incomingIPv4/IPv6 DSCP values to 0.Per-Peer Control Plane PolicersBGP protocol packets are handled at the RP level, meaning each packet ishandled by the router CPU with limited bandwidth and processingresources. In the case of a malicious or misconfigured peer this couldexhaust the processing power of the CPU impacting other important tasks.IOS-XR enforces protocol policers and BGP peer policers by default.BGP Prefix SecurityRPKI Origin ValidationPrefix hijacking has been prevalent throughout the last decade as theInternet became more integrated into our lives. This led to the creationof RPKI origin validation, a mechanism to validate a prefix was beingoriginated by its rightful owner by checking the originating ASN vs. asecure database. IOS-XR fully supports RPKI for origin validation.BGP RPKI and ROV ConfgurationThe following section outlines an example configuration for RPKI and Route Origin Validation (ROV) within IOS-XR.Create ROV Routing PoliciesIn order to apply specific attributes to routes tagged with an ROV status, one must use a routing policy. The “invalid”, “valid”, and “unconfigured” states can be matched upon and then used to set specific BGP attributes as well as accept or drop the route. In the following example a routes’ local-preference attribute is set based on ROV status.route-policy rpki if validation-state is invalid then set local-preference 50 endif if validation-state is not-found then set local-preference 75 endif if validation-state is valid then set local-preference 100 endif else pass end policyConfigure RPKI Server and ROV OptionsAn RPKI server is defined using the “rpki server” section under the global BGP hierarchy. Also configurable is whether or not the ROV status is taken into account as part of the BGP best path selection process. A route with a “valid” status is preferred over a route with a “not-found” or “invalid” status. There is also a configuration option for whether or not to allow invalid routes at all as part of the selection process. It is recommended to includerouter bgp 65536 bgp router id 192.168.0.1 rpki server 172.16.0.254 transport tcp port 32000 refresh-time 120 bgp bestpath origin-as use validity bgp bestpath origin-as allow invalidEnabling RPKI ROV on BGP NeighborsROV is done at the global BGP level, but the treatment of routes is done at the neighbor level. This requires applying the pre-defined ROV route-policy to the neighbors you wish to apply policy to based on ROV status.neighbor 192.168.0.254 remote-as 64555 address-family ipv4 unicast route-policy rpki in Communicating ROV Status via Well-Known BGP CommunityRPKI ROV is typically only done on the edges of the network, and in IOS-XR is only done on EBGP sessions. In a network with multiple ASNs under the same administrative control, one should configure the following to signal ROV validation status via a well-known community to peers within the same administrative domain. This way only the nodes connected to external peers have RTR sessions to the RPKI ROV validators and are responsible for applying ROV policy, adding efficiency to the process and reducing load on the validator.address-family ipv4 unicast bgp origin-as validation signal ibgp BGPSEC (Reference Only)RPKI origin validation works to validate the source of a prefix, butdoes not validate the entire path of the prefix. Origin validation alsodoes not use cryptographic signatures to ensure the originator is whothey say they are, so spoofing the ASN as well does not stop someoneform hijacking a prefix. BGPSEC is an evolution where a BGP prefix iscryptographically signed with the key of its valid originator, and eachBGP router receiving the path checks to ensure the prefix originatedfrom the valid owner. BGPSEC standards are being worked on in the SIDRworking group. Cisco continues to monitor the standards related to BGPSEC and similar technologies to determine which to implement to best serve our customers.DDoS traffic steering using SR-TESee the overview design section for more details. This shows the configuration of a single SR-TE Policy which will balance traffic to two different egress DDoS “dirty” interfaces. If a BGP session is enabled between the DDoS mitigation appliance and the router, an EPE label can be assigned to the interface. In the absence of EPE, a MPLS static LSP can be created on the core-facing interfaces on the egress node, with the action set to “pop” towards the DDoS mitigation interface.SR-TE Policy configurationIn this example the node SID is 16441. The EPE or manual xconnect SID for a specific egress interface is 28000 and 28001. The weight of each path is 100, so traffic will be equally balanced across the paths.segment-routing traffic-eng segment-list pr1-ddos-1 index 1 mpls label 16441 index 2 mpls label 28000 segment-list pr1-ddos-2 index 1 mpls label 16441 index 2 mpls label 28001 policy pr1_ddos1_epe color 999 end-point ipv4 192.168.14.4 candidate-paths preference 100 explicit segment-list pr1-ddos-1 weight 100 ! explicit segment-list pr1-ddos-2 weight 100Egress node BGP configurationOn the egress BGP node, 192.168.14.4, prefixes are set with a specific “DDoS” color to enable the ingress node to steer traffic into the correct SR Policy. An example is given of injecting the 50.50.50.50/32 route with the “DDoS” color of 999.extcommunity-set opaque DDOS 999 end-set!route-policy SET-DDOS-COLOR set extcommunity color DDOS passend-policy!router static address-family ipv4 unicast 50.50.50.50/32 null0 ! ! router bgp 100 address-family ipv4 unicast network 50.50.50/32 route-policy SET-DDOS-COLOR ! !Egress node MPLS static LSP configurationIf EPE is not being utilized, the last label in the SR Policy path must be matched to a static LSP. The ingress label on the egress node is used to map traffic to a specific IP next-hop and interface. We will give an example using the label 28000 in the SR Policy path. The core-facing ingress interface is HundredGigE0/0/0/1, the egress DDoS “dirty” interface is TenGigE0/0/0/1 with a NH address of 192.168.100.1.mpls static interface HundredGigE0/0/0/1 lsp ddos-interface-1 in-label 28000 allocate forward path 1 nexthop TenGigE0/0/0/1 192.168.100.1 out-label pop ! !!AppendixApplicable YANG ModelsModelDataopenconfig-interfacesCisco-IOS-XR-infra-statsd-operCisco-IOS-XR-pfi-im-cmd-operInterface config and state Common counters found in SNMP IF-MIB openconfig-if-ethernet Cisco-IOS-XR-drivers-media-eth-operEthernet layer config and stateXR native transceiver monitoringopenconfig-platformInventory, transceiver monitoring openconfig-bgpCisco-IOS-XR-ipv4-bgp-oper Cisco-IOS-XR-ipv6-bgp-operBGP config and state Includes neighbor session state, message counts, etc.openconfig-bgp-rib Cisco-IOS-XR-ip-rib-ipv4-oper Cisco-IOS-XR-ip-rib-ipv6-operBGP RIB information. Note# Cisco native includes all protocols openconfig-routing-policyConfigure routing policy elements and combined policyopenconfig-telemetryConfigure telemetry sensors and destinations Cisco-IOS-XR-ip-bfd-cfg Cisco-IOS-XR-ip-bfd-operBFD config and state Cisco-IOS-XR-ethernet-lldp-cfg Cisco-IOS-XR-ethernet-lldp-operLLDP config and state openconfig-mplsMPLS config and state, including Segment RoutingCisco-IOS-XR-clns-isis-cfgCisco-IOS-XR-clns-isis-operIS-IS config and state Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-operNCS 5500 HW resources NETCONF YANG PathsNote that while paths are given to retrieve data from a specific leafnode, it is sometimes more efficient to retrieve all the data under aspecific heading and let a management station filter unwanted data thanperform operations on the router. Additionally, Model Driven Telemetrymay not work at a leaf level, requiring retrieval of an entire subset ofdata.The data is also available via NETCONF, which does allow subtree filtersand retrieval of specific data. However, this is a more resourceintensive operation on the router. Metric Data Logical Interface Admin State Enum SNMP OID IF-MIB#ifAdminStatus OC YANG openconfig-interfaces#interfaces/interface/state/admin-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Interface Operational State Enum SNMP OID IF-MIB#ifOperStatus OC YANG openconfig-interfaces#interfaces/interface/state/oper-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Last State Change (seconds) Counter SNMP OID IF-MIB#ifLastChange OC YANG openconfig-interfaces#interfaces/interface/state/last-change Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/last-state-transition-time     Logical Interface SNMP ifIndex Integer SNMP OID IF-MIB#ifIndex OC YANG openconfig-interfaces#interfaces/interface/state/if-index Native YANG Cisco-IOS-XR-snmp-agent-oper#snmp/interface-indexes/if-index     Logical Interface RX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCInOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-received     Logical Interface TX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCOutOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-sent     Logical Interface RX Errors Counter SNMP OID IF-MIB#ifInErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-errors MDT Native     Logical Interface TX Errors Counter SNMP OID IF-MIB#ifOutErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-errors     Logical Interface Unicast Packets RX Counter SNMP OID IF-MIB#ifHCInUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Unicast Packets TX Counter SNMP OID IF-MIB#ifHCOutUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Input Drops Counter SNMP OID IF-MIB#ifIntDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-drops     Logical Interface Output Drops Counter SNMP OID IF-MIB#ifOutDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-drops     Ethernet Layer Stats – All Interfaces Counters SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics     Ethernet PHY State – All Interfaces Counters SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info     Ethernet Input CRC Errors Counter SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state/oc-eth#counters/oc-eth#in-crc-errors Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics/statistic/dropped-packets-with-crc-align-errors The following transceiver paths retrieve the total power for thetransceiver, there are specific per-lane power levels which can beretrieved from both native and OC models, please refer to the model YANGfile for additionalinformation.     Ethernet Transceiver RX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-rx-power     Ethernet Transceiver TX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-tx-power BGP Operational StateGlobal BGP Protocol StateIOS-XR native models do not store route information in the BGP Opermodel, they are stored in the IPv4/IPv6 RIB models. These models containRIB information based on protocol, with a numeric identifier for eachprotocol with the BGP ProtoID being 5. The protoid must be specified orthe YANG path will return data for all configured routingprotocols.     BGP Total Paths (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-paths Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/num-active-paths MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native BGP Neighbor StateExample UsageDue the construction of the YANG model, the neighbor-address key must beincluded as a container in all OC BGP state RPCs. The following RPC getsthe session state for all configured peers#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address/> <state> <session-state/> </state> </neighbor> </neighbors> </bgp> </filter> </get></rpc>\t<nc#rpc-reply message-id=~urn#uuid#24db986f-de34-4c97-9b2f-ac99ab2501e3~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address>172.16.0.2</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> </neighbors> </bgp> </nc#data></nc#rpc-reply>     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Session State for all BGP neighbors Enum SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/session-state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/connection-state     Message counters for all BGP neighbors Counter SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/messages Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/message-statistics Current queue depth for all BGP neighborsCounterSNMP OIDNAOC YANG/openconfig-bgp#bgp/neighbors/neighbor/state/queuesNative YANGCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-outCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-inBGP RIB DataRIB data is retrieved per AFI/SAFI. To retrieve IPv6 unicast routesusing OC models, replace “ipv4-unicast” with “ipv6-unicast”IOS-XR native models do not have a BGP specific RIB, only RIB dataper-AFI/SAFI for all protocols. Retrieving RIB information from thesepaths will include this data.While this data is available via both NETCONF and MDT, it is recommendedto use BMP as the mechanism to retrieve RIB table data.Example UsageThe following retrieves a list of best-path IPv4 prefixes withoutattributes from the loc-RIB#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <loc-rib> <routes> <route> <prefix/> <best-path>true</best-path> </route> </routes> </loc-rib> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc>     IPv4 Local RIB – Prefix Count Counter OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/num-routes Native YANG       IPv4 Local RIB – IPv4 Prefixes w/o Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes/route/prefix     IPv4 Local RIB – IPv4 Prefixes w/Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes Native YANG   The following per-neighbor RIB paths can be qualified with a specificneighbor address to retrieve RIB data for a specific peer. Below is anexample of a NETCONF RPC to retrieve the number of post-policy routesfrom the 192.168.2.51 peer and the returned output.<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes/> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc><nc#rpc-reply message-id=~urn#uuid#7d9a0468-4d8d-4008-972b-8e703241a8e9~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <afi-safi-name xmlns#idx=~http#//openconfig.net/yang/rib/bgp-types~>idx#IPV4_UNICAST</afi-safi-name> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes>3</num-routes> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </nc#data></nc#rpc-reply>     IPv4 Neighbor adj-rib-in pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-re     IPv4 Neighbor adj-rib-in post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-post     IPv4 Neighbor adj-rib-out pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre     IPv4 Neighbor adj-rib-out post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre BGP Flowspec     BGP Flowspec Operational State Counters SNMP OID NA OC YANG NA Native YANG Cisco-IOS-XR-flowspec-oper MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native Device Resource YANG Paths     Device Inventory List OC YANG oc-platform#components     NCS5500 Dataplane Resources List OC YANG NA Native YANG Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data Validated Model-Driven Telemetry Sensor PathsThe following represents a list of validated sensor paths useful formonitoring the Peering Fabric and the data which can be gathered byconfiguring these sensorpaths.Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform openconfig-platform#components cisco-ios-xr-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data cisco-ios-xr-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info cisco-ios-xr-shellutil-oper#system-time/uptime cisco-ios-xr-wdsysmon-fd-oper#system-monitoring/cpu-utilizationLLDP MonitoringCisco-IOS-XR-ethernet-lldp-oper#lldpCisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/neighborsInterface statistics and stateopenconfig-interfaces#interfacesCisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-countersCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interfaceCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statisticsCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statistics/basic-interface-statsThe following sub-paths can be used but it is recommended to use the base openconfig-interfaces modelopenconfig-interfaces#interfaces/interfaceopenconfig-interfaces#interfaces/interface/stateopenconfig-interfaces#interfaces/interface/state/countersopenconfig-interfaces#interfaces/interface/subinterfaces/subinterface/state/countersAggregate bundle information (use interface models for interface counters)sensor-group openconfig-if-aggregate#aggregatesensor-group openconfig-if-aggregate#aggregate/statesensor-group openconfig-lacp#lacpsensor-group Cisco-IOS-XR-bundlemgr-oper#bundlessensor-group Cisco-IOS-XR-bundlemgr-oper#bundle-information/bfd-countersBGP Peering informationsensor-path openconfig-bgp#bgpsensor-path openconfig-bgp#bgp/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticsIS-IS IGP informationsensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighborssensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/interfacessensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/adjacenciesIt is not recommended to monitor complete RIB tables using MDT but can be used for troubleshootingCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countQoS and ACL monitoringopenconfig-acl#aclCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/general-statsCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/queue-stats-arrayBGP RIB informationIt is not recommended to monitor these paths using MDT with large tablesopenconfig-rib-bgp#bgp-ribCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intRouting policy InformationCisco-IOS-XR-policy-repository-oper#routing-policy/policies", "url": "/blogs/2020-01-10-peering-fabric-hld/", "author": "Phil Bedard", "tags": "iosxr, design, peering, ddos, ixp" } , "blogs-2020-10-01-cst-implementation-guide-3-5": { "title": "Converged SDN Transport Implementation Guide", "content": " On This Page Version Targets Testbed Overview Devices Key Resources to Allocate Role-Based Router Configuration IOS-XR Router Configuration Underlay physical interface configuration with BFD MPLS Performance Measurement Interface delay metric dynamic configuration Interface delay metric static configuration SR-MPLS Transport Segment Routing SRGB and SRLB Definition IGP protocol (ISIS) and Segment Routing MPLS configuration IS-IS router configuration IS-IS Loopback and node SID configuration Unnumbered Interfaces Unnumbered Interface IS-IS Database Anycast SID ABR node configuration IS-IS logical interface configuration with TI-LFA Segment Routing Data Plane Monitoring MPLS Segment Routing Traffic Engineering (SRTE) configuration MPLS Segment Routing Traffic Engineering (SRTE) TE metric configuration IOS-XE Nodes - SR-MPLS Transport Segment Routing MPLS configuration Prefix-SID assignment to loopback 0 configuration IGP protocol (ISIS) with Segment Routing MPLS configuration TI-LFA FRR configuration IS-IS and MPLS interface configuration MPLS Segment Routing Traffic Engineering (SRTE) Area Border Routers (ABRs) IGP-ISIS Redistribution configuration (IOS-XR) Redistribute Core SvRR and TvRR loopback into Access domain Redistribute Access SR-PCE and SvRR loopbacks into CORE domain Multicast transport using mLDP Overview mLDP core configuration LDP base configuration with defined interfaces LDP auto-configuration G.8275.1 and G.8275.2 PTP (1588v2) timing configuration Summary Enable frequency synchronization Optional Synchronous Ethernet configuration (PTP hybrid mode) PTP G.8275.2 global timing configuration PTP G.8275.2 interface profile definitions IPv4 G.8275.2 master profile IPv6 G.8275.2 master profile IPv4 G.8275.2 slave profile IPv6 G.8275.2 slave profile PTP G.8275.1 global timing configuration IPv6 G.8275.1 slave profile IPv6 G.8275.1 master profile Application of PTP profile to physical interface G.8275.2 interface configuration G.8275.1 interface configuration Segment Routing Path Computation Element (SR-PCE) configuration BGP - Services (sRR) and Transport (tRR) route reflector configuration Services Route Reflector (sRR) configuration Transport Route Reflector (tRR) configuration BGP – Provider Edge Routers (A-PEx and PEx) to service RR IOS-XR configuration IOS-XE configuration BGP-LU co-existence BGP configuration Segment Routing Global Block Configuration Boundary node configuration PE node configuration Area Border Routers (ABRs) IGP topology distribution Segment Routing Traffic Engineering (SRTE) and Services Integration On Demand Next-Hop (ODN) configuration – IOS-XR On Demand Next-Hop (ODN) configuration – IOS-XE SR-PCE configuration – IOS-XR SR-PCE configuration – IOS-XE SR-TE Policy Configuration SR-TE Color and Endpoint SR-TE Candidate Paths Service to SR-TE Policy Forwarding SR-TE Configuration Examples SR Policy using IGP computation, head-end computation SR Policy using lowest IGP metric computation and PCEP SR Policy using lowest latency metric and PCEP SR Policy using explicit segment list QoS Implementation Summary Core QoS configuration Class maps used in QoS policies Core ingress classifier policy Core egress queueing map Core egress MPLS EXP marking map H-QoS configuration Enabling H-QoS on NCS 540 and NCS 5500 Example H-QoS policy for 5G services Class maps used in ingress H-QoS policies Parent ingress QoS policy H-QoS ingress child policies Egress H-QoS parent policy (Priority levels) Egress H-QoS child using priority only Egress H-QoS child using reserved bandwidth Egress H-QoS child using shaping Services End-To-End VPN Services End-To-End VPN Services Data Plane L3VPN MP-BGP VPNv4 On-Demand Next-Hop Access Router Service Provisioning (IOS-XR) Access Router Service Provisioning (IOS-XE) L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Access Router Service Provisioning (IOS-XR)# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# L2VPN EVPN E-Tree IOS-XR Root Node Configuraiton IOS-XR Leaf Node Configuration Hierarchical Services L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Router Service Provisioning (IOS-XR)# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# Ethernet CFM for L2VPN service assurance Maintenance Domain configuration MEP configuration for EVPN-VPWS services Multicast NG-MVPN Profile 14 using mLDP and ODN L3VPN Multicast core configuration Unicast L3VPN PE configuration Multicast PE configuration Multicast distribution using TreeSID with static S,G Mapping TreeSID SR-PCE Configuration Endpoint Set Configuration P2MP TreeSID SR Policy Configuration TreeSID Common Config on All Nodes Segment Routing Local Block PCEP Configuration TreeSID Source Node Multicast Configuration TreeSID Receiver Node Multicast Configuration Global Routing Table Multicast mVPN Multicast Configuration TreeSID Verification on PCE End-To-End VPN Services Data Plane Hierarchical Services L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Router Service Provisioning (IOS-XR)# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# Remote PHY CIN Implementation Summary Sample QoS Policies Class maps RPD and DPIC interface policy maps Core QoS CIN Timing Configuration Example CBR-8 RPD DTI Profile Multicast configuration Summary Global multicast configuration - Native multicast Global multicast configuration - LSM using profile 14 PIM configuration - Native multicast PIM configuration - LSM using profile 14 IGMPv3/MLDv2 configuration - Native multicast IGMPv3/MLDv2 configuration - LSM profile 14 IGMPv3 / MLDv2 snooping profile configuration (BVI aggregation) RPD DHCPv4/v6 relay configuration Native IP / Default VRF RPHY L3VPN cBR-8 DPIC interface configuration without Link HA cBR-8 DPIC interface configuration with Link HA cBR-8 Digital PIC Interface Configuration RPD interface configuration P2P L3 BVI RPD/DPIC agg device IS-IS configuration Additional configuration for L3VPN Design Global VRF Configuration BGP Configuration Model-Driven Telemetry Configuration Summary Device inventory and monitoring Interface Data LLDP Monitoring Aggregate bundle information (use interface models for interface counters) PTP and SyncE Information BGP Information IS-IS Information Routing protocol RIB information BGP RIB information Routing policy Information EVPN Information Per-Interface QoS Statistics Information Per-Policy, Per-Interface, Per-Class statistics L2VPN Information SR-PCE PCC and SR Policy Information MPLS performance measurement mLDP Information ACL Information VersionThe following aligns to and uses features from Converged SDN Transport 3.5, pleasesee the overview High Level Design document at https#//xrdocs.io/design/blogs/latest-converged-sdn-transport-hldTargets Hardware# ASR 9000 as Centralized Provider Edge (C-PE) router NCS 5500, NCS 560, and NCS 55A2 as Aggregation and Pre-Aggregation router NCS 5500 as P core router ASR 920, NCS 540, and NCS 5500 as Access Provider Edge (A-PE) cBR-8 CMTS with 8x10GE DPIC for Remote PHY Compact Remote PHY shelf with three 1x2 Remote PHY Devices (RPD) Software# IOS-XR 7.1.2 on ASR 9000, NCS 560, NCS 540, NCS 5500, and NCS 55A2 routers IOS-XE 16.12.03 on ASR 920 IOS-XE 16.10.1f on cBR-8 Key technologies Transport# End-To-End Segment-Routing Network Programmability# SR- TE Inter-Domain LSPs with On-DemandNext Hop Network Availability# TI-LFA/Anycast-SID Services# BGP-based L2 and L3 Virtual Private Network services(EVPN and L3VPN/mVPN) Network Timing# G.8275.1 and G.8275.2 Network Assurance# 802.1ag Testbed OverviewFigure 1# Compass Converged SDN Transport High Level TopologyFigure 2# Testbed Physical TopologyFigure 3# Testbed Route-Reflector and SR-PCE physical connectivityFigure 4# Testbed IGP DomainsDevicesAccess PE (A-PE) Routers Cisco NCS5501-SE (IOS-XR) – A-PE7 Cisco NCS540 (IOS-XR) - A-PE1, A-PE2, A-PE3, A-PE8 Cisco ASR920 (IOS-XE) – A-PE4, A-PE5, A-PE6, A-PE9Pre-Aggregation (PA) Routers Cisco NCS5501-SE (IOS-XR) – PA3, PA4Aggregation (AG) Routers Cisco NCS5501-SE (IOS-XR) – AG2, AG3, AG4 Cisco NCS 560-4 w/RSP-4E (IOS-XR) - AG1High-scale Provider Edge Routers Cisco ASR9000 w/Tomahawk Line Cards (IOS-XR) – PE1, PE2, PE3, PE4Area Border Routers (ABRs) Cisco ASR9000 (IOS-XR) – PE3, PE4 Cisco 55A2-MOD-SE - PA2 Cisco NCS540 - PA1Service and Transport Route Reflectors (RRs) Cisco IOS XRv 9000 – tRR1-A, tRR1-B, sRR1-A, sRR1-B, sRR2-A, sRR2-B,sRR3-A, sRR3-BSegment Routing Path Computation Element (SR-PCE) Cisco IOS XRv 9000 – SRPCE-A1-A, SRPCE-A1-B, SRPCE-A2-A, SRPCE-A2-A, SRPCE-CORE-A, SRPCE-CORE-BKey Resources to Allocate IP Addressing IPv4 address plan IPv6 address plan, recommend dual plane day 1 Plan for SRv6 in the future Color communities for ODN Segment Routing Blocks SRGB (segment-routing address block) Keep in mind anycast SID for ABR node pairs Allocate 3 SIDs for potential future Flex-algo use SRLB (segment routing local block) Local significance only Can be quite small and re-used on each node IS-IS unique instance identifiers for each domainRole-Based Router ConfigurationIOS-XR Router ConfigurationUnderlay physical interface configuration with BFDinterface TenGigE0/0/0/10 bfd mode ietf bfd address-family ipv4 timers start 180 bfd address-family ipv4 multiplier 3 bfd address-family ipv4 destination 10.1.2.1 bfd address-family ipv4 fast-detect bfd address-family ipv4 minimum-interval 50 mtu 9216 ipv4 address 10.15.150.1 255.255.255.254 ipv4 unreachables disable load-interval 30 dampeningMPLS Performance MeasurementInterface delay metric dynamic configurationStarting with CST 3.5 we now support end to end dynamic link delay measurements across all IOS-XR nodes. The feature in IOS-XR is called Performance Measurement and all configuration is found under the performance-measurement configuration hierarchy. There are a number of configuration options utilized when configuring performance measurement, but the below configuration will enable one-way delay measurements on physical links. The probe measurement-mode options are either one-way or two-way One-way mode requires nodes be time synchronized to a common PTP clock, and should be used if available. In the absence of a common PTP clock, two-mode can be used which calculates the one-way delay using multiple timestamps at the querier and responder.The advertisement options specify when the advertisements are made into the IGP. The periodic interval sets the minimum interval, with the threshold setting the difference required to advertise a new delay value. The accelerated threshold option sets a percentage change required to trigger and advertisement prior to the periodic interval timer expiring. Performance measurement takes a series of measurements within each computation interval and uses this information to derive the min, max, and average link delay.Full documentation on Performance Measurement can be found at# https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/segment-routing/72x/b-segment-routing-cg-ncs5500-72x/configure-performance-measurement.htmlPlease note while this is the IOS-XR 7.2.1 documentation it also applies to IOS-XR 7.1.2.performance-measurement interface TenGigE0/0/0/20 delay-measurement ! ! interface TenGigE0/0/0/21 delay-measurement ! ! protocol twamp-light measurement delay unauthenticated querier-dst-port 12345 ! ! ! delay-profile interfaces advertisement accelerated threshold 25 ! periodic interval 120 threshold 10 ! ! probe measurement-mode two-way protocol twamp-light computation-interval 60 ! !!endInterface delay metric static configurationIn the absence of dynamic realtime one-way latency monitoring for physical interfaces, the interface delay can be set manually. The one-way delay measurement value is used when computing SR Policy paths with the “latency” constraint type. The configured value is advertised in the IGP using extensions defined in RFC 7810, and advertised to the PCE using BGP-LS extensions. Keep in mind the delay metric value is defined in microseconds, so if you are mixing dynamic computation with static values they should be set appropriately.performance-measurement interface TenGigE0/0/0/10 delay-measurement advertise-delay 15000 interface TenGigE0/0/0/20 delay-measurement advertise-delay 10000SR-MPLS TransportSegment Routing SRGB and SRLB DefinitionIt’s recommended to first configure the Segment Routing Global Block (SRGB) across all nodes needing connectivity between each other. In most instances a single SRGB will be used across the entire network. In a SR MPLS deployment the SRGB and SRLB correspond to the label blocks allocated to SR. IOS-XR has a maximum configurable SRGB limit of 512,000 labels, however please consult platform-specific documentation for maximum values. The SRLB corresponds to the labels allocated for SIDs local to the node, such as Adjacency-SIDs. It is recommended to configure the same SRLB block across all nodes. The SRLB must not overlap with the SRGB. The SRGB and SRLB are configured in IOS-XR with the following configuration#segment-routing global-block 16000 23999 local-block 15000 15999 IGP protocol (ISIS) and Segment Routing MPLS configurationKey chain global configuration for IS-IS authenticationkey chain ISIS-KEY key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password 00071A150754 send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm HMAC-MD5 IS-IS router configurationAll routers, except Area Border Routers (ABRs), are part of one IGPdomain and L2 area (ISIS-ACCESS or ISIS-CORE). Area border routersrun two IGP IS-IS processes (ISIS-ACCESS and ISIS-CORE). Note that Loopback0 is part of both IGP processes.router isis ISIS-ACCESS set-overload-bit on-startup 360 is-type level-2-only net 49.0001.0101.0000.0110.00 nsr nsf cisco log adjacency changes lsp-gen-interval maximum-wait 5000 initial-wait 5 secondary-wait 100 lsp-refresh-interval 65000 max-lsp-lifetime 65535 lsp-password keychain ISIS-KEY address-family ipv4 unicast metric-style wide advertise link attributes spf-interval maximum-wait 1000 initial-wait 5 secondary-wait 100 segment-routing mpls spf prefix-priority high tag 1000 maximum-redistributed-prefixes 100 level 2 ! address-family ipv6 unicast metric-style wide spf-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 maximum-redistributed-prefixes 100 level 2Note# ABR Loopback 0 on domain boundary is part of both IGP processes together with same “prefix-sid absolute” valueNote# The prefix SID can be configured as either absolute or index. The index configuration is required for interop with nodes using a different SRGB.IS-IS Loopback and node SID configuration interface Loopback0 ipv4 address 100.0.1.50 255.255.255.255 address-family ipv4 unicast prefix-sid absolute 16150 tag 1000 Unnumbered InterfacesIS-IS and Segment Routing/SR-TE utilized in the Converged SDN Transport design supports using unnumbered interfaces. SR-PCE used to compute inter-domain SR-TE paths also supports the use of unnumbered interfaces. In the topology database each interface is uniquely identified by a combination of router ID and SNMP IfIndex value.Unnumbered interface configurationinterface TenGigE0/0/0/2 description to-AG2 mtu 9216 ptp profile My-Slave port state slave-only local-priority 10 ! service-policy input core-ingress-classifier service-policy output core-egress-exp-marking ipv4 point-to-point ipv4 unnumbered Loopback0 frequency synchronization selection input priority 10 wait-to-restore 1 !!Unnumbered Interface IS-IS DatabaseThe IS-IS database will reference the node SNMP IfIndex valueMetric# 10 IS-Extended A-PE1.00 Local Interface ID# 1075, Remote Interface ID# 40 Affinity# 0x00000000 Physical BW# 10000000 kbits/sec Reservable Global pool BW# 0 kbits/sec Global Pool BW Unreserved# [0]# 0 kbits/sec [1]# 0 kbits/sec [2]# 0 kbits/sec [3]# 0 kbits/sec [4]# 0 kbits/sec [5]# 0 kbits/sec [6]# 0 kbits/sec [7]# 0 kbits/sec Admin. Weight# 90 Ext Admin Group# Length# 32 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 Link Average Delay# 1 us Link Min/Max Delay# 1/1 us Link Delay Variation# 0 us Link Maximum SID Depth# Label Imposition# 12 ADJ-SID# F#0 B#1 V#1 L#1 S#0 P#0 weight#0 Adjacency-sid#24406 ADJ-SID# F#0 B#0 V#1 L#1 S#0 P#0 weight#0 Adjacency-sid#24407Anycast SID ABR node configurationAnycast SIDs are SIDs existing on two more ABR nodes to offer a redundant fault tolerant path for traffic between Access PEs and remote PE devices. In CST 3.5 and above, anycast SID paths can either be manually configured on the head-end or computed by the SR-PCE. When SR-PCE computes a path it will inspect the topology database to ensure the next SID in the computed segment list is reachable from all anycast nodes. If not, the anycast SID will not be used. The same IP address and prefix-sid must be configured on all shared anycast nodes, with the n-flag clear option set. Note when anycast SID path computation is used with SR-PCE, only IGP metrics are supported.IS-IS Configuration for Anycast SIDrouter isis ACCESS interface Loopback100 ipv4 address 100.100.100.1 255.255.255.255 address-family ipv4 unicast prefix-sid absolute 16150 n-flag clear tag 1000 Conditional IGP Loopback advertisement While not the only use case for conditional advertisement, it is a required component when using anycast SIDs with static segment list. Conditional advertisement will not advertise the Loopback interface if certain routes are not found in the RIB. If the anycast Loopback is withdrawn, the segment list will be considered invalid on the head-end node. The conditional prefixes should be all or a subset of prefixes from the adjacent IGP domain.route-policy check if rib-has-route in async remote-prefixes pass endif end-policyprefix-set remote-prefixes 100.0.2.52, 100.0.2.53router isis ACCESS interface Loopback100 address-family ipv4 unicast advertise prefix route-policy checkIS-IS logical interface configuration with TI-LFAIt is recommended to use manual adjacency SIDs. A protected SID is eligible for backup path computation, meaning if a packet ingresses the node with the label a backup path will be provided in case of a link failure. In the case of having multiple adjacencies between the same two nodes, use the same adjacency-sid on each link. Unnumbered interfaces are configured using the same configuration. interface TenGigE0/0/0/10 point-to-point hello-password keychain ISIS-KEY address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa adjacency-sid absolute 15002 protected metric 100 ! address-family ipv6 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 Segment Routing Data Plane MonitoringIn CST 3.5 we introduce SR DPM across all IOS-XR platforms. SR DPM uses MPLS OAM mechanisms along with specific SID lists in order to exercise the dataplane of the originating node, detecting blackholes typically difficult to diagnose. SR DPM ensures the nodes SR-MPLS forwarding plane is valid without a drop in traffic towards adjacent nodes and other nodes in the same IGP domain. SR DPM is a proactive approach to blackhole detection and mitigation.SR DPM first performs interface adjacency checks by sending an MPLS OAM packet to adjacent nodes using the interface adjacency SID and its own node SID in the SID list. This ensures the adjacent node is sending traffic back to the node correctly.Once this connectivity is verified, SR DPM will then test forwarding to all other node SIDs in the IGP domain across each adjacency. This is done by crafting a MPLS OAM packet with SID list {Adj-SID, Target Node SID} with TTL=2. The packet is sent to the adjacent node, back to the SR DPM testing node, and then onto the target node via SR-MPLS forwarding. The downstream node towards the target node will receive the packet with TTL=0 and send an MPLS OAM response to the SR DPM originating node. This communicates valid forwarding across the originating node towards the target node.It is recommended to enable SR DPM on all CST IOS-XR nodes.SR Data Plane Monitoring Configurationmpls oam dpm pps 10 interval 60 (minutes) MPLS Segment Routing Traffic Engineering (SRTE) configurationThe following configuration is done at the global ISIS configuration level and should be performed for all IOS-XR nodes.router isis ACCESS address-family ipv4 unicast mpls traffic-eng level-2-only mpls traffic-eng router-id Loopback0MPLS Segment Routing Traffic Engineering (SRTE) TE metric configurationThe TE metric is used when computing SR Policy paths with the “te” or “latency” constraint type. The TE metric is carried as a TLV within the TE opaque LSA distributed across the IGP area and to the PCE via BGP-LS.The TE metric is used in the CST 5G Transport use case. If no TE metric is defined the local CSPF or PCE will utilize the IGP metric.segment-routing traffic-eng interface TenGigE0/0/0/6 metric 1000IOS-XE Nodes - SR-MPLS TransportSegment Routing MPLS configurationmpls label range 6001 32767 static 16 6000segment-routing mpls ! set-attributes address-family ipv4 sr-label-preferred exit-address-family ! global-block 16000 24999 ! Prefix-SID assignment to loopback 0 configuration connected-prefix-sid-map address-family ipv4 100.0.1.51/32 index 151 range 1 exit-address-family ! IGP protocol (ISIS) with Segment Routing MPLS configurationkey chain ISIS-KEY key 1 key-string cisco accept-lifetime 00#00#00 Jan 1 2018 infinite send-lifetime 00#00#00 Jan 1 2018 infinite!router isis ACCESS net 49.0001.0102.0000.0254.00 is-type level-2-only authentication mode md5 authentication key-chain ISIS-KEY metric-style wide fast-flood 10 set-overload-bit on-startup 120 max-lsp-lifetime 65535 lsp-refresh-interval 65000 spf-interval 5 50 200 prc-interval 5 50 200 lsp-gen-interval 5 5 200 log-adjacency-changes segment-routing mpls segment-routing prefix-sid-map advertise-local TI-LFA FRR configuration fast-reroute per-prefix level-2 all fast-reroute ti-lfa level-2 microloop avoidance protected!interface Loopback0 ip address 100.0.1.51 255.255.255.255 ip router isis ACCESS isis circuit-type level-2-onlyend IS-IS and MPLS interface configurationinterface TenGigabitEthernet0/0/12 mtu 9216 ip address 10.117.151.1 255.255.255.254 ip router isis ACCESS mpls ip isis circuit-type level-2-only isis network point-to-point isis metric 100end MPLS Segment Routing Traffic Engineering (SRTE)router isis ACCESS mpls traffic-eng router-id Loopback0 mpls traffic-eng level-2 Area Border Routers (ABRs) IGP-ISIS Redistribution configuration (IOS-XR)The ABR nodes must provide IP reachability for RRs, SR-PCEs and NSO between ISIS-ACCESS and ISIS-CORE IGP domains. This is done by IPprefix redistribution. The ABR nodes have static hold-down routes for the block of IP space used in each domain across the network, those static routes are then redistributed into the domains using the redistribute static command with a route-policy. The distance command is used to ensure redistributed routes are not preferred over local IS-IS routes on the opposite ABR. The distance command must be applied to both ABR nodes.router staticaddress-family ipv4 unicast 100.0.0.0/24 Null0 100.0.1.0/24 Null0 100.1.0.0/24 Null0 100.1.1.0/24 Null0prefix-set ACCESS-PCE_SvRR-LOOPBACKS 100.0.1.0/24, 100.1.1.0/24end-setprefix-set RR-LOOPBACKS 100.0.0.0/24, 100.1.0.0/24end-set Redistribute Core SvRR and TvRR loopback into Access domainroute-policy CORE-TO-ACCESS1 if destination in RR-LOOPBACKS then pass else drop endifend-policy!router isis ACCESS address-family ipv4 unicast distance 254 0.0.0.0/0 RR-LOOPBACKS redistribute static route-policy CORE-TO-ACCESS1 Redistribute Access SR-PCE and SvRR loopbacks into CORE domainroute-policy ACCESS1-TO-CORE if destination in ACCESS-PCE_SvRR-LOOPBACKS then pass else drop endif end-policy ! router isis CORE address-family ipv4 unicast distance 254 0.0.0.0/0 ACCESS-PCE_SvRR-LOOPBACKS redistribute static route-policy CORE-TO-ACCESS1 Multicast transport using mLDPOverviewThis portion of the implementation guide instructs the user how to configure mLDP end to end across the multi-domain network. Multicast service examples are given in the “Services” section of the implementation guide.mLDP core configurationIn order to use mLDP across the Converged SDN Transport network LDP must first be enabled. There are two mechanisms to enable LDP on physical interfaces across the network, LDP auto-configuration or manually under the MPLS LDP configuration context. The capabilities statement will ensure LDP unicast FECs are not advertised, only mLDP FECs. Recursive forwarding is required in a multi-domain network. mLDP must be enabled on all participating A-PE, PE, AG, PA, and P routers.LDP base configuration with defined interfacesmpls ldp capabilities sac mldp-only mldp logging notifications address-family ipv4 make-before-break delay 30 forwarding recursive recursive-fec ! ! router-id 100.0.2.53 session protection address-family ipv4 ! interface TenGigE0/0/0/6 ! interface TenGigE0/0/0/7 LDP auto-configurationLDP can automatically be enabled on all IS-IS interfaces with the following configuration in the IS-IS configuration. It is recommended to do this only after configuring all MPLS LDP properties.router isis ACCESS address-family ipv4 unicast segment-routing mpls sr-prefer mpls ldp auto-config G.8275.1 and G.8275.2 PTP (1588v2) timing configurationSummaryThis section contains the base configurations used for both G.8275.1 and G.8275.2 timing. Please see the CST 3.0 HLD for an overview on timing in general.Enable frequency synchronizationIn order to lock the internal oscillator to a PTP source, frequency synchronization must first be enabled globally.frequency synchronization quality itu-t option 1 clock-interface timing-mode system log selection changes! Optional Synchronous Ethernet configuration (PTP hybrid mode)If the end-to-end devices support SyncE it should be enabled. SyncE will allow much faster frequency sync and maintain integrity for long periods of time during holdover events. Using SyncE for frequency and PTP for phase is known as “Hybrid” mode. A lower priority is used on the SyncE input (50 for SyncE vs. 100 for PTP).interface TenGigE0/0/0/10 frequency synchronization selection input priority 50 !! PTP G.8275.2 global timing configurationAs of CST 3.0, IOS-XR supports a single PTP timing profile and single clock type in the global PTP configuration. The clock domain should follow the ITU-T guidelines for specific profiles using a domain >44 for G.8275.2 clocks.ptp clock domain 60 profile g.8275.2 clock-type T-BC ! frequency priority 100 time-of-day priority 50 log servo events best-master-clock changes ! PTP G.8275.2 interface profile definitionsIt is recommended to use “profiles” defined globally which are then applied to interfaces participating in timing. This helps minimize per-interface timing configuration. It is also recommended to define different profiles for “master” and “slave” interfaces.IPv4 G.8275.2 master profileThe master profile is assigned to interfaces for which the router is acting as a boundary clockptp profile g82752_master_v4 transport ipv4 port state master-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 5 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv6 G.8275.2 master profileThe master profile is assigned to interfaces for which the router is acting as a boundary clockptp profile g82752_master_v6 transport ipv6 port state master-only sync frequency 16 clock operation one-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv4 G.8275.2 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82752_master_v4 transport ipv4 port state slave-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv6 G.8275.2 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82752_master_v6 transport ipv6 port state slave-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! PTP G.8275.1 global timing configurationAs of CST 3.0, IOS-XR supports a single PTP timing profile and single clock type in the global PTP configuration. The clock domain should follow the ITU-T guidelines for specific profiles using a domain <44 for G.8275.1 clocks.ptpclock domain 24 operation one-step Use one-step for NCS series, two-step for ASR 9000 physical-layer-frequency frequency priority 100 profile g.8275.1 clock-type T-BClog servo events best-master-clock changes IPv6 G.8275.1 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82751_slave port state slave-only clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 delay-request frequency 16 multicast transport ethernet !! IPv6 G.8275.1 master profileThe master profile is assigned to interfaces for which the router is acting as a master to slave devicesptp profile g82751_slave port state master-only clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step sync frequency 16 announce timeout 10 announce interval 1 delay-request frequency 16 multicast transport ethernet !! Application of PTP profile to physical interfaceNote# In CST 3.0 PTP may only be enabled on physical interfaces. G.8275.1 operates at L2 and supports PTP across Bundle member links and interfaces part of a bridge domain. G.8275.2 operates at L3 and does not support Bundle interfaces or BVI interfaces.G.8275.2 interface configurationThis example is of a slave device using a master of 2405#10#23#253##0.interface TenGigE0/0/0/6 ptp profile g82752_slave_v6 master ipv6 2405#10#23#253## ! ! G.8275.1 interface configurationinterface TenGigE0/0/0/6 ptp profile g82751_slave ! ! Segment Routing Path Computation Element (SR-PCE) configurationrouter static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.100 bgp graceful-restart graceful-reset bgp graceful-restart ibgp policy out enforce-modifications address-family link-state link-state ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR !!pce address ipv4 100.100.100.1 rest user rest_user password encrypted 00141215174C04140B ! authentication basic ! state-sync ipv4 100.100.100.2 peer-filter ipv4 access-list pe-routers! BGP - Services (sRR) and Transport (tRR) route reflector configurationServices Route Reflector (sRR) configurationIn the CST validation a sRR is used to reflect all service routes. In a production network each service could be allocated its own sRR based on resiliency and scale demands.router static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.200 bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast nexthop trigger-delay critical 10 additional-paths receive additional-paths send ! address-family vpnv6 unicast nexthop trigger-delay critical 10 additional-paths receive additional-paths send retain route-target all ! address-family l2vpn evpn additional-paths receive additional-paths send ! address-family ipv4 mvpn nexthop trigger-delay critical 10 soft-reconfiguration inbound always ! address-family ipv6 mvpn nexthop trigger-delay critical 10 soft-reconfiguration inbound always ! neighbor-group SvRR-Client remote-as 100 bfd fast-detect bfd minimum-interval 3 update-source Loopback0 address-family l2vpn evpn route-reflector-client ! address-family vpnv4 unicast route-reflector-client ! address-family vpnv6 unicast route-reflector-client ! address-family ipv4 mvpn route-reflector-client ! address-family ipv6 mvpn route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group SvRR-Client !! Transport Route Reflector (tRR) configurationrouter static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.10 bgp graceful-restart ibgp policy out enforce-modifications address-family link-state link-state additional-paths receive additional-paths send ! neighbor-group RRC remote-as 100 update-source Loopback0 address-family link-state link-state route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group RRC ! neighbor 100.0.0.2 use neighbor-group RRC! BGP – Provider Edge Routers (A-PEx and PEx) to service RREach PE router is configured with BGP sessions to service route-reflectors for advertising VPN service routes across the inter-domain network.IOS-XR configurationrouter bgp 100 nsr bgp router-id 100.0.1.50 bgp graceful-restart graceful-reset bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! address-family l2vpn evpn ! neighbor-group SvRR remote-as 100 bfd fast-detect bfd minimum-interval 3 update-source Loopback0 address-family vpnv4 unicast soft-reconfiguration inbound always ! address-family vpnv6 unicast soft-reconfiguration inbound always ! address-family ipv4 mvpn soft-reconfiguration inbound always ! address-family ipv6 mvpn soft-reconfiguration inbound always ! address-family l2vpn evpn soft-reconfiguration inbound always ! ! neighbor 100.0.1.201 use neighbor-group SvRR ! ! IOS-XE configurationrouter bgp 100 bgp router-id 100.0.1.51 bgp log-neighbor-changes no bgp default ipv4-unicast neighbor SvRR peer-group neighbor SvRR remote-as 100 neighbor SvRR update-source Loopback0 neighbor 100.0.1.201 peer-group SvRR ! address-family ipv4 exit-address-family ! address-family vpnv4 neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family ! address-family l2vpn evpn neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family ! BGP-LU co-existence BGP configurationCST 3.0 introduced co-existence between services using BGP-LU and SR endpoints. If you are using SR and BGP-LU within the same domain it requires using BGP-SR in order to resolve prefixes correctly on the each ABR. BGP-SR uses a new BGP attribute attached to the BGP-LU prefix to convey the SR prefix-sid index end to end across the network. Using the same prefix-sid index both within the SR-MPLS IGP domain and across the BGP-LU network simplifies the network from an operational perspective since the path to an end node can always be identified by that SID.It is recommended to enable the BGP-SR configuration when enabling SR on the PE node. See the PE configuration below for an example of this configuration.Segment Routing Global Block ConfigurationThe BGP process must know about the SRGB in order to properly allocate local BGP-SR labels when receiving a BGP-LU prefix with a BGP-SR index community. This is done via the following configuration. If a SRGB is defined under the IGP it must match the global SRGB value. The IGP will inherit this SRGB value if none is previously defined.segment-routing global-block 32000 64000 !! Boundary node configurationThe following configuration is necessary on all domain boundary nodes. Note the ibgp policy out enforce-modifications command is required to change the next-hop on reflected IBGP routes.router bgp 100 ibgp policy out enforce-modifications neighbor-group BGP-LU-PE remote-as 100 update-source Loopback0 address-family ipv4 labeled-unicast soft-reconfiguration inbound always route-reflector-client next-hop-self ! ! neighbor-group BGP-LU-PE remote-as 100 update-source Loopback0 address-family ipv4 labeled-unicast soft-reconfiguration inbound always route-reflector-client next-hop-self ! ! neighbor 100.0.2.53 use neighbor-group BGP-LU-PE ! neighbor 100.0.2.52 use neighbor-group BGP-LU-PE ! neighbor 100.0.0.1 use neighbor-group BGP-LU-BORDER ! neighbor 100.0.0.2 use neighbor-group BGP-LU-BORDER ! ! PE node configurationThe following configuration is necessary on all domain PE nodes participating in BGP-LU/BGP-SR. The label-index set must match the index of the Loopback addresses being advertised into BGP. This example shows a single Loopback address being advertised into BGP.route-policy LOOPBACK-INTO-BGP-LU($SID-LOOPBACK0) set label-index $SID-LOOPBACK0 set aigp-metric igp-costend-policy!router bgp 100 address-family ipv4 unicast network 100.0.2.53/32 route-policy LOOPBACK-INTO-BGP-LU(153) ! neighbor-group BGP-LU-BORDER remote-as 100 update-source Loopback0 address-family ipv4 labeled-unicast ! ! neighbor 100.0.0.3 use neighbor-group BGP-LU-BORDER ! neighbor 100.0.0.4 use neighbor-group BGP-LU-BORDER ! Area Border Routers (ABRs) IGP topology distributionNext network diagram# “BGP-LS Topology Distribution” shows how AreaBorder Routers (ABRs) distribute IGP network topology from ISIS ACCESSand ISIS CORE to Transport Route-Reflectors (tRRs). tRRs then reflecttopology to Segment Routing Path Computation Element (SR-PCEs). Each SR-PCE has full visibility of the entire inter-domain network.Note# Each IS-IS process in the network requires a unique instance-id to identify itself to the PCE.Figure 5# BGP-LS Topology Distributionrouter isis ACCESS **distribute link-state instance-id 101** net 49.0001.0101.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0 !! router isis CORE **distribute link-state instance-id 100** net 49.0001.0100.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0 !! router bgp 100 **address-family link-state link-state** ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR ! Segment Routing Traffic Engineering (SRTE) and Services IntegrationThis section shows how to integrate Traffic Engineering (SRTE) withservices. ODN is configured by first defining a global ODN color associated with specific SR Policy constraints. The color and BGP next-hop address on the service route will be used to dynamically instantiate a SR Policy to the remote VPN endpoint.On Demand Next-Hop (ODN) configuration – IOS-XRsegment-routing traffic-eng logging policy status ! on-demand color 100 dynamic pce ! metric type igp ! ! ! pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 ! pce address ipv4 100.1.1.101 ! !extcommunity-set opaque BLUE 100end-setroute-policy ODN_EVPN set extcommunity color BLUEend-policyrouter bgp 100 address-family l2vpn evpn route-policy ODN_EVPN out !! On Demand Next-Hop (ODN) configuration – IOS-XEmpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-allmpls traffic-eng auto-tunnel p2p config unnumbered-interface Loopback0mpls traffic-eng auto-tunnel p2p tunnel-num min 1000 max 5000!mpls traffic-eng lsp attributes L3VPN-SRTE path-selection metric igp pce!ip community-list 1 permit 9999!route-map L3VPN-ODN-TE-INIT permit 10 match community 1 set attribute-set L3VPN-SRTE!route-map L3VPN-SR-ODN-Mark-Comm permit 10 match ip address L3VPN-ODN-Prefixes set community 9999 !!router bgp 100 address-family vpnv4 neighbor SvRR send-community both neighbor SvRR route-map L3VPN-ODN-TE-INIT in neighbor SvRR route-map L3VPN-SR-ODN-Mark-Comm out SR-PCE configuration – IOS-XRsegment-routing traffic-eng pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 ! pce address ipv4 100.1.1.101 ! ! SR-PCE configuration – IOS-XEmpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-all SR-TE Policy ConfigurationAt the foundation of CST is the use of Segment Routing Traffic Engineering Policies. SR-TE allow providers to create end to end traffic paths with engineered constraints to achieve a SLA objective. SR-TE Policies are either dynamically created by ODN (see ODN section) or users can configure SR-TE Policies on the head-end node.SR-TE Color and EndpointThe components uniquely identifying a SR-TE Policy to a destination PE node are its endpoint and color. The endpoint is the destination node loopback address. Note the endpoint address should not be an anycast address. The color is a 32-bit value which should have a SLA meaning to the network. The color allows for multiple SR-TE Policies to exist between a pair of nodes, each one with its own set of metrics and constraints.SR-TE Candidate Paths Each SR-TE Policy configured on a node must have at least one candidate path defined. If multiple candidate paths are defined, only one is active at any one time. The candidate path with the higher preference value is preferred over candidate paths with a lower preference value. The candidate path configuration specifies whether the path is dynamic or uses an explicit segment list. Within the dynamic configuration one can specify whether to use a PCE or not, the metric type used in the path computation (IGP metric, latency, TE metric, hop count), and the additional constraints placed on the path (link affinities, flex-algo constraints, or a cumulative metric of type IGP metric, latency, TE Metric, or hop count) There is a default candidate path with a preference of 200 using head-end IGP path computation Each candidate path can have multiple explicit segment lists defined with a bandwidth weight value to load balance traffic across multiple explicit pathsService to SR-TE Policy ForwardingService traffic is forwarded over SR-TE Policies in the CST design using per-destination automated steering. Per-destination steering utilizes two BGP components of the service route to forward traffic to a matching SR Policy A color extended community attached to the service route matching the SR Policy color The BGP next-hop address of the service route to match the endpoint of the SR Policy SR-TE Configuration ExamplesSR Policy using IGP computation, head-end computationThe local PE device will compute a path using the lowest cumulative path to 100.0.1.50. Note in the multi-domain CST design, this computation will fail to nodes not found within the same IS-IS domain as the PE.segment-routing traffic-eng policy GREEN-PE3-24 color 1024 end-point ipv4 100.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type igpSR Policy using lowest IGP metric computation and PCEPThis policy will request a path from the configured primary PCE with the lowest cumulative IGP metric to the endpoint 100.0.1.50segment-routing traffic-eng policy GREEN-PE3-24 color 1024 end-point ipv4 100.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type igpSR Policy using lowest latency metric and PCEPThis policy will request a path from the configured primary PCE with the lowest cumulative latency to the endpoint 100.0.1.50. As covered in the performance-measurement section, the per-link latency metric value used will be the dynamic/static PM value, a configured TE metric value, or the IGP metric.segment-routing traffic-eng policy GREEN-PE3-24 color 1024 end-point ipv4 100.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type latency SR Policy using explicit segment listThis policy does not perform any path computation, it will utilize the statically defined segment lists as the forwarding path across the network. The node does however check the validity of the node segments in the list. Each node SID in the segment list can be defined by either IP address or SID. The full path to the egress node must be defined in the list, but you do not need to define every node explicitly in the path. If you want the path to take a specific link the correct node and adjacency SID must be defined in the list.segment-routing traffic-eng segment-list anycast-path index 1 mpls label 17034 index 2 mpls label 16150 ! policy anycast-path-ape3 color 9999 end-point ipv4 100.0.1.50 candidate-paths preference 1 explicit segment-list anycast-pathQoS ImplementationSummaryPlease see the CST 3.0 HLD for in-depth information on design choices.Core QoS configurationThe core QoS policies defined for CST 3.0 utilize priority levels, with no bandwidth guarantees per traffic class. In a production network it is recommended to analyze traffic flows and determine an appropriate BW guarantee per traffic class. The core QoS uses four classes. Note the “video” class uses priority level 6 since only levels 6 and 7 are supported for high priority multicast. Traffic Type Priority Level Core EXP Marking     Network Control 1 6   Voice 2 5   High Priority 3 4   Video 6 2   Default 0 0 Class maps used in QoS policiesClass maps are used within a policy map to match packet criteria or internal QoS markings like traffic-class or qos-groupclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 match mpls experimental topmost 5 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 match mpls experimental topmost 4 end-class-map!class-map match-any match-video-cs4-exp2 description Video match dscp 32 match mpls experimental topmost 2 end-class-map!class-map match-any match-cs6-exp6 description Highest priority control-plane traffic match dscp cs6 match mpls experimental topmost 6 end-class-map!class-map match-any match-qos-group-1 match qos-group 1 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-map!class-map match-any match-qos-group-6 match qos-group 3 end-class-map!class-map match-any match-traffic-class-1 description ~Match highest priority traffic-class 1~ match traffic-class 1 end-class-map!class-map match-any match-traffic-class-2 description ~Match high priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match medium traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-traffic-class-6 description ~Match video traffic-class 6~ match traffic-class 6 end-class-map Core ingress classifier policypolicy-map core-ingress-classifier class match-cs6-exp6 set traffic-class 1 ! class match-ef-exp5 set traffic-class 2 ! class match-cs5-exp4 set traffic-class 3 ! class match-video-cs4-exp2 set traffic-class 6 ! class class-default set mpls experimental topmost 0 set traffic-class 0 set dscp 0 ! end-policy-map! Core egress queueing mappolicy-map core-egress-queuing class match-traffic-class-2 priority level 2 queue-limit 100 us ! class match-traffic-class-3 priority level 3 queue-limit 500 us ! class match-traffic-class-6 priority level 6 queue-limit 500 us ! class match-traffic-class-1 priority level 1 queue-limit 500 us ! class class-default queue-limit 250 ms ! end-policy-map! Core egress MPLS EXP marking mapThe following policy must be applied for PE devices with MPLS-based VPN services in order for service traffic classified in a specific QoS Group to be marked. VLAN-based P2P L2VPN services will by default inspect the incoming 802.1p bits and copy those the egress MPLS EXP if no specific ingress policy overrides that behavior. Note the EXP can be set in either an ingress or egress QoS policy. This QoS example sets the EXP via the egress map.policy-map core-egress-exp-marking class match-qos-group-1 set mpls experimental imposition 6 ! class match-qos-group-2 set mpls experimental imposition 5 ! class match-qos-group-3 set mpls experimental imposition 4 ! class match-qos-group-6 set mpls experimental imposition 2 ! class class-default set mpls experimental imposition 0 ! end-policy-map! H-QoS configurationEnabling H-QoS on NCS 540 and NCS 5500Enabling H-QoS on the NCS platforms requires the following global command and requires a reload of the device.hw-module profile qos hqos-enable Example H-QoS policy for 5G servicesThe following H-QoS policy represents an example QoS policy reserving 5Gbps on a sub-interface. On ingress each child class is policed to a certain percentage of the 5Gbps policer. In the egress queuing policy, shaping is used with guaranteed each class a certain amount of egress bandwidth, with high priority traffic being serviced in a low-latency queue (LLQ).Class maps used in ingress H-QoS policiesclass-map match-any edge-hqos-2-in match dscp 46 end-class-map!class-map match-any edge-hqos-3-in match dscp 40 end-class-map!class-map match-any edge-hqos-6-in match dscp 32 end-class-map Parent ingress QoS policypolicy-map hqos-ingress-parent-5g class class-default service-policy hqos-ingress-child-policer police rate 5 gbps ! ! end-policy-map H-QoS ingress child policiespolicy-map hqos-ingress-child-policer class edge-hqos-2-in set traffic-class 2 police rate percent 10 ! ! class edge-hqos-3-in set traffic-class 3 police rate percent 30 ! ! class edge-hqos-6-in set traffic-class 6 police rate percent 30 ! ! class class-default set traffic-class 0 set dscp 0 police rate percent 100 ! ! end-policy-map Egress H-QoS parent policy (Priority levels)policy-map hqos-egress-parent-4g-priority class class-default service-policy hqos-egress-child-priority shape average 4 gbps ! end-policy-map! Egress H-QoS child using priority onlyIn this policy all classes can access 100% of the bandwidth, queues are services based on priority level. The lower priority level has preference.policy-map hqos-egress-child-priority class match-traffic-class-2 shape average percent 100 priority level 2 ! class match-traffic-class-3 shape average percent 100 priority level 3 ! class match-traffic-class-6 priority level 4 shape average percent 100 ! class class-default ! end-policy-map Egress H-QoS child using reserved bandwidthIn this policy each class is reserved a certain percentage of bandwidth. Each class may utilize up to 100% of the bandwidth, if traffic exceeds the guaranteed bandwidth it is eligible for drop.policy-map hqos-egress-child-bw class match-traffic-class-2 bandwidth remaining percent 30 ! class match-traffic-class-3 bandwidth remaining percent 30 ! class match-traffic-class-6 bandwidth remaining percent 30 ! class class-default bandwidth remaining percent 10 ! end-policy-map Egress H-QoS child using shapingIn this policy each class is shaped to a defined amount and cannot exceed the defined bandwidth.policy-map hqos-egress-child-shaping class match-traffic-class-2 shape average percent 30 ! class match-traffic-class-3 shape average percent 30 ! class match-traffic-class-6 shape average percent 30 ! class class-default shape average percent 10 ! end-policy-map! ServicesEnd-To-End VPN ServicesFigure 6# End-To-End Services TableEnd-To-End VPN Services Data PlaneFigure 10# End-To-End Services Data PlaneL3VPN MP-BGP VPNv4 On-Demand Next-HopFigure 7# L3VPN MP-BGP VPNv4 On-Demand Next-Hop Control PlaneAccess Routers# Cisco ASR920 IOS-XE and NCS540 IOS-XR Operator# New VPNv4 instance via CLI or NSO Access Router# Advertises/receives VPNv4 routes to/from ServicesRoute-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Please refer to “On Demand Next-Hop (ODN)” sections for initial ODN configuration.Access Router Service Provisioning (IOS-XR)ODN route-policy configurationextcommunity-set opaque ODN-GREEN 100end-setroute-policy ODN-L3VPN-OUT set extcommunity color ODN-GREEN passend-policy VRF definition configurationvrf ODN-L3VPN rd 100#1 address-family ipv4 unicast import route-target 100#1 ! export route-target export route-policy ODN-L3VPN-OUT 100#1 ! ! address-family ipv6 unicast import route-target 100#1 ! export route-target export route-policy ODN-L3VPN-OUT 100#1 ! ! VRF Interface configurationinterface TenGigE0/0/0/23.2000 mtu 9216 vrf ODN-L3VPN ipv4 address 172.106.1.1 255.255.255.0 encapsulation dot1q 2000 BGP VRF configuration with static/connected onlyrouter bgp 100 vrf VRF-MLDP rd auto address-family ipv4 unicast redistribute connected redistribute static ! address-family ipv6 unicast redistribute connected redistribute static ! Access Router Service Provisioning (IOS-XE)VRF definition configurationvrf definition L3VPN-SRODN-1 rd 100#100 route-target export 100#100 route-target import 100#100 address-family ipv4 exit-address-family VRF Interface configurationinterface GigabitEthernet0/0/2 mtu 9216 vrf forwarding L3VPN-SRODN-1 ip address 10.5.1.1 255.255.255.0 negotiation autoend BGP VRF configuration Static & BGP neighborStatic routing configurationrouter bgp 100 address-family ipv4 vrf L3VPN-SRODN-1 redistribute connected exit-address-family BGP neighbor configurationrouter bgp 100 neighbor Customer-1 peer-group neighbor Customer-1 remote-as 200 neighbor 10.10.10.1 peer-group Customer-1 address-family ipv4 vrf L3VPN-SRODN-2 neighbor 10.10.10.1 activate exit-address-family L2VPN Single-Homed EVPN-VPWS On-Demand Next-HopFigure 8# L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Advertises/receives EVPN-VPWS instance to/fromServices Route-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Note# Please refer to On Demand Next-Hop (ODN) – IOS-XR section for initial ODN configuration. The correct EVPN L2VPN routes must be advertised with a specific color ext-community to trigger dynamic SR Policy instantiation.Access Router Service Provisioning (IOS-XR)#Port based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 1000 target 1 source 1 interface TenGigE0/0/0/5 l2transport VLAN Based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 neighbor evpn evi 1000 target 1 source 1 !! interface TenGigE0/0/0/5.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric! L2VPN Static Pseudowire (PW) – Preferred Path (PCEP)Figure 9# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) ControlPlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Access Router Service Provisioning (IOS-XR)#Note# EVPN VPWS dual homing is not supported when using an SR-TE preferred path.Note# In IOS-XR 6.6.3 the SR Policy used as the preferred path must be referenced by its generated name and not the configured policy name. This requires first issuing the commandDefine SR Policy traffic-eng policy GREEN-PE3-1 color 1001 end-point ipv4 100.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type igp Determine auto-configured policy name The auto-configured policy name will be persistant and must be used as a reference in the L2VPN preferred-path configuration.RP/0/RP0/CPU0#A-PE8#show segment-routing traffic-eng policy candidate-path name GREEN-PE3-1   SR-TE policy database Color# 1001, End-point# 100.0.1.50 Name# srte_c_1001_ep_100.0.1.50 Port Based Service configurationinterface TenGigE0/0/0/15 l2transport ! ! l2vpn pw-class static-pw-class-PE3 encapsulation mpls control-word preferred-path sr-te policy srte_c_1001_ep_100.0.1.50 ! ! ! p2p Static-PW-to-PE3-1 interface TenGigE0/0/0/15 neighbor ipv4 100.0.0.3 pw-id 1000 mpls static label local 1000 remote 1000 pw-class static-pw-class-PE3 VLAN Based Service configurationinterface TenGigE0/0/0/5.1001 l2transport encapsulation dot1q 1001 rewrite ingress tag pop 1 symmetric ! ! l2vpn pw-class static-pw-class-PE3 encapsulation mpls control-word preferred-path sr-te policy srte_c_1001_ep_100.0.1.50 p2p Static-PW-to-PE7-2 interface TenGigE0/0/0/5.1001 neighbor ipv4 100.0.0.3 pw-id 1001 mpls static label local 1001 remote 1001 pw-class static-pw-class-PE3 Access Router Service Provisioning (IOS-XE)#Port Based service with Static OAM configurationinterface GigabitEthernet0/0/1 mtu 9216 no ip address negotiation auto no keepalive service instance 10 ethernet encapsulation default xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word ! pseudowire-static-oam class static-oam timeout refresh send 10 ttl 255 ! ! ! pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 status protocol notification static static-oam ! VLAN Based Service configurationinterface GigabitEthernet0/0/1 no ip address negotiation auto service instance 1 ethernet Static-VPWS-EVC encapsulation dot1q 10 rewrite ingress tag pop 1 symmetric xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word ! ! ! pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 L2VPN EVPN E-TreeNote# ODN support for EVPN E-Tree is supported on ASR9K only in CST 3.5. Support for E-Tree across all CST IOS-XR nodes will be covered in CST 4.0 based on IOS-XR 7.2.2. In CST 3.5, if using E-Tree across multiple IGP domains, SR-TE Policies must be configured between all Root nodes and between all Root and Leaf nodes.IOS-XR Root Node Configuraitonevpn evi 100 advertise-mac ! ! l2vpn bridge group etree bridge-domain etree-ftth interface TenGigE0/0/0/14.100 routed interface BVI100 ! evi 100 IOS-XR Leaf Node ConfigurationA single command is needed to enable leaf function for an EVI. Configuring “etree leaf” will signal to other nodes this is a leaf node. In this case we also have a L3 IRB configured within the EVI. In order to isolate the two ACs, each AC is configured with the “split-horizon group” configuration command. The BVI interfaceis configured with “local-proxy-arp” to intercept ARP requests between hosts on each AC. This is needed if hosts in two different ACs are using the same IP address subnet, since ARP traffic will be suppressed acrossed the ACs.evpn evi 100 etree leaf ! advertise-mac ! ! l2vpn bridge group etree bridge-domain etree-ftth interface TenGigE0/0/0/23.1098 split-horizon-group interface TenGigE0/0/0/24.1098 split-horizon group routed interface BVI100 ! evi 100 interface BVI11011 local-proxy-arpHierarchical ServicesFigure 11# Hierarchical Services TableL3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE)Figure 12# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Provider Edge Router# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together withPseudowire-Headend (PWHE) via CLI or NSO Provider Edge Router# Path to remote PE is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p L3VPN-VRF1 interface TenGigE0/0/0/5.501 neighbor evpn evi 13 target 501 source 501 ! ! !interface TenGigE0/0/0/5.501 l2transport encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric Port based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 13 target 502 source 502 ! ! !! interface TenGigE0/0/0/5 l2transport Access Router Service Provisioning (IOS-XE)#VLAN based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric ! Port based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation default Provider Edge Router Service Provisioning (IOS-XR)#VRF configurationvrf L3VPN-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#501 ! export route-target 100#501 ! ! address-family ipv6 unicast import route-target 100#501 ! export route-target 100#501 ! ! BGP configurationrouter bgp 100 vrf L3VPN-ODNTE-VRF1 rd 100#501 address-family ipv4 unicast redistribute connected ! address-family ipv6 unicast redistribute connected ! ! PWHE configurationinterface PW-Ether1 vrf L3VPN-ODNTE-VRF1 ipv4 address 10.13.1.1 255.255.255.0 ipv6 address 1000#10#13##1/126 attach generic-interface-list PWHE! EVPN VPWS configuration towards Access PEl2vpn xconnect group evpn-vpws-l3vpn-A-PE3 p2p L3VPN-ODNTE-VRF1 interface PW-Ether1 neighbor evpn evi 13 target 501 source 501 ! Figure 13# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 withPseudowire-Headend (PWHE) Data PlaneL3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRBFigure 14# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 withAnycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together with Anycast IRBvia CLI or NSO Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2 l2transport !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 n-flag-clear L2VPN configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30 VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 15# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB Datal PlaneL2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRBFigure 16# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L2VPN Multipoint EVPN instance together withAnycast IRB via CLI or NSO (Anycast IRB is optional when L2 and L3is required in same service instance) Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Please note that provisioning on Access and Provider Edge routers issame as in “L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB”. In this use case there is BGP EVPN instead of MP-BGPVPNv4/6 in the core.Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !!interface TenGigE0/0/0/2 l2transport!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 L2VPN Configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30! VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 17# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Data PlaneEthernet CFM for L2VPN service assuranceEthernet Connectivity Fault Management is an Ethernet OAM component used to validate end-to-end connectivity between service endpoints. Ethernet CFM is defined by two standards, 802.1ag and Y.1731. Within an SP network, Maintenance Domains are created based on service scope. Domains are typically separated by operator boundaries and may be nested but cannot overlap. Within each service, maintenance points can be created to verify bi-directional end to end connectivity. These are known as MEPs (Maintenance End-Point) and MIPs (Maintenance Intermediate Points). These maintenance points process CFM messages. A MEP is configured at service endpoints and has directionality where an “up” MEP faces the core of the network and a “down” MEP faces a CE device or NNI port. MIPs are optional and are created dynamically. Detailed information on Ethernet CFM configuration and operation can be found at https#//www.cisco.com/c/en/us/td/docs/routers/ncs5500/software/interfaces/configuration/guide/b-interfaces-hardware-component-cg-ncs5500-66x/b-interfaces-hardware-component-cg-ncs5500-66x_chapter_0101.htmlMaintenance Domain configurationA Maintenance Domain is defined by a unique name and associated level. The level can be 0-7. The numerical identifier usually corresponds to the scope of the MD, where 7 is associated with CE endpoints, 6 associated with PE devices connected to a CE. Additional levels may be required based on the topology and service boundaries which occur along the end-to-end service. In this example we only a single domain and utilize level 0 for all MEPs.ethernet cfm domain EVPN-VPWS-PE3-PE8 level 0 MEP configuration for EVPN-VPWS servicesFor L2VPN xconnect services, each service must have a MEP created on the end PE device. There are two components to defining a MEP, first defining the Ethernet CFM “service” and then defining the MEP on the physical or logical interface participating in the L2VPN xconnect service. In the following configuration the xconnect group “EVPN-VPWS-ODN-PE3” and P2P EVPN VPWS service odn-8 are already defined. The Ethernet CFM service of “odn-8” does NOT have to match the xconnect service name. The MEP crosscheck defines a remote MEP to listen for Continuity Check messages from. It does not have to be the same as the local MEP defined on the physical sub-interface (103), but for P2P services it is best practice to make them identical. This configuration will send Ethernet CFM Continuity Check (CC) messages every 1 minute to verify end to end reachability.L2VPN configurationl2vpn xconnect group EVPN-VPWS-ODN-PE3 p2p odn-8 interface TenGigE0/0/0/23.8 neighbor evpn evi 1318 target 8 source 8 ! ! !! Physical sub-interface configurationinterface TenGigE0/0/0/23.8 l2transport encapsulation dot1q 8 rewrite ingress tag pop 1 symmetric ethernet cfm mep domain EVPN-VPWS-PE3-PE8 service odn-8 mep-id 103 ! !! Ethernet CFM service configurationethernet cfm domain EVPN-VPWS-PE3-PE8 service odn-8 xconnect group EVPN-VPWS-ODN-PE3 p2p odn-8 mip auto-create all continuity-check interval 1m mep crosscheck mep-id 103 ! log crosscheck errors log continuity-check errors log continuity-check mep changes ! !! Multicast NG-MVPN Profile 14 using mLDP and ODN L3VPNIn ths service example we will implement multicast delivery across the CST network using mLDP transport for multicast and SR-MPLS for unicast traffic. L3VPN SR paths will be dynamically created using ODN. Multicast profile 14 is the “Partitioned MDT - MLDP P2MP - BGP-AD - BGP C-Mcast Signaling” Using this profile each mVPN will use a dedicated P2MP tree, endpoints will be auto-discovered using NG-MVPN BGP NLRI, and customer multicast state such as source streams, PIM, and IGMP membership data will be signaled using BGP. Profile 14 is the recommended profile for high scale and utilizing label-switched multicast (LSM) across the core.Please note that mLDP requires an IGP path to the source PE loopback address. The CST design utilizes a multi-domain approach which normally does not advertise IGP routes across domain boundaries. If mLDP is being utilized across domains, controlled redistribution should be used to advertise the source PE loopback addresses to receiver PEsMulticast core configurationThe multicast “core” includes transit endpoints participating in mLDP only. See the mLDP core configuration section for details on end-to-end mLDP configuration.Unicast L3VPN PE configurationIn order to complete an RPF check for SSM sources, unicast L3VPN configuration is required. Additionally the VRF must be defined under the BGP configuration with the NG-MVPN address families configured. In our use case we are utilizing ODN for creating the paths between L3VPN endpoints with a route-policy attached to the mVPN VRF to set a specific color on advertised routes.ODN opaque ext-community setextcommunity-set opaque MLDP 1000end-set ODN route-policyroute-policy ODN-MVPN set extcommunity color MLDP passend-policy Global L3VPN VRF definitionvrf VRF-MLDP address-family ipv4 unicast import route-target 100#38 ! export route-policy ODN-MVPN export route-target 100#38 ! ! address-family ipv6 unicast import route-target 100#38 ! export route-policy ODN-MVPN export route-target 100#38 ! !! BGP configurationrouter bgp 100 vrf VRF-MLDP rd auto address-family ipv4 unicast redistribute connected redistribute static ! address-family ipv6 unicast redistribute connected redistribute static ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! !! Multicast PE configurationThe multicast “edge” includes all endpoints connected to native multicast sources or receivers.Define RPF policyroute-policy mldp-partitioned-p2mp set core-tree mldp-partitioned-p2mpend-policy! Enable Multicast and define mVPN VRFmulticast-routing address-family ipv4 interface Loopback0 enable ! ! vrf VRF-MLDP address-family ipv4 mdt source Loopback0 rate-per-route interface all enable accounting per-prefix bgp auto-discovery mldp ! mdt partitioned mldp ipv4 p2mp mdt data 100 ! !! Enable PIM for mVPN VRF In this instance there is an interface TenGigE0/0/0/23.2000 which is using PIM within the VRFrouter pim address-family ipv4 rp-address 100.0.1.50 ! vrf VRF-MLDP address-family ipv4 rpf topology route-policy mldp-partitioned-p2mp mdt c-multicast-routing bgp ! interface TenGigE0/0/0/23.2000 enable ! ! Enable IGMP for mVPN VRF interface To discover listeners for a specific group, enable IGMP on interfaces within the VRF. These interested receivers will be advertised via BGP to establish end to end P2MP trees from the source.router igmp vrf VRF-MLDP interface TenGigE0/0/0/23.2001 ! version 3 !! Multicast distribution using TreeSID with static S,G MappingTreeSID utilizes only Segment Routing to create and forward multicast traffic across an optimized tree. The TreeSID tree is configured on the SR-PCE for deployment to the network. PCEP is used to instantiate the correct computed segments end to end. On the head-end source node,Note# TreeSID requires all nodes in the multicast distribution network to have connections to the same SR-PCE instances, please see the PCEP configuration section of the Implmentation GuideTreeSID SR-PCE ConfigurationEndpoint Set ConfigurationThe P2MP endpoint sets are defined outside of the SR TreeSID Policy configuration in order to be reusaable across multiple trees. This is a required step in the configuration of TreeSID.pce address ipv4 100.0.1.101 timers reoptimization 600 ! segment-routing traffic-eng p2mp endpoint-set APE7-APE8 ipv4 100.0.2.57 ipv4 100.0.2.58 ! timers reoptimization 120 timers cleanup 30P2MP TreeSID SR Policy ConfigurationThis configuration defines the TreeSID P2MP SR Policy to be used across the network. Note the name of the TreeSID must be unique across the netowrk and referenced explicitly on all source and receiver nodes. Within the policy configuration, supported constraints can be applied during path computation of the optimized P2MP tree. Note the source address must be specified and the MPLS label used must be within the SRLB for all nodes across the network.pce segment-routing traffic-eng policy treesid-1 source ipv4 100.0.0.1 color 100 endpoint-set APE7-APE8 treesid mpls 18600 candidate-paths constraints affinity include-any color1 ! ! ! preference 100 dynamic metric type igp ! ! !TreeSID Common Config on All NodesSegment Routing Local BlockWhile the SRLB config is covered elsewhere in this guide, it is recommended to set the values the same across the TreeSID domain. The values shown are for demonstration only.segment-routing local-block 18000 19000 !!PCEP ConfigurationTreeSID relies on PCE initiated segments to the node, so a session to the PCE is required for all nodes in the domain.segment-routing traffic-eng pcc source-address ipv4 100.0.2.53 pce address ipv4 100.0.1.101 precedence 200 ! pce address ipv4 100.0.2.101 precedence 100 ! pce address ipv4 100.0.2.102 precedence 100 ! report-all timers delegation-timeout 10 timers deadtimer 60 timers initiated state 15 timers initiated orphan 10 ! !!TreeSID Source Node Multicast ConfigurationPIM ConfigurationIn this configuration a single S,G of 232.0.0.20 with a source of 104.14.1.2 is mapped to TreeSID treesid-1 for distribution across the network.router pim address-family ipv4 interface Loopback0 enable ! interface Bundle-Ether111 enable ! interface Bundle-Ether112 enable ! interface TenGigE0/0/0/16 enable ! sr-p2mp-policy treesid-1 static-group 232.0.0.20 104.14.1.2 !!Multicast Routing Configurationmulticast-routing address-family ipv4 interface all enable mdt static segment-routing ! address-family ipv6 mdt static segment-routing ! !TreeSID Receiver Node Multicast ConfigurationGlobal Routing Table MulticastPIM Configurationrouter pim address-family ipv4 rp-address 100.0.0.1 ! !!On the router connected to the receivers, configure the address family to use the TreeSID for static S,G mapping.multicast-routing address-family ipv4 mdt source Loopback0 rate-per-route interface all enable static sr-policy TreeSID-GRT mdt static segment-routing accounting per-prefix address-family ipv6 mdt source Loopback0 rate-per-route interface all enable static sr-policy TreeSID-GRT mdt static segment-routing account per-prefix !!Multicast Routing Configurationmulticast-routing address-family ipv4 interface all enable static sr-policy treesid-1 ! address-family ipv6 static sr-policy treesid-1 ! !mVPN Multicast ConfigurationPIM ConfigurationIn this configuration, we are mapping the PIM RP to the TREESID sourcerouter pim vrf TREESID address-family ipv4 rp-address 100.0.0.1 ! !!Multicast Routing ConfigurationOn the PE connected to the receivers, within the VRF associated with the TreeSID SR Policy, enable the TreeSID for static mapping of S,G multicast.multicast-routing vrf TREESID address-family ipv4 interface all enable static sr-policy treesid-1 ! address-family ipv6 static sr-policy treesid-1 ! !TreeSID Verification on PCEYou can view the end to end path using the “show pce lsp p2mp” command.RP/0/RP0/CPU0#XTC-ACCESS1-PHY#show pce lsp p2mpWed Sep 2 19#31#50.745 UTCTree# treesid-1 Label# 18600 Operational# up Admin# up Transition count# 1 Uptime# 00#06#39 (since Wed Sep 02 19#25#11 UTC 2020) Source# 100.0.0.1 Destinations# 100.0.2.53, 100.0.2.52 Nodes# Node[0]# 100.0.2.3 (AG3) Role# Transit Hops# Incoming# 18600 CC-ID# 1 Outgoing# 18600 CC-ID# 1 (10.23.253.1) Outgoing# 18600 CC-ID# 1 (10.23.252.0) Node[1]# 100.0.2.1 (PA3) Role# Transit Hops# Incoming# 18600 CC-ID# 2 Outgoing# 18600 CC-ID# 2 (10.21.23.1) Node[2]# 100.0.0.3 (PE3) Role# Transit Hops# Incoming# 18600 CC-ID# 3 Outgoing# 18600 CC-ID# 3 (10.3.21.1) Node[3]# 100.0.0.5 (P1) Role# Transit Hops# Incoming# 18600 CC-ID# 4 Outgoing# 18600 CC-ID# 4 (10.3.5.0) Node[4]# 100.0.0.7 (P3) Role# Transit Hops# Incoming# 18600 CC-ID# 5 Outgoing# 18600 CC-ID# 5 (10.5.7.0) Node[5]# 100.0.1.1 (NCS540-PA1) Role# Transit Hops# Incoming# 18600 CC-ID# 6 Outgoing# 18600 CC-ID# 6 (10.1.7.1) Node[6]# 100.0.0.1 (PE1) Role# Ingress Hops# Incoming# 18600 CC-ID# 7 Outgoing# 18600 CC-ID# 7 (10.1.11.1) Node[7]# 100.0.2.53 (A-PE8) Role# Egress Hops# Incoming# 18600 CC-ID# 8 Node[8]# 100.0.2.52 (A-PE7) Role# Egress Hops# Incoming# 18600 CC-ID# 9End-To-End VPN Services Data PlaneFigure 10# End-To-End Services Data PlaneHierarchical ServicesFigure 11# Hierarchical Services TableL3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE)Figure 12# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Provider Edge Router# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together withPseudowire-Headend (PWHE) via CLI or NSO Provider Edge Router# Path to remote PE is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p L3VPN-VRF1 interface TenGigE0/0/0/5.501 neighbor evpn evi 13 target 501 source 501 ! ! !interface TenGigE0/0/0/5.501 l2transport encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric Port based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 13 target 502 source 502 ! ! !! interface TenGigE0/0/0/5 l2transport Access Router Service Provisioning (IOS-XE)#VLAN based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric ! Port based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation default Provider Edge Router Service Provisioning (IOS-XR)#VRF configurationvrf L3VPN-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#501 ! export route-target 100#501 ! ! address-family ipv6 unicast import route-target 100#501 ! export route-target 100#501 ! ! BGP configurationrouter bgp 100 vrf L3VPN-ODNTE-VRF1 rd 100#501 address-family ipv4 unicast redistribute connected ! address-family ipv6 unicast redistribute connected ! ! PWHE configurationinterface PW-Ether1 vrf L3VPN-ODNTE-VRF1 ipv4 address 10.13.1.1 255.255.255.0 ipv6 address 1000#10#13##1/126 attach generic-interface-list PWHE! EVPN VPWS configuration towards Access PEl2vpn xconnect group evpn-vpws-l3vpn-A-PE3 p2p L3VPN-ODNTE-VRF1 interface PW-Ether1 neighbor evpn evi 13 target 501 source 501 ! Figure 13# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 withPseudowire-Headend (PWHE) Data PlaneL3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRBFigure 14# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 withAnycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together with Anycast IRBvia CLI or NSO Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2 l2transport !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 n-flag-clear L2VPN configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30 VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 15# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB Datal PlaneL2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRBFigure 16# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L2VPN Multipoint EVPN instance together withAnycast IRB via CLI or NSO (Anycast IRB is optional when L2 and L3is required in same service instance) Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Please note that provisioning on Access and Provider Edge routers issame as in “L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB”. In this use case there is BGP EVPN instead of MP-BGPVPNv4/6 in the core.Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !!interface TenGigE0/0/0/2 l2transport!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 L2VPN Configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30! VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 17# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Data PlaneRemote PHY CIN ImplementationSummaryDetail can be found in the CST 3.0 high-level design guide for design decisions, this section will provide sample configurations.Sample QoS PoliciesThe following are usable policies but policies should be tailored for specific network deployments.Class mapsClass maps are used within a policy map to match packet criteria for further treatmentclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 match mpls experimental topmost 5 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 match mpls experimental topmost 4 end-class-map!class-map match-any match-video-cs4-exp2 description Video match dscp 32 match mpls experimental topmost 2 end-class-map!class-map match-any match-cs6-exp6 description Highest priority control-plane traffic match dscp cs6 match mpls experimental topmost 6 end-class-map!class-map match-any match-qos-group-1 match qos-group 1 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-map!class-map match-any match-qos-group-6 match qos-group 3 end-class-map!class-map match-any match-traffic-class-1 description ~Match highest priority traffic-class 1~ match traffic-class 1 end-class-map!class-map match-any match-traffic-class-2 description ~Match high priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match medium traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-traffic-class-6 description ~Match video traffic-class 6~ match traffic-class 6 end-class-map RPD and DPIC interface policy mapsThese are applied to all interfaces connected to cBR-8 DPIC and RPD devices.Note# Egress queueing maps are not supported on L3 BVI interfacesRPD/DPIC ingress classifier policy mappolicy-map rpd-dpic-ingress-classifier class match-cs6-exp6 set traffic-class 1 set qos-group 1 ! class match-ef-exp5 set traffic-class 2 set qos-group 2 ! class match-cs5-exp4 set traffic-class 3 set qos-group 3 ! class match-video-cs4-exp2 set traffic-class 6 set qos-group 6 ! class class-default set traffic-class 0 set dscp 0 set qos-group 0 ! end-policy-map! P2P RPD and DPIC egress queueing policy mappolicy-map rpd-dpic-egress-queuing class match-traffic-class-1 priority level 1 queue-limit 500 us ! class match-traffic-class-2 priority level 2 queue-limit 100 us ! class match-traffic-class-3 priority level 3 queue-limit 500 us ! class match-traffic-class-6 priority level 6 queue-limit 500 us ! class class-default queue-limit 250 ms ! end-policy-map! Core QoSPlease see the general QoS section for core-facing QoS configurationCIN Timing ConfigurationPlease see the G.8275.2 timing configuration guide in this document for details on timing configuration. The following values should be used for PTP configuration attributes. Please note in CST 3.0 the use of an IOS-XR router as a Boundary Clock is only supported on P2P L3 interfaces. The use of a BVI for RPD aggregation requires the BC used for RPD nodes be located upstream, or alternatively a physical loopback cable may be used to provide timing off the IOS-XR based RPD leaf device. PTP variable IOS-XR configuration value IOS-XE value Announce Interval 1 1 Announce Timeout 5 5 Sync Frequency 16 -4 Delay Request Frequency 16 -4 Example CBR-8 RPD DTI Profileptp r-dti 4 profile G.8275.2 ptp-domain 60 clock-port 1 clock source ip 192.168.3.1 sync interval -4 announce timeout 5 delay-req interval -4 Multicast configurationSummaryWe present two different configuration options based on either native multicast deployment or the use of a L3VPN to carry Remote PHY traffic. The L3VPN option shown uses Label Switched Multicast profile 14 (partitioned mLDP) however profile 6 could also be utilized.Global multicast configuration - Native multicastOn CIN aggregation nodes all interfaces should have multicast enabled.multicast-routing address-family ipv4 interface all enable ! address-family ipv6 interface all enable enable ! Global multicast configuration - LSM using profile 14On CIN aggregation nodes all interfaces should have multicast enabled.vrf VRF-MLDP address-family ipv4 mdt source Loopback0 rate-per-route interface all enable accounting per-prefix bgp auto-discovery mldp ! mdt partitioned mldp ipv4 p2mp mdt data 100 ! ! PIM configuration - Native multicastPIM should be enabled for IPv4/IPv6 on all core facing interfacesrouter pim address-family ipv4 interface Loopback0 enable ! interface TenGigE0/0/0/6 enable ! interface TenGigE0/0/0/7 enable ! ! PIM configuration - LSM using profile 14The PIM configuration is utilized even though no PIM neighbors may be connected.route-policy mldp-partitioned-p2mp set core-tree mldp-partitioned-p2mpend-policy!router pim address-family ipv4 interface Loopback0 enable vrf rphy-vrf address-family ipv4 rpf topology route-policy mldp-partitioned-p2mp mdt c-multicast-routing bgp ! ! IGMPv3/MLDv2 configuration - Native multicastInterfaces connected to RPD and DPIC interfaces should have IGMPv3 and MLDv2 enabledrouter igmp interface BVI100 version 3 ! interface TenGigE0/0/0/25 version 3 !!router mld interface BVI100 version 2 interface TenGigE0/0/0/25 version 3 ! ! IGMPv3/MLDv2 configuration - LSM profile 14Interfaces connected to RPD and DPIC interfaces should have IGMPv3 and MLDv2 enabled as neededrouter igmp vrf rphy-vrf interface BVI101 version 3 ! interface TenGigE0/0/0/15 ! !!router mld vrf rphy-vrf interface TenGigE0/0/0/15 version 2 ! !! IGMPv3 / MLDv2 snooping profile configuration (BVI aggregation)In order to limit L2 multicast replication for specific groups to only interfaces with interested receivers, IGMP and MLD snooping must be enabled.igmp snooping profile igmp-snoop-1!mld snooping profile mld-snoop-1! RPD DHCPv4/v6 relay configurationIn order for RPDs to self-provision DHCP relay must be enabled on all RPD-facing L3 interfaces. In IOS-XR the DHCP relay configuration is done in its own configuration context without any configuration on the interface itself.Native IP / Default VRFdhcp ipv4 profile rpd-dhcpv4 relay helper-address vrf default 10.0.2.3 ! interface BVI100 relay profile rpd-dhcpv4!dhcp ipv6 profile rpd-dhcpv6 relay helper-address vrf default 2001#10#0#2##3 iana-route-add source-interface BVI100 ! interface BVI100 relay profile rpd-dhcpv6 RPHY L3VPNIn this example it is assumed the DHCP server exists within the rphy-vrf VRF, if it does not then additional routing may be necessary to forward packets between VRFs.dhcp ipv4 vrf rphy-vrf relay profile rpd-dhcpv4-vrf profile rpd-dhcpv4-vrf relay helper-address vrf rphy-vrf 10.0.2.3 relay information option allow-untrusted ! inner-cos 5 outer-cos 5 interface BVI101 relay profile rpd-dhcpv4-vrf interface TenGigE0/0/0/15 relay profile rpd-dhcpv4-vrf! cBR-8 DPIC interface configuration without Link HAWithout link HA the DPIC port is configured as a normal physical interfaceinterface TenGigE0/0/0/25 description .. Connected to cbr8 port te1/1/0 service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 carrier-delay up 0 down 0 load-interval 30 cBR-8 DPIC interface configuration with Link HAWhen using Link HA faster convergence is achieved when each DPIC interface is placed into a BVI with a statically assigned MAC address. Each DPIC interface is placed into a separate bridge-domain with a unique BVI L3 interface. The same MAC address should be utilized on all BVI interfaces. Convergence using BVI interfaces is <50ms, L3 physical interfaces is 1-2s.Even DPIC port CIN interface configurationinterface TenGigE0/0/0/25 description ~Connected to cBR8 port Te1/1/0~ lldp ! carrier-delay up 0 down 0 load-interval 30 l2transport !!l2vpn bridge group cbr8 bridge-domain port-ha-0 interface TenGigE0/0/0/25 ! routed interface BVI500 ! ! ! interface BVI500 description ~BVI for cBR8 port HA, requires static MAC~ service-policy input rpd-dpic-ingress-classifier ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 mac-address 8a.9698.64 load-interval 30! Odd DPIC port CIN interface configurationinterface TenGigE0/0/0/26 description ~Connected to cBR8 port Te1/1/1~ lldp ! carrier-delay up 0 down 0 load-interval 30 l2transport !!l2vpn bridge group cbr8 bridge-domain port-ha-1 interface TenGigE0/0/0/26 ! routed interface BVI501 ! ! ! interface BVI501 description ~BVI for cBR8 port HA, requires static MAC~ service-policy input rpd-dpic-ingress-classifier ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 mac-address 8a.9698.64 load-interval 30! cBR-8 Digital PIC Interface Configurationinterface TenGigE0/0/0/25 description .. Connected to cbr8 port te1/1/0 service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 carrier-delay up 0 down 0 load-interval 30 RPD interface configurationP2P L3In this example the interface has PTP enabled towards the RPDinterface TeGigE0/0/0/15 description To RPD-1 mtu 9200 ptp profile g82752_master_v4 ! service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 192.168.2.0 255.255.255.254 ipv6 address 2001#192#168#2##0/127 ipv6 enable ! BVIl2vpn bridge group rpd bridge-domain rpd-1 mld snooping profile mld-snoop-1 igmp snooping profile igmp-snoop-1 interface TenGigE0/0/0/15 ! interface TenGigE0/0/0/16 ! interface TenGigE0/0/0/17 ! routed interface BVI100 ! ! ! !!interface BVI100 description ... to downstream RPD hosts service-policy input rpd-dpic-ingress-classifier ipv4 address 192.168.2.1 255.255.255.0 ipv6 address 2001#192#168#2##1/64 ipv6 enable ! RPD/DPIC agg device IS-IS configurationThe standard IS-IS configuration should be used on all core interfaces with the addition of specifying all DPIC and RPD connected as IS-IS passive interfaces. Using passive interfaces is preferred over redistributing connected routes. This configuration is needed for reachability between DPIC and RPDs across the CIN network.router isis ACCESS interface TenGigE0/0/0/25 passive address-family ipv4 unicast ! address-family ipv6 unicast Additional configuration for L3VPN DesignGlobal VRF ConfigurationThis configuration is required on all DPIC and RPD connected routers as well as ancillary elements communicating with Remote PHY elementsvrf rphy-vrf address-family ipv4 unicast import route-target 100#5000 ! export route-target 100#5000 ! ! address-family ipv6 unicast import route-target 100#5000 ! export route-target 100#5000 ! ! BGP ConfigurationThis configuration is required on all DPIC and RPD connected routers as well as ancillary elements communicating with Remote PHY elementsrouter bgp 100 vrf rphy-vrf rd auto address-family ipv4 unicast label mode per-vrf redistribute connected ! address-family ipv6 unicast label mode per-vrf redistribute connected ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! ! Model-Driven Telemetry ConfigurationSummaryThis is not an exhaustive list of IOS-XR model-driven telemetry sensor paths, but gives some basic paths used to monitor a Converged SDN Transport deployment. Each sensor path may have its own cadence of collection and transmission, but it’s recommended to not use values less than 60s when using many sensor paths.Device inventory and monitoring Metric Sensor path Full inventory via OpenConfig model openconfig-platform#components NCS 540/5500 NPU resources cisco-ios-xr-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data Optics information cisco-ios-xr-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info System uptime cisco-ios-xr-shellutil-oper#system-time/uptime System CPU utilization cisco-ios-xr-wdsysmon-fd-oper#system-monitoring/cpu-utilization Interface Data| Metric | Sensor path | ———————–| —————————————————||Interface optics state | Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info/transport-admin-state||OpenConfig interface stats|openconfig-interfaces#interfaces||Interface data rates, based on load-interval|Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/data-rate| |Interface counters similar to “show int”|Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters||Full interface information|Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface||Interface stats|Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statistics||Subset of interface stats|Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statistics/basic-interface-stats|LLDP Monitoring| Metric | Sensor path | ———————–| —————————————————||All LLDP Info| Cisco-IOS-XR-ethernet-lldp-oper#lldp||LLDP neighbor info|Cisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/neighbors|Aggregate bundle information (use interface models for interface counters)| Metric | Sensor path | ———————–| —————————————————||OpenConfig LAG information|sensor-group openconfig-if-aggregate#aggregate||OpenConfig LAG state only|sensor-group openconfig-if-aggregate#aggregate/state||OpenConfig LACP information|sensor-group openconfig-lacp#lacp||Cisco full bundle information|sensor-group Cisco-IOS-XR-bundlemgr-oper#bundles||Cisco BFD over Bundle stats|sensor-group Cisco-IOS-XR-bundlemgr-oper#bundle-information/bfd-counters|PTP and SyncE Information| Metric | Sensor path | ———————–| —————————————————||PTP servo status | Cisco-IOS-XR-ptp-oper#ptp/platform/servo/device-status ||PTP servo statistics | Cisco-IOS-XR-ptp-oper#ptp/platform/servo ||PTP foreign master information | Cisco-IOS-XR-ptp-oper#ptp/interface-foreign-masters ||PTP interface counters, key is interface name | Cisco-IOS-XR-ptp-oper#ptp/interface-packet-counters | |Frequency sync info | Cisco-IOS-XR-freqsync-oper#frequency-synchronization/summary/frequency-summary ||SyncE interface information, key is interface name | Cisco-IOS-XR-freqsync-oper#frequency-synchronization/interface-datas/interface-data |BGP Information| Metric | Sensor path | |———————–| —————————————————||BGP established neighbor count across all AF | Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/vrfs/vrf/process-info/global/established-neighbors-count-total||BGP total neighbor count| Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/vrfs/vrf/process-info/global/neighbors-count-total||BGP prefix SID count| Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/vrfs/vrf/process-info/global/prefix-sid-label-index-count||BGP total VRF count including default VRF| Cisco-IOS-XR-ipv4-bgp-oper#process-info/ipv4-bgp-oper#global/ipv4-bgp-oper#total-vrf-count||BGP convergence|Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/afs/af/af-process-info/performance-statistics/global/|has-converged||BGP IPv4 route count| Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-ext/active-routes-count||OpenConfig BGP information|openconfig-bgp#bgp||OpenConfig BGP neighbor info only| openconfig-bgp#bgp/neighbors|IS-IS Information| Metric | Sensor path | |———————–| —————————————————||IS-IS neighbor info| sensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighbors||IS-IS interface info| sensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/interfaces||IS-IS adj information| sensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/adjacencies||IS-IS neighbor summary| sensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighbor-summaries||IS-IS node count| Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/topologies/topology/topology-levels/topology-level/topology-summary/router-node-count/reachable-node-count||IS-IS adj state| Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/level/adjacencies/adjacency/adjacency-state||IS-IS neighbor count| Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighbor-summaries/neighbor-summary/level2-neighbors/neighbor-up-count||IS-IS total route count| Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2/active-routes-count|Routing protocol RIB information| Metric | Sensor path | |———————–| —————————————————||IS-IS L1 Info|Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1||IS-IS L2 Info|Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2|IS-IS Summary|Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sum||Total route count per protocol|Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count||IPv6 IS-IS L1 info|Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1||IPv6 IS-IS L2 info|Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2||IPv6 IS-IS summary|Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sum||IPv6 total route count per protocol|Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count|BGP RIB informationIt is not recommended to monitor these paths using MDT with large tables| Metric | Sensor path | |———————–| —————————————————||openconfig-rib-bgp#bgp-rib||Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-ext||Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-int||Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-ext||Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-int|Routing policy Information| Metric | Sensor path | |———————–| —————————————————||Routing policy information | Cisco-IOS-XR-policy-repository-oper#routing-policy/policies|EVPN Information| Metric | Sensor path | |———————–| —————————————————||EVPN information| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/evpn-summary||Total EVPN| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/evpn-summary/total-count||EVPN total ES entries| Cisco-IOS-XR-evpn-oper#evpn/active/summary/es-entries||EVPN local Eth Auto Discovery routes| Cisco-IOS-XR-evpn-oper#evpn/active/summary/local-ead-routes||EVPN remote Eth Auto Discovery routes| Cisco-IOS-XR-evpn-oper#evpn/active/summary/remote-ead-routes|Per-Interface QoS Statistics Information| Metric | Sensor path | |———————–| —————————————————||Input stats | Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/||General QoS Stats | Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/general-stats||Per-queue stats | Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/queue-stats-array||General service policy information, keys are policy name and interface applied| Cisco-IOS-XR-qos-ma-oper#qos/interface-table/interface/input/service-policy-names|Per-Policy, Per-Interface, Per-Class statisticsSee sensor path name for detailed information on data leafs| Sensor path | |———————–| |Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/match-data-rate||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/pre-policy-matched-bytes||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/pre-policy-matched-packets||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/total-drop-bytes||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/total-drop-packets||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/total-drop-rate||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/total-transmit-rate||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/transmit-bytes||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/queue-instance-length/value||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/queue-max-length/unit||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/queue-max-length/value||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/random-drop-bytes||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/random-drop-packets||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/tail-drop-bytes||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/tail-drop-packets||Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/shared-queue-id|L2VPN Information| Metric | Sensor path | |———————–| —————————————————||L2VPN general forwarding information including EVPN and Bridge Domains| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary||Bridge domain information| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/bridge-domain-summary| |Total BDs active| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/bridge-domain-summary/bridge-domain-count||Total BDs using EVPN| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/bridge-domain-summary/bridge-domain-with-evpn-enabled||Total MAC count (Local+remote)| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/mac-summary/mac-count||L2VPN xconnect Forwarding information| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/xconnect-summary||Xconnect total count| Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnect-summary/number-xconnects||Xconnect down count| Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnect-summary/number-xconnects-down||Xconnect up count| Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnect-summary/number-xconnects-up||Xconnect unresolved| Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnect-summary/number-xconnects-unresolved||Xconnect with down attachment circuits| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/xconnect-summary/ac-down-count-l2vpn||Per-xconnect detailed information including state| xconnect group and name are keys# Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnects/xconnect||L2VPN bridge domain specific information, will have the BD name as a key| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-bridge-domains/l2fib-bridge-domain||L2VPN EVPN IPv4 MAC/IP information| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-evpn-ip4macs||L2VPN EVPN IPv6 MAC/IP information| Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-evpn-ip6macs|SR-PCE PCC and SR Policy Information| Metric | Sensor path | |———————–| —————————————————||PCC to PCE peer information| Cisco-IOS-XR-infra-xtc-agent-oper#pcc/peers||SR policy summary info| Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-summary||Specific SR policy information| Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-summary/configured-down-policy-count||Specific SR policy information| Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-summary/configured-total-policy-count||Specific SR policy information| Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-summary/configured-up-policy-count||SR policy information, key is SR policy name| Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policies/policy||SR policy forwarding info including packet and byte stats per candidate path, key is policy name and candidate path| Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-forwardings|MPLS performance measurement| Metric | Sensor path | |———————–| —————————————————||Summary info| Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/summary||Interface stats for delay measurements| Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/summary/delay-summary/interface-delay-summary/delay-transport-counters/generic-counters||Interface stats for loss measurement| Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/summary/loss-summary/interface-loss-summary||SR policy PM statistics| Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/sr-policies/sr-policy-delay ||Parent interface oper data sensor path| Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/interfaces ||Delay values for each probe measurement| Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/interfaces/delay/interface-last-probes||Delay values aggregated at computation interval| Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/interfaces/delay/interface-last-aggregations||Delay values aggregated at advertisement interval| Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/interfaces/delay/interface-last-advertisements||SR Policy measurement information| Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/sr-policies|mLDP Information| Metric | Sensor path | |———————–| —————————————————||mLDP LSP count| Cisco-IOS-XR-mpls-ldp-mldp-oper#mpls-mldp/active/default-context/context/lsp-count||mLDP peer count| Cisco-IOS-XR-mpls-ldp-mldp-oper#mpls-mldp/active/default-context/context/peer-count||mLDP database info, where specific LSP information is stored| Cisco-IOS-XR-mpls-ldp-mldp-oper#mpls-mldp/active/default-context/databases/database|ACL Information| Metric | Sensor path | |———————–| —————————————————||Details on ACL resource consumption| Cisco-IOS-XR-ipv4-acl-oper#ipv4-acl-and-prefix-list/oor/access-list-summary/details/current-configured-ac-es||OpenConfig full ACL information | openconfig-acl#acl|```", "url": "/blogs/2020-10-01-cst-implementation-guide-3_5/", "author": "Phil Bedard", "tags": "iosxr, cisco, 5G, cin, rphy, Metro, Design" } , "blogs-cin-design-guide": { "title": "Cisco Remote PHY Converged Interconnect Network Design Guide", "content": " On This Page Converged Interconnect Network Design Guide Links to complete Converged SDN Transport Documents Cisco Hardware RPD Aggregation Leaf DPIC Aggregation Aggregation Spine Core / PE Cable Converged Interconnect Network (CIN) High Level Design Summary Distributed Access Architecture Remote PHY Components and Requirements Remote PHY Device (RPD) RPD Network Connections Cisco cBR-8 and cnBR cBR-8 Network Connections cBR-8 Redundancy Remote PHY Communication DHCP Remote PHY Standard Flows GCP UEPI and DEPI L2TPv3 Tunnels CIN Network Requirements IPv4/IPv6 Unicast and Multicast Network Timing QoS DHCPv4 and DHCPv6 Relay Converged SDN Transport CIN Design Deployment Topology Options High Scale Design (Recommended) Collapsed Digital PIC and SUP Uplink Connectivity Collapsed RPD and cBR-8 DPIC Connectivity Cisco Hardware Scalable L3 Routed Design L3 IP Routing CIN Router to Router Interconnection Leaf Transit Traffic cBR-8 DPIC to CIN Interconnection DPIC Interface Configuration Router Interface Configuration RPD to Router Interconnection Native IP or L3VPN/mVPN Deployment SR-TE CIN Quality of Service (QoS) CST Network Traffic Classification CST and Remote-PHY Load Balancing Low-Level CIN Design and Configuration IOS-XR Nodes - SR-MPLS Transport Underlay physical interface configuration with BFD SRGB and SRLB Definition IGP protocol (ISIS) and Segment Routing MPLS configuration Key chain global configuration for IS-IS authentication IS-IS router configuration IS-IS Loopback and node SID configuration IS-IS interface configuration with TI-LFA Multicast transport using mLDP Overview mLDP core configuration LDP base configuration with defined interfaces LDP auto-configuration G.8275.2 PTP (1588v2) timing configuration Summary Enable frequency synchronization Optional Synchronous Ethernet configuration (PTP hybrid mode) PTP G.8275.2 global timing configuration PTP G.8275.2 interface profile definitions IPv4 G.8275.2 master profile IPv6 G.8275.2 master profile IPv4 G.8275.2 slave profile IPv6 G.8275.2 slave profile Application of G.8275.2 PTP profile to physical interface G.8275.2 interface configuration CIN Remote-PHY Specific Deployment Configuration Summary Sample QoS Policies Class maps RPD and DPIC interface policy maps Core QoS CIN Timing Configuration Example CBR-8 RPD DTI Profile Multicast configuration Summary Global multicast configuration - Native multicast Global multicast configuration - LSM using profile 14 PIM configuration - Native multicast PIM configuration - LSM using profile 14 IGMPv3/MLDv2 configuration - Native multicast IGMPv3/MLDv2 configuration - LSM profile 14 IGMPv3 / MLDv2 snooping profile configuration (BVI aggregation) RPD DHCPv4/v6 relay configuration Native IP / Default VRF RPHY L3VPN cBR-8 DPIC interface configuration without Link HA cBR-8 DPIC interface configuration with Link HA cBR-8 Digital PIC Interface Configuration RPD interface configuration P2P L3 BVI RPD/DPIC agg device IS-IS configuration Additional configuration for L3VPN Design Global VRF Configuration BGP Configuration Network Deployment Example Summary Network Diagrams Global Routing Table with DPIC Leaf L3VPN Collapsed Connectivity Table Consistent Configuration across GRT and L3VPN Designs PE4 Configuration PE4 to Core Sample Interface Configuration IS-IS Configuration PE4 Multicast and PIM Configuration PE4 BGP Configuration to CBR8 cBR-8 Configuration cBR-8 Line Card Redundancy Configuration cBR-8 Link HA Configuration cBR-8 DPIC and DPIC Routing Configuration cBR-8 SUP Routing CIN Core Configuration Network Timing Configuration Timing Configuration between PA4 and ASR-903 Grandmaster Physical Interface Configuration for Timing Master IS-IS Configuration - PA3, PA4, AG3, AG4, A-PE8 CIN to cBR8 DPIC Configurations CIN to DPIC Global Routing Table PA4 Te0/0/0/26 to cBR8 DPIC Te0/1/1 primary active interface IS-IS Configuration CIN to DPIC L3VPN BGP Configuration VRF Configuration AG4 Te0/0/0/26 to cBR8 DPIC Te1/1/2 primary active interface CIN to RPD Configuration CIN to RPD Router Timing Configuration Global timing configuration Core-facing (slave) timing configuration GRT Specific Configuration DHCP Configuration Multicast Routing Configuration PIM Configuration IGMP/MLD and Snooping Configuration DHCP Configuration Physical Interface Configuration L2VPN Bridge Domain Configuration IRB/BRI Logical Interface Configuration IS-IS Routing Configuration for RPD Interface L3VPN Configuration VRF Configuration DHCP Configuration MPLS mLDP Configuration BGP Configuration Multicast Configuration PIM Configuration MLD and PIM Configuration Router to RPD Physical Interface Configuration Converged Interconnect Network Design GuideThis CIN design guide is an excerpt from the complete Converged SDN Transport architecture. If you are implementing a converged residential, mobile, and business access and aggregation network, please visit the links below for more details on the holistic design. This design focuses on elements specific to the support for Remote PHY over CIN.Links to complete Converged SDN Transport Documentshttps#//xrdocs.io/design/blogs/latest-converged-sdn-transport-hldhttps#//xrdocs.io/design/blogs/latest-converged-sdn-transport-igCisco HardwareThe design utilizes the following Cisco hardware in the following roles. While these are the roles used in the design, all of the devices utilize IOS-XR and could be utilized in any role in the CIN depending on scale requirements. All devices presented support class B timing using PTP.RPD Aggregation LeafDue to the 10G requirements of RPDs and future RMD connections, these leaf devices are dense in 10G SFP+ connectivity required to support longer distance DWDM optics. In addition to supporting 10GE SFP+ on all ports the N540-24Z8Q2C has 8x25G SFP28 ports and the 55A1-24Q6H-S has 24x25G SFP28 ports.DPIC AggregationIn larger deployments a DPIC aggregation leaf is recommended. Depending on the cBR8 DPIC card being used, a 10GE or 100GE aggregation device is required. The above RPD leaf devices can be utilized for high density DPIC 10GE aggregation. 100GE DPIC aggregation can also be served by the NCS-55A1-24H (24x100GE) or NCS-55A1-36H-S (36x100GE), or a variety of line cards for the NCS 5504 or 5508 modular chassis. All 100GE ports support 4x10GE breakouts for seamless migration to 100GE.Aggregation SpineThe CIN aggregation spine router needs high density 100GE connectivity. Cisco has fixed or modular chassis variants supporting high density 100GE.Core / PEThis device is typically used for cBR8 SUP uplinks. The ASR 9000 series or NCS 5500 can fulfill this role at high density including 400GE support.Cable Converged Interconnect Network (CIN) High Level DesignSummaryThe Converged SDN Transport Design enables a multi-service CIN by adding support for the features and functions required to build a scalable next-generation Ethernet/IP cable access network. Differentiated from simple switch or L3 aggregation designs is the ability to support NG cable transport over the same common infrastructure already supporting other services like mobile backhaul and business VPN services. Cable Remote PHY is simply another service overlayed onto the existing Converged SDN Transport network architecture. We will cover all aspects of connectivity between the Cisco cBR-8 and the RPD device.Distributed Access ArchitectureThe cable Converged Interconnect Network is part of a next-generation Distributed Access Architecture (DAA), an architecture unlocking higher subscriber bandwidth by moving traditional cable functions deeper into the network closer to end users. R-PHY or Remote PHY, places the analog to digital conversion much closer to users, reducing the cable distance and thus enabling denser and higher order modulation used to achieve Gbps speeds over existing cable infrastructure. This reference design will cover the CIN design to support Remote PHY deployments.Remote PHY Components and RequirementsThis section will list some of the components of an R-PHY network and the network requirements driven by those components. It is not considered to be an exhaustive list of all R-PHY components, please see the CableLabs specification document, the latest which can be access via the following URL# https#//specification-search.cablelabs.com/CM-SP-R-PHYRemote PHY Device (RPD)The RPD unlocks the benefits of DAA by integrating the physical analog to digital conversions in a device deployed either in the field or located in a shelf in a facility. The uplink side of the RPD or RPHY shelf is simply IP/Ethernet, allowing transport across widely deployed IP infrastructure. The RPD-enabled node puts the PHY function much closer to an end user, allowing higher end-user speeds. The shelf allows cable operators to terminate only the PHY function in a hub and place the CMTS/MAC function in a more centralized facility, driving efficiency in the hub and overall network. The following diagram shows various options for how RPDs or an RPD shelf can be deployed. Since the PHY function is split from the MAC it allows independent placement of those functions.RPD Network ConnectionsEach RPD is typically deployed with a single 10GE uplink connection. The compact RPD shelf uses a single 10GE uplink for each RPD.Cisco cBR-8 and cnBRThe Cisco Converged Broadband Router performs many functions as part of a Remote PHY solution. The cBR-8 provisions RPDs, originates L2TPv3 tunnels to RPDs, provisions cable modems, performs cable subscriber aggregation functions, and acts as the uplink L3 router to the rest of the service provider network. In the Remote PHY architecture the cBR-8 acts as the DOCSIS core and can also serve as a GCP server and video core. The cBR-8 runs IOS-XE. The cnBR, cloud native Broadband Router, provides DOCSIS core functionality in a server-based software platform deployable anywhere in the SP network. CST 3.0 has been validated using the cBR-8, the cnBR will be validated in an upcoming release.cBR-8 Network ConnectionsThe cBR-8 is best represented as having “upstream” and “downstream” connectivity.The upstream connections are from the cBR8 Supervisor module to the SP network. Subscriber data traffic and video ingress these uplink connections for delivery to the cable access network. The cBR-8 SUP-160 has 8x10GE SFP+ physical connections, the SUP-250 has 2xQSFP28/QSFP+ interfaces for 40G/100G upstream connections.In a remote PHY deployment the downstream connections to the CIN are via the Digital PIC (DPIC-8X10G) providing 40G of R-PHY throughput with 8 SFP+ network interfaces.cBR-8 RedundancyThe cBR-8 supports both upstream and downstream redundancy. Supervisor redundancy uses active/standby connections to the SP network. Downstream redundancy can be configured at both the line card and port level. Line card redundancy uses an active/active mechanism where each RPD connects to the DOCSIS core function on both the active and hot standby Digital PIC line card. Port redundancy uses the concept of “port pairs” on each Digital PIC, with ports 0/1, 2/3, 4/6, and 6/7 using either an active/active (L2) or active/standby (L3) mechanism. In the CST design we utilize a L3 design with the active/standby mechanism. The mechanism uses the same IP address on both ports, with the standby port kept in a physical down state until switchover occurs.Remote PHY CommunicationDHCPThe RPD is provisioned using ZTP (Zero Touch Provisioning). DHCPv4 and DHCPv6 are used along with CableLabs DHCP options in order to attach the RPD to the correct GCP server for further provisioning.Remote PHY Standard FlowsThe following diagram shows the different core functions of a Remote PHY solution and the communication between those elements.GCPGeneric Communications Protocol is used for the initial provisioning of the RPD. When the RPD boots and received its configuration via DHCP, one of the DHCP options will direct the RPD to a GCP server which can be the cBR-8 or Cisco Smart PHY. GCP runs over TCP typically on port 8190.UEPI and DEPI L2TPv3 TunnelsThe upstream output from an RPD is IP/Ethernet, enabling the simplification of the cable access network. Tunnels are used between the RPD PHY functions and DOCSIS core components to transport signals from the RPD to the core elements, whether it be a hardware device like the Cisco cBR-8 or a virtual network function provided by the Cisco cnBR (cloud native Broadband Router).DEPI (Downstream External PHY Interface) comes from the M-CMTS architecture, where a distributed architecture was used to scale CMTS functions. In the Remote PHY architecture DEPI represents a tunnel used to encapsulate and transport from the DOCSIS MAC function to the RPD. UEPI (Upstream External PHY Interface) is new to Remote PHY, and is used to encode and transport analog signals from the RPD to the MAC function.In Remote PHY both DEPI and UEPI tunnels use L2TPv3, defined in RFC 3931, to transport frames over an IP infrastructure. Please see the following Cisco white paper for more information on how tunnels are created specific to upstream/downstream channels and how data is encoded in the specific tunnel sessions. https#//www.cisco.com/c/en/us/solutions/collateral/service-provider/converged-cable-access-platform-ccap-solution/white-paper-c11-732260.html. In general there will be one or two (standby configuration) UEPI and DEPI L2TPv3 tunnels to each RPD, with each tunnel having many L2TPv3 sessions for individual RF channels identified by a unique session ID in the L2TPv3 header. Since L2TPv3 is its own protocol, no port number is used between endpoints, the endpoint IP addresses are used to identify each tunnel. Unicast DOCSIS data traffic can utilize either or multicast L2TPv3 tunnels. Multicast tunnels are used with downstream virtual splitting configurations. Multicast video is encoded and delivered using DEPI tunnels as well, using a multipoint L2TPv3 tunnel to multiple RPDs to optimize video delivery.CIN Network RequirementsIPv4/IPv6 Unicast and MulticastDue to the large number of elements and generally greenfield network builds, the CIN network must support all functions using both IPv4 and IPv6. IPv6 may be carried natively across the network or within an IPv6 VPN across an IPv4 MPLS underlay network. Similarly the network must support multicast traffic delivery for both IPv4 and IPv6 delivered via the global routing table or Multicast VPN. Scalable dynamic multicast requires the use of PIMv4, PIMv6, IGMPv3, and MLDv2 so these protocols are validated as part of the overall network design. IGMPv2 and MLDv2 snooping are also required for designs using access bridge domains and BVI interfaces for aggregation.Network TimingFrequency and phase synchronization is required between the cBR-8 and RPD to properly handle upstream scheduling and downstream transmission. Remote PHY uses PTP (Precision Timing Protocol) for timing synchronization with the ITU-T G.8275.2 timing profile. This profile carries PTP traffic over IP/UDP and supports a network with partial timing support, meaning multi-hop sessions between Grandmaster, Boundary Clocks, and clients as shown in the diagram below. The cBR-8 and its client RPD require timing alignment to the same Primary Reference Clock (PRC). In order to scale, the network itself must support PTP G.8275.2 as a T-BC (Boundary Clock). Synchronous Ethernet (SyncE) is also recommended across the CIN network to maintain stability when timing to the PRC.QoSControl plane functions of Remote PHY are critical to achieving proper operation and subscriber traffic throughput. QoS is required on all RPD-facing ports, the cBR-8 DPIC ports, and all core interfaces in between. Additional QoS may be necessary between the cBR-8, RPD, and any PTP timing elements. See the design section for further details on QoS components.DHCPv4 and DHCPv6 RelayAs a critical component of the initial boot and provisioning of RPDs, the network must support DHCP relay functionality on all RPD-facing interfaces, for both IPv4 and IPv6.Converged SDN Transport CIN DesignDeployment Topology OptionsThe Converged SDN Transport design is extremely flexible in how Remote PHY components are deployed. Depending on the size of the deployment, components can be deployed in a scalable leaf-spine fabric with dedicated routers for RPD and cBR-8 DPIC connections or collapsed into a single pair of routers for smaller deployments. If a smaller deployment needs to be expanded, the flexible L3 routed design makes it very easy to simply interconnect new devices and scale the design to a fabric supporting thousands of RPD and other access network connections.High Scale Design (Recommended)This option maximizes statistical multiplexing by aggregating Digital PIC downstream connections on a separate leaf device, allowing one to connect a number of cBR-8 interfaces to a fabric with minimal 100GE uplink capacity. The topology also supports the connectivity of remote shelves for hub consolidation. Another benefit is the fabric has optimal HA and the ability to easily scale with more leaf and spine nodes.High scale topologyCollapsed Digital PIC and SUP Uplink ConnectivityThis design for smaller deployments connects both the downstream Digital PIC connections and uplinks on the same CIN core device. If there is enough physical port availability and future growth does not dictate capacity beyond these nodes this design can be used. This design still provides full redundancy and the ability to connect RPDs to any cBR-8. Care should be taken to ensure traffic between the DPIC and RPD does not traverse the SUP uplink interfaces.Collapsed cBR-8 uplink and Digital PIC connectivityCollapsed RPD and cBR-8 DPIC ConnectivityThis design connects each cBR-8 Digital PIC connection to the RPD leaf connected to the RPDs it will serve. This design can also be considered a “pod” design where cBR-8 and RPD connectivity is pre-planned. Careful planning is needed since the number of ports on a single device may not scale efficiently with bandwidth in this configuration.Collapsed or Pod cBR-8 Digital PIC and RPD connectivityIn the collapsed desigs care must be taken to ensure traffic between each RPD can reach the appropriate DPIC interface. If a leaf is single-homed to the aggregation router its DPIC interface is on, RPDs may not be able to reach their DPIC IP. The options with the shortest convergence time are# Adding interconnects between the agg devices or multiple uplinks from the leaf to agg devices.Cisco HardwareThe following table highlights the Cisco hardware utilized within the Converged SDN Transport design for Remote PHY. This table is non-exhaustive. One highlight is all NCS platforms listed are built using the same NPU family and share most features across all platforms. See specific platforms for supported scale and feature support. Product Role 10GE SFP+ 25G SFP28 100G QSFP28 Timing Comments NCS-55A1-24Q6H-S RPD leaf 48 24 6 Class B   N540-ACC-SYS RPD leaf 24 8 2 Class B Smaller deployments N540-28Z4C-A/D RPD leaf 28 0 4 Class B Smaller deployments NCS-55A1-48Q6H-S DPIC leaf 48 48 6 Class B   NCS-55A2-MOD Remote agg 40 24 upto 8 Class B CFP2-DCO support NCS-55A1-36H-S Spine 144 (breakout) 0 36 Class B   NCS-5502 Spine 192 (breakout) 0 48 None   NCS-5504 Multi Upto 576 x Upto 144 Class B 4-slot modular platform Scalable L3 Routed DesignThe Cisco validated design for cable CIN utilizes a L3 design with or without Segment Routing. Pure L2 networks are no longer used for most networks due to their inability to scale, troubleshooting difficulty, poor network efficiency, and poor resiliency. L2 bridging can be utilized on RPD aggregation routers to simplify RPD connectivity.L3 IP RoutingLike the overall CST design, we utilize IS-IS for IPv4 and IPv6 underlay routing and BGP to carry endpoint information across the network. The following diagram illustrates routing between network elements using a reference deployment. The table below describes the routing between different functions and interfaces. See the implementation guide for specific configuration. Interface Routing Comments cBR-8 Uplink IS-IS Used for BGP next-hop reachability to SP Core cBR-8 Uplink BGP Advertise subscriber and cable-modem routes to SP Core cBR-8 DPIC Static default in VRF Each DPIC interface should be in its own VRF on the cBR-8 so it has a single routing path to its connected RPDs RPD Leaf Main IS-IS Used for BGP next-hop reachability RPD Leaf Main BGP Advertise RPD L3 interfaces to CIN for cBR-8 to RPD connectivity RPD Leaf Timing BGP Advertise RPD upstream timing interface IP to rest of network DPIC Leaf IS-IS Used for BGP next-hop reachability DPIC Leaf BGP Advertise cBR-8 DPIC L3 interfaces to CIN for cBR-8 to RPD connectivity CIN Spine IS-IS Used for reachability between BGP endpoints, the CIN Spine does not participate in BGP in a SR-enabled network CIN Spine RPD Timing IS-IS Used to advertise RPD timing interface BGP next-hop information and advertise default CIN Spine BGP (optional) In a native IP design the spine must learn BGP routes for proper forwarding CIN Router to Router InterconnectionIt is recommended to use multiple L3 links when interconnecting adjacent routers, as opposed to using LAG, if possible. Bundles increase the possibility for timing inaccuracy due to asymmetric timing traffic flow between slave and master. If bundle interfaces are utilized, care should be taken to ensure the difference in paths between two member links is kept to a minimum. All router links will be configured according to the global CST design. Leaf devices will be considered CST access PE devices and utilize BGP for all services routing.Leaf Transit TrafficIn a single IGP network with equal IGP metrics, certain link failures may cause a leaf to become a transit node. Several options are available to keep transit traffic from transiting a leaf and potentially causing congestion. Using high metrics on all leaf to agg uplinks will prohibit this and is recommended in all configurations.cBR-8 DPIC to CIN InterconnectionThe cBR-8 supports two mechanisms for DPIC high availability outlined in the overview section. DPIC line card and link redundancy is recommended but not a requirement. In the CST reference design, if link redundancy is being used each port pair on the active and standby line cards is connected to a different router and the default active ports (even port number) is connected to a different router. In the example figure, port 0 from active DPIC card 0 is connected to R1 and port 0 from standby DPIC card 1 is connected to R2. DPIC link redundancy MUST be configured using the “cold” method since the design is using L3 to each DPIC interface and no intermediate L2 switching. This is done with the cable rphy link redundancy cold global command and will keep the standby link in a down/down state until switchover occurs.DPIC line card and link HADPIC Interface ConfigurationEach DPIC interface should be configured in its own L3 VRF. This ensures traffic from an RPD assigned to a specific DPIC interface takes the traffic path via the specific interface and does not traverse the SUP interface for either ingress or egress traffic. It’s recommended to use a static default route within each DPIC VRF towards the CIN network. Dynamic routing protocols could be utilized, however it will slow convergence during redundancy switchover.Router Interface ConfigurationIf no link redundancy is utilized each DPIC interface will connect to the router using a point to point L3 interface.If using cBR-8 link HA, failover time is reduced by utilizing the same gateway MAC address on each router. Link HA uses the same IP and MAC address on each port pair on the cBR-8, and retains routing and ARP information for the L3 gateway. If a different MAC address is used on each router, traffic will be dropped until an ARP occurs to populate the GW MAC address on the router after failover. On the NCS platforms, a static MAC address cannot be set on a physical L3 interface. The method used to set a static MAC address is to use a BVI (Bridged Virtual Interface), which allows one to set a static MAC address. In the case of DPIC interface connectivity, each DPIC interface should be placed into its own bridge domain with an associated BVI interface. Since each DPIC port is directly connected to the router interface, the same MAC address can be utilized on each BVI.If using IS-IS to distribute routes across the CIN, each DPIC physical interface or BVI should be configured as a passive IS-IS interface in the topology. If using BGP to distribute routing information the “redistribute connected” command should be used with an appropriate route policy to restrict connected routes to only DPIC interface. The BGP configuration is the same whether using L3VPN or the global routing table.It is recommended to use a /31 for IPv4 and /127 for IPv6 addresses for each DPIC port whether using a L3 physical interface or BVI on the CIN router.RPD to Router InterconnectionThe Converged SDN Transport design supports both P2P L3 interfaces for RPD and DPIC aggregation as well as using Bridge Virtual Interfaces. A BVI is a logical L3 interface within a L2 bridge domain. In the BVI deployment the DPIC and RPD physical interfaces connected to a single leaf device share a common IP subnet with the gateway residing on the leaf router.It is recommended to configure the RPD leaf using bridge-domains and BVI interfaces. This eases configuration on the leaf device as well as the DHCP configuration used for RPD provisioning.The following shows the P2P and BVI deployment options.Native IP or L3VPN/mVPN DeploymentTwo options are available and validated to carry Remote PHY traffic between the RPD and MAC function. Native IP means the end to end communication occurs as part of the global routing table. In a network with SR-MPLS deployed such as the CST design, unicast IP traffic is still carried across the network using an MPLS header. This allows for fast reconvergence in the network by using SR and enabled the network to carry other VPN services on the network even if they are not used to carry Remote PHY traffic. In then native IP deployment, multicast traffic uses either PIM signaling with IP multicast forwarding or mLDP in-band signaling for label-switched multicast. The multicast profile used is profile 7 (Global mLDP in-band signaling). L3VPN and mVPN can also be utilized to carry Remote PHY traffic within a VPN service end to end. This has the benefit of separating Remote PHY traffic from the network underlay, improving security and treating Remote PHY as another service on a converged access network. Multicast traffic in this use case uses mVPN profile 14. mLDP is used for label-switched multicast, and the NG-MVPN BGP control plane is used for all multicast discovery and signaling. SR-TESegment Routing Traffic Engineering may be utilized to carry traffic end to end across the CIN network. Using On-Demand Networking simplifies the deployment of SR-TE Policies from ingress to egress by using specific color BGP communities to instruct head-end nodes to create policies satisfying specific user constraints. As an example, if RPD aggregation prefixes are advertised using BGP to the DPIC aggregation device, SR-TE tunnels following a user constraint can be built dynamically between those endpoints.CIN Quality of Service (QoS)QoS is a requirement for delivering trouble-free Remote PHY. This design uses sample QoS configurations for concept illustration, but QoS should be tailored for specific network deployments. New CIN builds can utilize the configurations in the implementation guide verbatim if no other services are being carried across the network. Please see the section in this document on QoS for general NCS QoS information and the implementation guide for specific details.CST Network Traffic ClassificationThe following lists specific traffic types which should be treated with specific priority, default markings, and network classification points. Traffic Type Ingress Interface Priority Default Marking Comments BGP Routers, cBR-8 Highest CS6 (DSCP 48) None IS-IS Routers, cBR-8 Highest CS6 IS-IS is single-hop and uses highest priority queue by default BFD Routers Highest CS6 BFD is single-hop and uses highest priority queue by default DHCP RPD High CS5 DHCP COS is set explicitly PTP All High DSCP 46 Default on all routers, cBR-8, and RPD DOCSIS MAP/UCD RPD, cBR-8 DPIC High DSCP 46   DOCSIS BWR RPD, cBR-8 DPIC High DSCP 46   GCP RPD, cBR-8 DPIC Low DSCP 0   DOCSIS Data RPD, cBR-8 DPIC Low DSCP 0   Video cBR-8 Medium DSCP 32 Video within multicast L2TPv3 tunnel when cBR-8 is video core MDD RPD, cBR-8 Medium DSCP 40   CST and Remote-PHY Load BalancingUnicast network traffic is load balanced based on MPLS labels and IP header criteria. The devices used in the CST design are capable of load balancing traffic based on MPLS labels used in the SR underlay and IP headers underneath any MPLS labels. In the higher bandwidth downstream direction, where a series of L2TP3 tunnels are created from the cBR-8 to the RPD, traffic is hashed based on the source and destination IP addresses of those tunnels. Downstream L2TPv3 tunnels from a single Digital PIC interface to a set of RPDs will be distributed across the fabric based on RPD destination IP address. The followUing illustrates unicast load balancing across the network.Multicast traffic is not load balanced across the network. Whether the network is utilizing PIMv4, PIMv6, or mVPN, a multicast flow with two equal cost downstream paths will utilize only a single path, and only a single member link will be utilized in a link bundle. If using multicast, ensure sufficient bandwidth is available on a single link between two adjacencies.Low-Level CIN Design and ConfigurationIOS-XR Nodes - SR-MPLS TransportUnderlay physical interface configuration with BFDinterface TenGigE0/0/0/10 bfd mode ietf bfd address-family ipv4 timers start 180 bfd address-family ipv4 multiplier 3 bfd address-family ipv4 destination 10.1.2.1 bfd address-family ipv4 fast-detect bfd address-family ipv4 minimum-interval 50 mtu 9216 ipv4 address 10.15.150.1 255.255.255.254 ipv4 unreachables disable load-interval 30 dampeningSRGB and SRLB DefinitionIt’s recommended to first configure the Segment Routing Global Block (SRGB) across all nodes needing connectivity between each other. In most instances a single SRGB will be used across the entire network. In a SR MPLS deployment the SRGB and SRLB correspond to the label blocks allocated to SR. IOS-XR has a maximum configurable SRGB limit of 512,000 labels, however please consult platform-specific documentation for maximum values. The SRLB corresponds to the labels allocated for SIDs local to the node, such as Adjacency-SIDs. It is recommended to configure the same SRLB block across all nodes. The SRLB must not overlap with the SRGB. The SRGB and SRLB are configured in IOS-XR with the following configuration#segment-routing global-block 16000 23999 local-block 15000 15999 IGP protocol (ISIS) and Segment Routing MPLS configurationKey chain global configuration for IS-IS authenticationkey chain ISIS-KEY key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password 00071A150754 send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm HMAC-MD5 IS-IS router configurationAll routers, except Area Border Routers (ABRs), are part of one IGPdomain and L2 area (ISIS-ACCESS or ISIS-CORE). Area border routersrun two IGP IS-IS processes (ISIS-ACCESS and ISIS-CORE). Note that Loopback0 is part of both IGP processes.router isis ISIS-ACCESS set-overload-bit on-startup 360 is-type level-2-only net 49.0001.0101.0000.0110.00 nsr nsf cisco log adjacency changes lsp-gen-interval maximum-wait 5000 initial-wait 5 secondary-wait 100 lsp-refresh-interval 65000 max-lsp-lifetime 65535 address-family ipv4 unicast metric-style wide advertise link attributes spf-interval maximum-wait 1000 initial-wait 5 secondary-wait 100 segment-routing mpls spf prefix-priority high tag 1000 maximum-redistributed-prefixes 100 level 2 ! address-family ipv6 unicast metric-style wide spf-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 maximum-redistributed-prefixes 100 level 2Note# ABR Loopback 0 on domain boundary is part of both IGP processes together with same “prefix-sid absolute” value, giving resiliency to the border.Note# The prefix SID can be configured as either absolute or index. The index configuration is required for interop with nodes using a different SRGB.IS-IS Loopback and node SID configuration interface Loopback0 ipv4 address 100.0.1.50 255.255.255.255 address-family ipv4 unicast prefix-sid absolute 16150 tag 1000 IS-IS interface configuration with TI-LFAIt is recommended to use manual adjacency SIDs. A protected SID is eligible for backup path computation, meaning if a packet ingresses the node with the label a backup path will be provided in case of a failure. In the case of having multiple adjacencies between the same two nodes, use the same adjacency-sid on each link. interface TenGigE0/0/0/10 point-to-point hello-password keychain ISIS-KEY address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa adjacency-sid absolute 15002 protected metric 100 ! address-family ipv6 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 Multicast transport using mLDPOverviewThis portion of the implementation guide instructs the user how to configure mLDP end to end across the multi-domain network. Multicast service examples are given in the “Services” section of the implementation guide.mLDP core configurationIn order to use mLDP across the Converged SDN Transport network LDP must first be enabled. There are two mechanisms to enable LDP on physical interfaces across the network, LDP auto-configuration or manually under the MPLS LDP configuration context. The capabilities statement will ensure LDP unicast FECs are not advertised, only mLDP FECs. Recursive forwarding is required in a multi-domain network. mLDP must be enabled on all participating A-PE, PE, AG, PA, and P routers.LDP base configuration with defined interfacesmpls ldp capabilities sac mldp-only mldp logging notifications address-family ipv4 make-before-break delay 30 forwarding recursive recursive-fec ! ! router-id 100.0.2.53 session protection address-family ipv4 ! interface TenGigE0/0/0/6 ! interface TenGigE0/0/0/7 LDP auto-configurationLDP can automatically be enabled on all IS-IS interfaces with the following configuration in the IS-IS configuration. It is recommended to do this only after configuring all MPLS LDP properties.router isis ACCESS address-family ipv4 unicast segment-routing mpls sr-prefer mpls ldp auto-config G.8275.2 PTP (1588v2) timing configurationSummaryThis section contains the base configurations used for both G.8275.1 and G.8275.2 timing. Please see the CST 3.0 HLD for an overview on timing in general.Enable frequency synchronizationIn order to lock the internal oscillator to a PTP source, frequency synchronization must first be enabled globally.frequency synchronization quality itu-t option 1 clock-interface timing-mode system log selection changes! Optional Synchronous Ethernet configuration (PTP hybrid mode)If the end-to-end devices support SyncE it should be enabled. SyncE will allow much faster frequency sync and maintain integrity for long periods of time during holdover events. Using SyncE for frequency and PTP for phase is known as “Hybrid” mode. A lower priority is used on the SyncE input (50 for SyncE vs. 100 for PTP).interface TenGigE0/0/0/10 frequency synchronization selection input priority 50 !! PTP G.8275.2 global timing configurationAs of CST 3.0, IOS-XR supports a single PTP timing profile and single clock type in the global PTP configuration. The clock domain should follow the ITU-T guidelines for specific profiles using a domain >44 for G.8275.2 clocks.ptp clock domain 60 profile g.8275.2 clock-type T-BC ! frequency priority 100 time-of-day priority 50 log servo events best-master-clock changes ! PTP G.8275.2 interface profile definitionsIt is recommended to use “profiles” defined globally which are then applied to interfaces participating in timing. This helps minimize per-interface timing configuration. It is also recommended to define different profiles for “master” and “slave” interfaces.IPv4 G.8275.2 master profileThe master profile is assigned to interfaces for which the router is acting as a boundary clockptp profile g82752_master_v4 transport ipv4 port state master-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 5 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv6 G.8275.2 master profileThe master profile is assigned to interfaces for which the router is acting as a boundary clockptp profile g82752_master_v6 transport ipv6 port state master-only sync frequency 16 clock operation one-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv4 G.8275.2 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82752_master_v4 transport ipv4 port state slave-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv6 G.8275.2 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82752_master_v6 transport ipv6 port state slave-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! Application of G.8275.2 PTP profile to physical interfaceNote# In CST 3.0 PTP may only be enabled on physical interfaces. G.8275.1 operates at L2 and supports PTP across Bundle member links and interfaces part of a bridge domain. G.8275.2 operates at L3 and does not support Bundle interfaces or BVI interfaces.G.8275.2 interface configurationThis example is of a slave device using a master of 2405#10#23#253##0.interface TenGigE0/0/0/6 ptp profile g82752_slave_v6 master ipv6 2405#10#23#253## ! ! CIN Remote-PHY Specific Deployment ConfigurationSummaryDetail can be found in the CST 3.0 high-level design guide for design decisions, this section will provide sample configurations.Sample QoS PoliciesThe following are usable policies but policies should be tailored for specific network deployments.Class mapsClass maps are used within a policy map to match packet criteria for further treatmentclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 match mpls experimental topmost 5 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 match mpls experimental topmost 4 end-class-map!class-map match-any match-video-cs4-exp2 description Video match dscp 32 match mpls experimental topmost 2 end-class-map!class-map match-any match-cs6-exp6 description Highest priority control-plane traffic match dscp cs6 match mpls experimental topmost 6 end-class-map!class-map match-any match-qos-group-1 match qos-group 1 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-map!class-map match-any match-qos-group-6 match qos-group 3 end-class-map!class-map match-any match-traffic-class-1 description ~Match highest priority traffic-class 1~ match traffic-class 1 end-class-map!class-map match-any match-traffic-class-2 description ~Match high priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match medium traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-traffic-class-6 description ~Match video traffic-class 6~ match traffic-class 6 end-class-map RPD and DPIC interface policy mapsThese are applied to all interfaces connected to cBR-8 DPIC and RPD devices.Note# Egress queueing maps are not supported on L3 BVI interfacesRPD/DPIC ingress classifier policy mappolicy-map rpd-dpic-ingress-classifier class match-cs6-exp6 set traffic-class 1 set qos-group 1 ! class match-ef-exp5 set traffic-class 2 set qos-group 2 ! class match-cs5-exp4 set traffic-class 3 set qos-group 3 ! class match-video-cs4-exp2 set traffic-class 6 set qos-group 6 ! class class-default set traffic-class 0 set dscp 0 set qos-group 0 ! end-policy-map! P2P RPD and DPIC egress queueing policy mappolicy-map rpd-dpic-egress-queuing class match-traffic-class-1 priority level 1 queue-limit 500 us ! class match-traffic-class-2 priority level 2 queue-limit 100 us ! class match-traffic-class-3 priority level 3 queue-limit 500 us ! class match-traffic-class-6 priority level 6 queue-limit 500 us ! class class-default queue-limit 250 ms ! end-policy-map! Core QoSPlease see the general QoS section for core-facing QoS configurationCIN Timing ConfigurationPlease see the G.8275.2 timing configuration guide in this document for details on timing configuration. The following values should be used for PTP configuration attributes. Please note in CST 3.0 the use of an IOS-XR router as a Boundary Clock is only supported on P2P L3 interfaces. The use of a BVI for RPD aggregation requires the BC used for RPD nodes be located upstream, or alternatively a physical loopback cable may be used to provide timing off the IOS-XR based RPD leaf device. PTP variable IOS-XR configuration value IOS-XE value Announce Interval 1 1 Announce Timeout 5 5 Sync Frequency 16 -4 Delay Request Frequency 16 -4 Example CBR-8 RPD DTI Profileptp r-dti 4 profile G.8275.2 ptp-domain 60 clock-port 1 clock source ip 192.168.3.1 sync interval -4 announce timeout 5 delay-req interval -4 Multicast configurationSummaryWe present two different configuration options based on either native multicast deployment or the use of a L3VPN to carry Remote PHY traffic. The L3VPN option shown uses Label Switched Multicast profile 14 (partitioned mLDP) however profile 6 could also be utilized.Global multicast configuration - Native multicastOn CIN aggregation nodes all interfaces should have multicast enabled.multicast-routing address-family ipv4 interface all enable ! address-family ipv6 interface all enable enable ! Global multicast configuration - LSM using profile 14On CIN aggregation nodes all interfaces should have multicast enabled.vrf VRF-MLDP address-family ipv4 mdt source Loopback0 rate-per-route interface all enable accounting per-prefix bgp auto-discovery mldp ! mdt partitioned mldp ipv4 p2mp mdt data 100 ! ! PIM configuration - Native multicastPIM should be enabled for IPv4/IPv6 on all core facing interfacesrouter pim address-family ipv4 interface Loopback0 enable ! interface TenGigE0/0/0/6 enable ! interface TenGigE0/0/0/7 enable ! ! PIM configuration - LSM using profile 14The PIM configuration is utilized even though no PIM neighbors may be connected.route-policy mldp-partitioned-p2mp set core-tree mldp-partitioned-p2mpend-policy!router pim address-family ipv4 interface Loopback0 enable vrf rphy-vrf address-family ipv4 rpf topology route-policy mldp-partitioned-p2mp mdt c-multicast-routing bgp ! ! IGMPv3/MLDv2 configuration - Native multicastInterfaces connected to RPD and DPIC interfaces should have IGMPv3 and MLDv2 enabledrouter igmp interface BVI100 version 3 ! interface TenGigE0/0/0/25 version 3 !!router mld interface BVI100 version 2 interface TenGigE0/0/0/25 version 3 ! ! IGMPv3/MLDv2 configuration - LSM profile 14Interfaces connected to RPD and DPIC interfaces should have IGMPv3 and MLDv2 enabled as neededrouter igmp vrf rphy-vrf interface BVI101 version 3 ! interface TenGigE0/0/0/15 ! !!router mld vrf rphy-vrf interface TenGigE0/0/0/15 version 2 ! !! IGMPv3 / MLDv2 snooping profile configuration (BVI aggregation)In order to limit L2 multicast replication for specific groups to only interfaces with interested receivers, IGMP and MLD snooping must be enabled.igmp snooping profile igmp-snoop-1!mld snooping profile mld-snoop-1! RPD DHCPv4/v6 relay configurationIn order for RPDs to self-provision DHCP relay must be enabled on all RPD-facing L3 interfaces. In IOS-XR the DHCP relay configuration is done in its own configuration context without any configuration on the interface itself.Native IP / Default VRFdhcp ipv4 profile rpd-dhcpv4 relay helper-address vrf default 10.0.2.3 ! interface BVI100 relay profile rpd-dhcpv4!dhcp ipv6 profile rpd-dhcpv6 relay helper-address vrf default 2001#10#0#2##3 iana-route-add source-interface BVI100 ! interface BVI100 relay profile rpd-dhcpv6 RPHY L3VPNIn this example it is assumed the DHCP server exists within the rphy-vrf VRF, if it does not then additional routing may be necessary to forward packets between VRFs.dhcp ipv4 vrf rphy-vrf relay profile rpd-dhcpv4-vrf profile rpd-dhcpv4-vrf relay helper-address vrf rphy-vrf 10.0.2.3 relay information option allow-untrusted ! inner-cos 5 outer-cos 5 interface BVI101 relay profile rpd-dhcpv4-vrf interface TenGigE0/0/0/15 relay profile rpd-dhcpv4-vrf! cBR-8 DPIC interface configuration without Link HAWithout link HA the DPIC port is configured as a normal physical interfaceinterface TenGigE0/0/0/25 description .. Connected to cbr8 port te1/1/0 service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 carrier-delay up 0 down 0 load-interval 30 cBR-8 DPIC interface configuration with Link HAWhen using Link HA faster convergence is achieved when each DPIC interface is placed into a BVI with a statically assigned MAC address. Each DPIC interface is placed into a separate bridge-domain with a unique BVI L3 interface. The same MAC address should be utilized on all BVI interfaces. Convergence using BVI interfaces is <50ms, L3 physical interfaces is 1-2s.Even DPIC port CIN interface configurationinterface TenGigE0/0/0/25 description ~Connected to cBR8 port Te1/1/0~ lldp ! carrier-delay up 0 down 0 load-interval 30 l2transport !!l2vpn bridge group cbr8 bridge-domain port-ha-0 interface TenGigE0/0/0/25 ! routed interface BVI500 ! ! ! interface BVI500 description ~BVI for cBR8 port HA, requires static MAC~ service-policy input rpd-dpic-ingress-classifier ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 mac-address 8a.9698.64 load-interval 30! Odd DPIC port CIN interface configurationinterface TenGigE0/0/0/26 description ~Connected to cBR8 port Te1/1/1~ lldp ! carrier-delay up 0 down 0 load-interval 30 l2transport !!l2vpn bridge group cbr8 bridge-domain port-ha-1 interface TenGigE0/0/0/26 ! routed interface BVI501 ! ! ! interface BVI501 description ~BVI for cBR8 port HA, requires static MAC~ service-policy input rpd-dpic-ingress-classifier ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 mac-address 8a.9698.64 load-interval 30! cBR-8 Digital PIC Interface Configurationinterface TenGigE0/0/0/25 description .. Connected to cbr8 port te1/1/0 service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 carrier-delay up 0 down 0 load-interval 30 RPD interface configurationP2P L3In this example the interface has PTP enabled towards the RPDinterface TeGigE0/0/0/15 description To RPD-1 mtu 9200 ptp profile g82752_master_v4 ! service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 192.168.2.0 255.255.255.254 ipv6 address 2001#192#168#2##0/127 ipv6 enable ! BVIl2vpn bridge group rpd bridge-domain rpd-1 mld snooping profile mld-snoop-1 igmp snooping profile igmp-snoop-1 interface TenGigE0/0/0/15 ! interface TenGigE0/0/0/16 ! interface TenGigE0/0/0/17 ! routed interface BVI100 ! ! ! !!interface BVI100 description ... to downstream RPD hosts service-policy input rpd-dpic-ingress-classifier ipv4 address 192.168.2.1 255.255.255.0 ipv6 address 2001#192#168#2##1/64 ipv6 enable ! RPD/DPIC agg device IS-IS configurationThe standard IS-IS configuration should be used on all core interfaces with the addition of specifying all DPIC and RPD connected as IS-IS passive interfaces. Using passive interfaces is preferred over redistributing connected routes. This configuration is needed for reachability between DPIC and RPDs across the CIN network.router isis ACCESS interface TenGigE0/0/0/25 passive address-family ipv4 unicast ! address-family ipv6 unicast Additional configuration for L3VPN DesignGlobal VRF ConfigurationThis configuration is required on all DPIC and RPD connected routers as well as ancillary elements communicating with Remote PHY elementsvrf rphy-vrf address-family ipv4 unicast import route-target 100#5000 ! export route-target 100#5000 ! ! address-family ipv6 unicast import route-target 100#5000 ! export route-target 100#5000 ! ! BGP ConfigurationThis configuration is required on all DPIC and RPD connected routers as well as ancillary elements communicating with Remote PHY elementsrouter bgp 100 vrf rphy-vrf rd auto address-family ipv4 unicast label mode per-vrf redistribute connected ! address-family ipv6 unicast label mode per-vrf redistribute connected ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! ! Network Deployment ExampleSummaryIn this section we will show an example deployments with complete configurations. There are two flavors of deployment covered in this section, the first using the Global Routing Table (GRT) to carry CIN traffic, the second using a L3VPN to carry all CIN traffic. The recommended design option for those building a converged access and aggregation network is to utilize a L3VPN to carry CIN traffic. In both designs we will enable Segment Routing to provide 50ms failover across the CIN network.Network DiagramsGlobal Routing Table with DPIC LeafL3VPN CollapsedConnectivity Table A Node A Int Z Node Z Int Role PE4 Te0/0/0/19 CBR8 Te4/1/6 SUP Uplink PE3 Te0/0/0/19 CBR8 Te4/1/5 SUP Uplink PA3 Te0/0/0/25 CBR8 Te0/1/0 GRT DPIC to CIN Secondary Active PA3 Te0/0/0/26 CBR8 Te1/1/1 GRT DPIC to CIN Primary Standby PA3 BE321 PE3 BE321 Core PA3 BE421 PE4 BE421 Core PA3 Te0/0/0/21 AG3 Te0/0/0/20 Core PA3 Te0/0/0/20 AG4 Te0/0/0/21 Core PA4 Te0/0/0/25 CBR8 Te0/1/1 GRT DPIC to CIN Secondary Standby PA4 Te0/0/0/26 CBR8 Te1/1/0 GRT DPIC to CIN Primary Active PA4 Te0/0/0/5 TGM-903 Te0/3/2 CIN to PTP GM PA4 BE322 PE3 BE322 Core PA4 BE422 PE4 BE422 Core PA4 Te0/0/0/21 AG3 Te0/0/0/21 Core PA4 Te0/0/0/20 AG4 Te0/0/0/20 Core AG3 Hu0/0/1/0 Ag4 Hu0/0/1/1 Core AG3 Te0/0/0/30 A-PE8 Te0/0/0/6 Core (RPD leaf) AG3 Te0/0/0/25 CBR8 Te0/1/2 L3VPN DPIC to CIN Secondary Active AG3 Te0/0/0/26 CBR8 Te1/1/3 L3VPN DPIC to CIN Primary Standby AG4 Te0/0/0/30 A-PE8 Te0/0/0/7 Core (RPD leaf) AG4 Te0/0/0/25 CBR8 Te0/1/3 L3VPN DPIC to CIN Secondary Standby AG4 Te0/0/0/26 CBR8 Te1/1/2 L3VPN DPIC to CIN Primary Active A-PE8 Te0/0/0/15 RPD0 Eth0 RPD A-PE8 Te0/0/0/16 RPD1 Eth0 RPD A-PE8 Te0/0/0/17 RPD2 Eth0 RPD Consistent Configuration across GRT and L3VPN DesignsPE4 ConfigurationPE4 connects to the uplink interfaces on the cBR-8 SUP-250 supervisor module. Its configuration is the same across both designs.PE4 to Core Sample Interface Configurationinterface Bundle-Ether34 bfd mode ietf bfd address-family ipv4 timers start 180 bfd address-family ipv4 multiplier 3 bfd address-family ipv4 destination 10.3.4.0 bfd address-family ipv4 fast-detect bfd address-family ipv4 minimum-interval 50 mtu 9216 service-policy input core-ingress-classifier-9k service-policy output core-egress-queuing-9k ipv4 address 10.3.4.1 255.255.255.254 ipv4 unreachables disable ipv6 address 2405#10#3#4##1/127 bundle minimum-active links 1 load-interval 30 dampening! IS-IS Configurationkey chain ISIS-CBR key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password 08285F471A09040401 send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm HMAC-MD5 !!router isis ACCESS set-overload-bit on-startup 360 is-type level-2-only net 49.0001.0102.0000.0004.00 segment-routing global-block 32000 64000 nsr nsf cisco log adjacency changes lsp-gen-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 lsp-refresh-interval 65000 max-lsp-lifetime 65535 address-family ipv4 unicast metric-style wide mpls traffic-eng level-2-only mpls traffic-eng router-id Loopback0 spf-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 segment-routing mpls spf prefix-priority critical tag 5000 spf prefix-priority high tag 1000 ! address-family ipv6 unicast metric-style wide ! interface TenGigE0/0/0/19 circuit-type level-2-only point-to-point hello-password keychain ISIS-CBR address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 ! address-family ipv6 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 ! !! PE4 Multicast and PIM ConfigurationIn IOS-XR multicast must be enabled on all participating interfaces for IPv4 and IPv6. It is easiest to enable multicast on all interfaces with the “interface all enable” command. PIM should be enabled for IPv4 and IPv6 on all core interfaces as well as the interface to the cBR8 if the cBR8 is acting as a video core.multicast-routing address-family ipv4 mdt source Loopback0 rate-per-route interface all enable accounting per-prefix ! address-family ipv6 rate-per-route interface all enable accounting per-prefix !!router pim address-family ipv4 interface Loopback0 enable ! interface Bundle-Ether421 enable ! interface Bundle-Ether422 enable ! interface TenGigE0/0/0/19 enable address-family ipv6 interface Loopback0 enable ! interface Bundle-Ether421 enable ! interface Bundle-Ether422 enable ! interface TenGigE0/0/0/19 enable PE4 BGP Configuration to CBR8router bgp 100 nsr bgp router-id 100.0.0.4 bgp graceful-restart ibgp policy out enforce-modifications address-family ipv4 unicast ! address-family ipv6 unicast ! neighbor 1.0.0.13 remote-as 100 update-source Loopback0 address-family ipv4 unicast ! ! neighbor 2001#1##13 remote-as 100 update-source Loopback0 address-family ipv6 unicast ! !! cBR-8 ConfigurationThe cBR-8 configuration will remain the same across each design option.cBR-8 Line Card Redundancy Configurationredundancy mode sso linecard-group 0 internal-switch class 1#N member slot 1 primary member slot 0 secondary no revertive cBR-8 Link HA Configurationcable rphy link redundancy cold cBR-8 DPIC and DPIC Routing ConfigurationEach DPIC interface is placed in a VRF with associated default routes pointing to the CIN device as its gateway. This configuration ensures traffic destined for the DPIC IP address will only ingress the DPIC interface and not the SUP interface.vrf definition lc0_p0 ! address-family ipv4 exit-address-family ! address-family ipv6 exit-address-family!!interface TenGigabitEthernet0/1/0 description ~Connected to PA4 TenGigE0/0/0/25~ vrf forwarding lc0_p0 ip address 3.3.9.100 255.255.255.0 logging event link-status cdp enable ipv6 address 2001#3#3#9##100/64 ipv6 enable!ip route vrf lc0_p0 0.0.0.0 0.0.0.0 3.3.9.101!ipv6 route vrf lc0_p0 ##/0 2001#4#4#9##101! vrf definition lc0_p1 ! address-family ipv4 exit-address-family ! address-family ipv6 exit-address-family!!interface TenGigabitEthernet0/1/2 description ~Connected to AG3 TenGigE0/0/0/25~ vrf forwarding lc0_p1 ip address 5.5.9.100 255.255.255.0 logging event link-status cdp enable ipv6 address 2001#5#5#9##100/64 ipv6 enable!ip route vrf lc0_p1 0.0.0.0 0.0.0.0 5.5.9.101!ipv6 route vrf lc0_p1 ##/0 2001#5#5#9##101! vrf definition lc1_p0 ! address-family ipv4 exit-address-family ! address-family ipv6 exit-address-family!!interface TenGigabitEthernet1/1/0 description ~Connected to PA4 TenGigE0/0/0/25~ vrf forwarding lc1_p0 ip address 4.4.9.100 255.255.255.0 logging event link-status load-interval 30 cdp enable ipv6 address 2001#4#4#9##100/64 ipv6 enable!ip route vrf lc1_p0 0.0.0.0 0.0.0.0 4.4.9.101!ipv6 route vrf lc1_p0 ##/0 2001#4#4#9##101!vrf definition lc1_p1 ! address-family ipv4 exit-address-family ! address-family ipv6 exit-address-family!!interface TenGigabitEthernet1/1/2 description ~Connected to AG3 TenGigE0/0/0/25~ vrf forwarding lc1_p1 ip address 6.6.9.100 255.255.255.0 logging event link-status cdp enable ipv6 address 2001#6#6#9##100/64 ipv6 enable!ip route vrf lc1_p1 0.0.0.0 0.0.0.0 6.6.9.101!ipv6 route vrf lc1_p1 ##/0 2001#6#6#9##101 cBR-8 SUP RoutingIn this example we will utilize IS-IS between the cBR-8 and provider network, and utilize BGP to advertise subscriber and cable modem address space to the rest of the network.IS-IS Configurationkey chain ISIS-KEYCHAIN key 0 key-string isispass accept-lifetime 00#00#00 Jan 1 2018 infinite send-lifetime 00#00#00 Jan 1 2018 infinite cryptographic-algorithm md5 ! !!router isis access net 49.0001.0010.0000.0013.00 is-type level-2-only router-id Loopback0 metric-style wide log-adjacency-changes ! address-family ipv6 multi-topology exit-address-family!interface Loopback0 ip address 1.0.0.13 255.255.255.255 ip router isis access isis circuit-type level-2-onlyend!interface TenGigabitEthernet4/1/6 description ~Connected to PE4 TenGigE 0/0/0/19~ ip address 4.1.6.1 255.255.255.0 ip router isis access load-interval 30 cdp enable ipv6 address 2001#4#1#6##1/64 ipv6 router isis access !! mpls ip optional for LDP enabled CMTS mpls ip isis circuit-type level-2-only isis network point-to-point isis authentication mode md5 isis authentication key-chain ISIS-KEYCHAIN isis csnp-interval 10 hold-queue 400 inend BGP Configurationrouter bgp 100 bgp router-id 1.0.0.13 bgp log-neighbor-changes bgp graceful-restart timers bgp 5 60 neighbor 2001#100##4 remote-as 100 neighbor 2001#100##4 ha-mode sso neighbor 2001#100##4 update-source Loopback0 neighbor 2001#100##4 ha-mode graceful-restart neighbor 100.0.0.4 remote-as 100 neighbor 100.0.0.4 ha-mode sso neighbor 100.0.0.4 update-source Loopback0 neighbor 100.0.0.4 ha-mode graceful-restart ! address-family ipv4 redistribute connected redistribute static route-map static-route no neighbor 2001#100##4 activate neighbor 100.0.0.4 activate neighbor 100.0.0.4 send-community extended neighbor 100.0.0.4 next-hop-self neighbor 100.0.0.4 soft-reconfiguration inbound exit-address-family ! address-family ipv6 redistribute connected neighbor 2001#100##4 activate neighbor 2001#100##4 send-community extended neighbor 2001#100##4 next-hop-self neighbor 2001#100##4 soft-reconfiguration inbound exit-address-family CIN Core ConfigurationMuch of the configuration across the IOS-XR CIN nodes is the same with regards to routing. Single examples will be given which can be applied across the specific devices and interfaces in the design.Network Timing ConfigurationSee the timing configuration section in this document for standard timing profiles. The following diagram shows the flow of timing end to end across the network.Timing Configuration between PA4 and ASR-903 GrandmasterThe following configuration enabled IPv4 G.8275.2 and SyncE between the Grandmaster to the CIN networkinterface TenGigE0/0/0/5 description Connected to PTP Grandmaster ptp profile g82752_slave_v4 transport ipv4 port state slave-only master ipv4 23.23.23.1 ! ! ipv4 address 10.1.15.1 255.255.255.0 ipv6 nd prefix default no-adv ipv6 nd other-config-flag ipv6 nd managed-config-flag ipv6 address 2001#10#1#15##1/64 load-interval 30 frequency synchronization selection input priority 1 wait-to-restore 1 !! Physical Interface Configuration for Timing MasterThe following is used on core interfaces with downstream slave interfacesinterface TenGigE0/0/0/20 mtu 9216 ptp profile g82752_master_v4 ! service-policy input core-ingress-classifier service-policy output core-egress-queuing service-policy output core-egress-exp-marking ipv4 address 10.22.24.0 255.255.255.254 ipv4 unreachables disable ipv6 address 2405#10#22#24##/127 load-interval 30 dampening! IS-IS Configuration - PA3, PA4, AG3, AG4, A-PE8The following IS-IS configuration is used on all CIN IOS-XR nodes. An example is given for a core interface along with the loopback interface with SR enabled. The prefix-sid index of Lo0 must be unique on each node.router isis ACCESS set-overload-bit on-startup 360 is-type level-2-only net 49.0001.0102.0000.0022.00 segment-routing global-block 32000 64000 nsr nsf cisco log adjacency changes lsp-gen-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 lsp-refresh-interval 65000 max-lsp-lifetime 65535 address-family ipv4 unicast metric-style wide microloop avoidance segment-routing mpls traffic-eng level-2-only mpls traffic-eng router-id Loopback0 spf-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 segment-routing mpls spf prefix-priority critical tag 5000 spf prefix-priority high tag 1000 ! address-family ipv6 unicast metric-style wide ! interface Loopback0 address-family ipv4 unicast prefix-sid index 22 ! address-family ipv6 unicast ! ! interface Bundle-Ether322 bfd minimum-interval 50 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 point-to-point hello-password keychain ISIS-KEY address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 adjacency-sid absolute 15322 protected ! address-family ipv6 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 ! ! CIN to cBR8 DPIC ConfigurationsThe following gives an example of the configuration between the CIN interface and the DPIC interface on the cBR8 for both global routing table (GRT) and L3VPN deployments. This also includes the routing configuration needed to advertise the DPIC IP address across the CIN for reachability between RPD and DPIC.CIN to DPIC Global Routing TableIn this use case we are utilizing the IGP, IS-IS for reachability between the RPD and DPIC interface.PA4 Te0/0/0/26 to cBR8 DPIC Te0/1/1 primary active interfaceinterface TenGigE0/0/0/26 description Connected to cbr8 port te0/1/1 cdp service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 3.3.9.101 255.255.255.0 ipv6 address 2001#3#3#9##101/64 carrier-delay up 150 down 0 load-interval 30! IS-IS Configurationrouter isis ACCESS interface TenGigE0/0/0/26 passive address-family ipv4 unicast ! address-family ipv6 unicast ! !! CIN to DPIC L3VPNThe L3VPN configuration requires additional configuration for the RPHY VRF as well as BGP configuration to exchange VPNv4 prefixes between the RPD leaf node and the DPIC leaf node. In this use case an external route-reflector is used to exchange routes between all CIN routers.BGP Configurationrouter bgp 100 nsr bgp router-id 100.0.2.4 bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast ! address-family vpnv6 unicast neighbor-group SvRR remote-as 100 update-source Loopback0 address-family vpnv4 unicast soft-reconfiguration inbound always ! address-family vpnv6 unicast soft-reconfiguration inbound always ! neighbor 100.0.2.202 use neighbor-group SvRR ! vrf rphy-vrf rd auto address-family ipv4 unicast label mode per-vrf redistribute connected redistribute static ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! !! VRF Configurationvrf rphy-vrf rd auto address-family ipv4 unicast label mode per-vrf redistribute connected redistribute static ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! !! AG4 Te0/0/0/26 to cBR8 DPIC Te1/1/2 primary active interfaceinterface TenGigE0/0/0/26 description .. Connected to cbr8 port te0/1/3 cdp service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing vrf rphy-vrf ipv4 address 5.5.9.101 255.255.255.0 ipv6 address 2001#5#5#9##101/64 ipv6 address 2001#6#6#9##101/64 load-interval 30! CIN to RPD ConfigurationThis section will highlight configurations used to support the RPD in both GRT and L3VPN configurations, including all relevant protocols.CIN to RPD Router Timing ConfigurationThe timing configuration is shared across both GRT and L3VPN configurations.Global timing configurationptp clock domain 60 profile g.8275.2 clock-type T-BC ! profile g82752_slave_v4 transport ipv4 port state slave-only sync frequency 16 clock operation one-step announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 ! profile g82752_slave_v6 transport ipv6 port state slave-only sync frequency 16 clock operation one-step announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 ! profile g82752_master_v4 transport ipv4 port state master-only sync frequency 16 clock operation one-step announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 ! profile g82752_master_v6 transport ipv6 port state master-only sync frequency 16 clock operation one-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 ! frequency priority 5 time-of-day priority 5 log servo events best-master-clock changes !! Core-facing (slave) timing configurationThis node has two core-facing interfaced to PA3 and PA4. On interface Te0/0/0/6 to PA3 we will use IPv6 and on Te0/0/0/7 to PA4 we will use IPv4. There is also an example of utilizing a VRF and sub-interface to force traffic into this physical interface. PTP packets must ingress the physical interface to be considered valid. There can be a situation with a dual-homed leaf where traffic to the Te0/0/0/6 interface comes in through the Te0/0/0/7 interface. One way to alleviate the issue is via proper IGP routing configuration, the second to use a local VRF with no reachability except via a single physical interface.interface TenGigE0/0/0/6 cdp service-policy input core-ingress-classifier service-policy output core-egress-queuing service-policy output core-egress-exp-marking ipv4 address 10.23.253.1 255.255.255.254 ipv6 address 2405#10#23#253##1/127 load-interval 30!interface TenGigE0/0/0/6.1000 ptp profile g82752_master_v6 master ipv6 2001#999## priority 5 ! ! vrf PTP-test ipv4 address 10.1.1.1 255.255.255.254 ipv6 address 2001#999##1/64 load-interval 30 encapsulation dot1q 1000!interface TenGigE0/0/0/7 cdp ptp profile g82752_slave_v4 master ipv4 10.24.253.0 ! master ipv4 100.100.100.1 priority 10 ! ! service-policy input core-ingress-classifier service-policy output core-egress-queuing service-policy output core-egress-exp-marking ipv4 address 10.24.253.1 255.255.255.254 ipv6 address 2405#10#24#253##1/127 load-interval 30! GRT Specific ConfigurationDHCP Configurationdhcp ipv4 profile rpd-dhcpv4 relay helper-address vrf default 4.4.9.100 helper-address vrf default 10.0.2.3 ! inner-cos 5 outer-cos 5 interface TenGigE0/0/0/16 relay profile rpd-dhcpv4 interface TenGigE0/0/0/17 relay profile rpd-dhcpv4 Multicast Routing Configurationmulticast-routing address-family ipv4 interface Loopback0 enable ! interface TenGigE0/0/0/6 enable ! interface TenGigE0/0/0/7 enable ! mdt source Loopback0 rate-per-route interface all enable accounting per-prefix ! address-family ipv6 interface Loopback0 enable ! interface TenGigE0/0/0/6 enable ! interface TenGigE0/0/0/7 enable ! rate-per-route interface all enable accounting per-prefix ! PIM Configurationrouter pim address-family ipv4 interface Loopback0 enable ! interface TenGigE0/0/0/6 enable ! interface TenGigE0/0/0/7 enable ! IGMP/MLD and Snooping Configurationrouter mld interface BVI100 version 2 !!router igmp interface BVI100 version 3 !!mld snooping profile mld-snoop-1!igmp snooping profile igmp-snoop-1 DHCP Configurationdhcp ipv4 vrf rphy-vrf relay profile rpd-dhcpv4-vrf profile rpd-dhcpv4 relay helper-address vrf default 4.4.9.100 helper-address vrf default 10.0.2.3 ! profile rpd-dhcpv4-vrf relay helper-address vrf rphy-vrf 10.0.2.3 relay information option allow-untrusted ! inner-cos 5 outer-cos 5 interface BVI100 relay profile rpd-dhcpv4! Physical Interface ConfigurationIn this configuration we are using a BVI to aggregate RPDs into a single L3 IP subnet, so note the “l2transport” keyword which places the RPD port in L2 modeinterface TenGigE0/0/0/16 description .. to RPD1 load-interval 30 l2transport L2VPN Bridge Domain ConfigurationIn IOS-XR all L2 configuration is done under the L2VPN context.l2vpn bridge group rpd bridge-domain rpd-1 mld snooping profile mld-snoop-1 igmp snooping profile igmp-snoop-1 interface TenGigE0/0/0/16 ! interface TenGigE0/0/0/17 ! routed interface BVI100 ! ! IRB/BRI Logical Interface ConfigurationThe BVI acts as the gateway interface for all RPDs placed within the same bridge-domain with BVI100 assigned as its routed interface. The command “local-proxy-arp” requires all traffic to bridge through the L3 interface, otherwise ARP traffic is b roadcast between all connected ports.interface BVI100 description ... to downstream RPD service-policy input rpd-dpic-ingress-classifier ipv4 address 192.168.2.1 255.255.255.0 local-proxy-arp ipv6 nd suppress-ra ipv6 nd other-config-flag ipv6 nd managed-config-flag ipv6 address 2001#192#168#2##1/64 ipv6 enable! IS-IS Routing Configuration for RPD InterfaceCommunication between DPIC interface and RPD is realized by advertising the BVI interface throughout the IS-IS domain. We utilize the “passive” option to advertise the interface with it participating in IS-IS itself.router isis ACCESS interface BVI100 passive address-family ipv4 unicast ! address-family ipv6 unicast L3VPN ConfigurationThe L3VPN configuration is similar to the GRT configuration with the following major differences# Multicast traffic is carried using Label Switched Multicast with mLDP and using the NG-MVPN BGP control-plane. This is known as multicast profile 14. Like the DPIC configuration, the RPD interface is advertised using MP-BGP using the VPNv4/VPNv6 address families Changes are needed to enable multicast, IGMP/MLD, and DHCP for VRF awareness In the following configuration, the RPD is connected to the router via a point-to-point L3 interface, no bridge-domain is utilized. There is no restriction using L3VPN with BVI interfaces.VRF Configuration vrf rphy-vrf rd auto address-family ipv4 unicast label mode per-ce redistribute connected redistribute static ! address-family ipv6 unicast label mode per-ce redistribute connected redistribute static ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! ! DHCP Configurationdhcp ipv4 vrf rphy-vrf relay profile rpd-dhcpv4-vrf profile rpd-dhcpv4-vrf relay helper-address vrf rphy-vrf 10.0.2.3 relay information option allow-untrusted ! inner-cos 5 outer-cos 5 interface TenGigE0/0/0/15 relay profile rpd-dhcpv4-vrf! MPLS mLDP ConfigurationLDP is configured with mLDP-only on the two core interfaces to PA3/PA4.mpls ldp capabilities sac mldp-only mldp address-family ipv4 make-before-break delay 30 forwarding recursive recursive-fec ! ! router-id 100.0.2.53 session protection address-family ipv4 ! interface TenGigE0/0/0/6 ! interface TenGigE0/0/0/7 !! BGP ConfigurationBGP is configured for two purposes. Distribute VPNv4/VPNv6 across the network for RPD to DPIC connectivity, and to exchange multicast routes between source and receiver nodes to build dynamic mLDP P2MP trees. Again, a cetnralized route reflector is used between all CIN elements.router bgp 100 nsr bgp router-id 100.0.2.53 bgp graceful-restart ibgp policy out enforce-modifications address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! neighbor-group SvRR remote-as 100 update-source Loopback0 address-family vpnv4 unicast soft-reconfiguration inbound always ! address-family vpnv6 unicast soft-reconfiguration inbound always ! address-family ipv4 mvpn soft-reconfiguration inbound always ! address-family ipv6 mvpn soft-reconfiguration inbound always ! ! neighbor 100.0.2.202 use neighbor-group SvRR ! vrf rphy-vrf rd auto address-family ipv4 unicast label mode per-ce redistribute connected redistribute static ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! Multicast Configurationmulticast-routing address-family ipv4 interface Loopback0 enable ! ! ! vrf rphy-vrf address-family ipv4 mdt source Loopback0 rate-per-route interface all enable accounting per-prefix bgp auto-discovery mldp ! mdt partitioned mldp ipv4 p2mp ! ! PIM Configurationrouter pim vrf rphy-vrf address-family ipv4 rpf topology route-policy mldp-partitioned-p2mp mdt c-multicast-routing bgp ! ! MLD and PIM Configurationrouter mld vrf rphy-vrf interface TenGigE0/0/0/15 version 2 ! !!router igmp vrf rphy-vrf interface BVI101 version 3 ! interface TenGigE0/0/0/15 ! !! Router to RPD Physical Interface Configurationinterface TenGigE0/0/0/15 description .. to RPD0 ptp profile g82752_master_v4 ! service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing vrf rphy-vrf ipv4 address 192.168.3.1 255.255.255.0 load-interval 30! ", "url": "/blogs/cin-design-guide", "author": "Phil Bedard", "tags": "iosxr, Design, Cable, CIN" } , "blogs-routinator-hosted-on-xr": { "title": "Hosting the Routinator 3000 RPKI validator and RTR server on IOS-XR", "content": " On This Page Blog Summary Why host an RPKI validator and cache on a router? Environment TL;DR RPKI Overview Quick RPKI Terminology What security problems does RPKI solve? How does RPKI solve route origin hijacking? How is RPKI data fetched and validated to create a set of “valid” prefixes? How do I get a set of valid prefixes from a ROA validated cache to my router? How do I get my router to do something with the data? IETF RFCs related to RPKI Routinator 3000 Overview Future enhancements to RPKI ASPA RTRv2 Additional information IOS-XR Third Party Applications Overview of TPA Networking Third party app forwarding overview Forwarding via XR control-plane (RP) Third party app binding (listening) address Protecting third party applications Diagram of networking in a running Docker container IOS-XR Third Party Application documentation Running Routinator in a Docker container in IOS-XR Overview Communication Flows Container overview Prerequisites Building the Docker container Build Steps Dockerfile build script entrypoint.sh script Copying the built docker container to an IOS-XR VM or device Loading the docker image into the local registry Running the routinator-xr container Docker Options Environment variables Verifying running state of the container Router configuration Hosting router RPKI RTR configuration Remote router RPKI RTR configuration Verify successful connectivity Appendix RTR over SSH RTR over SSH Docker container Environment variables How RTR over SSH proxy works RTR over SSH Router Configuration Additional RPKI open source and information NLNet Labs Krill project Cloudflare RPKI toolkit RIPE RPKI Validator v3 Blog SummaryThis blog assumes the reader is already familiar with the basic concepts of BGP, RPKI, and IOS-XR. Some previous knowledge of Docker is recommended but not 100% necessary. We will briefly cover the origins of RPKI, its components, and the basic operation of how a ROA in a regional registry ends up being a part of the BGP route processing criteria on a router. We will utilize Routinator 3000 from NLNet Labs as our RPKI validator and cache, but the concepts and solution would be similar for other options as well. A list of additional RPKI validators/caches and RPKI-RTR servers is located in the appendix.Why host an RPKI validator and cache on a router?Many organizations distribute BGP routing information using distributed route reflectors. Implementing the RPKI Relying Party function on a router follows the same deployment model, with the route reflectors also feeding edge routers with ROA information. Additionally, one does not have to maintain external server resources and network connectivity to perform what is a relatively simple network function. The number of Relying Party routers required depends on the size of the network, but an organization should have at least two in the network.EnvironmentTesting was done using a Cisco NCS 55A1-24H router using IOS-XR 6.6.3. Most of the information is relevant to IOS-XR 6.3.1 or newer.TL;DRSkip ahead to the section on running the container here or see https#//github.com/philbedard/routinator-xrRPKI OverviewQuick RPKI Terminology RIR# Regional Internet Registry RPKI# Resource Public Key Infrastructure ROA# Route Origin Authorization RTR# RPKI to Router Protocol RRDP# RPKI Repository Delta Protocol Relying Party# Anyone who wants to utilize the RIR ROA data for authorization (You!) RPKI Validator# Software for validating ROAs from the RIRs to build a list of prefixes with valid ROAs TAL# Trust Anchor Locator VRP# Validated ROA Payload Validated cache# List of validated prefixes, their max prefix length, and their origin ASNWhat security problems does RPKI solve?First and foremost I’ll touch upon what problems RPKI solves and what problems it doesn’t solve. The initial problem RPKI was created to solve is BGP route hijacking from rogue ASNs. Without RPKI, any ASN connected to the global Internet routing table can originate a prefix and potentially funnel destination traffic to themselves for malicious reasons. There are also ways to create man in the middle attacks by continuing to pass traffic to the original destination while extracting information as it passes through the rogue ASN. Longer prefix hijacking is also possible, since the Internet and routers today route packets based on a longest prefix match criteria. The RPKI infrastructure today can protect against these type of origin hijacks, but does not solve the issue of ASN hijacking or attest to the validity of the end to end chain of BGP advertisements (path validation).How does RPKI solve route origin hijacking?In order to verify a BGP prefix advertisement really originated from the organization (ASN) who owns the prefix, we must have a traceable way to attest they are the correct origin ASN for the prefix. This is done using X.509 PKI cryptographic certificates with records for organizations, ASNs and IPv4/IPv6 prefixes. The record responsible for certifying prefix validity for an ASN is the Route Origin Authorization record, or ROA. The ROA defined in RFC6482 uses a standard template simply containing the following elements# Authorized origin ASN, IPv4 or IPv6 prefixes, and the maximum length of the prefixes in the ROA. Each ROA is signed with the private key of an organization/ASN for validation by a Relying Party.Quoting RFC6483, A resource certificate describes an action by an issuer that binds a list of IP address blocks and Autonomous System (AS) numbers to the subject of a certificate, identified by the unique association of the subject’s private key with the public key contained in the resource certificate.How is RPKI data fetched and validated to create a set of “valid” prefixes?The steps how ASN and ROA records are used to validate a specific prefix is outlined in RFC6488, section 3. The consumer of the RPKI data, the relying party, must go through these steps in order to validate the signed objects and generate a list of prefixes with a valid cryptographic chain. The validation software will first use a seeded Trust Anchor Location or TAL for each RIR to begin downloading the chain of certificates used to eventually create prefix data for the router to use. RPKI data is stored in RIR, LIR, and delegated repositories so all of those must be traversed to gather data and validate the objects against the signers public key. RPKI data is downloaded using either rsync (legacy, may be deprecated) or RRDP. RRDP offers a more efficient way to download the data and runs over HTTPS.https#//datatracker.ietf.org/doc/draft-ietf-sidrops-rp covers Requirements for RPKI Relying Parties with more detail on these procedures.Once the validator has validated the signed RPKI objects, it will compile a list of {prefix,maxLength,ASN} entries. The entry is known as a Validated ROA Payload (VRP). The combined list of VRPs form the validated cache of prefixes.The validator refreshes its data periodically based on a default or user defined interval.How do I get a set of valid prefixes from a ROA validated cache to my router?RTR, the RPKI to Router protocol is defined in RFC6810. RTR uses a pull mechanism to download the validated cache data (set of VRPs) from the validator. The cache can signal the router it has new updates, forcing the router to download the new updates, or the router can periodically fetch the entries based on a timer. In IOS-XR the default cache refresh timer is 600 seconds. A serial number is used to keep track of incremental changes in cache data.How do I get my router to do something with the data?References on configuring IOS-XR based devices to use RPKI data can be found at https#//xrdocs.io/design/blogs/latest-peering-fabric-hld. We will briefly cover the IOS-XR configuration to the validator cache in the following sections, but not the policy and logic to process routes based on their ROA validation state.IETF RFCs related to RPKIThis is a non-exhaustive list. Please see the SIDR and SIDROPS IETF working groups for a complete list of completed RFCs and in-progress drafts.https#//tools.ietf.org/html/rfc6480 - An Infrastructure to Support Secure Internet Routinghttps#//tools.ietf.org/html/rfc6482 - A Profile for Route Origin Authorizations (ROAs)https#//tools.ietf.org/html/rfc6483 - Validation of Route Origin using PKI and ROAshttps#//tools.ietf.org/html/rfc6810 - The RPKI to Router Protocol (RTR)Routinator 3000 OverviewRoutinator 3000 is a combined RPKI validator and cache combined into a single application. Routinator is developed by NLnet Labs using the Rust programming language. Routinator utilizes the RTR protocol to provide valid prefix data to the downstream routers for announcement validation. The Routinator 3000 project can be found at https#//github.com/NLnetLabs/routinator.the Routinator 3000 complete documentation can be found at# https#//rpki.readthedocs.io/en/latest/index.htmlFuture enhancements to RPKIASPAASPA or Autonomous System Provider Authorization extends the capabilities of RPKI to attend to ASN pairs in a BGP prefix AS path. A new ASPA object will be created with these ASN pairs to validate the AS path for a prefix advertisement. ASPA will add the ability to detect rogue ASNs within the path and not only as origins, giving a means to perform path validation. The latest version of the ASPA draft is located at# https#//tools.ietf.org/html/draft-azimov-sidrops-aspa-profileRTRv2RTRv2 is being defined in the SIDROPS working group via the https#//datatracker.ietf.org/doc/draft-ymbk-sidrops-8210bis/ draft. RTRv2 adds additional capabilities to the existing RTR protocol along with support for ASPA records.Additional information Great in-depth information on RPKI# https#//rpki.readthedocs.io/en/latest/index.html ARIN landing page for RPKI# https#//www.arin.net/resources/manage/rpki/ RIPE landing page for RPKI# https#//www.ripe.net/manage-ips-and-asns/resource-management/certificationIOS-XR Third Party ApplicationsThird party applications or a TPA gives one the ability to run applications not supplied by Cisco on hardware devices and virtual routers like the IOS-XRv9000. Due to the fact IOS-XR 64-bit is based on Linux, it allows a variety of methods for hosting applications on the device. Native Linux applications, Vagrant virtualized environments, and LXC/Docker containers are supported. In this blog we will focus on Docker containers, since Docker is the defacto way of building, distributing, and running Linux applications without having to worry about the base OS and other external dependencies.Overview of TPA NetworkingIn order to isolate different components between the base Linux OS and IOS-XR control-plane, Linux namespaces and specific routing tables are used. By default, a Docker container has no access to the XR global namespace. In addition to the global namespace, named “global-vrf”, each VRF instance created in XR creates a new Linux nanespace. This isolation can be used creatively to limit communication between IOS-XR and third party applications, or between third party applications themselves.Third party app forwarding overviewThe XR Linux environment uses specific virtual interfaces to forward packets from a container to the local XR control-plane or to send packets outside the router using the XR FIB. In the default configuration the fwdintf interface is used to route packets out the appropriate external or management interface using the XR FIB. The fwd_ew interface is used for East-West communication between third party applications and the XR control plane.The default configuration uses the Loopback1 interface for East-West communication, but an alternative interface can be configured using the tpa east-west [interface] configuration command.The default configuration uses the Loopback0 interface for the source address of packets sent from the Docker container either to the XR control plane via the fwd_ew interface or externally using the fwdintf interface. The source-address can be set to any available interface on the router via the tpa address-family ipv4|ipv6 update-source [interface] configuration command. As an example if the “Internet” is reachable via the management interface one would set the source address to the management interface.The following shows what a default routing table looks like with no additional TPA configuration. The loopback0 interface is 192.168.11.1/32 and the loopback1 interface is 192.168.99.1.[Compass-PEF-PE1#~]$ ip route showdefault dev fwdintf scope link src 192.168.11.1172.20.33.0/24 dev Mg0_RP0_CPU0_0 scope link src 172.20.33.36192.168.99.1 dev fwd_ew scope link src 192.168.11.1In our example we will configure a Loopback0 and Loopback1 interface and communicate to the third party Docker container via those interfaces. External hosts will utilize the Loopback0 interface.Forwarding via XR control-plane (RP)If you have a default 0.0.0.0/0 route in the XR control plane to the management interface, you will need to change the TPA configuration to route all traffic using the fwd_ew interface instead of the fwdintf interface. This is done using the “tpa address-family ipv4 default-route mgmt” command. You may also want to then change the source address to the mgmt interface instead of the default Loopback0 interface. This blog will not cover all network scenarios so please consult the documentation.Third party app binding (listening) addressIn the default configuration the third party application listens to all external interfaces, including Loopback0. Loopback0 is represented in the Linux environment as the “lo” interface shown below. The application itself can bind to a specific address and IOS-XR will adhere to the configuration if the address exists within the namespace.[Compass-PEF-PE1#~]$ ip addr show lo1# lo# <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00#00#00#00#00#00 brd 00#00#00#00#00#00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet 192.168.11.1/32 scope global lo#0 valid_lft forever preferred_lft forever inet6 2001#192#168#11##1/128 scope global valid_lft forever preferred_lft forever inet6 ##1/128 scope host valid_lft forever preferred_lft foreverOne can restrict traffic to the third party app to specific external interfaces using the following configuration#tpa vrf default address-family ipv4 protection allow protocol tcp local-port 3323 interface Loopback0 ! ! !!Protecting third party applicationsIOS-XR’s TPA configuration allows one to filter incoming packets based on a number of criteria. Since the RTR protocol is internal to the network, it’s recommended to explicitly allow external hosts to access the third party application. One could also create a standard infrastructure ACL to apply to external interfaces as an alternative.tpa vrf default address-family ipv4 protection allow protocol tcp local-port 3323 remote-address 1.1.1.1/32 allow protocol tcp local-port 3323 remote-address 1.1.1.2/32 ! ! !!Diagram of networking in a running Docker containerIOS-XR Third Party Application documentation http#//xrdocs.io has an entire section dedicated to app hosting on IOS-XR. You will find in-depth information and real-world examples using native Linux applications, LXC/Docker containers, and Vagrant boxes to host applications. Official app hosting configuration guide# https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/app-hosting/b-application-hosting-configuration-guide-ncs5500/b-application-hosting-configuration-guide-ncs5500_chapter_010.htmlRunning Routinator in a Docker container in IOS-XROverviewThe rest of the blog we will work on building a Routinator 3000 Docker container to run on IOS-XR. Routinator will run as both a RPKI validator/cache and act as a RTR server.Communication FlowsRoutinator uses rsync and/or RRPD to download RPKI data from the various registries hosting the data, thus the router or IOS-XR VM needs access to the global Internet. The default port for rsync is 873/tcp. RRDP uses port 443 as it runs over HTTPS.Routinator has a web API running over HTTP you can use to retrieve the VRPS in a variety of output formats. For example if my server ip is 172.20.1.1 with the HTTP server running on port 9556 using http#//172.20.1.1#9556/csv will return a full list of VRPS in CSV format. You can also check a single prefix for validity by using the following URL# http#//172.20.1.1#9556/validity?asn=209&prefix=100.0.0.0/24. See the Routinator documentation for more details.RTR in our example uses port 3323/tcp but could utilize any TCP port not in use on the router already.There is also an option in IOS-XR to use RTR over SSH. Routinator does not natively support RTR over SSH, so a proxy is used to bridge traffic between SSH and Routinator. Please see the “RTR over SSH” section here for more details.Container overviewThe Routinator 3000 for XR container is built using an Ubuntu 18.04 base. The resulting image is about 82MB in size. The standard public Routinator docker image uses an Alpine Linux base, which could also be used in this instance, Ubuntu was used more for familiarity than any technical reason, all of the Linux packages used are available for Alpine. The Github repository will contain Dockerfiles for both an Alpine Linux and Ubuntu based Docker image. This Docker image with minor changes could run on any Linux system with Docker installed.The Routinator program runs under the user “routinator” once started, unless re-defined within the Dockerfile during build.PrerequisitesIf you wish to build the Docker container yourself you will need a Linux host (Ubuntu, Centos, etc) with Docker installed. Instructions for installing Docker on different Linux distributions can be found at https#//docs.docker.com/get-docker.One option for running the Docker image is to copy it to/from the router using the “docker save” and “docker load” commands, requiring the user scp the file to the router or XR VM. If using Linux (including downloading from a host from the router itself) or MacOS the standard scp command will work. Windows will require an scp tool like WinSCP.Running the Docker container from the router using a public or private Docker registry it is 100% supported as long as the router has IP access to the registry. The up to date built image is located in the Docker hub via philxor/routinator-xr. Information on creating a private registry to load containers from is covered in Akshat Sharma’s excellent XR third party application blog here# https#//xrdocs.io/application-hosting/tutorials/2017-02-26-running-docker-containers-on-ios-xr-6-1-2/#public-dockerhub-registryBuilding the Docker containerIf you wish to skip ahead and download the prebuilt Docker image you can access the routinator-xr.tar.gz image at this URL and skip ahead to the section on loading the Docker image. https#//github.com/philbedard/routinator-xr.tar.gz You can also load the image off the public Docker hub by issuing docker pull philxor/routinator-xrBuild StepsBuilding the routinator image happens in two steps, defined in a single Dockerfile. First we build the routinator application which requires a much larger Rust compilation environment. The next stage of the Docker build creates the image the router will run. We will use Docker’s ability to copy files from one container to another during the build stage to copy the compiled Routinator binary to the new IOS-XR container image.We will now go over the different components in the Dockerfile used to build the image which will ultimately validate and serve RPKI prefix data to the local XR instance and the rest of the network.Dockerfile build scriptThe following is the annotated build file. Also available at https#//github.com/philbedard/routinator-xr# Build routinator binary for Linux glibcFROM ubuntu#18.04 as build# Proxy environment variables if needed for cargo and git ENV http_proxy=http#//myproxy.com#80ENV https_proxy=http#//myproxy.com#80# Add TiniENV TINI_VERSION v0.15.0ADD https#//github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /tiniRUN apt-get update && apt-get upgrade -y && apt-get install -y \\ git \\ cargo \\ libssl-dev \\ && rm -rf /var/lib/apt/lists/*WORKDIR /tmp/routinatorRUN git clone --depth 1 https#//github.com/NLnetLabs/routinator .RUN cargo build \\ --release \\ --locked# Create actual routinator container with runtime argumentsFROM ubuntu#18.04MAINTAINER bedard.phil@gmail.com# Copy routinator binary from build image COPY --from=build /tmp/routinator/target/release/routinator /usr/local/bin# Install Tini to capture ^C if running in foregroundCOPY --from=build /tini /sbin/tiniRUN chmod +x /sbin/tiniARG RUN_USER=routinatorARG RUN_USER_UID=1012ARG RUN_USER_GID=1012RUN apt-get update && apt-get install -y \\ rsync \\ iproute2 \\ iputils-ping \\ sudo \\ && rm -rf /var/lib/apt/lists/*RUN useradd -u $RUN_USER_GID -U $RUN_USER && \\ mkdir -p /home/${RUN_USER}/.rpki-cache/repository /home/${RUN_USER}/.rpki-cache/tals/arin && \\ chown -R ${RUN_USER_UID}#${RUN_USER_GID} /usr/local/bin/routinator /home/${RUN_USER}/.rpki-cache# Copy TAL files from source to user directory# Requires acceptance of ARIN TAL at https#//www.arin.net/resources/rpki/tal.htmlCOPY --from=build /tmp/routinator/tals/*.tal /home/${RUN_USER}/.rpki-cache/tals/# The ARIN TAL is distributed with the image but you will not use it by# default. If you have accepted the ARIN TAL usage at https#//www.arin.net/resources/rpki/tal.html# you can comment out the following line so all TALs are in the TAL directory at build time# Otherwise you must interactively accept the ARIN RPA when using the init command RUN mv /home/${RUN_USER}/.rpki-cache/tals/arin.tal /home/${RUN_USER}/.rpki-cache/tals/arin/# Copy entrypoint.sh to root of image for execuationCOPY entrypoint.sh /entrypoint.shRUN chmod +x /entrypoint.sh# Default ports to expose. Not 100% necessary in IOS-XR since it will globally listen to ports within the container.EXPOSE 3323/tcpEXPOSE 9556/tcpENTRYPOINT [~/sbin/tini~, ~--~, ~/entrypoint.sh~]entrypoint.sh scriptThis Routinator Docker image uses an entrypoint script to setup the environment based on options passed via the “docker run” command and then execute routinator. Here is the entire entrypoint.sh script, annotated with comments for each section. The entrypoint.sh script in this case is included as an external file on the host but could be incorporated into the Docker “Dockerfile” build script.The entrypoint script creates a a directory in the XR host /app_host/ directory. Due to the disk space required for the RPKI data we must store the data outside the internal Docker container filesystem. It’s also a best practice for data persistance, so the data does not need to be re-downloaded when the Docker container is removed. Any directory can be used as long as its mounted to /data within the container.#!/bin/bashif [[ $1 == ~init~ ]];then echo ~Creating RPKI data directory at /misc/app_host/rpki~ mkdir -p /data/rpki/tals mkdir -p /data/rpki/repository chown -R routinator#routinator /data/rpki/ echo ~Copying TAL data from container to host directory~ sudo -u routinator cp /home/routinator/.rpki-cache/tals/* /data/rpki/tals 2>&1 echo ~Please read the ARIN RPA at https#//www.arin.net/resources/manage/rpki/rpa.pdf~ read -p ~If you agree with the ARIN RPA type 'yes', any other input will mean non-agreement and the ARIN TAL will NOT be installed# ~ ARIN_RPA if [ ~$ARIN_RPA~ = ~yes~ ]; then echo ~User agreed to ARIN TAL, copying ARIN TAL file~ sudo -u routinator cp /home/routinator/.rpki-cache/tals/arin/* /data/rpki/tals else echo ~User declined ARIN TAL, will not be installed in TALS directory. Rerun init to copy, or copy manually after agreeing~ fielif [ ! ~$#~ -eq 0 ];then echo ~Starting command $@~ exec ~$@~else if [[ ! -d ~/data/rpki~ ]]; then echo ~Please run container with 'init' to create directories and copy TAL files~ exit 1 fi VRF=~${VRF#-global-vrf}~ echo ~Using $VRF as namespace, override default of global-vrf with -e VRF=vrf if using a different VRF for TPA~ RTR_PORT=~${RTR_PORT#-3323}~ echo ~Using $RTR_PORT as RTR server port, override default of 3323 with -e RTR_PORT=port~ HTTP_PORT=~${HTTP_PORT#-9556}~ echo ~Using $HTTP_PORT as Routinator HTTP server port, override default of 9556 with -e HTTP_PORT=port~ if [[ -v RSYNC_PROXY ]]; then echo ~Using $RSYNC_PROXY as rsync proxy~ else echo ~No rsync proxy set, set using -e RSYNC_PROXY=proxy (not URI) in docker run if required~ fi if [[ -v RRDP_PROXY ]]; then echo ~Using $RRDP_PROXY as rrdp proxy~ RRDP_ARG=~--rrdp-proxy=${RRDP_PROXY}~ else echo ~No RRDP proxy set, set using -e RRDP_PROXY=proxy (URI form) in docker run if required~ fi NS1=~${NS1#-208.67.222.222}~ echo ~Using ~$NS1~ as primary DNS server, override with -e NS1=nameserver to override default of 208.67.222.222~ echo ~nameserver ~$NS1~~ > /etc/resolv.conf NS2=~${NS2#-208.67.220.220}~ echo ~Using ~$NS2~ as secondary DNS server, override with -e NS2=nameserver to override default of 208.67.220.220~ echo ~nameserver ~$NS2~~ >> /etc/resolv.conf echo ~Starting Routinator~ ip netns exec ${VRF} sudo -E -u routinator routinator \\ --base-dir /data/rpki/ \\ --verbose \\ $RRDP_ARG \\ server --rtr 0.0.0.0#$RTR_PORT --http 0.0.0.0#$HTTP_PORTfiCopying the built docker container to an IOS-XR VM or deviceAs noted, we will be using the docker save/load commands as opposed to a registry. On the linux host the docker image was built on execute the following command to save docker image to a .tar file gzip the file to minimize space#myhost$ docker save --output routinator-xr.tar && gzip routinator-xr.tar In this case I will be copying from the host to the router by utilizing scp on the router itself.RP/0/RP0/CPU0#Compass-PEF-PE1# bash [Compass-PEF-PE1#~]$ scp cisco@myhost#/home/cisco/routinator-xr.tar.gz . [Compass-PEF-PE1#~]$ gzip -d routinator-xr.tar.gzLoading the docker image into the local registryThe Docker image is now ready to load. The default user in XR is root so there is no need to sudo to load the docker image. Once loaded we can issue the “docker images” command to view the local images.[Compass-PEF-PE1#~]$ docker load --input routinator-xr.tar[Compass-PEF-PE1#~]$ docker imagesREPOSITORY TAG IMAGE ID CREATED SIZEroutinator-xr latest 65e4574eb6cb 20 hours ago 79.51 MB Running the routinator-xr containerThe routinator Docker container is started from the IOS-XR bash shell, executed using the “bash” command, or by utilizing ssh to the shell if it’s enabled. The docker image used can either be the image transferred to the IOS-XR host or it can run from the public docker registry via the philxor/routinator-xr tag.The container must initially be run with the “init” command to initialize the data directories and accept the ARIN RPA. The following will run a single-use container that will setup the directories and present the user with an option to accept the ARIN RPA. The container must be run with the -it switch for interactive input.docker run --name routinator \\ --it \\ --rm \\ -v /misc/app_host#/data \\ routinator-xr initThe following docker run command shows starting the container with both mandatory and optional parameters after the first “init” run.docker run --name routinator \\ --restart always \\ --detach \\ --cap-add SYS_ADMIN \\ -v /misc/app_host#/data \\ -v /var/run/netns/global-vrf#/var/run/netns/global-vrf \\ -e VRF=global-vrf \\ -e RSYNC_PROXY=proxy.esl.cisco.com#80 \\ -e RRDP_PROXY=http#//proxy.esl.cisco.com#80 \\ -e NS1=171.70.168.183 \\ -e NS2=171.70.168.184 \\ -e RTR_PORT=3323 \\ -e HTTP_PORT=9556 \\ philxor/routinator-xrDocker Options Options Purpose –name Sets the name of the running container, default is generated by Docker –restart always Will automatically restart a container on exit or reload of the host –detach Run container in the background, running in the foreground is useful for debugging –cap-add Must be set to SYS_ADMIN to allow access to the network namespace -v This option mounts host volumes within the container. The namespace utilized by the container must be mounted and the RPKI host data directory must be mounted to /data Environment variablesEnvironment variables specified with -e are how docker passes arguments to containers. Environment Variable Default Definition VRF global-vrf Sets the IOS-XR namespace routinator runs in. The default is the global-vrf namespace. See the network section for more information RSYNC_PROXY none If your environment requires a proxy to reach RSYNC destinations, use this variable. The rsync proxy is not prefixed by http/https RRDP_PROXY none RRDP uses HTTPS, so if you require a HTTP/HTTPS proxy use this variable NS1 208.67.222.222 Primary nameserver NS2 208.67.220.220 Secondary nameserver RTR_PORT 3323 RTR server port HTTP_PORT 9556 Routinator HTTP API port Verifying running state of the containerOnce you have started the container you can issue a “docker ps” from the bash prompt and should see something like this#[Compass-PEF-PE1#~]$ docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMESbea1760f73b6 routinator-xr ~/sbin/tini -- /entry~ 6 minutes ago Up 6 minutes routinatorYou can check to make sure the third party application (Docker container in our case), is listening on the correct ports by using the “netstat -lt” or “ss -lt” commands.[Compass-PEF-PE1#~]$ netstat -ltnActive Internet connections (only servers)Proto Recv-Q Send-Q Local Address Foreign Address Statetcp 0 0 0.0.0.0#9556 0.0.0.0#* LISTENtcp 0 0 0.0.0.0#3323 0.0.0.0#* LISTENRouter configurationThe following diagram shows our basic testing configuration. The router hosting routinator uses it as a RTR server as well as other routers in the network. Ideally the RTR server would be deployed similarly to a route reflector in the network.Hosting router RPKI RTR configurationThe only special configuration required when using the hosted routinator as a destination on its hosting router is to specify a source interface to use. In our case this interface will be Loopback1, the default east-west forwarding interface. The ability to specify a source interface for RPKI RTR sessions was added in IOS-XR 6.6.25. 192.168.11.1 is the Lo0 IP address.router bgp 100 rpki server 192.168.11.1 bind-source interface Loopback1 transport tcp port 3323 !!Remote router RPKI RTR configurationNo special configuration is necessary, the configuration will look like any other RTR server.router bgp 100 rpki server 192.168.11.1 transport tcp port 3323 !!Verify successful connectivityUse the show rpki server <server> command to view details about connectivity to the RPKI serverRPKI Cache-Server 192.168.11.1 Transport# TCP port 3323 Bind source# Loopback1 Connect state# ESTAB Conn attempts# 1 Total byte RX# 3372924 Total byte TX# 1112RPKI-RTR protocol information Serial number# 8 Cache nonce# 0xC1ED Protocol state# DATA_END Refresh time# 600 seconds Response time# 30 seconds Purge time# 60 seconds Protocol exchange ROAs announced# 126492 IPv4 21740 IPv6 ROAs withdrawn# 5450 IPv4 1139 IPv6 Error Reports # 0 sent 0 rcvdAppendixRTR over SSHThe data transmitted between the validator cache and router in a default configuration is public data. In the case where a provider is using their own delegated RPKI or SLURM to supply local prefixes outside the global RPKI system, it’s up the provider’s discretion. Since the cache is internal to the network and the rest of the network control-plane is not encrypted, running RTR over SSH is not entirely necessary but the following can be used to accomplish it.On IOS-XR the RTR client can run over cleartext TCP or use SSH. Routinator does not natively support SSH connections, so an intermediate proxy must be used to interconnect the SSH connection to routinator’s RTR port.The below OpenSSH server configuration can be done in the native IOS-XR shell, but in the spirit of using Docker containers for specific functions, we can use another Docker container running on an external host or the router to proxy SSH to routinator.RTR over SSH Docker containerThe following Docker Dockerfile will build a simple rtr over ssh proxy container. The container is meant to run on a host or under IOS-XR itself. The up to date Dockerfile can be found at https#//github.com/philbedard/docker-rtr-ssh-proxy# This builds a simple SSH over RTR proxy Docker containerFROM ubuntu#18.04MAINTAINER bedard.phil@gmail.com#Static values if building container once, can be overridden by environment variablesARG SSHD_PORT=2222ARG SERVER_IP=127.0.0.1ARG SERVER_PORT=3323#Default user is rpki, can be changed to any userARG RUN_USER=rpkiARG RUN_USER_UID=1013ARG RUN_USER_GID=1013### !!! Change this !!! ###ARG RUN_USER_PW=# If building the container once with defined default values and using namespaces uncomment this line,# otherwise you can define a namespace as docker run environment variable#ARG NAMESPACE=global-vrfRUN apt-get update && apt-get install -y \\ iproute2 \\ netcat \\ openssh-server && \\ apt-get -y purge python3 && \\ rm -rf /var/lib/apt/lists/*# SSH login fix. Otherwise user is kicked off after loginRUN sed 's@session\\s*required\\s*pam_loginuid.so@session optional pam_loginuid.so@g' -i /etc/pam.d/sshdRUN echo ~Port $SSHD_PORT~ > /etc/ssh/sshd_config && \\ echo ~PasswordAuthentication yes~ >> /etc/ssh/sshd_config && \\ echo ~UsePAM yes~ >> /etc/ssh/sshd_config && \\ echo ~Ciphers +3des-cbc~ >> /etc/ssh/sshd_config && \\ echo ~KexAlgorithms +diffie-hellman-group1-sha1~ >> /etc/ssh/sshd_config && \\# echo ~Subsystem rpki-rtr /bin/nc $SERVER_IP $SERVER_PORT~ >> /etc/ssh/sshd_config && \\ mkdir -p /var/run/sshdRUN useradd -u $RUN_USER_GID -U $RUN_USER && \\ echo ~$RUN_USER#$RUN_USER_PW~ | chpasswdENV SSHD_PORT=${SSHD_PORT}ENV SERVER_IP=${SERVER_IP}ENV SERVER_PORT=${SERVER_PORT}EXPOSE $SSHD_PORT# Uncomment this line to use a default namespace without -e docker run option#ENV NAMESPACE=${NAMESPACE}SHELL [~/bin/bash~, ~-c~]# If a namespace is set run the command under the namespace otherwise run without namespaceCMD if [[ -v NAMESPACE ]] ; \\ then ip netns exec $NAMESPACE /usr/sbin/sshd -D -p $SSHD_PORT -o ~Subsystem rpki-rtr /bin/nc $SERVER_IP $SERVER_PORT~ -f /etc/ssh/sshd_config ; \\ else /usr/sbin/sshd -D -p $SSHD_PORT -o ~Subsystem rpki-rtr /bin/nc $SERVER_IP $SERVER_PORT~ -f /etc/ssh/sshd_config ; \\ fiDue to the fact the user password is defined in cleartext the container must be built by the user, it will not be supplied as a pre-built Docker image.You can run the RTR over SSH Docker container in IOS-XR with the following docker run command#docker run --name rtr-proxy \\ --cap-add SYS_ADMIN \\ -v /var/run/netns/global-vrf#/var/run/netns/global-vrf \\ -e NAMESPACE=global-vrf \\ -e SSHD_PORT=2222 \\ -e SERVER_IP=172.20.33.36 \\ -e SERVER_PORT=3323 \\ rtr-ssh-proxy Environment variablesEnvironment variables specified with -e are how docker passes arguments to containers. Environment Variable Default Definition NAMESPACE none Sets the namespace routinator runs in. Default is no specific namespace SSHD_PORT 2222 This is the SSH server port the router will connect to SERVER_IP 127.0.0.1 RTR server to proxy connections to, does not have to be local to the Docker host SERVER_PORT 3323 RTR server cleartext TCP port How RTR over SSH proxy worksThis container uses what is known as a OpenSSH subsystem to pipe the output of the SSHD session to another application. “rpki-rtr” is a well-known subsystem. When the router executes the ssh session to the RTR server it uses the “-S” ssh client flag to notify the SSH server the session should be handled by the rpki-rtr subsystem. In our example we use netcat to facilitate the proxy but other tools like socat could also be used.The /etc/sshd_config config file on the host as following (netcat is used for proxying in the example below)#Subsystem rpki-rtr /bin/nc 127.0.0.1 3323RTR over SSH Router ConfigurationThis is the basic configuration for defining a RPKI cache utilizing SSH transport. The password configuration will NOT be stored in the visible configurationrouter bgp 100 rpki server 172.27.223.244router bgp 100 rpki server 172.27.223.244 username rpkirouter bgp 100 rpki server 172.27.223.244 password password router bgp 100 rpki server 172.27.223.244 transport ssh port 57322 A recent version of OpenSSH no longer recognizes the version string of “1.99” identifying support for both SSH v1 and SSH v2. If you see a “major protocol version mismatch” error in the SSH server logs and the router cannot connect to the RPKI cache over SSH the following must be enabled to force the router SSH client protocol version to v2.RP/0/RP0/CPU0#Compass-PEF-PE1#ssh client v2Additional RPKI open source and informationNLNet Labs Krill projecthttps#//www.nlnetlabs.nl/projects/rpki/krill/ This is an open source RPKI Certificate Authority, allowing a provider to run their own delegated RPKI system. Having your own delegated authority allows one to manage ROAs yourself rather than go through an RIR portal. You can utilize your own in-house processes then to generate and manage ROAs and other RPKI resources.Cloudflare RPKI toolkithttps#//github.com/cloudflare/cfrpkiThe Cloudflare RPKI tooklkit consists of OctoRPKI, the Relying Party (validator) component of RPKI, and GoRTR the RPKI to RTR implementation. GoRTR uses a JSON list of prefixes (can be generated by OctoRPKI) as source information to feed routers. GoRTR supports cleartext, TLS, and SSH router to server connections.RIPE RPKI Validator v3https#//github.com/RIPE-NCC/rpki-validator-3/wikiThis is version 3 of RIPE NCCs RPKI Validator. It supports validation of all RPKI objects, has a full API to interact with the validator, and also contains a full web UI to explore RPKI data and validation status. There is also an included rpki-rtr server.", "url": "/blogs/routinator-hosted-on-xr", "author": "Phil Bedard", "tags": "rpki, peering, internet, security" } , "#": {} , "#": {} , "blogs-2020-10-01-peering-fabric-hld-3-5": { "title": "Peering Fabric Design", "content": " On This Page Revision History Key Drivers Traffic Growth Network Simplification Network Efficiency High-Level Design Peering Strategy Content Cache Aggregation Topology and Peer Distribution Platforms Control-Plane Slow Peer Detection for BGP Telemetry Automation Zero Touch Provisioning Cisco Crosswork Health Insights KPI pack Advanced Security using BGP Flowspec and QPPB (1.5) Radware validated DDoS solution Radware DefensePro Radware DefenseFlow Solution description Solution diagram Router SPAN (monitor) to physical interface configuration Router SPAN (monitor) to PWE Netscout Arbor validated DDoS Solution Solution Diagram Netscout Arbor Sightline Sightline Appliance Roles Netscout Arbor Threat Management System (TMS) Solution description Edge Mitigation Options Traffic Redirection Options Netscout Arbor TMS Blacklist Offloading Mitigation Example Internet and Peering in a VRF RPKI and Route Origin Validation Next-Generation IXP Fabric Validated Design Peering Fabric Design Use Cases Traditional IXP Peering Migration to Peering Fabric Peering Fabric Extension Localized Metro Peering and Content Delivery Express Peering Fabric Datacenter Edge Peering Peer Traffic Engineering with Segment Routing ODN (On-Demand Next-Hop) for Peering DDoS Traffic Steering using SR-TE and EPE Low-Level Design Integrated Peering Fabric Reference Diagram Distributed Peering Fabric Reference Diagram Peering Fabric Hardware Detail NCS-5501-SE NCS-55A1-36H-SE NCS-55A1-24H NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card NCS-55A2-MOD-SE-S Peer Termination Strategy Distributed Fabric Device Roles PFL – Peering Fabric Leaf PFS – Peering Fabric Spine Device Interconnection Capacity Scaling Peering Fabric Control Plane PFL to Peer PFL to PFS PFS to Core SR Peer Traffic Engineering Summary Nodal EPE Peer Interface EPE Abstract Peering SR-TE On-Demand Next-Hop for Peering ODN Configuration IXP Fabric Low Level Design Segment Routing Underlay EVPN L2VPN Services Peering Fabric Telemetry Telemetry Diagram Model-Driven Telemetry BGP Monitoring Protocol Netflow / IPFIX Automation and Programmability Cisco NSO Modules Netconf YANG Model Support 3rd Party Hosted Applications XR Service Layer API Recommended Device and Protocol Configuration Overview Common Node Configuration Enable LLDP Globally PFS Nodes IGP Configuration Segment Routing Traffic Engineering BGP Global Configuration Model-Driven Telemetry Configuration PFL Nodes Peer QoS Policy Peer Infrastructure ACL Peer Interface Configuration IS-IS IGP Configuration BGP Add-Path Route Policy BGP Global Configuration EBGP Peer Configuration PFL to PFS IBGP Configuration Netflow/IPFIX Configuration Model-Driven Telemetry Configuration Abstract Peering Configuration PFS Configuration BGP Flowspec Configuration and Operation Enabling BGP Flowspec Address Families on PFS and PFL Nodes BGP Flowspec Server Policy Definition BGP Flowspec Server Enablement BGP Flowspec Client Configuration QPPB Configuration and Operation Routing Policy Configuration Global BGP Configuration QoS Policy Definition Interface-Level Configuration BGP Graceful Shutdown Outbound graceful shutdown configuration Inbound graceful shutdown configuration Activating graceful shutdown Security Peering and Internet in a VRF VRF per Peer, default VRF for Internet Internet in a VRF Only VRF per Peer, Internet in a VRF Infrastructure ACLs BCP Implementation BGP Attribute and CoS Scrubbing BGP Control-Plane Type 6 Encryption Configuration TCP Authentication Option, MD5 Deprecation Per-Peer Control Plane Policers BGP Prefix Security RPKI Origin Validation BGP RPKI and ROV Confguration Create ROV Routing Policies Configure RPKI Server and ROV Options Enabling RPKI ROV on BGP Neighbors Communicating ROV Status via Well-Known BGP Community BGPSEC (Reference Only) DDoS traffic steering using SR-TE SR-TE Policy configuration Egress node BGP configuration Egress node MPLS static LSP configuration Appendix Applicable YANG Models NETCONF YANG Paths BGP Operational State Global BGP Protocol State BGP Neighbor State Example Usage BGP RIB Data Example Usage BGP Flowspec Device Resource YANG Paths Validated Model-Driven Telemetry Sensor Paths Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform LLDP Monitoring Interface statistics and state The following sub-paths can be used but it is recommended to use the base openconfig-interfaces model Aggregate bundle information (use interface models for interface counters) BGP Peering information IS-IS IGP information It is not recommended to monitor complete RIB tables using MDT but can be used for troubleshooting QoS and ACL monitoring BGP RIB information It is not recommended to monitor these paths using MDT with large tables Routing policy Information Revision History Version Date Comments 1.0 05/08/2018 Initial Peering Fabric publication 1.5 07/31/2018 BGP-FS, QPPB, ZTP, Internet/Peering in a VRF, NSO Services 2.0 04/01/2019 IXP Fabric, ODN and SR-PCE for Peering, RPKI 3.0 01/10/2020 SR-TE steering for DDoS, BGP graceful shutdown, Radware DDoS validation 3.5 11/01/2020 BGP slow peer detection, Type-6 Password Encryption, Arbor DDoS validation Key DriversTraffic GrowthInternet traffic has seen a compounded annual growth rate of 30% orhigher over the last five years, as more devices are connected and morecontent is consumed, fueled by the demand for video. Traffic willcontinue to grow as more content sources are added and Internetconnections speeds increase. Service and content providers must designtheir peering networks to scale for a future of more connected deviceswith traffic sources and destinations spanning the globe. Efficientpeering is required to deliver traffic to consumers.Network SimplificationSimple networks are easier to build and easier to operate. As networksscale to handle traffic growth, the level of network complexity mustremain flat. A prescriptive design using standard discrete componentsmakes it easier for providers to scale from networks handling a smallamount of traffic to 10s of Tbps without complete network forklifts.Fabrics with reduced control-plane elements and feature sets enhancestability and availability. Dedicating nodes to specific functions ofthe network also helps isolate the rest of the network from maliciousbehavior, defects, or instability.Network EfficiencyNetwork efficiency refers not only to maximizing network resources butalso optimizing the environmental impact of the deployed network. Muchof Internet peering today is done in 3rd party facilitieswhere space, power, and cooling are at a premium. High-density, lowerenvironmental footprint devices are critical to handling more trafficwithout exceeding the capabilities of a facility. In cases wheremultiple facilities must be connected, a simple and efficient way toextend networks must exist.High-Level DesignThe Peering design incorporates high-density environmentallyefficient edge routers, a prescriptive topology and peer terminationstrategy, and features delivered through IOS-XR to solve the needs ofservice and content providers. Also included as part of the Peeringdesign are ways to monitor the health and operational status of thepeering edge and through Cisco NSO integration assist providers inautomating peer configuration and validation. All designs areboth feature tested and validated as a complete design to ensurestability once implemented.Peering Strategyproposes a localized peering strategy to reduce network cost for“eyeball” service providers by placing peering or content provider cachenodes closer to traffic consumers. This reduces not only reducescapacity on long-haul backbone networks carrying traffic from IXPs toend users but also improves the quality of experience for users byreducing latency to content sources. The same design can also be usedfor content provider networks wishing to deploy a smaller footprintsolution in a SP location or 3rd party peering facility.Content Cache AggregationTraditional peering via EBGP at defined locations or over point to point circuits between routers is not sufficient enough today to optimize and efficiently deliver content between content providers and end consumers. Caching has been used for decades now performing traffic offload closer to eyeballs, and plays a critical role in today’s networks. The Peering Fabric design considers cache aggregation another role in “Peering” in creating a cost-optimized and scalable way to aggregate both provider and 3rd party caching servers such as those from Netflix, Google, or Akamai. The following diagram ** depicts a typical cache aggregation scenario at a metro aggregation facility. In larger high bandwidth facilities it is recommended to place caching nodes on a separate scalable set of devices separate from functions such as PE edge functions. Deeper in the network, Peering Fabric devices have the flexibility to integrate other functions such as small edge PE and compute termination such as in a 5G Mobile Edge Compute edge DC. Scale limitations are not a consideration with the ability to support full routing tablesin an environmentally optimized 1RU/2RU footprint.Topology and Peer DistributionThe Cisco Peering Fabric introduces two options for fabric topology andpeer termination. The first, similar to more traditional peeringdeployments, collapses the Peer Termination and Core Connectivitynetwork functions into a single physical device using the device’sinternal fabric to connect each function. The second option utilizes afabric separating the network functions into separate physical layers,connected via an external fabric running over standard Ethernet.In many typical SP peering deployments, a traditional two-node setup isused where providers vertically upgrade nodes to support the highercapacity needs of the network. Some may employ technologies such as backto back or multi-chassis clusters in order to support more connectionswhile keeping what seems like the operational footprint low. However,failures and operational issues occurring in these types of systems aretypically difficult to troubleshoot and repair. They also requirelengthy planning and timeframes for performing system upgrades. Weintroduce a horizontally scalable distributed peering fabric, the endresult being more deterministic interface or node failures.Minimizing the loss of peering capacity is very important for bothingress-heavy SPs and egress-heavy content providers. The loss of localpeering capacity means traffic must ingress or egress a sub-optimalnetwork port. Making a conscious design decision to spread peerconnections, even to the same peer, across multiple edge nodes helpsincrease resiliency and limit traffic-affecting network events.PlatformsThe Cisco NCS5500 platform is ideal for edge peer termination, given itshigh-density, large RIB and FIB scale, buffering capability, and IOS-XRsoftware feature set. The NCS5500 is also space and power efficient with36x100GE supporting up to 4M IPv4 routes in a 1RU fixed form factor orsingle modular line card. The Peering fabric can provide36x100GE, 144x10GE, or a mix of non-blocking peering connections withfull resiliency in 4RU. The fabric can also scale to support 10s ofterabits of capacity in a single rack for large peering deployments.Fixed chassis are ideal for incrementally building a peering edgefabric, the NCS NC55-36X100GE-A-SE and NC55A1-24H are efficient highdensity building blocks which can be rapidly deployed as needed withoutinstalling a large footprint of devices day one. Deployments needingmore capacity or interface flexibility such as IPoDWDM to extend peeringcan utilize the NCS5504 4-slot or NCS5508 8-slot modular chassis. If thepeering location has a need for services termination the ASR9000 familyor XRv-9000 virtual edge node can be incorporated into the fabric.All NCS5500 routers also contain powerful Route Processors to unlockpowerful telemetry and programmability. The Peering Fabric fixedchassis contain 1.6Ghz 8-core processors and 32GB of RAM. The latestNC55-RP-E for the modular NCS5500 chassis has a 1.9Ghz 6-core processorand 32G of RAM.Control-PlaneThe peering fabric design introduces a simplified control-plane builtupon IPv4/IPv6 with Segment Routing. In the collapsed design, eachpeering node is connected to EBGP peers and upstream to the core viastandard IS-IS, OSPF, and TE protocols, acting as a PE or LER in aprovider network.In the distributed design, network functions are separated. PeerTermination happens on Peering Fabric Leaf nodes. Peering Fabric Spineaggregation nodes are responsible for Core Connectivity and perform moreadvanced LER functions. The PFS routers use ECMP to balance trafficbetween PFL routers and are responsible for forwarding within the fabricand to the rest of the provider network. Each PFS acts as an LER,incorporated into the control-plane of the core network. The PFS, oralternatively vRRs, reflect learned peer routes from the PFL to the restof the network. The SR control-plane supports several trafficengineering capabilities. EPE to a specific peer interface, PFL node, orPFS is supported. We also introduce the abstract peering concept wherePFS nodes utilize a next-hop address bound to an anycast SR SID to allowtraffic engineering on a per-peering center basis.Slow Peer Detection for BGPIn the Peering Fabric 3.5 design and IOS-XR 7.1.2 slow-peer detection is enabled by default. Slow peers are those who are slow to receive and process inbound BGPupdates and ack those to the sender. If the slow peer is participating in the same update group as other peers, this can slow down the update process for all peers. In this release when IOS-XR detects a slow peer, it will create a syslogmention with information about the specific peer.TelemetryThe Peering fabric design uses the rich telemetry available in IOS-XRand the NCS5500 platform to enable an unprecedented level of insightinto network and device behavior. The Peering Fabric leverages Model-DrivenTelemetry and NETCONF along with both standard and native YANG modelsfor metric statistics collection. Telemetry configuration and applicablesensor paths have been identified to assist providers in knowing what tomonitor and how to monitor it.AutomationNETCONF and YANG using OpenConfig and native IOS-XR models are used tohelp automate peer configuration and validation. Cisco has developed specific Peering Fabric NSO service models to help automate common tasks suchas peer interface configuration, peer BGP configuration, and addingphysical interfaces to an existing peer bundle.Zero Touch ProvisioningIn addition to model-driven configuration and operation, Peering Fabric 1.5 alsosupports ZTP operation for automated device provisioning. ZTP is useful both in production as well as staging environments to automate initial device software installation, deploy an initial bootstrap configuration, as well as advanced functionality triggered by ZTP scripts. ZTP is supported on both out of band management interfaces as well as in-band data interfaces.Cisco Crosswork Health Insights KPI packTo ease the monitoring of common peering telemetry using CW Health Insights, a peering sensor pack is available containing common elements monitored for peering not included in the baseline CW HI KPI definitions. These include BGP session monitoring, RIB/FIB counts, and Flowspec statistics.Advanced Security using BGP Flowspec and QPPB (1.5)Release 1.5 of the Cisco Peering Fabric enhances the design by adding advancedsecurity capabilities using BGP Flowspec and QoS Policy Propagation using BGPor QPPB. BGP Flowspec was standardized in RFC 5575 and defines additional BGPNLRI to inject packet filter information to receiving routers. BGP is the control-plane fordisseminating the policy information while it is up to the BGP Flowspecreceiver to implement the dataplane rules specified in the NLRI. At theInternet peering edge, DDoS protection has become extremely important,and automating the remediation of an incoming DDoS attack has becomevery important. Automated DDoS protection is only one BGP Flowspec usecase, any application needing a programmatic way to create interfacepacket filters can make se use of its capabilities.QPPB allows using BGP attributes as a match criteria in dataplane packet filters. Matching packets based on attributes like BGP community and AS Path allows serviceproviders to create simplified edge QoS policies by not having to manage more cumbersome prefix lists or keep up to date when new prefixes are added. QPPB is supported in the peering fabric for destination prefix BGP attribute matching and has a number of use cases when delivering traffic from external providers to specific internal destinations.Radware validated DDoS solutionRadware, a Cisco partner, provides a robust and intelligent DDoS detection and mitigation solution covering both volumetric and application-layer DDoS attacks. The validated solution includes the following elements#Radware DefenseProDefensePro is used for attack detection and traffic scrubbing. DefensePro can be deployed at the edge of the network or centralized as is the case with a centralized scrubbing center. DefensePro uses realtime traffic analysis through SPAN (monitor) sessions from the edge routers to the DefensePro virtual machine or hardware appliance.Radware DefenseFlowDefenseFlow can work in a variety of ways as part of a comprehensive DDoS mitigation solution. DefenseFlow performs $anomaly detection by using advanced network behavioral analysis to first baseline a network during peacetime and then evaluate anomalies to determine when an attack is occurring. DefenseFlow can also incorporate third party data such as flow data or other data to enhance its attack detection capability. DefenseFlow also coordinates the mitigation actions of other solution components such as DefensePro and initiates traffic redirection through the use of BGP and BGP Flowspec on edge routers.Solution descriptionThe following steps describe the analysis and mitigation of DDoS attacks using Radware components. Radware DefenseFlow is deployed to orchestrate DDoS attack detection and mitigation. Virtual or appliance version of Radware DefensePro is deployed to a peering fabric location or centralized location. PFL nodes use interface monitoring sessions to mirror specific ingress traffic to an interface connected to the DefensePro element. The interface can be local to the PFL node or traffic or SPAN over Pseudowire can be used to tunnel traffic to an interface attached to a centralized DefensePro.Solution diagramRouter SPAN (monitor) to physical interface configurationThe following is used to direct traffic to a DefensePro virtual machine or appliance.monitor-session radware ethernet destination interface TenGigE0/0/2/2!interface TenGigE0/0/2/1 description ~DefensePro clean interface~ ipv4 address 182.10.1.1 255.255.255.252! interface TenGigE0/0/2/2 description ~SPAN interface to DefensePro~ !interface TenGigE0/0/2/3 description ~Transit peer connection~ ipv4 address 182.30.1.1 255.255.255.252 monitor-session radware ethernet port-level !end Router SPAN (monitor) to PWEThe following is used to direct traffic to a DefensePro virtual machine or appliance at a remote locationmonitor-session radware ethernet destination pseudowire !l2vpn xconnect group defensepro-remote p2p dp1 monitor-session radware neighbor ipv4 100.0.0.1 pw-id 1!interface TenGigE0/0/2/3 description ~Transit peer connection~ ipv4 address 182.30.1.1 255.255.255.252 monitor-session radware ethernet port-level !end Netscout Arbor validated DDoS SolutionNetscout, a Cisco partner, has deployed its Arbor solution at SPs around the world for advanced DDoS detection and mitigation. Using network analysis at the flow and packet level along with BGP and network statistic data, Arbor categorizes traffic based on user defined and learned criteria to quickly detect attacks. Once those attacks are detected SPs can mitigate those attacks using a combination of Route Triggered Blackhole, ACLs, and BGP Flowspec. Additionally, SPs can deploy the Arbor TMS or vTMS scrubbing appliances on-net to separate and block malicious traffic from legitimate traffic. Now we walk through the various solution components used in the Netscout Arbor solution.Information about all of Netscout’s traffic visibility and security solutions can be found at https#//www.netscout.comSolution DiagramNetscout Arbor SightlineSightline Appliance RolesSightline comprises the scalable distributed services responsible for network data collection, attack analysis, and mitigation coordination across the network. Each Sightline virtual machine appliance can be configured in a specific solution role. One of the deployed appliances is configured as the leader appliance maintaining the configuration for the entire Sightline cluster. In order to scale collection of network data, multiple collectors can be deployed to collect Netflow, SNMP, and BGP data. This data is then aggregated and used for traffic analysis and attack detection. Sightline elements can be configured via CLI or via the web UI once the UI appliance is operational. The following lists the different roles for Sightline VM appliances# Role Description Required UI Provides the web UI and all API access to the Sightline system Yes (recommended as Leader) Traffic and Routing Analysis Provides Netflow, BGP, and SNMP collection from the network along with DDoS analytics Yes Data Storage Separate data storage for Managed Object data, increasing scale of the overall solution for large deployments No Flow Sensor Generates flow data from the network when router export of Netflow is not capable No As seen in the table, UI and Traffic and Routing Analysis appliances are required.Netscout Arbor Threat Management System (TMS)The TMS or vTMS appliances provide deeper visibility into network traffic and acts as a scrubber as part of a holistic DDoS mitigation solution. The TMS performs deep packet inspection to identify application layer attacks at a packet and payload level, performs granular mitigation of the attack traffic, and provides reporting for the traffic. The TMS is integrated with a Routing and Analytics appliance so when attacks are detected by the R&A appliance it can then be redirected to a tethered TMS appliance for further inspection or mitigation.Solution descriptionThe following steps describe the analysis and mitigation of DDoS attacks using Netscout Arbor components. Netscout Arbor Sightline UI leader virtual appliance One or more Netscout Arbor Sightline Routing and Analytics appliances One or more Netscout Arbor TMS or vTMS appliances All routers in the network configured to export Netflow data to the R&A appliances Sightline and the network configured for SNMP and BGP collection from each network router, assigned to proper roles (Edge, Core) If necessary, configure Netscout Arbor Managed Objects to collect and analyze specific traffic for anomalies and attacks Configure mitigation components such as RTBH next-hops, TMS mitigation, and BGP Flowspec redirect and drop parametersEdge Mitigation OptionsThe methods to either police or drop traffic at the edge of the network are# Route Triggered Blackhole The RTBH IPv4 or IPv6 BGP prefix is advertised from the Routing and Analytics node, directing edge routers to Null route a specific prefix being attackced. This will cause all traffic to the destination prefix to be dropped on the edge router. Access Control Lists ACLs are generated and deployed on edge interfaces to mitigate attacks. In addition to matching either source or destination prefixes, ACLs can also match additional packet header information such as protocol and port. ACLs can be created to either drop all traffic matching the specific defined rules or rate-limit traffic to a configured policing rate. ACLs BGP Flowspec BGP Flowspec mitigation allows the provider to distribute edge mitigation in a scalable way using BGP. In a typical BGP Flowspec deployment the Netscout Arbor R&A node will advertise the BGP Flowspec policy to a provider Route Reflector which then distributes the BGP FS routes to all edge routers. BGP Flowspec rules can match a variety of header criteria and perform drop, police, or redirect actions. Traffic Redirection Options BGP Flowspec It is recommended to use BGP Flowspec to redirect traffic on PFL nodes to TMS appliances. This can be done through traditional configuration with next-hop redirection in the global routing table, redirection into a “dirty” VRF, or using static next-hops into SR-TE tunnels in the case where the scrubbing appliances are not connected via a directly attached interface to the PFL or PFS nodes. Netscout Arbor TMS Blacklist OffloadingBlacklist offloading is a combination of traffic scrubbing using the TMS along with filtering/dropping traffic on each edge router. The Netscout Arbor system identifies the top sources of attack traffic and automatically generates the BGP Flowspec rules to drop traffic on the edge router before it is redirected to the TMS. This makes the most efficient use of the TMS mitigation resources.Mitigation ExampleThe graphic below shows an example of traffic mitigation via RTBH. Netscout Arbor still receives flow information from the network edge for mitigated traffic, so Arbor is able to detect the amount of traffic which has been mitigates using the appropriate mitigation method.Internet and Peering in a VRFWhile Internet peering and carrying the Internet table in a provider network is typically done using the Global Routing Table (default VRF in IOS-XR) many modern networks are being built to isolate the GRT from the underlying infrastructure. In this case, the Internet global table is carried as a service just like any other VPN service, leaving the infrastructure layer protected from both the global Internet. Another application using VRFs is to simply isolate peers to specific VRFs in order to isolate the forwarding plane of each peer from each other and be able to control which routes a peer sees by the use of VPN route target communities as opposed to outbound routing policy. In this simplified use the case the global table is still carried in the default VRF, using IOS-XR capabilities to import and export routes to and from specific peer VRFs. Separating Internet and Peering routes into specific VRFs also gives flexibility in creating custom routing tables for specific customers, giving a service provider the flexibility to offer separate regional or global reach on the same network.Internet in a VRF and Peering in a VRF for IPv4 and IPv6 are compatible with most Peering Fabric features. Specific caveats are document in the Appendixof the document.RPKI and Route Origin ValidationRPKI stands for Resource Public Key Infrastructure and is a repository for attaching a trust anchor to Internet routing resources such as Autonomous Systems and IP Prefixes. Each RIR (Regional Internet Registry) houses the signed resource records it is responsible for, giving a trust anchor to those resources.The RPKI contains a Route Origin Authorization object, used to uniquely identify the ASN originating a prefix and optionally, the longer sub-prefixes covered by it. RPKI records are published by each Regional Internet Regitstry (RIR) adn consume by offline RPKI validators. The RPKI validator is an on-premise application responsible for compiling a list of routes considered VALID. Keep in mind these are only the routes which are registered in the RPKI database, no information is gathered from the global routing table. Once resource records are validated, the validator uses the RTR protocol **insert RFC ref to communicate with client routers who periodically make requests for an updated database.The router uses this database along with policy to validate incoming BGP prefixes against the database, a process called as Route Origin Validation (ROV). ROV verifies the origin ASN in the AS_PATH of the prefix NLRI matches the RPKI database. A communication flow diagram is given below. RPKI configuration examples are given in the implementation section.The Peering Fabric design was validated using the Routinator RPKI validator. Please see the security section for configuration of RPKI ROV in IOS-XR.For more information on RPKI and RPKI deployment with IOS-XR please see# https#//xrdocs.io/design/blogs/routinator-hosted-on-xrNext-Generation IXP FabricIntroduced in Peering Fabric 2.0 is a modern design for IXP fabrics. The design creates a simplified fault-tolerant L2VPN fabric with point to point and multi-point peer connectivity. Segment Routing brings a simplified MPLS underlay with resilience using TI-LFA and traffic engineering capabilities using Segment Routing - Traffic Engineering Policies. Today’s IX Fabrics utilize either traditional L2 networks or emulated L2 using VPLS and LDP/RSVP-TE underlays. The Cisco NG IX Fabric uses EVPN for all L2VPN services, replacing complicated LDP signaled services with a scalable BGP control-plane. See the implementation section for more details on configuring the IX fabric underlay and EVPN services.The IX fabric can also utilize the NSO automation created in the Metro Fabric design for deploying EVPN VPWS (point-to-point) and multi-point EVPN ELAN services.Validated DesignThe Peering Fabric Design control, management, and forwarding planes haveundergone validation testing to ensure individual design features workas intended and the peering fabric as a whole performs without fault.Validation is done exceeding real-world scaling requirements to ensurethe design fulfills its rule in existing networks with room for futuregrowth.Peering Fabric Design Use CasesTraditional IXP Peering Migration to Peering FabricA traditional SP IXP design traditionally uses one or two large modularsystems terminating all peering connections. In many cases, sinceproviders are constrained on space and power they use a collapsed designwhere the minimal set of peering nodes not only terminates peerconnections but also provides services and core connectivity to thelocation. The Peering Fabric uses best of breed high density,low footprint hardware requiring much less space than older generationmodular systems. Many older systems provide densities at approximately4x100GE per rack unit, while Peering Fabric PFL nodes start at 24x100GEor 36x100GE per 1RU with high FIB capability. Due to the superior spaceefficiency, there is no longer a limitation of using just a pair ofnodes for these functions. In either a collapsed function or distributedfunction design, peers can be distributed across a number of devices toincrease resiliency and lessen collateral impact when failures occur.The diagram below shows a fully distributed fabric, where peers are nowdistributed across three PFL nodes, each with full connectivity toupstream PFS nodes.Peering Fabric ExtensionIn some cases, there may be peering facilities within close geographicproximity which need to integrate into a single fabric. This may happenif there are multiple 3rd party facilities in a closegeographic area, each with unique peers you want to connect to. Theremay also be multiple independent peering facilities within a smallgeographic area you do not wish to install a complete peering fabricinto. In those cases, connecting remote PFL nodes to a larger peeringfabric can be done using optical transport or longer range gray optics.Localized Metro Peering and Content DeliveryIn order to drive greater network efficiency, content sources should beplaces as close to the end destination as possible. Traditional wirelineand wireless service providers have heavy inbound traffic from contentproviders delivering OTT video. Providers may also be providing theirown IP video services to on-net and off-net destinations via a SP CDN.Peering and internal CDN equipment can be placed within a localized peeror content delivery center, connected via a common peering fabric. Inthese cases the PFS nodes connect directly to the metro core to enabledelivery across the region or metro.Express Peering FabricAn evolution to localized metro peering is to interconnect the PFSpeering nodes directly or a metro-wide peering core. The main driver fordirect interconnection is minimizing the number of router and transportnetwork interfaces traffic must pass through. High density opticalmuxponders such as the NCS1002 along with flexible photonic ROADMarchitectures enabled by the NCS2000 can help make the most efficientuse of metro fiber assets.Datacenter Edge PeeringIn order to serve traffic as close to consumer endpoints as possible aprovider may construct a peering edge attached to an edge or centraldatacenter. As gateway functions in the network become virtualized forapplications such as vPE, vCPE, and mobile 5G, the need to attachInternet peering to the SP DC becomes more important. The Peering Fabric supports interconnected to the DC via the SP core or withthe PFS nodes as leafs to the DC spine. These would act as traditionalborder routers in the DC design.Peer Traffic Engineering with Segment RoutingSegment Routing performs efficient source routing of traffic across aprovider network. Traffic engineering is particular applicable topeering as content providers look for ways to optimize egress networkports and eyeball providers work to reduce network hops between ingressand subscriber. There are also a number of advanced use cases based onusing constraints to place traffic on optimal paths, such as latency. AnSRTE Policy represents a forwarding entity within the SR domain mappingtraffic to a specific network path, defined statically on the node orcomputed by an external PCE. An additional benefit of SR is the abilityto source route traffic based on a node SID or an anycast SIDrepresenting a set of nodes. ECMP behavior is preserved at each point inthe network, redundancy is simplified, and traffic protection issupplied using TI-LFA.In the Low-Level Design we explore common peer engineering use cases.Much more information on Segment Routing technology and its futureevolution can be found at http#//segment-routing.netODN (On-Demand Next-Hop) for PeeringThe 2.0 release of Peering Fabric introduces ODN as a method for dynamically provisioning SR-TE Policies to nodes based on specific “color” extended communities attached to advertised BGP routes. The color represents a set of constraints used for the provisioned SR-TE Policy, applied to traffic automatically steered into the Policy once the SR-TE Policy is instantiated.An applicable example is the use case where I have several types of peers on the same device sending traffic to destinations across my larger SP network. Some of this traffic may be Best Effort with no constraints, other traffic from cloud partners may be considered low-latency traffic, and traffic from a services partner may have additional constraints such as maintaining a disjoint path from the same peer on another router. Traffic in the reverse direction egressing a peer from a SP location can also utilize the same mechanisms to apply constraints to egress traffic.DDoS Traffic Steering using SR-TE and EPESR-TE and Egress Peer Engineering can be utilized to direct DDoS traffic to a specific end node and specific DDoS destination interface without the complexities of using VRFs to separate dirty/clean traffic. On ingress, traffic is immediately steered into a SR-TE Policy and no IP lookup is performed between the ingress node and egress DDoS “dirty” interface. In the 3.0 design using IOS-XR 6.6.3 Flowspec redirects traffic to a next-hop IP pointing to a pre-configured “DDoS” SR-Policy. An MPLS xconnect is used map DDoS traffic with a specific EPE label on the egress node to a specific egress interface.Low-Level DesignIntegrated Peering Fabric Reference DiagramDistributed Peering Fabric Reference DiagramPeering Fabric Hardware DetailThe NCS5500 family of routers provide high density, high routing scale,idea buffer sizes, and environmental efficiency to help providerssatisfy any peering fabric use case. Due to high FIB scale, largebuffers, and broad XR feature set, all prescribed hardware can serve ineither a collapsed or distributed fabric. Further detailed informationon each platform can be found athttps#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.html.NCS-5501-SEThe NCS 5501 is a 1RU fixed router with 40X10GE SFP+ and 4X100GE QSFP28interfaces. The 5501 has IPv4 FIB scale of at least 2M routes. The5501-SE is ideal as a peering leaf node when providers need 10GEinterface flexibility such as ER, ZR, or DWDM.NCS-55A1-36H-SEThe 55A1-36H-SE is a second generation 1RU NCS5500 fixed platform with36 100GE QSFP28 ports operating at line rate. The –SE model contains anexternal TCAM increasing route scale to a minimum of 3M IPv4/512K IPv6routes in its FIB. It also contains a powerful multi-core routeprocessor with 64GB of RAM and an on-board 64GB SSD. Its high density,efficiency, and buffering capability make it ideal in 10GE or 100GEdeployments. Peering fabrics can scale to much higher capacity 1RU at atime by simply adding additional 55A1-36H-SE spine nodes.NCS-55A1-24HThe NCS-55A1-24H is a second generation 1RU NCS5500 fixed platform with24 100GE QSFP28 ports. The device uses two 900GB NPUs, with 12X100GEports connected to each NPU. The 55A1-24H uses a high scale NPU with aminimum of 1.3M IPv4/256K IPv6 routes. At just 675W it is ideal for 10GEpeering fabric deployments with a migration path to 100GE connectivity.The 55A1-24H also has a powerful multi-core processor and 32GB of RAM.NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card Very large peering fabric deployments or those needing interfaceflexibility such as IPoDWDM connectivity can use the modular NCS5500series chassis. Large deployments can utilize the second-generation36X100G-A-SE line card with external TCAM, supporting a minimum of 3MIPv4 routes.NCS-55A2-MOD-SE-SThe NCS-55A2-MOD router is a 2RU router with 24x10G SFP+ interfaces, 16x25 SFP28 interfaces, and two Modular Port Adapter (MPA) slots with 400Gbps of full-duplex bandwidth. A variety of MPAs are available, adding additional 10GE, 100GE QSFP28, and 100G/200G CFP2 interfaces. The CFP2 interfaces support CFP2-DCO Digital Coherent Optics, simplifying deployment for peering extensions connected over dark fiber or DWDM multiplexers.The 55A2-MOD-SE-S uses a next-generation external TCAM with a minimum route scale of 3M IPv4/512K IPv6. The 55A2-MOD-SE-S also supports advanced security using BGP Flowspec and QPPB.Peer Termination StrategyOften overlooked when connecting to Internet peers is determining astrategy to maximize efficiency and resiliency within a local peeringinstance. Often times a peer is connected to a single peering node evenwhen two nodes exist for ease of configuration and coordination with thepeering or transit partner. However, with minimal additionalconfiguration and administration assisted by automation, even singlepeers can be spread across multiple edge peering nodes. Ideally, withina peering fabric, a peer is connected to each leaf in the fabric. Incases where this cannot be done, the provider should use capacityplanning processes to balance peers and transit connections acrossmultiple leafs in the fabric. The added resiliency leads to greaterefficiency when failures do happen, with less reliance on peeringcapacity further away from the traffic destination.Distributed Fabric Device RolesPFL – Peering Fabric LeafThe Peering Fabric Leaf is the node physically connected to externalpeers. Peers could be aggregation routers or 3rd party CDNnodes. In a deconstructed design the PFL is analogous to a line card ina modular chassis solution. PFL nodes can be added as capacity needsgrow.PFS – Peering Fabric SpineThe Peering Fabric Spine acts as an aggregation node for the PFLs and isalso physical connected to the rest of the provider network. Theprovider network could refer to a metro core in the case of localizedpeering, a backbone core in relation to IXP peering, a DC spine layer inthe case of DC peering.Device InterconnectionIn order to maximize resiliency in the fabric, each PFL node isconnected to each PFS. While the design shown includes three PFLs andtwo PFS nodes, there could be any number of PFL and PFS nodes, scalinghorizontally to keep up with traffic and interface growth. PFL nodes arenot connected to each other, the PFS nodes provide the capacity for anytraffic between those nodes. The PFS nodes are also not interconnectedto each other, as no end device should terminate on the PFL, only otherrouters.Capacity ScalingCapacity of the peering fabric is scaled horizontally. The uplinkcapacity from PFL to PFS will be determine by an appropriateoversubscription factor determined by the service provider’s capacityplanning exercises. The leaf/spine architecture of the fabric connectseach PFL to each PFS with equal capacity. In steady-state operationtraffic is balanced between the PFS and PFL in both directions,maximizing the total capacity. The entropy in peering traffic generallyensures equal distribution between either ECMP paths or bundle interfacemember links in the egress direction. More information can be found inthe forwarding plane section of the document. An example deployment mayhave two NC55-36X100G-A-SE spine nodes and two NC55A1-24H leaf nodes. Ina 100GE peer deployment scenario each leaf would support 14x100GE clientconnections and 5x100GE to each spine node. A 10GE deployment wouldsupport 72x10GE client ports and 3x100GE to each spine, at a 1.2#1oversubscription ratio.Peering Fabric Control PlanePFL to PeerThe Peering Fabric Leaf is connected directly to peers via traditionalEBGP. BFD may additionally be used for fault detection if agreed to bythe peer. Each EBGP peer will utilize SR EPE to enable TE to the peerfrom elsewhere on the provider network.PFL to PFSPFL to Peering Fabric Spine uses widely deployed standard routingprotocols. IS-IS is the prescribed IGP protocol within the peeringfabric. Each PFS is configured with the same IS-IS L1 area. In the casewhere OSPF is being used as an IGP, the PFL nodes will reside in an OSPFNSSA area. The peering fabric IGP is SR-enabled with the loopback ofeach PFL assigned a globally unique SR Node SID. Each PFL also has anIBGP session to each PFR to distribute its learned EBGP routes upstreamand learn routes from elsewhere on the provider network. If a provideris distributing routes from PFL to PFL or from another peering locationto local PFLs it is important to enable the BGP “best-path-external”feature to ensure the PFS has the routing information to acceleratere-convergence if it loses the more preferred path.Egress peer engineering will be enabled for EBGP peering connections, sothat each peer or peer interface connected to a PFL is directlyaddressable by its AdJ-Peer-SID from anywhere on the SP network.Adj-Peer-SID information is currently not carried in the IGP of thenetwork. If utilized it is recommended to distribute this informationusing BGP-LS to all controllers creating paths to the PFL EPEdestinations.Each PFS node will be configured with IBGP multipath so traffic is loadbalanced to PFL nodes and increase resiliency in the case of peerfailure. On reception of a BGP withdraw update for a multipath route,traffic loss is minimized as the existing valid route is stillprogrammed into the FIB.PFS to CoreThe PFS nodes will participate in the global Core control plane and actas the gateway between the peering fabric and the rest of the SPnetwork. In order to create a more scalable and programmatic fabric, itis prescribed to use Segment Routing across the core infrastructure.IS-IS is the preferred protocol for transmitting SR SID information fromthe peering fabric to the rest of the core network and beyond. Indeployments where it may be difficult to transition quickly to an all-SRinfrastructure, the PFS nodes will also support OSPF and RSVP-TE forinterconnection to the core. The PFS acts as an ABR or ASBR between thepeering fabric and the larger metro or backbone core network.SR Peer Traffic EngineeringSummarySR allows a provider to create engineered paths to egress peeringdestinations or egress traffic destinations within the SP network. Astack of globally addressable labels is created at the traffic entrypoint, requiring no additional protocol state at midpoints in thenetwork and preserving qualities of normal IGP routing such as ECMP ateach hop. The Peering Fabric proposes end-to-end visibility fromthe PFL nodes to the destinations and vice-versa. This will allow arange of TE capabilities targeting a peering location, peering exitnode, or as granular as a specific peering interface on a particularnode. The use of anycast SIDs within a group of PFS nodes increasesresiliency and load balancing capability.Nodal EPENode EPE directs traffic to a specific peering node within the fabric.The node is targeted using first the PFS cluster anycast IP along withthe specific PFL node SID.Peer Interface EPEThis example uses an Egress Peer Engineering peer-adj-SID value assignedto a single peer interface. The result is traffic sent along this SRpath will use only the prescribed interface for egress traffic.Abstract PeeringAbstract peering allows a provider to simply address a Peering Fabric bythe anycast SIDs of its cluster of PFS nodes. In this case PHP is usedfor the anycast SIDs and traffic is simply forwarded as IP to the finaldestination across the fabric.SR-TE On-Demand Next-Hop for PeeringSR-TE On-Demand Next-Hop is a method to dynamically create specific constraint-based tunnels across an SP network to/from edge peering nodes. ODN utilizes Cisco’s Segment Routing Path Computation Element (SR-PCE) to compute paths on demand based on the BGP next-hop and associated “color” communities.When a node receives a route with a specific community, it builds a SR-TE Policy to the BGP next-hop based on policy.One provider example is the case where I have DIA (Direct Internet Access) customers with different levels of service. I can create a specific SLA for “Gold” customers so their traffic takes a lower latency path across the network. In B2B peering arrangements, I can ensure voice or video traffic I am ingesting from a partner network takes priority. I can do this without creating a number of static tunnels on the network.ODN ConfigurationODN requires a few components be configured. In this example we tag routes coming from a specific provider with the color “BLUE” with a numerical value of 100. In IOS-XR we first define an extended community set defining our color with a unique string identifier of BLUE. This configuration should be found on both the ingress and egress nodes of the SR Policy.extcommunity-set opaque BLUE 100end-setThe next step is to define an inbound routing policy on the PFL nodes tagging all inbound routes from PEER1 with the BLUE extended community.route-policy PEER1-IN set community (65000#100) set local-preference 100 set extcommunity color BLUE passend-policyIn order for the head-end node to process the color community and create an SR Policy with constraints, the color must be configured under SR Traffic Engineering. The following configuration defined a color value of 100, the same as our extended community BLUE, and instructs the router how to handle creating the SR-TE Policy to the BGP next-hop address of the prefix received with the community. In this instance it instructs the router to utilize an external PCE, SR-PCE, to compute the path and use the lower IGP metric path cost to reach the destination. Other options available are TE metric, latency, hop count, and others covered in the SR Traffic Engineering documentation found on cisco.com.segment-routing traffic-eng on-demand color 100 dynamic pcep ! metric type igpThe head-end router will only create a single SR-TE Policy to the next-hop address, other prefixes matching the original next-hop constraints will utilize the pre-existing tunnel. The tunnels are ephemeral meaning they will not persist across router reboots.IXP Fabric Low Level DesignSegment Routing UnderlayThe underlay network used in the IXP Fabric design is the same as utilized with the regular Peering Fabric design. The validated IGP used for all iterations of the IXP Fabric is IS-IS, with all elements of the fabric belonging to the same Level 2 IS-IS domain.EVPN L2VPN ServicesComprehensive configuration for EVPN L2VPN services are outside the scope of this document, please consult the Converged SDN Transport design guide or associated Cisco documentation for low level details on configuring EVPN VPWS and EVPN ELAN services. The Converged SDN Transport design guide can be found at the following URL# https#//xrdocs.io/design/blogs/latest-converged-sdn-transport-hldPeering Fabric TelemetryOnce a peering fabric is deployed, it is extremely important to monitorthe health of the fabric as well as harness the wealth of data providedby the enhanced telemetry on the NCS5500 platform and IOS-XR. Throughstreaming data mechanisms such as Model-Driven Telemetry, BMP, andNetflow, providers can extract data useful for operations, capacityplanning, security, and many other use cases. In the diagram below, thetelemetry collection hosts could be a single system or distributedsystems used for collection. The distributed design of the peeringfabric enhances the ability to collect telemetry data from the fabric bydistributing resources across the fabric. Each PFL or PFS contains amodern multi-core CPU and at least 32GB of RAM (64GB in NC55A1-36H-SE)to support not only built in telemetry operation but also 3rdparty applications a service or content provider may want to deploy tothe node for additional telemetry. Examples of 3rd partytelemetry applications include those storing temporary data forroot-cause analysis if a node is isolated from the rest of the networkor performance measurement applications.The peering fabric also fully supports traditional collections methodssuch as SNMP, and NETCONF using YANG models to integrate with legacysystems.Telemetry DiagramModel-Driven TelemetryMDT uses standards-based or native IOS-XR YANG data models to streamoperational state data from deployed devices. The ability to pushstatistics and state data from the device adds capabilities andefficiency not found using traditional SNMP. Sensors and collectionhosts can be configured statically on the host (dial-out) or the set ofsensors, collection hosts, and their attributes can be managed off-boxusing OpenConfig or native IOS-XR YANG models. Pipeline is Cisco’s opensource collector, which can take MDT data as an input and output it viaa plugin architecture supporting scalable messages buses such as Kafka,or directly to a TSDB such as InfluxDB or Prometheus. The appendixcontains information about MDT YANG paths relevant to the peering fabricand their applicability to PFS and PFL nodes.BGP Monitoring ProtocolBMP, defined in RFC7854, is a protocol to monitor BGP RIB information,updates, and protocol statistics. BMP was created to alleviate theburden of collecting BGP routing information using inefficientmechanisms like screen scraping. BMP has two primary modes, RouteMonitoring mode and Route Mirroring mode. The monitoring mode willinitially transmit the adj-rib-in contents per-peer to a monitoringstation, and continue to send updates as they occur on the monitoreddevice. Setting the L bits on the RM header to 1 will convey this is apost-policy route, 0 will indicate pre-policy. The mirroring mode simplyreflects all received BGP messages to the monitoring host. IOS-XRsupports sending pre and post policy routing information and updates toa station via the Route Monitoring mode. BMP can additionally sendinformation on peer state change events, including why a peer went downin the case of a BGP event.There are drafts in the IETF process led by Cisco to extend BMP toreport additional routing data, such as the loc-RIB and per-peeradj-RIB-out. Local-RIB is the full device RIB include ng received BGProutes, routes from other protocols, and locally originated routes.Adj-RIB-out will add the ability to monitor routes advertised to peerspre and post routing policy.Netflow / IPFIXNetflow was invented by Cisco due to requirements for traffic visibilityand accounting. Netflow in its simplest form exports 5-tuple data foreach flow traversing a Netflow-enabled interface. Netflow data isfurther enhanced with the inclusion of BGP information in the exportedNetflow data, namely AS_PATH and destination prefix. This inclusionmakes it possible to see where traffic originated by ASN and derive thedestination for the traffic per BGP prefix. The latest iteration ofCisco Netflow is Netflow v9, with the next-generation IETF standardizedversion called IPFIX (IP Flow Information Export). IPFIX has expanded onNetflow’s capabilities by introducing hundreds of entities.Netflow is traditionally partially processed telemetry data. The deviceitself keeps a running cache table of flow entries and countersassociated with packets, bytes, and flow duration. At certain timeintervals or event triggered, the flow entries are exported to acollector for further processing. The type 315 extension to IPFIX,supported on the NCS5500, does not process flow data on the device, butsends the raw sampled packet header to an external collector for allprocessing. Due to the high bandwidth, PPS rate, and large number ofsimultaneous flows on Internet routers, Netflow samples packets at apre-configured rate for processing. Typical sampling values on peeringrouters are 1 in 8192 packets, however customers implementing Netflow orIPFIX should work with Cisco to fine tune parameters for optimal datafidelity and performance.Automation and ProgrammabilityCisco NSO ModulesCisco Network Services Orchestrator is a widely deployed networkautomation and orchestration platform, performing intent-drivenconfiguration and validation of networks from a single source of truthconfiguration database. The Peering design includes a Cisco NSOmodules to perform specific peering tasks such as peer turn-up, peermodification, deploying routing policy and ACLs to multiple nodes,providing a jumpstart to peering automation. The following table highlights the currently available Peering NSO services. The current peering service models use the IOS-XR CLI NED and are validated with NSO 4.5.5. Service Description peering-service Manage full BGP and Interface Configuration for EBGP Peers peering-acl Manage infrastructure ACLs referenced by the peering service prefix-set Manage IOS-XR prefix-sets as-path-set Manage IOS-XR as-path sets route-policy Manage XR routing policies for deployment to multiple peering nodes peering-common A set of services to manage as-path sets, community sets, and static routing policies drain-service Service to automate draining traffic away from a node under maintenance telemetry Service to enable telemetry sensors and export to collector bmp Service to enable BMP on configured peers and export to monitoring station netflow Service to enable Netflow on configured peer interfaces and export to collector PFL-to-PFS-Routing Configures IGP and BGP routing between PFL and PFS nodes PFS-Global-BGP Configures global BGP parameters for PFS nodes PFS-Global-ISIS Configures global IS-IS parameters for PFS nodes NetconfNetconf is an industry standard method for configuration networkdevices. Standardized in RFC 6241, Netconf has standard Remote ProcedureCalls (RPCs) to manipulate configuration data and retrieving state data.Netconf on IOS-XR supports the candidate datastore, meaningconfiguration must be explicitly committed for application to therunning configuration.YANG Model SupportWhile Netconf created standard RPCs for managing configuration on adevice, it did not define a language for expressing configuration. Theconfiguration syntax communicated by Netconf followed the typical CLIconfiguration, proprietary for each network vendor XML formatted withoutfollowing any common semantics. YANG or Yet Another Network Grammar, isa modeling language to express configuration using standard elementssuch as containers, groups, lists, and endpoint data called leafs. YANG1.0 was defined in RFC 6020 and updated to version 1.1 in RFC 7950.Vendors cover the majority of device configuration and state usingNative YANG models unique to each vendor, but the industry is headedtowards standardized models where applicable. Groups such as OpenConfigand the IETF are developing standardized YANG models allowing operatorsto write a configuration once across all vendors. Cisco has implementeda number of standard OpenConfig network models relevant to peeringincluding the BGP protocol, BGP RIB, and Interfaces model.The appendix contains information about YANG paths relevant toconfiguring the peering fabric and their applicability to PFS and PFLnodes.3rd Party Hosted ApplicationsIOS-XR starting in 6.0 runs on an x86 64-bit Linux foundation. The moveto an open and well supported operating system, with XR componentsrunning on top of it, allows network providers to run 3rdparty applications directly on the router. There are a wide variety ofapplications which can run on the XR host, with fast path interfaces inand out of the application. Example applications are telemetrycollection, custom network probes, or tools to manage other portions ofthe network within a location.XR Service Layer APIThe XR service layer API is a gRPC based API to extract data from adevice as well as provide a very fast programmatic path into therouter’s runtime state. One use case of SL API in the peering fabricis to directly program FIB entries on a device, overriding the defaultpath selection. Using telemetry extracted from a peering fabric, anexternal controller can use the data and additional external constraintsto programmatically direct traffic across the fabric. SL API alsosupports transmission of event data via subscriptions.Recommended Device and Protocol ConfigurationOverviewThe following configuration guidelines will step through the majorcomponents of the device and protocol configuration specific to thepeering fabric and highlight non-default configuration recommended foreach device role and the reasons behind those choices. Complete exampleconfigurations for each role can be found in the Appendix of thisdocument. Configuration specific to telemetry is covered in section 4.Common Node ConfigurationThe following configuration is common to both PFL and PFS NCS5500 seriesnodes.Enable LLDP GloballylldpPFS NodesAs the PFS nodes will integrate into the core control-plane, onlyrecommended configuration for connectivity to the PFL nodes is given.IGP Configurationrouter isis pf-internal-core set-overload-bit on-startup wait-for-bgp is-type level-1-2 net <L2 NET> net <L1 PF NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10Segment Routing Traffic EngineeringIn IOS-XR there are two mechanisms for configuring SR-TE. Prior to IOS-XR 6.3.2 SR-TE was configured using the MPLS traffic engineering tunnel interface configuration. Starting in 6.3.2 SR-TE can now be configured using the more flexible SR-TE Policy model. The following examples show how to define a static SR-TE path from PFS node to exit PE node using both the legacy tunnel configuration model as well as the new SR Policy model.Paths to PE exit node being load balanced across two static P routers using legacy tunnel configexplicit-path name PFS1-P1-PE1-1 index 1 next-address 192.168.12.1 index 2 next-address 192.168.11.1!explicit-path name PFS1-P2-PE1-1 index 1 next-label 16221 index 2 next-label 16511!interface tunnel-te1 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.1 path-option 1 explicit name PFS1-P1-PE1-1 segment-routing!interface tunnel-te2 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.2 path-option 1 explicit name PFS1-P2-PE1-1 segment-routingIOS-XR 6.3.2+ SR Policy Configurationsegment-routingtraffic-eng segment-list PFS1-P1-PE1-SR-1 index 1 mpls label 16211 index 2 mpls label 16511 ! segment-list PFS1-P2-PE1-SR-1 index 1 mpls label 16221 index 2 mpls label 16511 ! policy pfs1_pe1_via_p1 binding-sid mpls 900001 color 1 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! ! policy pfs1_pe1_via_p2 binding-sid mpls 900002 color 2 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! !BGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATH is longer address-family ipv4 unicast additional-paths receive maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths receive bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF Model-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1PFL NodesPeer QoS PolicyPolicy applied to edge of the network to rewrite any incoming DSCP valueto 0.policy-map peer-qos-in class class-default set dscp default ! end-policy-map!Peer Infrastructure ACLSee the Security section of the document for recommended best practicesfor ingress and egress infrastructure ACLs.access-group v4-infra-acl-in access-group v6-infra-acl-in access-group v4-infra-acl-out access-group v6-infra-acl-out Peer Interface Configurationinterface TenGigE0/0/0/0 description “external peer” service-policy input peer-qos-in ;Explicit policy to rewrite DSCP to 0 lldp transmit disable #Do not run LLDP on peer connected interfaces lldp receive disable #Do not run LLDP on peer connected interfaces ipv4 access-group v4-infra-acl-in #IPv4 Ingress infrastructure ACL ipv4 access-group v4-infra-acl-out #IPv4 Egress infrastructure ACL, BCP38 filtering ipv6 access-group v6-infra-acl-in #IPv6 Ingress infrastructure ACL ipv6 access-group v6-infra-acl-out #IPv6 Egress infrastructure ACL, BCP38 filtering IS-IS IGP Configurationrouter isis pf-internal set-overload-bit on-startup wait-for-bgp is-type level-1 net <L1 Area NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10 ! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10BGP Add-Path Route Policyroute-policy advertise-all ;Create policy for add-path advertisements set path-selection all advertiseend-policyBGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATh is longer address-family ipv4 unicast bgp attribute-download ;Enable BGP information for Netflow/IPFIX export additional-paths send additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv4 NLRI to PFS maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths send additional-paths receive additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv6 NLRI to PFS bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF EBGP Peer Configurationsession-group peer-session ignore-connected-check #Allow loopback peering over ECMP w/o EBGP Multihop egress-engineering #Allocate adj-peer-SID ttl-security #Enable gTTL security if neighbor supports it bmp-activate server 1 #Optional send BMP data to receiver 1af-group v4-af-peer address-family ipv4 unicast soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 1000 80;Set maximum inbound prefixes, warning at 80% thresholdaf-group v6-af-peer soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 100 80 #Set maximum inbound prefixes, warning at 80% thresholdneighbor-group v4-peer use session-group peer-session dmz-link-bandwidth ;Propagate external link BW address-family ipv4 unicast af-group v4-af-peerneighbor-group v6-peer use session-group peer-session dmz-link-bandwidth address-family ipv6 unicast af-group v6-af-peer neighbor 1.1.1.1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v4-peer address-family ipv4 unicast route-policy v4-peer-in(12345) in route-policy v4-peer-out(12345) out neighbor 2001#dead#b33f#0#1#1#1#1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v6-peer address-family ipv6 unicast route-policy v6-peer-in(12345) in route-policy v6-peer-out(12345) out PFL to PFS IBGP Configurationsession-group pfs-session ttl-security #Enable gTTL security if neighbor supports it bmp-activate server 1 #Optional send BMP data to receiver 1 update-source Loopback0 #Set BGP session source address to Loopback0 address af-group v4-af-pfs address-family ipv4 unicast next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v4-pfs-in in route-policy v4-pfs-out out af-group v6-af-pfs next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v6-pfs-in in route-policy v6-pfs-out out neighbor-group v4-pfs ! use session-group pfs-session address-family ipv4 unicast af-group v4-af-pfsneighbor-group v6-pfs ! use session-group pfs-session address-family ipv6 unicast af-group v6-af-pfs neighbor <PFS IP> description ~PFS #1~ remote-as <local ASN> use neighbor-group v4-pfsNetflow/IPFIX Configurationflow exporter-map nf-export version v9 options interface-table timeout 60 options sampler-table timeout 60 template timeout 30 ! transport udp <port> source Loopback0 destination <dest>flow monitor-map flow-monitor-ipv4 record ipv4 option bgpattr exporter nf-export cache entries 50000 cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-ipv6 record ipv6 option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-mpls record mpls ipv4-ipv6-fields option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10 sampler-map nf-sample-8192 random 1 out-of 8192Peer Interfaceinterface Bundle-Ether100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressPFS Upstream Interfaceinterface HundredGigE0/0/0/100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressModel-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1Abstract Peering ConfigurationAbstract peering uses qualities of Segment Routing anycast addresses toallow a provider to steer traffic to a specific peering fabric by simplyaddressing a node SID assigned to all PFS members of the peeringcluster. All of the qualities of SR such as midpoint ECMP and TI-LFAfast protection are preserved for the end to end BGP path, improvingconvergence across the network to the peering fabric. Additionally,through the use of SR-TE Policy, source routed engineered paths can beconfigured to the peering fabric based on business logic and additionalpath constraints.PFS ConfigurationOnly the PFS nodes require specific configuration to perform abstractpeering. Configuration shown is for example only with IS-IS configuredas the IGP carrying SR information. The routing policy setting thenext-hop to the AP anycast SID should be incorporated into standard IBGPoutbound routing policy.interface Loopback1 ipv4 address x.x.x.x/32 ipv6 address x#x#x#x##x/128 router isis <ID> passive address-family ipv4 unicast prefix-sid absolute <Global IPv4 AP Node SID> address-family ipv6 unicast prefix-sid absolute <Global IPv6 AP Node SID> route-policy v4-abstract-ibgp-out set next-hop <Loopback1 IPv4 address> route-policy v6-abstract-ibgp-out set next-hop <Loopback1 IPv6 address> router bgp <ASN> ibgp policy out enforce-modifications ;Enables a PFS node to set a next-hop address on routes reflected to IBGP peersrouter bgp <ASN> neighbor x.x.x.x address-family ipv4 unicast route-policy v4-abstract-ibgp-out neighbor x#x#x#x##x address-family ipv6 unicast route-policy v6-abstract-ibgp-out BGP Flowspec Configuration and OperationBGP Flowspec consists of two different node types. The BGP Flowspec Server is where Flowspec policy is defined and sent to peers via BGP sessions with the BGP Flowspec IPv4 and IPv6 AFI/SAFI enabled. The BGP Flowspec Client receives Flowspec policy information and applies the proper dataplane match and action criteria via dynamic ACLs applied to each routerinterface. By default, IOS-XR applies the dynamic policy to all interfaces, with an interface-level configuration setting used to disable BGP Flowspec on specific interfaces.In the Peering Fabric, PFL nodes will act as Flowspec clients. The PFS nodes may act as Flowspec servers, but will never act as clients.Flowspec policies are typically defined on an external controller to be advertised to the rest of the network. The XRv-9000 virtual router works well in these instances. If one is using an external element to advertise Flowspec policies to the peering fabric, they should be advertised to the PFS nodes which will reflect them to the PFL nodes. In the absence of an external policy injector Flowspec policies can be defined on the Peering Fabric PFS nodes for advertisement to all PFL nodes. IPv6 Flowspec on the NCS5500 requires the use of the following global command, followed by a device reboot. hw-module profile flowspec ipv6-enableEnabling BGP Flowspec Address Families on PFS and PFL NodesFollowing the standard Peering Fabric BGP group definitions the following new groups are augmented. The following configuration assumes the PFS node is the BGP Flowspec server.PFSrouter bgp <ASN>address-family ipv4 flowspec address-family ipv6 flowspec af-group v4-flowspec-af-pfl address-family ipv4 flowspec multipath route-reflector-client next-hop-self af-group v6-flowspec-af-pfl address-family ipv4 flowspec multipath route-reflector-client next-hop-self neighbor-group v4-pfl address-family ipv4 flowspec use af-group v4-flowspec-af-pfl neighbor-group v6-pfl address-family ipv6 flowspec use af-group v6-flowspec-af-pfl PFLrouter bgp <ASN>address-family ipv4 flowspec address-family ipv6 flowspec af-group v4-flowspec-af-pfs address-family ipv4 flowspec multipath af-group v6-flowspec-af-pfs address-family ipv4 flowspec multipath neighbor-group v4-pfs address-family ipv4 flowspec use af-group v4-flowspec-af-pfl neighbor-group v6-pfs address-family ipv6 flowspec use af-group v6-flowspec-af-pfl BGP Flowspec Server Policy DefinitionPolicies are defined using the standard IOS-XR QoS Configuration, the first example below matches the recent memcached DDoS attack and drops all traffic. Additional examples are given covering various packet matching criteria and actions.class-map type traffic match-all memcached match destination-port 11211 match protocol udp tcp match destination-address ipv4 10.0.0.0 255.255.255.0 end-class-map!!policy-map type pbr drop-memcached class type traffic memcached drop ! class type traffic class-default ! end-policy-mapclass-map type traffic match-all icmp-echo-flood match protocol icmp match ipv4 icmp type 8 match destination-address ipv4 10.0.0.0 255.255.255.0 end-class-map!!policy-map type pbr limit-icmp-echo class type traffic memcached police rate 100 kbps ! class type traffic class-default ! end-policy-mapclass-map type traffic match-all dns match protocol udp match source port 53 end-class-map!!policy-map type pbr redirect-dns class type traffic dns police rate 100 kbps ! class type traffic class-default redirect nexthop 1.1.1.1 redirect nexthop route-target 1000#1 ! end-policy-mapBGP Flowspec Server EnablementThe following global configuration will enable the Flowspec server and advertisethe policy via the BGP Flowspec NLRIflowspec address-family ipv4 service-policy type pbr drop-memcachedBGP Flowspec Client ConfigurationThe following global configuration enables the BGP Flowspec client function and installation of policies on all local interfaces. Flowspec can be disabled on individual interfaces using the [ipv4|ipv6] flowspec disable command in interface configuration mode.flowspec address-family ipv4 local-install interface-all QPPB Configuration and OperationQoS Policy Propagation using BGP is described in more detail in the Security section.QPPB applies standard QoS policies to packets matching BGP prefix criteria such as BGP community or AS Path. QPPB is supported for both IPv4 and IPv6 address families and packets. QPPB on the NCS5500 supports matching destination prefix attributes only.QPPB configuration starts with a standard RPL route policy that matches BGP attributes and sets a specific QoS group based on that criteria. This routing policy is applied to each address-family as a table-policy in the global BGP configuration. A standard MQC QoS policy is then defined using the specific QoS groups as match criteria to apply additional QoS behavior such as filtering, marking, or policing. This policy is applied to a logical interface, with a specific QPPB command used to enable the propagation of BGP data as part of the dataplane ACL packet match criteria.IPv6 QPPB on the NCS5500 requires the use of the following global command, followed by a device reboot. hw-module profile qos ipv6 shortRouting Policy Configurationroute-policy qppb-test if community matches-every (1000#1) then set qos-group 1 endif if community matches-every (1000#2) then set qos-group 2 endifend-policyGlobal BGP Configurationrouter bgp <ASN> address-family ipv4 unicast table-policy qppb-test address-family ipv6 unicast table-policy qppb-test QoS Policy Definitionclass-map match-any qos-group-1 match qos-group 1 end-class-map class-map match-any qos-group-2 match qos-group 2 end-class-map policy-map remark-peer-traffic class qos-group1 set precedence 5 set mpls experimental imposition 5 ! class qos-group2 set precedence 3 set mpls experimental imposition 3 ! class class-default ! end-policy-mapInterface-Level Configurationinterface gigabitethernet0/0/0/1 service-policy input remark-peer-traffic ipv4 bgp policy propagation input qos-group destination ipv6 bgp policy propagation input qos-group destination BGP Graceful ShutdownBGP graceful shutdown is an IETF standard mechanism for notifying an IBGP or EBGP peer the peer will be g$ing offline. Graceful shutdown uses a well-known community, the GSHUT community (65535#0), on each prefix advertised to a peer so the peer can match the community and perform an action to move traffic gracefully away from the peer before it goes down. In the example in the peering design we will lower the local preference on the route.Outbound graceful shutdown configurationGraceful shutdown is part of the graceful maintenance configuration within BGP. Graceful maintenance can also perform an AS prepend operation when activated. Sending the GSHUT community is enabled using the send-community-gshut-ebgp command under each address family. Graceful maintenance is enabled using the “activate” keyword in the configuration for the neighbor, neighbor-group, or globally for the BGP process.neighbor 1.1.1.1 graceful-maintenance as-prepends 3 address-family ipv4 unicast send-community-gshut-ebgp ! address-family ipv6 unicast send-community-gshut-ebgp Inbound graceful shutdown configurationInbound prefixes tagged with the GSHUT community should be processed with a local-preference of 0 applied so if there is another path for traffic it can be utilized prior to the peer going down. The following is a simple example of a community-set and routing policy to perform this. This could also be added to an existing peer routing policy.community-set graceful-shutdown 65535#0 end-set ! route-policy gshut-inbound if community matches-any graceful-shutdown then set local-preference 0 endif end-policy Activating graceful shutdownGraceful maintenance can be activated globally or for a specific neighbor/neighbor-group. To enable graceful shutdown use the activate keyword under the “graceful-maintenance” configuration context. Without the “all-neighbors” flag maintenance will only be enabled for peers with their own graceful-maintenance configuration. The activate command is persistantGlobalrouter bgp 100 graceful-maintenance activate [ all-neighbors ] Individual neighborrouter bgp 100 neighbor 1.1.1.1 graceful-maintenance activate Peers in specific neighbor-groupneighbor-group peer-group graceful-maintenance activate SecurityPeering by definition is at the edge of the network, where security ismandatory. While not exclusive to peering, there are a number of bestpractices and software features when implemented will protect your ownnetwork as well as others from malicious sources within your network.Peering and Internet in a VRFUsing VRFs to isolate peers and the Internet routing table from the infrastructure can enhance security by keeping internal infrastructure components separate from Internet and end user reachability. VRF separation can be done one of three different ways# Separate each peer into its own VRF, use default VRF on SP Network Single VRF for all “Internet” endpoints, including peers Separate each peer into its own VRF, and use a separate “Internet” VRFVRF per Peer, default VRF for InternetIn this method each peer, or groups of peers, are configured under separate VRFs. The SP carries these and all other routes via the default VRF in IOS-XR commonly known as the Global Routing Table. The VPNv4 and VPNv6 address families are NOT configured on the BGP peering sessions between the PFL and PFS nodes and the PFS nodes and the rest of the network. IOS-XR provides the command import from default-vrf and export to default-vrf with a route-policy to match specific routes to be imported to/from each peer VRF to the default VRF. This provides dataplane isolation between peers and another mechanism to determine which SP routes are advertised to each peer.Internet in a VRF OnlyIn this method all Internet endpoints are configured in the same “Internet” VRF. The security benefit is removing dataplane connectivity between the global Internet and your underlying infrastructure, which is using the default VRF for all internal connectivity. This method uses the VPNv4/VPNv6 address families on all BGP peers and requires the Internet VRF be configured on all peering fabric nodes as well as SP PEs participating in the global routing table. If there are VPN customers or public-facing services in their own VRF needing Internet access, routes can be imported/exported from the Internet VRF on the PE devices they attach to.VRF per Peer, Internet in a VRFThis method combines the properties and configuration of the previous two methods for a solution with dataplane isolation per peer and separation of all public Internet traffic from the SP infrastructure layer. The exchange of routes between the peer VRFs and Internet VRF takes place on the PFL nodes with the rest of the network operating the same as the Internet in a VRF use case.The VPNv4 and VPNv6 address families must be configured across all routers in the network.Infrastructure ACLsInfrastructure ACLs and their associated ACEs (Access Control Entries)are the perimeter protection for a network. The recommended PFL deviceconfiguration uses IPv4 and IPv6 infrastructure ACLs on all edgeinterfaces. These ACLs are specific to each provider’s security needs,but should include the following sections. Filter IPv4 and IPv6 BOGON space ingress and egress Drop ingress packets with a source address matching your own aggregate IPv4/IPv6 prefixes. Rate-limit ingress traffic to Unix services typically used in DDoSattacks, such as chargen (TCP/19). On ingress and egress, allow specific ICMP types and rate-limit toappropriate values, filter out ones not needed on your network. ICMPttl-exceeded, host unreachable, port unreachable, echo-reply,echo-request, and fragmentation needed should always be allowed in somecapacity.BCP ImplementationBest Current Practices are informational documents published by the IETFto give guidelines on operational practices. This document will notoutline the contents of the recommended BCPs, but two in particular areof interest to Internet peering. BCP38 explains the need to filterunused address space at the edges of the network, minimizing the chancesof spoofed traffic from DDoS sources reaching their intended target.BCP38 is applicable for ingress traffic and especially egress traffic,as it stops spoofed traffic before it reaches outside your network.BCP194, BGP Operations and Security, covers a number of BGP operationalpractices, many of which are used in Internet peering. IOS-XR supportsall of the mechanisms recommended in BCP38, BCP84, and BCP194, includingsoftware features such as GTTL, BGP dampening, and prefix limits.BGP Attribute and CoS ScrubbingScrubbing of data on ingress and egress of your network is an importantsecurity measure. Scrubbing falls into two categories, control-plane anddataplane. The control-plane for Internet peering is BGP and there are afew BGP transitive attributes one should take care to normalize. Yourinternal BGP communities should be deleted from outbound BGP NLRI viaegress policy. Most often you are setting communities on inboundprefixes, make sure you are replacing existing communities from the peerand not adding communities. Unless you have an agreement with the peer,normalize the MED attribute to zero or another standard value on allinbound prefixes.In the dataplane, it’s important to treat the peering edge as untrustedand clear any CoS markings on inbound packets, assuming a prioragreement hasn’t been reached with the peer to carry them across thenetwork boundary. It’s an overlooked aspect which could lead to peertraffic being prioritized on your network, leading to unexpected networkbehavior. An example PFL infrastructure ACL is given resetting incomingIPv4/IPv6 DSCP values to 0.BGP Control-PlaneType 6 Encryption ConfigurationType 6 encryption provides stronger on-box storage of control-plane secrets thanlegacy methods which use relatively weak encryption methods. Type 6 encryption uses the onboard Trust Anchor Module, or TAM, to store the encrypted key outsideof the device configuration, meaning simply having access to the config does not expose the keys used in control-plane protocol security.Create key (exec mode, not config mode)key config-key password-encryption (enter key) password6 encryption aesKey chain configurationAt the “key-string” command simply enter the unencrypted string. If Type-6 encryption is enabled, the key will automatically use the “password6” encryptiontype.key chain bgp_type6 key 1 accept-lifetime 01#00#00 october 24 2005 infinite key-string password6 634d695d4848565e5a5d49604741465566496568575046455a6265414142 send-lifetime 01#00#00 october 24 2005 infinite cryptographic-algorithm HMAC-MD5TCP Authentication Option, MD5 DeprecationTCP Authentication Option, commonly known as TCP-AO, is a modern way to authenticate TCP sessions. TCP is defined in RFC 5925. TCP-AO replaces MD5 authentication, which has been deprecated for a number of years due to its weak security. TCP-AO does NOT encrypt BGP session traffic, it authenticates the TCP header to ensure the neighbor is the correct sender.TCP-AO should be used along with Type 6 encryption to best secure BGP sessions.tcp ao keychain TCP-AO-KEY key 1 SendID 100 ReceiveID 100 !!key chain TCP-AO-KEY key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password6 5d574a574d5b6657555c534c62485b51584b57655351495352564f55575060525a60504b send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm AES-128-CMAC-96 !!BGP Neighbor Configurationrouter bgp 100 neighbor 1.2.3.4 remote-as 101 ao TCP-AO-KEY include-tcp-options enablePer-Peer Control Plane PolicersBGP protocol packets are handled at the RP level, meaning each packet ishandled by the router CPU with limited bandwidth and processingresources. In the case of a malicious or misconfigured peer this couldexhaust the processing power of the CPU impacting other important tasks.IOS-XR enforces protocol policers and BGP peer policers by default.BGP Prefix SecurityRPKI Origin ValidationPrefix hijacking has been prevalent throughout the last decade as theInternet became more integrated into our lives. This led to the creationof RPKI origin validation, a mechanism to validate a prefix was beingoriginated by its rightful owner by checking the originating ASN vs. asecure database. IOS-XR fully supports RPKI for origin validation.BGP RPKI and ROV ConfgurationThe following section outlines an example configuration for RPKI and Route Origin Validation (ROV) within IOS-XR.Create ROV Routing PoliciesIn order to apply specific attributes to routes tagged with an ROV status, one must use a routing policy. The “invalid”, “valid”, and “unconfigured” states can be matched upon and then used to set specific BGP attributes as well as accept or drop the route. In the following example a routes’ local-preference attribute is set based on ROV status.route-policy rpki if validation-state is invalid then set local-preference 50 endif if validation-state is not-found then set local-preference 75 endif if validation-state is valid then set local-preference 100 endif else pass end policyConfigure RPKI Server and ROV OptionsAn RPKI server is defined using the “rpki server” section under the global BGP hierarchy. Also configurable is whether or not the ROV status is taken into account as part of the BGP best path selection process. A route with a “valid” status is preferred over a route with a “not-found” or “invalid” status. There is also a configuration option for whether or not to allow invalid routes at all as part of the selection process. It is recommended to includerouter bgp 65536 bgp router id 192.168.0.1 rpki server 172.16.0.254 transport tcp port 32000 refresh-time 120 bgp bestpath origin-as use validity bgp bestpath origin-as allow invalidEnabling RPKI ROV on BGP NeighborsROV is done at the global BGP level, but the treatment of routes is done at the neighbor level. This requires applying the pre-defined ROV route-policy to the neighbors you wish to apply policy to based on ROV status.neighbor 192.168.0.254 remote-as 64555 address-family ipv4 unicast route-policy rpki in Communicating ROV Status via Well-Known BGP CommunityRPKI ROV is typically only done on the edges of the network, and in IOS-XR is only done on EBGP sessions. In a network with multiple ASNs under the same administrative control, one should configure the following to signal ROV validation status via a well-known community to peers within the same administrative domain. This way only the nodes connected to external peers have RTR sessions to the RPKI ROV validators and are responsible for applying ROV policy, adding efficiency to the process and reducing load on the validator.address-family ipv4 unicast bgp origin-as validation signal ibgp BGPSEC (Reference Only)RPKI origin validation works to validate the source of a prefix, butdoes not validate the entire path of the prefix. Origin validation alsodoes not use cryptographic signatures to ensure the originator is whothey say they are, so spoofing the ASN as well does not stop someoneform hijacking a prefix. BGPSEC is an evolution where a BGP prefix iscryptographically signed with the key of its valid originator, and eachBGP router receiving the path checks to ensure the prefix originatedfrom the valid owner. BGPSEC standards are being worked on in the SIDRworking group. Cisco continues to monitor the standards related to BGPSEC and similar technologies to determine which to implement to best serve our customers.DDoS traffic steering using SR-TESee the overview design section for more details. This shows the configuration of a single SR-TE Policy which will balance traffic to two different egress DDoS “dirty” interfaces. If a BGP session is enabled between the DDoS mitigation appliance and the router, an EPE label can be assigned to the interface. In the absence of EPE, a MPLS static LSP can be created on the core-facing interfaces on the egress node, with the action set to “pop” towards the DDoS mitigation interface.SR-TE Policy configurationIn this example the node SID is 16441. The EPE or manual xconnect SID for a specific egress interface is 28000 and 28001. The weight of each path is 100, so traffic will be equally balanced across the paths.segment-routing traffic-eng segment-list pr1-ddos-1 index 1 mpls label 16441 index 2 mpls label 28000 segment-list pr1-ddos-2 index 1 mpls label 16441 index 2 mpls label 28001 policy pr1_ddos1_epe color 999 end-point ipv4 192.168.14.4 candidate-paths preference 100 explicit segment-list pr1-ddos-1 weight 100 ! explicit segment-list pr1-ddos-2 weight 100Egress node BGP configurationOn the egress BGP node, 192.168.14.4, prefixes are set with a specific “DDoS” color to enable the ingress node to steer traffic into the correct SR Policy. An example is given of injecting the 50.50.50.50/32 route with the “DDoS” color of 999.extcommunity-set opaque DDOS 999 end-set!route-policy SET-DDOS-COLOR set extcommunity color DDOS passend-policy!router static address-family ipv4 unicast 50.50.50.50/32 null0 ! ! router bgp 100 address-family ipv4 unicast network 50.50.50/32 route-policy SET-DDOS-COLOR ! !Egress node MPLS static LSP configurationIf EPE is not being utilized, the last label in the SR Policy path must be matched to a static LSP. The ingress label on the egress node is used to map traffic to a specific IP next-hop and interface. We will give an example using the label 28000 in the SR Policy path. The core-facing ingress interface is HundredGigE0/0/0/1, the egress DDoS “dirty” interface is TenGigE0/0/0/1 with a NH address of 192.168.100.1.mpls static interface HundredGigE0/0/0/1 lsp ddos-interface-1 in-label 28000 allocate forward path 1 nexthop TenGigE0/0/0/1 192.168.100.1 out-label pop ! !!AppendixApplicable YANG ModelsModelDataopenconfig-interfacesCisco-IOS-XR-infra-statsd-operCisco-IOS-XR-pfi-im-cmd-operInterface config and state Common counters found in SNMP IF-MIB openconfig-if-ethernet Cisco-IOS-XR-drivers-media-eth-operEthernet layer config and stateXR native transceiver monitoringopenconfig-platformInventory, transceiver monitoring openconfig-bgpCisco-IOS-XR-ipv4-bgp-oper Cisco-IOS-XR-ipv6-bgp-operBGP config and state Includes neighbor session state, message counts, etc.openconfig-bgp-rib Cisco-IOS-XR-ip-rib-ipv4-oper Cisco-IOS-XR-ip-rib-ipv6-operBGP RIB information. Note# Cisco native includes all protocols openconfig-routing-policyConfigure routing policy elements and combined policyopenconfig-telemetryConfigure telemetry sensors and destinations Cisco-IOS-XR-ip-bfd-cfg Cisco-IOS-XR-ip-bfd-operBFD config and state Cisco-IOS-XR-ethernet-lldp-cfg Cisco-IOS-XR-ethernet-lldp-operLLDP config and state openconfig-mplsMPLS config and state, including Segment RoutingCisco-IOS-XR-clns-isis-cfgCisco-IOS-XR-clns-isis-operIS-IS config and state Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-operNCS 5500 HW resources NETCONF YANG PathsNote that while paths are given to retrieve data from a specific leafnode, it is sometimes more efficient to retrieve all the data under aspecific heading and let a management station filter unwanted data thanperform operations on the router. Additionally, Model Driven Telemetrymay not work at a leaf level, requiring retrieval of an entire subset ofdata.The data is also available via NETCONF, which does allow subtree filtersand retrieval of specific data. However, this is a more resourceintensive operation on the router. Metric Data Logical Interface Admin State Enum SNMP OID IF-MIB#ifAdminStatus OC YANG openconfig-interfaces#interfaces/interface/state/admin-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Interface Operational State Enum SNMP OID IF-MIB#ifOperStatus OC YANG openconfig-interfaces#interfaces/interface/state/oper-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Last State Change (seconds) Counter SNMP OID IF-MIB#ifLastChange OC YANG openconfig-interfaces#interfaces/interface/state/last-change Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/last-state-transition-time     Logical Interface SNMP ifIndex Integer SNMP OID IF-MIB#ifIndex OC YANG openconfig-interfaces#interfaces/interface/state/if-index Native YANG Cisco-IOS-XR-snmp-agent-oper#snmp/interface-indexes/if-index     Logical Interface RX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCInOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-received     Logical Interface TX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCOutOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-sent     Logical Interface RX Errors Counter SNMP OID IF-MIB#ifInErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-errors MDT Native     Logical Interface TX Errors Counter SNMP OID IF-MIB#ifOutErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-errors     Logical Interface Unicast Packets RX Counter SNMP OID IF-MIB#ifHCInUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Unicast Packets TX Counter SNMP OID IF-MIB#ifHCOutUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Input Drops Counter SNMP OID IF-MIB#ifIntDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-drops     Logical Interface Output Drops Counter SNMP OID IF-MIB#ifOutDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-drops     Ethernet Layer Stats – All Interfaces Counters SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics     Ethernet PHY State – All Interfaces Counters SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info     Ethernet Input CRC Errors Counter SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state/oc-eth#counters/oc-eth#in-crc-errors Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics/statistic/dropped-packets-with-crc-align-errors The following transceiver paths retrieve the total power for thetransceiver, there are specific per-lane power levels which can beretrieved from both native and OC models, please refer to the model YANGfile for additionalinformation.     Ethernet Transceiver RX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-rx-power     Ethernet Transceiver TX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-tx-power BGP Operational StateGlobal BGP Protocol StateIOS-XR native models do not store route information in the BGP Opermodel, they are stored in the IPv4/IPv6 RIB models. These models containRIB information based on protocol, with a numeric identifier for eachprotocol with the BGP ProtoID being 5. The protoid must be specified orthe YANG path will return data for all configured routingprotocols.     BGP Total Paths (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-paths Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/num-active-paths MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native BGP Neighbor StateExample UsageDue the construction of the YANG model, the neighbor-address key must beincluded as a container in all OC BGP state RPCs. The following RPC getsthe session state for all configured peers#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address/> <state> <session-state/> </state> </neighbor> </neighbors> </bgp> </filter> </get></rpc>\t<nc#rpc-reply message-id=~urn#uuid#24db986f-de34-4c97-9b2f-ac99ab2501e3~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address>172.16.0.2</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> </neighbors> </bgp> </nc#data></nc#rpc-reply>     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Session State for all BGP neighbors Enum SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/session-state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/connection-state     Message counters for all BGP neighbors Counter SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/messages Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/message-statistics Current queue depth for all BGP neighborsCounterSNMP OIDNAOC YANG/openconfig-bgp#bgp/neighbors/neighbor/state/queuesNative YANGCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-outCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-inBGP RIB DataRIB data is retrieved per AFI/SAFI. To retrieve IPv6 unicast routesusing OC models, replace “ipv4-unicast” with “ipv6-unicast”IOS-XR native models do not have a BGP specific RIB, only RIB dataper-AFI/SAFI for all protocols. Retrieving RIB information from thesepaths will include this data.While this data is available via both NETCONF and MDT, it is recommendedto use BMP as the mechanism to retrieve RIB table data.Example UsageThe following retrieves a list of best-path IPv4 prefixes withoutattributes from the loc-RIB#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <loc-rib> <routes> <route> <prefix/> <best-path>true</best-path> </route> </routes> </loc-rib> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc>     IPv4 Local RIB – Prefix Count Counter OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/num-routes Native YANG       IPv4 Local RIB – IPv4 Prefixes w/o Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes/route/prefix     IPv4 Local RIB – IPv4 Prefixes w/Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes Native YANG   The following per-neighbor RIB paths can be qualified with a specificneighbor address to retrieve RIB data for a specific peer. Below is anexample of a NETCONF RPC to retrieve the number of post-policy routesfrom the 192.168.2.51 peer and the returned output.<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes/> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc><nc#rpc-reply message-id=~urn#uuid#7d9a0468-4d8d-4008-972b-8e703241a8e9~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <afi-safi-name xmlns#idx=~http#//openconfig.net/yang/rib/bgp-types~>idx#IPV4_UNICAST</afi-safi-name> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes>3</num-routes> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </nc#data></nc#rpc-reply>     IPv4 Neighbor adj-rib-in pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-re     IPv4 Neighbor adj-rib-in post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-post     IPv4 Neighbor adj-rib-out pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre     IPv4 Neighbor adj-rib-out post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre BGP Flowspec     BGP Flowspec Operational State Counters SNMP OID NA OC YANG NA Native YANG Cisco-IOS-XR-flowspec-oper MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native Device Resource YANG Paths     Device Inventory List OC YANG oc-platform#components     NCS5500 Dataplane Resources List OC YANG NA Native YANG Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data Validated Model-Driven Telemetry Sensor PathsThe following represents a list of validated sensor paths useful formonitoring the Peering Fabric and the data which can be gathered byconfiguring these sensorpaths.Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform openconfig-platform#components cisco-ios-xr-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data cisco-ios-xr-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info cisco-ios-xr-shellutil-oper#system-time/uptime cisco-ios-xr-wdsysmon-fd-oper#system-monitoring/cpu-utilizationLLDP MonitoringCisco-IOS-XR-ethernet-lldp-oper#lldpCisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/neighborsInterface statistics and stateopenconfig-interfaces#interfacesCisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-countersCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interfaceCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statisticsCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statistics/basic-interface-statsThe following sub-paths can be used but it is recommended to use the base openconfig-interfaces modelopenconfig-interfaces#interfaces/interfaceopenconfig-interfaces#interfaces/interface/stateopenconfig-interfaces#interfaces/interface/state/countersopenconfig-interfaces#interfaces/interface/subinterfaces/subinterface/state/countersAggregate bundle information (use interface models for interface counters)sensor-group openconfig-if-aggregate#aggregatesensor-group openconfig-if-aggregate#aggregate/statesensor-group openconfig-lacp#lacpsensor-group Cisco-IOS-XR-bundlemgr-oper#bundlessensor-group Cisco-IOS-XR-bundlemgr-oper#bundle-information/bfd-countersBGP Peering informationsensor-path openconfig-bgp#bgpsensor-path openconfig-bgp#bgp/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticsIS-IS IGP informationsensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighborssensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/interfacessensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/adjacenciesIt is not recommended to monitor complete RIB tables using MDT but can be used for troubleshootingCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countQoS and ACL monitoringopenconfig-acl#aclCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/general-statsCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/queue-stats-arrayBGP RIB informationIt is not recommended to monitor these paths using MDT with large tablesopenconfig-rib-bgp#bgp-ribCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intRouting policy InformationCisco-IOS-XR-policy-repository-oper#routing-policy/policies", "url": "/blogs/2020-10-01-peering-fabric-hld_3_5/", "author": "Phil Bedard", "tags": "iosxr, design, peering, ddos, ixp" } , "blogs-2021-01-20-converged-sdn-transport-4-0-hld": { "title": "Converged SDN Transport High Level Design v4.0", "content": " On This Page Revision History Minimum supported IOS-XR Release Minimum supported IOS-XE Release Value Proposition Summary Technical Overview Hardware Components in Design ASR 9000 NCS-560 NCS 5504, 5508, 5516 Modular Chassis NCS-5501, NCS-5501-SE, and N540-ACC-SYS NCS-55A2-MOD ASR 920 NCS 520 Transport – Design Components Network Domain Structure Topology options and PE placement - Inline and non-inline PE Connectivity using 100G/200G coherent optics w/MACSec Ring deployment without multiplexers Ring deployment with multiplexer Unnumbered Interface Support Intra-Domain Intra-Domain Routing and Forwarding Intra-Domain Forwarding - Fast Re-Route using TI-LFA Inter-Domain Inter-Domain Forwarding Area Border Routers – Prefix-SID vs Anycast-SID Inter-Domain Forwarding - High Availability and Fast Re-Route Inter-Domain Open Ring Support Transport Programmability Traffic Engineering (Tactical Steering) – SR-TE Policy Traffic Engineering - Dynamic Anycast-SID Paths and Black Hole Avoidance Transport Controller Path Computation Engine (PCE) Segment Routing Path Computation Element (SR-PCE) PCE Controller Summary – SR-PCE Converged SDN Transport Path Computation Workflows Static SR-TE Policy Configuration On-Demand Next-Hop Driven Configuration Segment Routing Flexible Algorithms (Flex-Algo) Flex-Algo Node SID Assignment Flex-Algo IGP Definition Path Computation across SR Flex-Algo Network Flex-Algo Dual-Plane Example Segment Routing and Unified MPLS (BGP-LU) Co-existence Summary ABR BGP-LU design Quality of Service and Assurance Overview NCS 540, 560, and 5500 QoS Primer Hierarchical Edge QoS H-QoS platform support CST Core QoS mapping with five classes Example Core QoS Class and Policy Maps Class maps for ingress header matching Class maps for egress queuing and marking policies Egress QoS queuing policy Egress QoS marking policy Converged SDN Transport Use Cases 5G Mobile Networks Summary and 5G Service Types Key Validated Components End to End Timing Validation Low latency SR-TE path computation Dynamic Link Performance Measurement SR Policy latency constraint configuration on configured policy SR Policy latency constraint configuration for ODN policies Dynamic link delay metric configuration Static defined link delay metric TE metric definition SR Policy one-way delay measurement Segment Routing Flexible Algorithms for 5G Slicing End to end network QoS with H-QoS on Access PE CST QoS mapping with 5 classes FTTH Design using EVPN E-Tree Summary E-Tree Diagram E-Tree Operation Split-Horizon Groups L3 IRB Support Multicast Traffic Ease of Configuration Cable Converged Interconnect Network (CIN) Summary Distributed Access Architecture Remote PHY Components and Requirements Remote PHY Device (RPD) RPD Network Connections Cisco cBR-8 and cnBR cBR-8 Network Connections cBR-8 Redundancy Remote PHY Communication DHCP Remote PHY Standard Flows GCP UEPI and DEPI L2TPv3 Tunnels CIN Network Requirements IPv4/IPv6 Unicast and Multicast Network Timing CST 4.0+ Update to CIN Timing Design QoS DHCPv4 and DHCPv6 Relay Converged SDN Transport CIN Design Deployment Topology Options High Scale Design (Recommended) Collapsed Digital PIC and SUP Uplink Connectivity Collapsed RPD and cBR-8 DPIC Connectivity Cisco Hardware Scalable L3 Routed Design L3 IP Routing CIN Router to Router Interconnection Leaf Transit Traffic cBR-8 DPIC to CIN Interconnection DPIC Interface Configuration Router Interface Configuration RPD to Router Interconnection Native IP or L3VPN/mVPN Deployment SR-TE CIN Quality of Service (QoS) CST Network Traffic Classification CST and Remote-PHY Load Balancing SmartPHY RPD Automation 4G Transport and Services Modernization L3 IP Multicast and mVPN LDP Auto-configuration LDP mLDP-only Session Capability (RFC 7473) LDP Unicast FEC Filtering for SR Unicast with mLDP Multicast L3 Multicast using Segment Routing TreeSID w/Static S,G Mapping TreeSID Diagram TreeSID Overview EVPN Multicast LDP to Converged SDN Transport Migration Towards Converged SDN Transport Design Segment Routing Enablement Segment Routing Mapping Server Design Automation Zero Touch Provisioning Model-Driven Telemetry Network Services Orchestrator (NSO) Converged SDN Transport Supported Service Models Base Services supporting Advanced Use Cases Overview Ethernet VPN (EVPN) Ethernet VPN Hardware Support Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or with Data Center End-To-End (Flat) – Services Hierarchical – Services Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHE Services – Route-Reflector (S-RR) Ethernet Services OAM using Ethernet CFM Transport and Services Integration The Converged SDN Transport Design Transport Transport Programmability Services Transport and Services Integration The Converged SDN Transport Design - Summary Revision History Version Date Comments 1.0 05/08/2018 Initial Converged SDN Transport publication 1.5 09/24/2018 NCS540 Access, ZTP, NSO Services 2.0 4/1/2019 Non-inline PE Topology, NCS-55A2-MOD, IPv4/IPv6/mLDP Multicast, LDP to SR Migration 3.0 1/20/2020 Converged Transport for Cable CIN, Multi-domain Multicast, Qos w/H-QoS access, MACSEC, Coherent Optic connectivity 3.5 10/15/2020 Unnumbered access rings, Anycast SID ABR Resiliency, E-Tree for FTTH deployments, SR Multicast using Tree-SID, NCS 560, SmartPHY for R-PHY, Performance Measurement 4.0 2/1/2020 SR Flexible Algorithms inc. Inter-Domain, PTP multi-profile inc. G.82751<>G.8275.2 interworking, G.8275.2 on BVI, ODN support for EVPN ELAN, TI-LFA Open Ring support, NCS 520, SR on cBR8 Minimum supported IOS-XR Release CST Version XR version 1.0 6.3.2 1.5 6.5.1 2.0 6.5.3 3.0 6.6.3 3.5 7.1.2 4.0 7.2.2 on NCS, 7.1.3 on ASR9K Minimum supported IOS-XE Release CST Version XR version 4.0 16.12.03 on NCS 520, ASR920; 17.03.01w Value PropositionService Providers are facing the challenge to provide next generationservices that can quickly adapt to market needs. New paradigms such as5G introduction, video traffic continuous growth, IoT proliferation andcloud services model require unprecedented flexibility, elasticity andscale from the network. Increasing bandwidth demands and decreasing ARPUput pressure on reducing network cost. At the same time, services needto be deployed faster and more cost effectively to stay competitive.Metro Access and Aggregation solutions have evolved from nativeEthernet/Layer 2 based, to Unified MPLS to address the above challenges.The Unified MPLS architecture provides a single converged networkinfrastructure with a common operational model. It has great advantagesin terms of network convergence, high scalability, high availability,and optimized forwarding. However, that architectural model is stillquite challenging to manage, especially on large-scale networks, becauseof the large number of distributed network protocols involved whichincreases operational complexity.Converged SDN Transport design introduces an SDN-ready architecturewhich evolves traditional Metro network design towards an SDN enabled,programmable network capable of delivering all services (Residential,Business, 4G/5G Mobile Backhaul, Video, IoT) on the premise ofsimplicity, full programmability, and cloud integration, with guaranteedservice level agreements (SLAs).The Converged SDN Transport design brings tremendous value to ServiceProviders# Fast service deployment and rapid time to market throughfully automated service provisioning and end-to-end networkprogrammability Operational simplicity with less protocols to operate and manage Smooth migration towards an SDN-ready architecture thanks tobackward-compatibility with existing network protocols and services Next generation service creation leveraging guaranteed SLAs Enhanced and optimized operations using telemetry/analytics inconjunction with automation tools The Converged SDN Transport design is targeted at Service Providercustomers who# Want to evolve their existing Unified MPLS Network Are looking for an SDN ready solution Need a simple, scalable design that can support future growth Want a future proof architecture built using industry-leading technology SummaryThe Converged SDN Transport design satisfies the following criteria for scalable next-generation networks# Simple# based on Segment Routing as unified forwarding plane andEVPN and L3VPN as a common BGP based services control plane Programmable# Using SR-PCE to program end-to-end multi-domain paths across thenetwork with guaranteed SLAs Automated # Service provisioning is fully automated using NSOand YANG models; Analytics with model driven telemetry inconjunction with Crosswork Network Insights toenhance operations and network visibility Technical OverviewThe Converged SDN Transport design evolves from the successful CiscoEvolved Programmable Network (EPN) 5.0 architecture framework, to bringgreater programmability and automation.In the Converged SDN Transport design, the transport and service are builton-demand when the customer service is requested. The end-to-endinter-domain network path is programmed through controllers and selectedbased on the customer SLA, such as the need for a low latency path.The Converged SDN Transport is made of the following main buildingblocks# IOS-XR as a common Operating System proven in Service ProviderNetworks Transport Layer based on Segment Routing as UnifiedForwarding Plane SDN - Segment Routing Path Computation Element (SR-PCE) as Cisco Path ComputationEngine (PCE) coupled with Segment Routing to provide simple andscalable inter-domain transport connectivity, TrafficEngineering, and advanced Path control with constraints Service Layer for Layer 2 (EVPN) and Layer 3 VPN services basedon BGP as Unified Control Plane Automation and Analytics NSO for service provisioning Netconf/YANG data models Telemetry to enhance and simplify operations Zero Touch Provisioning and Deployment (ZTP/ZTD) Hardware Components in DesignASR 9000The ASR 9000 is the router of choice for high scale edge services. The Converged SDN Transport utilizes the ASR 9000 in a PE function role, performing high scale L2VPN, L3VPN, and Pseudowire headend termination. All testing up to 3.0 has been performed using Tomahawk series line cards on the ASR 9000.NCS-560The NCS-560 with RSP4 is a next-generation platform with high scale and modularity to fit in many access, pre-aggregation, and aggregation roles. Available in 4-slot and 7-slot versions, the NCS 560 is fully redundant with a variety of 40GE/100GE, 10GE, and 1GE modular adapters. The NCS 560 RSP4 has built-in GNSS timing support along with a high scale (-E) version to support full Internet routing tables or large VPN routing tables with room to spare for 5+ years of growth. The NCS 560 provides all of this with a very low power and space footprint with a depth of 9.5”.NCS 5504, 5508, 5516 Modular ChassisThe modular chassis version of the NCS 5500 is available in 4, 8, and 16 slot versions for flexible interfaces at high scale with dual RP modules. A variety of line cards are available with 10G, 40G, 100G, and 400G interface support. The NCS 5500 fully supports timing distribution for applications needing high accuracy clocks like mobile backhaul.NCS-5501, NCS-5501-SE, and N540-ACC-SYSThe NCS 5501, 5501-SE, and 540 hardware is validated in both an access and aggregation role in the Converged SDN Transport. The 5501 has 48x1G/10G SFP+ and 6x100G QSFP28 interfaces, the SE adds higher route scale via an external TCAM. The N540-ACC-SYS is a next-generation access node with 24x10G SFP+, 8x25G SFP28, and 2x100G QSFP28 interfaces. The NCS540 is available in extended temperature with a conformal coating for deployment deep into access networks.NCS-55A2-MODThe Converged SDN Transport design now supports the NCS-55A2-MOD access and aggregation router. The 55A2-MOD is a modular 2RU router with 24 1G/10G SFP+, 16 1G/10G/25G SFP28 onboard interfaces, and two modular slots capable of 400G of throughput per slot using Cisco NCS Modular Port Adapters or MPAs. MPAs add additional 1G/10G SFP+, 100G QSFP28, or 100G/200G CFP2 interfaces. The 55A2-MOD is available in an extended temperature version with a conformal coating as well as a high scale configuration (NCS-55A2-MOD-SE-S) scaling to millions of IPv4 and IPv6 routes.ASR 920The IOS-XE based ASR 920 is tested within the Converged SDN Transport as an access node. The Segment Routing data plane and supported service types are validated on the ASR 920 within the CST design. Please see the services support section for all service types supported on the ASR 920. NCS 520The IOS-XE based NCS 520 acts as an Ethernet demarcation device (NID) or carrier Ethernet switch in the Converged SDN Transport design. The MEF 3.0 certified device acts as a customer equipment termination point where QoS, OAM (Y.1731,802.3ah), and service validation/testing using Y.1564 can be performed. The NCS 520 is available in a variety of models covering different port requirements including industrial temp and conformal coated models for harsher environments.Transport – Design ComponentsNetwork Domain StructureTo provide unlimited network scale, the Converged SDN Transport isstructured into multiple IGP Domains# Access, Aggregation, and Core. However as we will illustrate in the next section, the number of domains is completely flexible based on provider need.Refer to the network topology in Figure 1.Figure 1# High scale fully distributedThe network diagram in Figure 2 shows how a Service Provider network canbe simplified by decreasing the number of IGP domains. In this scenariothe Core domain is extended over the Aggregation domain, thus increasingthe number of nodes in theCore.Figure 2# Distributed with expanded accessA similar approach is shown in Figure 3. In this scenario the Coredomain remains unaltered and the Access domain is extended over theAggregation domain, thus increasing the number of nodes in the Accessdomain.#%s/Figure 3# Distributed with expanded coreThe Converged SDN Transport transport design supports all three networkoptions, while remaining easily customizable.The first phase of the Converged SDN Transport, discussed later in thisdocument, will cover in depth the scenario described in Figure 3.Topology options and PE placement - Inline and non-inline PEThe non-inline PE topology, shown in the figure below, moves the services edge PE device from the forwarding path between the access/aggregation networks and the core. There are several factors which can drive providers to this design vs. one with an in-line PE, some of which are outlined in the table below. The control-plane configuration of the Converged SDN Transport does not change, all existing ABR configuration remains the same, but the device no longer acts as a high-scale PE.Figure# Non-Inline Aggregation TopologyConnectivity using 100G/200G coherent optics w/MACSecIn Converged SDN Transport 3.0 we add support for the use of pluggable CFP2-DCO transceivers to enable high speed aggregation and access network infrastructure. As endpoint bandwidth increases due to technology innovation such as 5G and Remote PHY, access and aggregation networks must grow from 1G and 10G to 100G and beyond. Coherent router optics simplify this evolution by allowing an upgrade path to increase ring bandwidth up to 400Gbps without deploying costly DWDM optical line systems.MACSec is an industry standard protocol running at L2 to provide encryption across Ethernet links. In CST 3.0 MACSec is enabled across CFP2-DCO access to aggregation links. MACSec support is hardware dependent, please consult individual hardware data sheets for MACSec support.Ring deployment without multiplexersIn the simplest deployment access rings are deployed over dark fiber, enabling plug and play operation up to 80km without amplification.CFP2-DCO DWDM ring deploymentRing deployment with multiplexerIn this option the nodes are deployed with active or passive multiplexers to maximize fiber utilization rings needing more bandwidth per ring site. While this example shows each site on the ring having direct DWDM links back to the aggregation nodes, a hybrid approach could also be supported targeting only high-bandwidth locations with direct links while leaving other sites on a an aggregation ring.CFP2-DCO DWDM hub and spoke or partial mesh deploymentThe Cisco NCS 55A2-MOD and 55A2-MOD-SE hardened modular platform has a mix of fixed SFP+ and SFP28 ports along with two MPA slots. The coherent aggregation and access solution can utilize either the 2xCFP2-DCO MPA or 2xQSFP28+1xCFP2-DCO MPA. The same MPA modules can be used in the 5504, 5508, and 5516 chassis using the NC55-MOD-A-S and NC55-MODD-A-SE line cards, with 12xSFP+ and 2xQSFP+ ports. The NCS 560 also now supports a CFP2-DCO line card to support using DWDM links with the NCS 560.Cisco 55A2 modular hardened routerCisco NCS 5500 chassis modular line cardUnnumbered Interface SupportIn CST 3.5, starting at IOS-XR 7.1.1 we have added support for unnumbered interfaces. Using unnumbered interfaces in the network eases the burden of deploying nodes by not requiring specific IPv4 or IPv6 interface addresses between adjacent node. When inserting a new node into an existing access ring the provideronly needs to configure each interface to use a Loopback address on the East and West interfaces of the nodes. IGP adjacencies will be formed over the unnumbered interfaces.IS-IS and Segment Routing/SR-TE utilized in the Converged SDN Transport design supports using unnumbered interfaces. SR-PCE used to compute inter-domain SR-TE paths also supports the use of unnumbered interfaces. In the topology database each interface is uniquely identified by a combination of router ID and SNMP IfIndex value.Unnumbered node insertionUnnumbered interface configuration#interface TenGigE0/0/0/2 description to-AG2 mtu 9216 ptp profile My-Slave port state slave-only local-priority 10 ! service-policy input core-ingress-classifier service-policy output core-egress-exp-marking ipv4 point-to-point ipv4 unnumbered Loopback0 frequency synchronization selection input priority 10 wait-to-restore 1 !!Intra-DomainIntra-Domain Routing and ForwardingThe Converged SDN Transport is based on a fully programmable transport thatsatisfies the requirements described earlier. The foundation technologyused in the transport design is Segment Routing (SR) with a MPLS basedData Plane in Phase 1 and a IPv6 based Data Plane (SRv6) in future.Segment Routing dramatically reduces the amount of protocols needed in aService Provider Network. Simple extensions to traditional IGP protocolslike ISIS or OSPF provide full Intra-Domain Routing and ForwardingInformation over a label switched infrastructure, along with HighAvailability (HA) and Fast Re-Route (FRR) capabilities.Segment Routing defines the following routing related concepts# Prefix-SID – A node identifier that must be unique for each node ina IGP Domain. Prefix-SID is statically allocated by th3 networkoperator. Adjacency-SID – A node’s link identifier that must be unique foreach link belonging to the same node. Adjacency-SID is typicallydynamically allocated by the node, but can also be staticallyallocated. In the case of Segment Routing with a MPLS Data Plane, both Prefix-SIDand Adjacency-SID are represented by the MPLS label and both areadvertised by the IGP protocol. This IGP extension eliminates the needto use LDP or RSVP protocol to exchange MPLS labels.The Converged SDN Transport design uses ISIS as the IGP protocol.Intra-Domain Forwarding - Fast Re-Route using TI-LFASegment-Routing embeds a simple Fast Re-Route (FRR) mechanism known asTopology Independent Loop Free Alternate (TI-LFA).TI-LFA provides sub 50ms convergence for link and node protection.TI-LFA is completely stateless and does not require any additionalsignaling mechanism as each node in the IGP Domain calculates a primaryand a backup path automatically and independently based on the IGPtopology. After the TI-LFA feature is enabled, no further care isexpected from the network operator to ensure fast network recovery fromfailures. This is in stark contrast with traditional MPLS-FRR, whichrequires RSVP and RSVP-TE and therefore adds complexity in the transportdesign.Please refer also to the Area Border Router Fast Re-Route covered inSection# “Inter-Domain Forwarding - High Availability and Fast Re-Route” for additional details.Inter-DomainInter-Domain ForwardingThe Converged SDN Transport achieves network scale by IGP domainseparation. Each IGP domain is represented by separate IGP process onthe Area Border Routers (ABRs).Section# “Intra-Domain Routing and Forwarding” described basic Segment Routing concepts# Prefix-SID andAdjacency-SID. This section introduces the concept of Anycast SID.Segment Routing allows multiple nodes to share the same Prefix-SID,which is then called a “Anycast” Prefix-SID or Anycast-SID. Additionalsignaling protocols are not required, as the network operator simplyallocates the same Prefix SID (thus a Anycast-SID) to a pair of nodestypically acting as ABRs.Figure 4 shows two sets of ABRs# Aggregation ABRs – AG Provider Edge ABRs – PE Figure 4# IGP Domains - ABRs Anycast-SIDFigure 5 shows the End-To-End Stack of SIDs for packets traveling fromleft to right through thenetwork.Figure 5# Inter-Domain LSP – SR-TE PolicyThe End-To-End Inter-Domain Label Switched Path (LSP) was computed viaSegment Routing Traffic Engineering (SR-TE) Policies.On the Access router “A” the SR-TE Policy imposes# Local Aggregation Area Border Routers Anycast-SID# Local-AGAnycast-SID Local Provider Edge Area Border Routers Anycast-SID# Local-PEAnycast SID Remote Provider Edge Area Border Routers Anycast-SID# Remote-PEAnycast-SID Remote Aggregation Area Border Routers Anycast-SID# Remote-AGAnycast-SID Remote/Destination Access Router# Destination-A Prefix-SID#Destination-A Prefix-SID The SR-TE Policy is programmed on the Access device on-demand by anexternal Controller and does not require any state to be signaledthroughout the rest of the network. The SR-TE Policy provides, by simpleSID stacking (SID-List), an elegant and robust way to programInter-Domain LSPs without requiring additional protocols such as BGP-LU(RFC3107).Please refer to Section# “Transport Programmability” for additional details.Area Border Routers – Prefix-SID vs Anycast-SIDSection# “Inter-Domain Forwarding” showed the use of Anycast-SID at the ABRs for theprovisioning of an Access to Access End-To-End LSP. When the LSP is setup between the Access Router and the AG/PE ABRs, there are two options# ABRs are represented by Anycast-SID; or Each ABR is represented by a unique Prefix-SID. Choosing between Anycast-SID or Prefix-SID depends on the requestedservice and inclusion of Anycast SIDs in the SR-TE Policy. Please refer to Section# “Services - Design”. If one is using the SR-PCE, such as the case of ODN SR-TE paths, the inclusion of Anycast SIDs is done via configuration.Note that both options can be combined on the same network.Inter-Domain Forwarding - High Availability and Fast Re-RouteAG/PE ABRs redundancy enables high availability for Inter-DomainForwarding.Figure 7# IGP Domains - ABRs Anycast-SIDWhen Anycast-SID is used to represent AG or PE ABRs, no other mechanismis needed for Fast Re-Route (FRR). Each IGP Domain provides FRRindependently by TI-LFA as described in Section# “Intra-Domain Forwarding - Fast Re-Route”.Figure 8 shows how FRR is achieved for a Inter-DomainLSP.Figure 8# Inter-Domain - FRRThe access router on the left imposes the Anycast-SID of the ABRs andthe Prefix-SID of the destination access router. For FRR, any router inIGP1, including the Access router, looks at the top label# “ABRAnycast-SID”. For this label, each device maintains a primary and backuppath preprogrammed in the HW. In IGP2, the top label is “Destination-A”.For this label, each node in IGP2 has primary and backup pathspreprogrammed in the HW. The backup paths are computed by TI-LFA.As Inter-Domain forwarding is achieved via SR-TE Policies, FRR iscompletely self-contained and does not require any additional protocol.Note that when traditional BGP-LU is used for Inter-Domain forwarding,BGP-PIC is also required for FRR.Inter-Domain LSPs provisioned by SR-TE Policy are protected by FRR alsoin case of ABR failure (because of Anycast-SID). This is not possiblewith BGP-LU/BGP-PIC, since BGP-LU/BGP-PIC have to wait for the IGP toconverge first.SR Data Plane Monitoring provides proactive method to ensure reachability between all SR enabled nodes in an IGP domain. SR DPM utilizes well known MPLS OAM capabilities with crafted SID lists to ensure valid forwarding across the entire IGP domain. See the CST Implementation Guide for more details on SR Data Plane monitoring.Inter-Domain Open Ring SupportPrior to CST 4.0 and XR 7.2.1, the use of TI-LFA within a ring topology required the ring be closed within the IGP domain. This required an interconnect at the ASBR domain node for each IGP domain terminating on the ASBR. This type of connectivity was not always possible in an aggregation network due to fiber or geographic constraints. In CST 4.0 we have introduced support for open rings by utilizing MPLSoGRE tunnels between terminating boundary nodes across the upstream IGP domain. The following picture illustrates open ring support between an access and aggregation network.In the absence of a physical link between the boundary nodes PA1 and PA2, GRE tunnels can be created to interconnect each domain over its adjacent domain. During a protection event, such as the link failure between PA1 and GA1, traffic will enter the tunnel on the protection node, in this case PA1 towards PA2. Keep in mind traffic will loop back through the domain until re-convergence occurs. In the case of a core failure, bandwidth may not be available in an access ring to carry all core traffic, so care must be taken to determine traffic impact.Transport ProgrammabilityFigure 9 and Figure 10 show the design of Route-Reflectors (RR), Segment Routing Path Computation Element (SR-PCE) and WAN Automation Engines (WAE).High-Availability is achieved by device redundancy in the Aggregationand Core networks.Figure 9# Transport Programmability – PCEPTransport RRs collect network topology from ABRs through BGP Link State (BGP-LS).Each Transport ABR has a BGP-LS session with the two Domain RRs. Each domain is represented by a different BGP-LS instance ID.Aggregation Domain RRs collect network topology information from theAccess and the Aggregation IGP Domain (Aggregation ABRs are part of theAccess and the Aggregation IGP Domain). Core Domain RRs collect networktopology information from the Core IGP Domain.Aggregation Domain RRs have BGP-LS sessions with Core RRs.Through the Core RRs, the Aggregation Domains RRs advertise localAggregation and Access IGP topologies and receive the network topologiesof the remote Access and Aggregation IGP Domains as well as the networktopology of the Core IGP Domain. Hence, each RR maintains the overallnetwork topology in BGP-LS.Redundant Domain SR-PCEs have BGP-LS sessions with the local Domain RRsthrough which they receive the overall network topology. Refer toSection# “Segment Routing Path Computation Element (SR-PCE)” for more details about SR-PCE.SR-PCE is capable of computing the Inter-Domain LSP path on-demand. The computed path (Segment Routing SID List) is communicated to the Service End Points via a Path Computation Element Protocol (PCEP) response as shown in Figure 9.The Service End Points create a SR-TE Policy and use the SID list returned by SR-PCE as the primary path.Service End Points can be located on the Access Routers for FlatServices or at both the Access and domain PE routers for Hierarchical Services. The domain PE routers and ABRs may or may not be the same router. The SR-TE Policy DataPlane in the case of Service End Point co-located with the Access routerwas described in Figure 5.The proposed design is very scalable and can be easily extended tosupport even higher numbers of PCEP sessions by addingadditional RRs and SR-PCE elements into the Access Domain.Figure 11 shows the Converged SDN Transport physical topology with examplesof product placement.Figure 11# Converged SDN Transport – Physical Topology with transportprogrammabilityTraffic Engineering (Tactical Steering) – SR-TE PolicyOperators want to fully monetize their network infrastructure byoffering differentiated services. Traffic engineering is used to providedifferent paths (optimized based on diverse constraints, such aslow-latency or disjoined paths) for different applications. Thetraditional RSVP-TE mechanism requires signaling along the path fortunnel setup or tear down, and all nodes in the path need to maintainstates. This approach doesn’t work well for cloud applications, whichhave hyper scale and elasticity requirements.Segment Routing provides a simple and scalable way of defining anend-to-end application-aware traffic engineering path known as an SR-TE Policy. The SR-TE Policy expresses the intent of the applications constraints across the network.In the Converged SDN Transport design, the Service End Point uses PCEP along with Segment Routing On-Demand Next-hop (SR-ODN) capability, to request from the controller a path thatsatisfies specific constraints (such as low latency). This is done byassociating SLA tags/attributes to the path request. Upon receiving therequest, the SR-PCE controller calculates the path based on the requestedSLA, and uses PCEP to dynamically program the ingress nodewith a specific SR-TE Policy.Traffic Engineering - Dynamic Anycast-SID Paths and Black Hole AvoidanceAs shown in Figure 7, inter-domain resilience and load-balancing is satisfied by using the same Anycast SID on each boundary node. Starting in CST 3.5 Anycast SIDs are used by a centralized SR-PCE without having to define an explicit SID list. Anycast SIDs are learned via the topology information distributed to the SR-PCE using BGP-LS. Once the SR-PCE knows the location of a set of Anycast SIDs, it will utilize the SID in the path computation to an egress node. The SR-PCE will only utilize the Anycast SID if it has a valid path to the next SID in the computed path, meaning if one ABR loses it’s path to the adjacent domain, the SR-PCE will update the head-end path with one utilizing a normal node SID to ensure traffic is not blackhole.It is also possible to withdraw an anycast SID from the topology by using the conditional route advertisement feature for IS-IS, new in 3.5. Once the anycast SID Loopback has been withdrawn, it will no longer be used in a SR Policy path. Conditional route advertisement can be used for SR-TE Policies with Anycast SIDs in either dynamic or static SID candidate paths. Conditional route advertisement is implemented by supplying the router with a list of remote prefixes to monitor for reachability in the RIB. If those routes disappear from the RIB, the interface route will be withdrawn. Please see the CST Implementation Guide for instructions on configuring anycast SID inclusion and blackhole avoidance.Transport Controller Path Computation Engine (PCE)Segment Routing Path Computation Element (SR-PCE)Segment Routing Path Computation Element, or SR-PCE, is a Cisco Path Computation Engine(PCE) and is implemented as a feature included as part of CiscoIOS-XR operating system. The function is typically deployed on a CiscoIOS-XR cloud appliance XRv-9000, as it involves control plane operationsonly. The SR-PCE gains network topology awareness from BGP-LSadvertisements received from the underlying network. Such knowledge isleveraged by the embedded multi-domain computation engine to provideoptimal path information to Path Computation Element Clients (PCCs) using the PathComputation Element Protocol (PCEP).The PCC is the device where the service originates (PE) and therefore itrequires end-to-end connectivity over the segment routing enabledmulti-domain network.The SR-PCE provides a path based on constraints such as# Shortest path (IGP metrics). Traffic-Engineering metrics. Disjoint paths starting on one or two nodes. Latency Figure 12# XR Transport Controller – ComponentsPCE Controller Summary – SR-PCESegment Routing Path Computation Element (SR-PCE)# Runs as a feature on a physical or virtual IOS-XR node Collects topology from BGP using BGP-LS, ISIS, or OSPF Deploys SR Policies based on client requests Computes Shortest, Disjoint, Low Latency, and Avoidance paths North Bound interface with applications via REST APIConverged SDN Transport Path Computation WorkflowsStatic SR-TE Policy Configuration NSO provisions the service. Alternatively, the service can beprovisioned via CLI SR-TE Policy is configured via NSO or CLI on the access node to the other service end points, specifying pcep as the computation method Access Router requests a path from SR-PCE with metric type and constraints SR-PCE computes the path SR-PCE provides the path to Access Router Access Router acknowledges and installs the SR Policy as the forwarding path for the service. On-Demand Next-Hop Driven Configuration NSO provisions the service. Alternatively, the service can beprovisioned via CLI On-demand colors are configured on each node, specifying specific constraints and pcep as the dynamic computation method On reception of service routes with a specific ODN color community, Access Router requests a path from SR-PCE to the BGP next-hop as the SR-TE endpoint. SR-PCE computes the path SR-PCE provides the path to Access Router Access Router acknowledges and installs the SR Policy as the forwarding path for the service. Segment Routing Flexible Algorithms (Flex-Algo)A powerful tool used to create traffic engineered Segment Routing paths is SR Flexible Algorithms, better known as SR Flex-Algo. Flex-Algo assigns a specific set of “algorithms” to a Segment. The algorithm identifies a specific computation constraint the segment supports. There are standards based algorithm definitions such as least cost IGP path and latency, or providers can define their own algorithms to satisfy their business needs. CST 4.0 supports computation of Flex-Algo paths in intra-domain and inter-domain deployments. In CST 4.0 (IOS-XR 7.2.2) inter-domain Flex-Algo using SR-PCE is limited to IGP lowest metric path computation.Flex-Algo limits the computation of a path to only those nodes participating in that algorithm. This gives a powerful way to create multiple network domains within a single larger network, constraining an SR path computation to segments satisfying the metrics defined by the algorithm. As you will see, we can now use a single node SID to reach a node via a path satisfying an advanced constraint such as delay.Flex-Algo Node SID AssignmentNodes participating in a specific algorithm must have a unique node SID prefix assigned to the algorithm. In a typical deployment, the same Loopback address is used for multiple algorithms. IGP extensions advertise algorithm membership throughout the network. Below is an example of a node with multiple algorithms and node SID assignments. By default, the basic IGP path computation is assigned to algorithm “0”. Algorithm “1” is also reserved. Algorithms 128-255 are user-definable. All Flex-Algo SIDs belong to the same global SRGB so providers deploying SR should take this into account. Each algorithm should be assigned its own block of SIDs within the SRGB, in the case below the SRGB is 16000-32000, each algorithm is assigned 1000 SIDs. interface Loopback0 address-family ipv4 unicast prefix-sid index 150 prefix-sid algorithm 128 absolute 18003 prefix-sid algorithm 129 absolute 19003 prefix-sid algorithm 130 absolute 20003Flex-Algo IGP DefinitionFlexible algorithms being used within a network must be defined in the IGP domains in the network The configuration is typically done on at least one node under the IGP configuration for domain. Under the definition the metric type used for computation is defined along with any link affinities. Link affinities are used to constrain the algorithm to not only specific nodes, but also specific links. These affinities are the same previously used by RSVP-TE.Note# Inter-domain Flex-Algo path computation requires synchronized Flex-Algo definitions across the end-to-end path flex-algo 130 metric-type delay advertise-definition ! flex-algo 131 advertise-definition affinity exclude-any redPath Computation across SR Flex-Algo NetworkFlex-Algo works by creating a separate topology for each algorithm. By default, all links interconnecting nodes participating in the same algorithm can be used for those paths. If the algorithm is defined to include or exclude specific link affinities, the topology will reflect it. A SR-TE path computation using a specific Flex-Algo will use the Algo’s topology for end the end path computation. It will also look at the metric type defined for the Algo and use it for the path computation. Even with a complex topology, a single SID is used for the end to end path, as opposed to using a series of node and adjacency SIDs to steer traffic across a shared topology. Each node participating in the algorithm has adjacencies to other nodes utilizing the same algorithm, so when a incoming MPLS label matching the algo SID enters, it will utilize the path specific to the algo. A Flex-Algo can also be used as a constraint in an ODN policy.Flex-Algo Dual-Plane ExampleA very simple use case for Flex-Algo is to easily define a dual-plane network topology where algorithm 129 red and algorithm 130 is green. Nodes A1 and A6 participate in both algorithms. When a path request is made for algorithm 129, the head-end nodes A1 and A6 will only use paths specific to the algorithm. The SR-TE Policy does not need to reference the specific SID, only the Algo being used as the constraints. The local node or SR-PCE will utilize the Algo to compute the path dynamically.The following policy configuration is an example of constraining the path to the Algo 129 “Red” path.segment-routing traffic-eng policy GREEN-PE8-128 color 1128 end-point ipv4 100.0.2.53 candidate-paths preference 1 dynamic pcep ! metric type igp ! ! constraints segments sid-algorithm 129 Segment Routing and Unified MPLS (BGP-LU) Co-existenceSummaryIn the Converged SDN Transport 3.0 design we introduce validation for the co-existence of services using BGP Labeled Unicast transport for inter-domain forwarding and those using SR-TE. Many networks deployed today have an existing BGP-LU design which may not be easily migrated to SR, so graceful introduction between the two transport methods is required. In the case of a multipoint service such as EVPN ELAN or L3VPN, an endpoint may utilize BGP-LU to one endpoint and SR-TE to another.ABR BGP-LU designIn a BGP-LU design each IGP domain or ASBR boundary node will exchange BGP labeled prefixes between domains while resetting the BGP next-hop to its own loopback address. The labeled unicast label will change at each domain boundary across the end to end network. Within each IGP domain, a label distribution protocol is used to supply MPLS connectivity between the domain boundary and interior nodes. In the Converged SDN Transport design, IS-IS with SR-MPLS extensions is used to provide intra-domain MPLS transport. This ensures within each domain BGP-LU prefixes are protected using TI-LFA.The BGP-LU design utilized in the Converged SDN Transport validation is based on Cisco’s Unified MPLS design used in EPN 4.0. More information can be found at# <a href=https#//www.cisco.com/c/dam/en/us/td/docs/solutions/Enterprise/Mobility/EPN/4_0/EPN_4_Transport_Infrastructure_DIG.pdf></a>Quality of Service and AssuranceOverviewQuality of Service is of utmost importance in today’s multi-service converged networks. The Converged SDN Transport design has the ability to enforce end to end traffic path SLAs using Segment Routing Traffic Engineering. In addition to satisfying those path constraints, traditional QoS is used to make sure the PHB (Per-Hop Behavior) of each packet is enforced at each node across the converged network.NCS 540, 560, and 5500 QoS PrimerFull details of the NCS 540 and 5500 QoS capabilities and configuration can be found at# <a href=https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/qos/66x/b-qos-cg-ncs5500-66x/b-qos-cg-ncs5500-66x_chapter_010.html></a>The NCS platforms utilize the same MQC configuration for QoS as other IOS-XR platforms but based on their hardware architecture use different elements for implementing end to end QoS. On these platforms ingress traffic is# Matched using flexible criteria via Class Maps Assigned to a specific Traffic Class (TC) and/or QoS Group for further treatment on egress Has its header marked with a specific IPP, DSCP, or MPLS EXP valueTraffic Classes are used internally for determining fabric priority and as the match condition for egress queuing. QoS Groups are used internally as the match criteria for egress CoS header re-marking. IPP/DSCP marking and re-marking of ingress MPLS traffic is done using ingress QoS policies. MPLS EXP for imposed labels can be done on ingress or egress, but if you wish to rewrite both the IPP/DSCP and set an explicit EXP for imposed labels, the MPLS EXP must be set on egress.The priority-level command used in an egress QoS policy specifies the egress transmit priority of the traffic vs. other priority traffic. Priority levels can be configured as 1-7 with 1 being the highest priority. Priority level 0 is reserved for best-effort traffic.Please note, multicast traffic does not follow the same constructs as unicast traffic for prioritization. All multicast traffic assigned to Traffic Classes 1-4 are treated as Low Priority and traffic assigned to 5-6 treated as high priority.Hierarchical Edge QoSHierarchical QoS enables a provider to set an overall traffic rate across all services, and then configure parameters per-service via a child QoS policy where the percentages of guaranteed bandwidth are derived from the parent rateH-QoS platform supportNCS platforms support 2-level and 3-level H-QoS. 3-level H-QoS applies a policer (ingress) or shaper (egress) to a physical interface, with each sub-interface having a 2-level H-QoS policy applied. Hierarchical QoS is not enabled by default on the NCS 540 and 5500 platforms. H-QoS is enabled using the hw-module profile qos hqos-enable command. Once H-QoS is enabled, the number of priority levels which can be assigned is reduced from 1-7 to 1-4. Additionally, any hierarchical QoS policy assigned to a L3 sub-interface using priority levels must include a “shape” command.The ASR9000 supports multi-level H-QoS at high scale for edge aggregation function. In the case of hierarchical services, H-QoS can be applied to PWHE L3 interfaces.CST Core QoS mapping with five classesQoS designs are typically tailored for each provider, but we introduce a 5-level QoS design which can fit most provider needs. The design covers transport of both unicast and multicast traffic. Traffic Type Core Marking Core Priority Comments Network Control EXP 6 Highest Underlay network control plane Low latency EXP 5 Highest Low latency service, consistent delay High Priority 1 EXP 3 Medium-High High priority service traffic Medium Priority / Multicast EXP 2 Medium priority and multicast   Best Effort EXP 0 General user traffic   Example Core QoS Class and Policy MapsThese are presented for reference only, please see the implementation guide for the full QoS configurationClass maps for ingress header matchingclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 end-class-mapIngress QoS policypolicy-map ingress-classifier class match-ef-exp5 set traffic-class 2 set qos-group 2 ! class match-cs5-exp4 set traffic-class 3 set qos-group 3 ! class class-default set traffic-class 0 set dscp 0 set qos-group 0 ! end-policy-mapClass maps for egress queuing and marking policiesclass-map match-any match-traffic-class-2 description ~Match highest priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match high priority traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-mapEgress QoS queuing policypolicy-map egress-queuing class match-traffic-class-2 priority level 2 ! class match-traffic-class-3 priority level 3 ! class class-default ! end-policy-mapEgress QoS marking policypolicy-map core-egress-exp-marking class match-qos-group-2 set mpls experimental imposition 5 ! class match-qos-group-3 set mpls experimental imposition 4 class class-default set mpls experimental imposition 0 ! end-policy-mapConverged SDN Transport Use CasesService Provider networks must adopt a very flexible design that satisfyany to any connectivity requirements, without compromising in stabilityand availability. Moreover, transport programmability is essential tobring SLA awareness into the network.The goal of the Converged SDN Transport is to provide a flexible networkblueprint that can be easily customized to meet customer specificrequirements. This blueprint must adapt to carry any service type, for examplecable access, mobile, and business services over the same converged network infrastructure. The following sections highglight some specific customer use cases and the components of the design used in building those solutions.5G Mobile NetworksSummary and 5G Service TypesThe Converged SDN Transport design introduces initial support for 5G networks and 5G services. There are a variety of new service use cases being defined by 3GPP for use on 5G networks, illustrated by the figure below. Networks must now be built to support the stringent SLA requirements of Ultra-Reliable Low-Latency services while also being able to cope with the massive bandwidth introduced by Enhanced Mobile Broadband services. The initial support for 5G in the Converged SDN Transport design focuses on the backhaul and midhaul portions of the network utilizing end to end Segment Routing. The design introduces no new service types, the existing scalable L3VPN and EVPN based services using BGP are sufficient for carrying 5G control-plane and user-plane traffic.Key Validated ComponentsThe following key features have been added to the CST validated design to support 5G deploymentsEnd to End Timing ValidationEnd to end timing using PTP with profiles G.8275.1 and G.8275.2 have been validated in the CST design. Best practice configurations are available in the online configurations and CST Implementation Guide. It is recommended to use G.8257.1 when possible to main the highest level of accuracy across the network. In CST 4.0+ we include validation for G.8275.1 to G.8275.2 interworking, allowing the use of different profiles across the network. Synchronous Ethernet (SyncE) is also recommended across the network to maintain stability when timing to the PRC. All nodes used in the CST design support SyncE.Low latency SR-TE path computationIn this release of the CST design, we introduce a new validated constraint type for SR-TE paths used for carrying services across the network. The “latency” constraint used either with a configured SR Policy or ODN SR Policy specifies the computation engine to look for the lowest latency path across the network. The latency computation algorithm can use different mechanisms for computing the end to end path. The first and preferred mechanism is to use the realtime measured per-link one-way delay across the network. This measured information is distributed via IGP extensions across the IGP domain and then onto external PCEs using BGP-LS extensions for use in both intra-domain and inter-domain calculations. In version 3.0 of the CST this is supported on ASR9000 links using the Performance Measurement link delay feature. More detail on the configuration can be found at https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-0/segment-routing/configuration/guide/b-segment-routing-cg-asr9000-70x/b-segment-routing-cg-asr9000-70x_chapter_010000.html#id_118505. In release 6.6.3 NCS 540 and NCS 5500 nodes support the configuration of static link-delay values which are distributed using the same method as the dynamic values. Two other metric types can also be utilized as part of the “latency” path computation. The TE metric, which can be defined on all SR IS-IS links and the regular IGP metric can be used in the absence of the link-delay metric.Dynamic Link Performance MeasurementStarting in version 3.5 of the CST, dynamic measurement of one-way and two-way latency on logical links is fully supported across all devices. The delay measurement feature utilizes TWAMP-Lite as the transport mechanism for probes and responses. PTP is a requirement for accurate measurement of one-way latency across links and is recommended for all nodes. In the absence of PTP a “two-way” delay mode is supported to calculate the one-way link delay. It is recommended to configure one-way delay on all IS-IS core links within the CST network. A sample configuration can be found below and detailed configuration information can be found in the implementation guide.One way delay measurement is also available for SR-TE Policy paths to give the provider an accurate latency measurement for all services utilizing the SR-TE Policy. This information is available through SR Policy statistics using the CLI or model-driven telemetry. The latency measurement is done for all active candidate paths.Dynamic one-way link delay measurements using PTP are not currently supported on unnumbered interfaces. In the case of unnumbered interfaces, static link delay values must be used.Different metric types can be used in a single path computation, with the following order used# Unidirectional link delay metric either computed or statically defined Statically defined TE metric IGP metricSR Policy latency constraint configuration on configured policysegment-routing traffic-eng policy LATENCY-POLICY color 20 end-point ipv4 1.1.1.3 candidate-paths preference 100 dynamic mpls metric type latencySR Policy latency constraint configuration for ODN policiessegment-routing traffic-eng on-demand color 100 dynamic pcep ! metric type latencyDynamic link delay metric configurationperformance-measurement interface TenGigE0/0/0/10 delay-measurement interface TenGigE0/0/0/20 delay-measurement ! ! protocol twamp-light measurement delay unauthenticated querier-dst-port 12345 ! ! ! delay-profile interfaces advertisement accelerated threshold 25 ! periodic interval 120 threshold 10 ! ! probe measurement-mode one-way protocol twamp-light computation-interval 60 ! !Static defined link delay metricStatic delay is set by configuring the “advertise-delay” value in microseconds under each interfaceperformance-measurement interface TenGigE0/0/0/10 delay-measurement advertise-delay 15000 interface TenGigE0/0/0/20 delay-measurement advertise-delay 10000TE metric definitionsegment-routing traffic-eng interface TenGigE0/0/0/10 metric 15 ! interface TenGigE0/0/0/20 metric 10The link-delay metrics are quantified in the unit of microseconds. On most networks this can be quite large and may be out of range from normal IGP metrics, so care must be taken to ensure proper compatibility when mixing metric types. The largest possible IS-IS metric is 16777214 which is equivalent to 16.77 seconds.SR Policy one-way delay measurementIn addition to the measurement of delay on physical links, the end to end one-way delay can also be measured across a SR Policy. This allows a provider to monitor the traffic path for increases in delay and log/alarm when thresholds are exceeded. Please note SR Policy latency measurements are not supported for PCE-computed paths, only those using head-end computation or configured static segment lists. The basic configuration for SR Policy measurement follows#performance-measurement delay-profile sr-policy advertisement accelerated threshold 25 ! periodic interval 120 threshold 10 ! threshold-check average-delay ! ! probe tos dscp 46 ! measurement-mode one-way protocol twamp-light computation-interval 60 burst-interval 60 ! ! protocol twamp-light measurement delay unauthenticated querier-dst-port 12345 ! ! !!segment-routing traffic-eng policy APE7-PM color 888 end-point ipv4 100.0.2.52 candidate-paths preference 200 dynamic metric type igp ! ! ! ! performance-measurement delay-measurement logging delay-exceededSegment Routing Flexible Algorithms for 5G SlicingSR Flexible Algorithms, outlined earlier in the transport section, give providers a powerful mechanism to segment networks into topoligies defined by SLA requirements. The SLA-driven topologies solve the constraints of specific 5G service types such as Ulta-Reliable Low-Latency Services. Using SR with a packet dataplane ensures the most efficient network possible, unlike slicing solutions using optical transport or OTN.End to end network QoS with H-QoS on Access PEQoS is of utmost importance for ensuring the mobile control plane and critical user plane traffic meets SLA requirements. Overall network QoS is covered in the QoS section in this document, this section will focus on basic Hierarchical QoS to support 5G services.H-QoS enables a provider to set an overall traffic rate across all services, and then configure parameters per-service via a child QoS policy where the percentages of guaranteed bandwidth are derived from the parent rate. NCS platforms support 2-level and 3-level H-QoS. 3-level H-QoS applies a policer (ingress) or shaper (egress) to a physical interface, with each sub-interface having a 2-level H-QoS policy applied.CST QoS mapping with 5 classes Traffic Type Ingress Marking Core Marking Comments Low latency IPP 5 EXP 5 URLLC, consistent delay, small buffer 5G Control Plane IPP 4 EXP 4 Mobile control and billing High Priority Service IPP 3 (in contract), 1 (out of contract) EXP 1,3 Business service Best Effort IPP 0 EXP 0 General user traffic Network Control IPP 6 EXP 6 Underlay network control plane FTTH Design using EVPN E-TreeSummaryMany providers today are migrating from L2 access networks to more flexible L3 underlay networks using xVPN overlays to support a variety of network services. L3 networks offer more flexibility in terms of topology, resiliency, and support of both L2VPN and L3VPN services. Using a converged aggregation and access network simplifies networks and reduced both capex and opex spend by eliminating duplicate networks. Fiber to the home networks using active Ethernet have typically used L2 designs using proprietary methods like Private VLANs for subscriber isolation. EVPN E-Tree gives us a modern alternative to provide these services across a converged L3 Segment Routing network.E-Tree DiagramE-Tree OperationOne of the strongest features of EVPN is its dynamic signaling of PE state across the entire EVPN virtual instance. E-Tree extends this paradigm by signaling between EVPN PEs which Ethernet Segments are considered root segments and which ones are considered leaf segments. Similar to hub and spoke L3VPN networks, traffic is allowed between root/leaf and root/root interfaces but not between leaf interfaces either on the same node or on different nodes. EVPN signaling creates the forwarding state and entries to restrict traffic forwarding between endpoints connected to the same leaf Ethernet Segment.Split-Horizon GroupsE-Tree enables split horizon groups on access interfaces within the same Bridge Domain/EVI configured for E-Tree to prohibit direct L2 forwarding between these interfaces.L3 IRB SupportIn a fully distributed FTTH deployment, a provider may choose to put the L3 gateway for downstream access endpoints on the leaf device. The L3 BVI interface defined for the E-Tree BD/EVI is always considered a root endpoint. E-Tree operates at L2 so when a L3 interface is present traffic will be forwarded at L3 between leaf endpoints. Note L2 leaf devices using a centralized IRB L3 GW on an E-Tree root node is not currently supported. In this type of deployment where the L3 GW is not located on the leaf the upstream L3 GW node must be attached via a L2 interface into the E-Tree root node Bridge Domani/EVI. It is recommended to locate the L3 GW on the leaf device if possible.Multicast TrafficMulticast traffic across the E-Tree L2/L3 network is performed using ingress replication from the source to the receiver nodes. It is important to use IGMP or MLDv2 snooping in order to minimize the flooding of multicast traffic across the entire Ethernet VPN instance. When snooping is utilized, traffic is only sent to EVPN PE nodes with interested receivers instead of all PEs in the EVI.Ease of ConfigurationConfiguring a node as a leaf in an E-Tree EVI requires only a single command “etree” to be configured under the EVI in the global EVPN configuration. Please see the Implementation Guide for specific configuration examples.l2vpn bridge group etree bridge-domain etree-ftth interface TenGigE0/0/0/23.1098 routed interface BVI100 ! evi 100 !!evpn evi 100 etree leaf ! advertise-mac ! ! Cable Converged Interconnect Network (CIN)SummaryThe Converged SDN Transport Design enables a multi-service CIN by adding support for the features and functions required to build a scalable next-generation Ethernet/IP cable access network. Differentiated from simple switch or L3 aggregation designs is the ability to support NG cable transport over the same common infrastructure already supporting other services like mobile backhaul and business VPN services. Cable Remote PHY is simply another service overlayed onto the existing Converged SDN Transport network architecture. We will cover all aspects of connectivity between the Cisco cBR-8 and the RPD device.Distributed Access ArchitectureThe cable Converged Interconnect Network is part of a next-generation Distributed Access Architecture (DAA), an architecture unlocking higher subscriber bandwidth by moving traditional cable functions deeper into the network closer to end users. R-PHY or Remote PHY, places the analog to digital conversion much closer to users, reducing the cable distance and thus enabling denser and higher order modulation used to achieve Gbps speeds over existing cable infrastructure. This reference design will cover the CIN design to support Remote PHY deployments.Remote PHY Components and RequirementsThis section will list some of the components of an R-PHY network and the network requirements driven by those components. It is not considered to be an exhaustive list of all R-PHY components, please see the CableLabs specification document, the latest which can be access via the following URL# https#//specification-search.cablelabs.com/CM-SP-R-PHYRemote PHY Device (RPD)The RPD unlocks the benefits of DAA by integrating the physical analog to digital conversions in a device deployed either in the field or located in a shelf in a facility. The uplink side of the RPD or RPHY shelf is simply IP/Ethernet, allowing transport across widely deployed IP infrastructure. The RPD-enabled node puts the PHY function much closer to an end user, allowing higher end-user speeds. The shelf allows cable operators to terminate only the PHY function in a hub and place the CMTS/MAC function in a more centralized facility, driving efficiency in the hub and overall network. The following diagram shows various options for how RPDs or an RPD shelf can be deployed. Since the PHY function is split from the MAC it allows independent placement of those functions.RPD Network ConnectionsEach RPD is typically deployed with a single 10GE uplink connection. The compact RPD shelf uses a single 10GE uplink for each RPD.Cisco cBR-8 and cnBRThe Cisco Converged Broadband Router performs many functions as part of a Remote PHY solution. The cBR-8 provisions RPDs, originates L2TPv3 tunnels to RPDs, provisions cable modems, performs cable subscriber aggregation functions, and acts as the uplink L3 router to the rest of the service provider network. In the Remote PHY architecture the cBR-8 acts as the DOCSIS core and can also serve as a GCP server and video core. The cBR-8 runs IOS-XE. The cnBR, cloud native Broadband Router, provides DOCSIS core functionality in a server-based software platform deployable anywhere in the SP network. CST 3.0 has been validated using the cBR-8, the cnBR will be validated in an upcoming release.cBR-8 Network ConnectionsThe cBR-8 is best represented as having “upstream” and “downstream” connectivity.The upstream connections are from the cBR8 Supervisor module to the SP network. Subscriber data traffic and video ingress these uplink connections for delivery to the cable access network. The cBR-8 SUP-160 has 8x10GE SFP+ physical connections, the SUP-250 has 2xQSFP28/QSFP+ interfaces for 40G/100G upstream connections.In a remote PHY deployment the downstream connections to the CIN are via the Digital PIC (DPIC-8X10G) providing 40G of R-PHY throughput with 8 SFP+ network interfaces.cBR-8 RedundancyThe cBR-8 supports both upstream and downstream redundancy. Supervisor redundancy uses active/standby connections to the SP network. Downstream redundancy can be configured at both the line card and port level. Line card redundancy uses an active/active mechanism where each RPD connects to the DOCSIS core function on both the active and hot standby Digital PIC line card. Port redundancy uses the concept of “port pairs” on each Digital PIC, with ports 0/1, 2/3, 4/6, and 6/7 using either an active/active (L2) or active/standby (L3) mechanism. In the CST design we utilize a L3 design with the active/standby mechanism. The mechanism uses the same IP address on both ports, with the standby port kept in a physical down state until switchover occurs.Remote PHY CommunicationDHCPThe RPD is provisioned using ZTP (Zero Touch Provisioning). DHCPv4 and DHCPv6 are used along with CableLabs DHCP options in order to attach the RPD to the correct GCP server for further provisioning.Remote PHY Standard FlowsThe following diagram shows the different core functions of a Remote PHY solution and the communication between those elements.GCPGeneric Communications Protocol is used for the initial provisioning of the RPD. When the RPD boots and received its configuration via DHCP, one of the DHCP options will direct the RPD to a GCP server which can be the cBR-8 or Cisco Smart PHY. GCP runs over TCP typically on port 8190.UEPI and DEPI L2TPv3 TunnelsThe upstream output from an RPD is IP/Ethernet, enabling the simplification of the cable access network. Tunnels are used between the RPD PHY functions and DOCSIS core components to transport signals from the RPD to the core elements, whether it be a hardware device like the Cisco cBR-8 or a virtual network function provided by the Cisco cnBR (cloud native Broadband Router).DEPI (Downstream External PHY Interface) comes from the M-CMTS architecture, where a distributed architecture was used to scale CMTS functions. In the Remote PHY architecture DEPI represents a tunnel used to encapsulate and transport from the DOCSIS MAC function to the RPD. UEPI (Upstream External PHY Interface) is new to Remote PHY, and is used to encode and transport analog signals from the RPD to the MAC function.In Remote PHY both DEPI and UEPI tunnels use L2TPv3, defined in RFC 3931, to transport frames over an IP infrastructure. Please see the following Cisco white paper for more information on how tunnels are created specific to upstream/downstream channels and how data is encoded in the specific tunnel sessions. https#//www.cisco.com/c/en/us/solutions/collateral/service-provider/converged-cable-access-platform-ccap-solution/white-paper-c11-732260.html. In general there will be one or two (standby configuration) UEPI and DEPI L2TPv3 tunnels to each RPD, with each tunnel having many L2TPv3 sessions for individual RF channels identified by a unique session ID in the L2TPv3 header. Since L2TPv3 is its own protocol, no port number is used between endpoints, the endpoint IP addresses are used to identify each tunnel. Unicast DOCSIS data traffic can utilize either or multicast L2TPv3 tunnels. Multicast tunnels are used with downstream virtual splitting configurations. Multicast video is encoded and delivered using DEPI tunnels as well, using a multipoint L2TPv3 tunnel to multiple RPDs to optimize video delivery.CIN Network RequirementsIPv4/IPv6 Unicast and MulticastDue to the large number of elements and generally greenfield network builds, the CIN network must support all functions using both IPv4 and IPv6. IPv6 may be carried natively across the network or within an IPv6 VPN across an IPv4 MPLS underlay network. Similarly the network must support multicast traffic delivery for both IPv4 and IPv6 delivered via the global routing table or Multicast VPN. Scalable dynamic multicast requires the use of PIMv4, PIMv6, IGMPv3, and MLDv2 so these protocols are validated as part of the overall network design. IGMPv2 and MLDv2 snooping are also required for designs using access bridge domains and BVI interfaces for aggregation.Network TimingFrequency and phase synchronization is required between the cBR-8 and RPD to properly handle upstream scheduling and downstream transmission. Remote PHY uses PTP (Precision Timing Protocol) for timing synchronization with the ITU-T G.8275.2 timing profile. This profile carries PTP traffic over IP/UDP and supports a network with partial timing support, meaning multi-hop sessions between Grandmaster, Boundary Clocks, and clients as shown in the diagram below. The cBR-8 and its client RPD require timing alignment to the same Primary Reference Clock (PRC). In order to scale, the network itself must support PTP G.8275.2 as a T-BC (Boundary Clock). Synchronous Ethernet (SyncE) is also recommended across the CIN network to maintain stability when timing to the PRC.CST 4.0+ Update to CIN Timing DesignStarting in CST 4.0, NCS nodes support both G.8275.1 and G.8275.2 on the same node, and also support interworking between them. If the network path between the PTP GM and client RPDs can support G.8275.1 on each hop, it should be used. G.8275.1 runs on physical interfaces and does not have limitations such as running over Bundle Ethernet interfaces. The G.8275.1 to G.8275.2 interworking will take place on the RPD leaf node, with G.8275.2 being used to the RPDs. The following diagram depicts a recommended end-to-end timing design between the PTP GM and the RPD. Please review the CST 4.0 Implementation Guide for details on configuring G.8275.1 to G.8275.2 interworking. In addition to PTP interworking, CST 4.0 supports PTP timing on BVI interfaces.QoSControl plane functions of Remote PHY are critical to achieving proper operation and subscriber traffic throughput. QoS is required on all RPD-facing ports, the cBR-8 DPIC ports, and all core interfaces in between. Additional QoS may be necessary between the cBR-8, RPD, and any PTP timing elements. See the design section for further details on QoS components.DHCPv4 and DHCPv6 RelayAs a critical component of the initial boot and provisioning of RPDs, the network must support DHCP relay functionality on all RPD-facing interfaces, for both IPv4 and IPv6.Converged SDN Transport CIN DesignDeployment Topology OptionsThe Converged SDN Transport design is extremely flexible in how Remote PHY components are deployed. Depending on the size of the deployment, components can be deployed in a scalable leaf-spine fabric with dedicated routers for RPD and cBR-8 DPIC connections or collapsed into a single pair of routers for smaller deployments. If a smaller deployment needs to be expanded, the flexible L3 routed design makes it very easy to simply interconnect new devices and scale the design to a fabric supporting thousands of RPD and other access network connections.High Scale Design (Recommended)This option maximizes statistical multiplexing by aggregating Digital PIC downstream connections on a separate leaf device, allowing one to connect a number of cBR-8 interfaces to a fabric with minimal 100GE uplink capacity. The topology also supports the connectivity of remote shelves for hub consolidation. Another benefit is the fabric has optimal HA and the ability to easily scale with more leaf and spine nodes.High scale topologyCollapsed Digital PIC and SUP Uplink ConnectivityThis design for smaller deployments connects both the downstream Digital PIC connections and uplinks on the same CIN core device. If there is enough physical port availability and future growth does not dictate capacity beyond these nodes this design can be used. This design still provides full redundancy and the ability to connect RPDs to any cBR-8. Care should be taken to ensure traffic between the DPIC and RPD does not traverse the SUP uplink interfaces.Collapsed cBR-8 uplink and Digital PIC connectivityCollapsed RPD and cBR-8 DPIC ConnectivityThis design connects each cBR-8 Digital PIC connection to the RPD leaf connected to the RPDs it will serve. This design can also be considered a “pod” design where cBR-8 and RPD connectivity is pre-planned. Careful planning is needed since the number of ports on a single device may not scale efficiently with bandwidth in this configuration.Collapsed or Pod cBR-8 Digital PIC and RPD connectivityIn the collapsed desigs care must be taken to ensure traffic between each RPD can reach the appropriate DPIC interface. If a leaf is single-homed to the aggregation router its DPIC interface is on, RPDs may not be able to reach their DPIC IP. The options with the shortest convergence time are# Adding interconnects between the agg devices or multiple uplinks from the leaf to agg devices.Cisco HardwareThe following table highlights the Cisco hardware utilized within the Converged SDN Transport design for Remote PHY. This table is non-exhaustive. One highlight is all NCS platforms listed are built using the same NPU family and share most features across all platforms. See specific platforms for supported scale and feature support. Product Role 10GE SFP+ 25G SFP28 100G QSFP28 Timing Comments NCS-55A1-24Q6H-S RPD leaf 48 24 6 Class B   N540-ACC-SYS RPD leaf 24 8 2 Class B Smaller deployments NCS-55A1-48Q6H-S DPIC leaf 48 48 6 Class B   NCS-55A2-MOD Remote agg 40 24 upto 8 Class B CFP2-DCO support NCS-55A1-36H-S Spine 144 (breakout) 0 36 Class B   NCS-5502 Spine 192 (breakout) 0 48 None   NCS-5504 Multi Upto 576 x Upto 144 Class B 4-slot modular platform Scalable L3 Routed DesignThe Cisco validated design for cable CIN utilizes a L3 design with or without Segment Routing. Pure L2 networks are no longer used for most networks due to their inability to scale, troubleshooting difficulty, poor network efficiency, and poor resiliency. L2 bridging can be utilized on RPD aggregation routers to simplify RPD connectivity.L3 IP RoutingLike the overall CST design, we utilize IS-IS for IPv4 and IPv6 underlay routing and BGP to carry endpoint information across the network. The following diagram illustrates routing between network elements using a reference deployment. The table below describes the routing between different functions and interfaces. See the implementation guide for specific configuration. Interface Routing Comments cBR-8 Uplink IS-IS Used for BGP next-hop reachability to SP Core cBR-8 Uplink BGP Advertise subscriber and cable-modem routes to SP Core cBR-8 DPIC Static default in VRF Each DPIC interface should be in its own VRF on the cBR-8 so it has a single routing path to its connected RPDs RPD Leaf Main IS-IS Used for BGP next-hop reachability RPD Leaf Main BGP Advertise RPD L3 interfaces to CIN for cBR-8 to RPD connectivity RPD Leaf Timing BGP Advertise RPD upstream timing interface IP to rest of network DPIC Leaf IS-IS Used for BGP next-hop reachability DPIC Leaf BGP Advertise cBR-8 DPIC L3 interfaces to CIN for cBR-8 to RPD connectivity CIN Spine IS-IS Used for reachability between BGP endpoints, the CIN Spine does not participate in BGP in a SR-enabled network CIN Spine RPD Timing IS-IS Used to advertise RPD timing interface BGP next-hop information and advertise default CIN Spine BGP (optional) In a native IP design the spine must learn BGP routes for proper forwarding CIN Router to Router InterconnectionIt is recommended to use multiple L3 links when interconnecting adjacent routers, as opposed to using LAG, if possible. Bundles increase the possibility for timing inaccuracy due to asymmetric timing traffic flow between slave and master. If bundle interfaces are utilized, care should be taken to ensure the difference in paths between two member links is kept to a minimum. All router links will be configured according to the global CST design. Leaf devices will be considered CST access PE devices and utilize BGP for all services routing.Leaf Transit TrafficIn a single IGP network with equal IGP metrics, certain link failures may cause a leaf to become a transit node. Several options are available to keep transit traffic from transiting a leaf and potentially causing congestion. Using high metrics on all leaf to agg uplinks will prohibit this and is recommended in all configurations.cBR-8 DPIC to CIN InterconnectionThe cBR-8 supports two mechanisms for DPIC high availability outlined in the overview section. DPIC line card and link redundancy is recommended but not a requirement. In the CST reference design, if link redundancy is being used each port pair on the active and standby line cards is connected to a different router and the default active ports (even port number) is connected to a different router. In the example figure, port 0 from active DPIC card 0 is connected to R1 and port 0 from standby DPIC card 1 is connected to R2. DPIC link redundancy MUST be configured using the “cold” method since the design is using L3 to each DPIC interface and no intermediate L2 switching. This is done with the cable rphy link redundancy cold global command and will keep the standby link in a down/down state until switchover occurs.DPIC line card and link HADPIC Interface ConfigurationEach DPIC interface should be configured in its own L3 VRF. This ensures traffic from an RPD assigned to a specific DPIC interface takes the traffic path via the specific interface and does not traverse the SUP interface for either ingress or egress traffic. It’s recommended to use a static default route within each DPIC VRF towards the CIN network. Dynamic routing protocols could be utilized, however it will slow convergence during redundancy switchover.Router Interface ConfigurationIf no link redundancy is utilized each DPIC interface will connect to the router using a point to point L3 interface.If using cBR-8 link HA, failover time is reduced by utilizing the same gateway MAC address on each router. Link HA uses the same IP and MAC address on each port pair on the cBR-8, and retains routing and ARP information for the L3 gateway. If a different MAC address is used on each router, traffic will be dropped until an ARP occurs to populate the GW MAC address on the router after failover. On the NCS platforms, a static MAC address cannot be set on a physical L3 interface. The method used to set a static MAC address is to use a BVI (Bridged Virtual Interface), which allows one to set a static MAC address. In the case of DPIC interface connectivity, each DPIC interface should be placed into its own bridge domain with an associated BVI interface. Since each DPIC port is directly connected to the router interface, the same MAC address can be utilized on each BVI.If using IS-IS to distribute routes across the CIN, each DPIC physical interface or BVI should be configured as a passive IS-IS interface in the topology. If using BGP to distribute routing information the “redistribute connected” command should be used with an appropriate route policy to restrict connected routes to only DPIC interface. The BGP configuration is the same whether using L3VPN or the global routing table.It is recommended to use a /31 for IPv4 and /127 for IPv6 addresses for each DPIC port whether using a L3 physical interface or BVI on the CIN router.RPD to Router InterconnectionThe Converged SDN Transport design supports both P2P L3 interfaces for RPD and DPIC aggregation as well as using Bridge Virtual Interfaces. A BVI is a logical L3 interface within a L2 bridge domain. In the BVI deployment the DPIC and RPD physical interfaces connected to a single leaf device share a common IP subnet with the gateway residing on the leaf router.It is recommended to configure the RPD leaf using bridge-domains and BVI interfaces. This eases configuration on the leaf device as well as the DHCP configuration used for RPD provisioning.The following shows the P2P and BVI deployment options.Native IP or L3VPN/mVPN DeploymentTwo options are available and validated to carry Remote PHY traffic between the RPD and MAC function. Native IP means the end to end communication occurs as part of the global routing table. In a network with SR-MPLS deployed such as the CST design, unicast IP traffic is still carried across the network using an MPLS header. This allows for fast reconvergence in the network by using SR and enabled the network to carry other VPN services on the network even if they are not used to carry Remote PHY traffic. In then native IP deployment, multicast traffic uses either PIM signaling with IP multicast forwarding or mLDP in-band signaling for label-switched multicast. The multicast profile used is profile 7 (Global mLDP in-band signaling). L3VPN and mVPN can also be utilized to carry Remote PHY traffic within a VPN service end to end. This has the benefit of separating Remote PHY traffic from the network underlay, improving security and treating Remote PHY as another service on a converged access network. Multicast traffic in this use case uses mVPN profile 14. mLDP is used for label-switched multicast, and the NG-MVPN BGP control plane is used for all multicast discovery and signaling. SR-TESegment Routing Traffic Engineering may be utilized to carry traffic end to end across the CIN network. Using On-Demand Networking simplifies the deployment of SR-TE Policies from ingress to egress by using specific color BGP communities to instruct head-end nodes to create policies satisfying specific user constraints. As an example, if RPD aggregation prefixes are advertised using BGP to the DPIC aggregation device, SR-TE tunnels following a user constraint can be built dynamically between those endpoints.CIN Quality of Service (QoS)QoS is a requirement for delivering trouble-free Remote PHY. This design uses sample QoS configurations for concept illustration, but QoS should be tailored for specific network deployments. New CIN builds can utilize the configurations in the implementation guide verbatim if no other services are being carried across the network. Please see the section in this document on QoS for general NCS QoS information and the implementation guide for specific details.CST Network Traffic ClassificationThe following lists specific traffic types which should be treated with specific priority, default markings, and network classification points. Traffic Type Ingress Interface Priority Default Marking Comments BGP Routers, cBR-8 Highest CS6 (DSCP 48) None IS-IS Routers, cBR-8 Highest CS6 IS-IS is single-hop and uses highest priority queue by default BFD Routers Highest CS6 BFD is single-hop and uses highest priority queue by default DHCP RPD High CS5 DHCP COS is set explicitly PTP All High DSCP 46 Default on all routers, cBR-8, and RPD DOCSIS MAP/UCD RPD, cBR-8 DPIC High DSCP 46   DOCSIS BWR RPD, cBR-8 DPIC High DSCP 46   GCP RPD, cBR-8 DPIC Low DSCP 0   DOCSIS Data RPD, cBR-8 DPIC Low DSCP 0   Video cBR-8 Medium DSCP 32 Video within multicast L2TPv3 tunnel when cBR-8 is video core MDD RPD, cBR-8 Medium DSCP 40   CST and Remote-PHY Load BalancingUnicast network traffic is load balanced based on MPLS labels and IP header criteria. The devices used in the CST design are capable of load balancing traffic based on MPLS labels used in the SR underlay and IP headers underneath any MPLS labels. In the higher bandwidth downstream direction, where a series of L2TP3 tunnels are created from the cBR-8 to the RPD, traffic is hashed based on the source and destination IP addresses of those tunnels. Downstream L2TPv3 tunnels from a single Digital PIC interface to a set of RPDs will be distributed across the fabric based on RPD destination IP address. The followUing illustrates unicast load balancing across the network.Multicast traffic is not load balanced across the network. Whether the network is utilizing PIMv4, PIMv6, or mVPN, a multicast flow with two equal cost downstream paths will utilize only a single path, and only a single member link will be utilized in a link bundle. If using multicast, ensure sufficient bandwidth is available on a single link between two adjacencies.SmartPHY RPD AutomationSmartPHY is an automation solution for managing deployed RPDs across the SP network. In a non-SmartPHY deployment providers must manually assign RPHY cores via DHCP and manually configure the cBR8 by CLI SmartPHY provides a flexible either GUI or API driven way to eliminate manual configuration. SmartPHY is configured as the RPHY core in the DHCP server for all RPDs. When the RPD boots it will initiate a GCP session to SmartPHY. SmartPHY identifies the RPD and if configured in SmartPHY, will redirect it to the proper RPHY core instance. When provisioning a new RPD, SmartPHY will also deploy the proper configuration to the RPHY core cBR8 node and verify the RPD is operational. The diagram below shows basic SmartPHY operation.4G Transport and Services ModernizationWhile talk about deploying 5G services has reached a fever pitch, many providers are continuing to build and evolve their 4G networks. New services require more agile and scalable networks, satisfied by Cisco’s Converged SDN Transport. The services modernization found in Converged SDN Transport 2.0 follows work done in EPN 4.0 located here# https#//www.cisco.com/c/dam/en/us/td/docs/solutions/Enterprise/Mobility/EPN/4_0/EPN_4_Transport_Infrastructure_DIG.pdf. Transport modernization requires simplification and new abilities. We evolve the EPN 4.0 design based on LDP and hierarchical BGP-LU to one using Segment Routing with an MPLS data plane and the SR-PCE to add inter-domain path computation, scale, and programmability. L3VPN based 4G services remain, but are modernized to utilize SR-TE On-Demand Next-Hop, reducing provisioning complexity, increasing scale, and adding advanced path computation constraints. 4G services utilizing L3VPN remain the same, but those utilizing L2VPN such as VPWS and VPLS transition to EVPN services. EVPN is the modern replacement for legacy LDP signalled L2VPN services, reducing complexity and adding advanced multi-homing functionality. The following table highlights the legacy and new way of delivering services for 4G. Element EPN 4.0 Converged SDN Transport Intra-domain MPLS Transport LDP IS-IS w/Segment Routing Inter-domain MPLS Transport BGP Labeled Unicast SR using SR-PCE for Computation MPLS L3VPN (LTE S1,X2) MPLS L3VPN MPLS L3VPN w/ODN L2VPN VPWS LDP Pseudowire EVPN VPWS w/ODN eMBMS Multicast Native / mLDP Native / mLDP The CST 4G Transport modernization covers only MPLS-based access and not L2 access scenarios.L3 IP Multicast and mVPNIP multicast continues to be an optimization method for delivering content traffic to many endpoints,especially traditional broadcast video. Unicast content dominates the traffic patterns of most networks today, but multicast carries critical high value services, so proper design and implementation is required. In Converged SDN Transport 2.0 we introduced multicast edge and core validation for native IPv4/IPv6 multicast using PIM, global multicast using in-band mLDP (profile 7), and mVPN using mLDP with in-band signaling (profile 6). Converged SDN Transport 3.0 extends this functionality by adding support for mLDP LSM with the NG-MVPN BGP control plane (profile 14). Using BGP signaling adds additional scale to the network over in-band mLDP signaling and fits with the overall design goals of CST. More information about deployment of profile 14 can be found in the Converged SDN Transport implementation guide. Converged SDN Transport 3.0 supports mLDP-based label switched multicast within a single doman and across IGP domain boundaries. In the case of the Converged SDN Transport design multicast has been tested with the source and receivers on both access and ABR PE devices. Supported Multicast Profiles Description Profile 6 mLDP VRF using in-band signaling Profile 7 mLDP global routing table using in-band signaling Profile 14 Partitioned MDT using BGP-AD and BGP c-multicast signaling LDP Auto-configurationLDP can automatically be enabled on all IS-IS interfaces with the following configuration in the IS-IS configurationrouter isis ACCESS address-family ipv4 unicast mpls ldp auto-config LDP mLDP-only Session Capability (RFC 7473)In Converged SDN Transport 3.0 we introduce the ability to only advertise mLDP state on each router adjacency, eliminating the need to filter LDP unicast FECs from advertisement into the network. This is done using the SAC (State Advertisement Control) TLV in the LDP initialization messages to advertise which LDP FEC classes to receive from an adjacent peer. We can restrict the capabilities to mLDP only using the following configuration. Please see the implementation guide and configurations for the full LDP configuration.mpls ldp capabilities sac mldp-only LDP Unicast FEC Filtering for SR Unicast with mLDP MulticastThe following is for historical context, please see the above section regarding disabling LDP unicast FECs using session capability advertisements.The Converged SDN Transport design utilized Segment Routing with the MPLS dataplane for all unicast traffic. The first phase of multicast support in Converged SDN Transport 2.0 will use mLDP for use with existing mLDP based networks and new networks wishing to utilize label switcched multicast across the core. LDP is enabled on an interface for both unicast and multicast by default. Since SR is being used for unicast, one must filtering out all LDP unicast FECs to ensure they are not distributed across the network. SR is used for all unicast traffic in the presence of an LDP FEC for the same prefix, but filtering them reduces control-plane activity, may aid in re-convergence, and simplifies troubleshooting. The following should be applied to all interfaces which have mLDP enabled#ipv4 access-list no-unicast-ldp 10 deny ipv4 any any!RP/0/RSP0/CPU0#Node-6#show run mpls ldpmpls ldplog neighboraddress-family ipv4 label local allocate for no-unicast-ldp L3 Multicast using Segment Routing TreeSID w/Static S,G MappingTreeSID DiagramTreeSID OverviewConverged SDN Transport 3.5 introduces Segment Routing Tree SID across all IOS-XR nodes. TreeSID utilizes the programmability of SR-PCE to create and maintain an optimized multicast tree from source to receiver across an SR-only IPv4 network. In CST 3.5 TreeSID utilizes MPLS labels at each hop in the network. Each node in the network maintains a session to the same set of SR-PCE controllers. The SR-PCEcreates the tree using PCE-initiated segments. TreeSID supports advanced functionality such as TI-LFA for fast protection and disjoint trees.Traffic is forwarded across the tree in CST 3.5 using static S,G mappings at the head-end source nodes and tail-end receiver nodes. Providers needing a solution where dynamic joins and leaves are not common, such as broadcast video deployments, can be benefit from the simplicity static TreeSID brings, eliminating the need for distributed BGP mVPN signaling. TreeSID is supported for both default VRF (Global Routing Table) and mVPN.Please see the CST 3.5 Implementation Guide for TreeSID configuration guidelines and examples.EVPN MulticastMulticast within a L2VPN EVPN has been supported since Converged SDN Transport 1.0. Multicast traffic within an EVPN is replicated to the endpoints interested in a specific group via EVPN signaling. EVPN utilizes ingress replication for all multicast traffic, meaning multicast is encapsulated with a specific EVPN label and unicast to each PE router with interested listeners for each multicast group. Ingress replication may add additional traffic to the network, but simplifies the core and data plane by eliminating multicast signaling, state, and hardware replication. EVPN multicast is also not subject to domain boundary restrictions.LDP to Converged SDN Transport MigrationVery few networks today are built as greenfield networks, most new designs are migrated from existing ones and must support some level of interop during migration. In the Converged SDN Transport design we tackle one of the most common migration scenarios, LDP to the Converged SDN Transport design. The following sections explain the configuration and best practices for performing the migration. The design is applicable to transport and services originating and terminating in the same LDP domain.Towards Converged SDN Transport DesignThe Converged SDN Transport design utilizes isolated IGP domains in different parts of the network, with each domain separated at a logical boundary by an ASBR router. SR-PCE is used to provide end to end paths across the inter-domain network. LDP does not support inter-domain transport, only between LDP FECs in the same IGP domain. It is recommended to plan logical boundaries if necessary when doing a flat LDP migration to the Converged SDN Transport design, so that when migration is complete the future scale benefits can be realized.Segment Routing EnablementOne must define the global Segment Routing Block (SRGB) to be used across the network on every node participating in SR. There is a default block enabled by default but it may not be large enough to supportan entire network, so it’s advised to right-size this value for your deployment. The current maximum SRGB sizefor SR-MPLS is 256K entries.Enabling SR in IS-IS requires only issuing the command “segment-routing mpls” under the IPv4 address-family and assigning a prefix-sid value to any loopback interfaces you require the node be addressed towards as a service destination. Enabling TI-LFA is done on a per-interface basis in the IS-IS configuration for each interface.Enabling SR-Prefer within IS-IS aids in migration by preferring a SR prefix-sid to a prefix over an LDP prefix, allowing a seamless migration to SR without needing to enable SR completely within a domain.Segment Routing Mapping Server DesignOne component introduced with Segment Routing is the SR Mapping Server (SRMS), a control-plane element converting unicast LDP FECs to Segment Routing prefix-SIDs for advertisement throughout the Segment Routing domain. Each separate IGP domain requires a pair of SRMS nodes until full migratino to SR is complete.AutomationZero Touch ProvisioningIn addition to model-driven configuration and operation, Converged SDN Transport 1.5 supports ZTP operation for automated device provisioning. ZTP is useful both in production as well as staging environments to automate initial device software installation, deploy an initial bootstrap configuration, as well as advanced functionality triggered by ZTP scripts. ZTP is supported on both out of band management interfaces as well as in-band data interfaces. When a device first boots, the IOS-XR ZTP processbeging on the management interface of the device and if no response is received, or the the interface is not active, the ZTP process will begin the process on data ports. IOS-XRcan be part of an ecosystem of automated device and service provisioning via Cisco NSO.Model-Driven TelemetryIn the 3.0 release the implementation guide includes a table of model-driven telemetry paths applicable to different components within the design. More information on Cisco model-driven telemetry can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/telemetry/66x/b-telemetry-cg-ncs5500-66x.html. Additional information about how to consume and visualize telemetry data can be found at https#//xrdocs.io/telemetry. We also introduce integration with Cisco Crosswork Health Insights, a telemetry and automated remediation platform, and sensor packs correspondding to Converged SDN Transport components. More information on Crosswork Health Insights can be found at https#//www.cisco.com/c/en/us/support/cloud-systems-management/crosswork-health-insights/model.html.Network Services Orchestrator (NSO)The NSO is a management and orchestration (MANO) solution for networkservices and Network Functions Virtualization (NFV). The NSO includescapabilities for describing, deploying, configuring, and managingnetwork services and VNFs, as well as configuring the multi-vendorphysical underlay network elements with the help of standard open APIssuch as NETCONF/YANG or a vendor-specific CLI using Network ElementDrivers (NED).In the Converged SDN Transport design, the NSO is used for ServicesManagement, Service Provisioning, and Service Orchestration.The NSO provides several options for service designing as shown inFigure 32 Service model with service template Service model with mapping logic Service model with mapping logic and servicetemplates Figure 32# NSO – ComponentsA service model is a way of defining a service in a template format.Once the service is defined, the service model accepts user inputs forthe actual provisioning of the service. For example, a E-Line servicerequires two endpoints and a unique virtual circuit ID to enable theservice. The end devices, attachment circuit UNI interfaces, and acircuit ID are required parameters that should be provided by the userto bring up the E-Line service. The service model uses the YANG modelinglanguage (RFC 6020) inside NSO to define a service.Once the service characteristics are defined based on the requirements,the next step is to build the mapping logic in NSO to extract the userinputs. The mapping logic can be implemented using Python or Java. Thepurpose of the mapping logic is to transform the service models todevice models. It includes mechanisms of how service related operationsare reflected on the actual devices. This involves mapping a serviceoperation to available operations on the devices.Finally, service templates need to be created in XML for each devicetype. In NSO, the service templates are required to translate theservice logic into final device configuration through CLI NED. The NSOcan also directly use the device YANG models using NETCONF for deviceconfiguration. These service templates enable NSO to operate in amulti-vendor environment.Converged SDN Transport Supported Service ModelsConverged SDN Transport 1.5 and later supports the following NSO service models for provisioning both hierarchical and flat services across the fabric. All NSO service modules in 1.5 utilize the IOS-XR and IOS-XE CLI NEDs for configuration.Figure 33# Automation – End-to-End Service ModelsFigure 34# Automation – Hierarchical Service ModelsBase Services supporting Advanced Use CasesOverviewThe Converged SDN Transport Design aims to enable simplification across alllayers of a Service Provider network. Thus, the Converged SDN Transportservices layer focuses on a converged Control Plane based on BGP.BGP based Services include EVPNs and Traditional L3VPNs (VPNv4/VPNv6).EVPN is a technology initially designed for Ethernet multipoint servicesto provide advanced multi-homing capabilities. By using BGP fordistributing MAC address reachability information over the MPLS network,EVPN brought the same operational and scale characteristics of IP basedVPNs to L2VPNs. Today, beyond DCI and E-LAN applications, the EVPNsolution family provides a common foundation for all Ethernet servicetypes; including E-LINE, E-TREE, as well as data center routing andbridging scenarios. EVPN also provides options to combine L2 and L3services into the same instance.To simplify service deployment, provisioning of all services is fullyautomated using Cisco Network Services Orchestrator (NSO) using (YANG)models and NETCONF. Refer to Section# “Network Services Orchestrator (NSO)”.There are two types of services# End-To-End and Hierarchical. The nexttwo sections describe these two types of services in more detail.Ethernet VPN (EVPN)EVPNs solve two long standing limitations for Ethernet Services inService Provider Networks# Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or withData Center Ethernet VPN Hardware SupportIn CST 3.0 EVPN ELAN, ETREE, and VPWS services are supported on all IOS-XR devices. The ASR920 running IOS-XE does not support native EVPN services, but can integrate into an overall EVPN service by utilizing service hierarchy. Please see the tables under Flat and Hierarchical Services for supported service types. Please note ODN is NOT supported for EVPN ELAN services in IOS-XR 6.6.3.Multi-Homed & All-Active Ethernet AccessFigure 21 demonstrates the greatest limitation of traditional L2Multipoint solutions likeVPLS.Figure 21# EVPN All-Active AccessWhen VPLS runs in the core, loop avoidance requires that PE1/PE2 andPE3/PE4 only provide Single-Active redundancy toward their respectiveCEs. Traditionally, techniques such mLACP or Legacy L2 protocols likeMST, REP, G.8032, etc. were used to provide Single-Active accessredundancy.The same situation occurs with Hierarchical-VPLS (H-VPLS), where theaccess node is responsible for providing Single-Active H-VPLS access byactive and backup spoke pseudowire (PW).All-Active access redundancy models are not deployable as VPLStechnology lacks the capability of preventing L2 loops that derive fromthe forwarding mechanisms employed in the Core for certain categories oftraffic. Broadcast, Unknown-Unicast and Multicast (BUM) traffic sourcedfrom the CE is flooded throughout the VPLS Core and is received by allPEs, which in turn flood it to all attached CEs. In our example PE1would flood BUM traffic from CE1 to the Core, and PE2 would sends itback toward CE1 upon receiving it.EVPN uses BGP-based Control Plane techniques to address this issue andenables Active-Active access redundancy models for either Ethernet orH-EVPN access.Figure 22 shows another issue related to BUM traffic addressed byEVPN.Figure 22# EVPN BUM DuplicationIn the previous example, we described how BUM is flooded by PEs over theVPLS Core causing local L2 loops for traffic returning from the core.Another issue is related to BUM flooding over VPLS Core on remote PEs.In our example either PE3 or PE4 receive and send the BUM traffic totheir attached CEs, causing CE2 to receive duplicated BUM traffic.EVPN also addresses this second issue, since the BGP Control Planeallows just one PE to send BUM traffic to an All-Active EVPN access.Figure 23 describes the last important EVPNenhancement.Figure 23# EVPN MAC Flip-FloppingIn the case of All-Active access, traffic is load-balanced (per-flow)over the access PEs (CE uses LACP to bundle multiple physical ethernetports and uses hash algorithm to achieve per flow load-balancing).Remote PEs, PE3 and PE4, receive the same flow from different neighbors.With a VPLS core, PE3 and PE4 would rewrite the MAC address tablecontinuously, each time the same mac address is seen from a differentneighbor.EVPN solves this by mean of “Aliasing”, which is also signaled via theBGP Control Plane.Service Provider Network - Integration with Central Office or with Data CenterAnother very important EVPN benefit is the simple integration withCentral Office (CO) or with Data Center (DC). Note that Metro CentralOffice design is not covered by this document.The adoption of EVPNs provides huge benefits on how L2 Multipointtechnologies can be deployed in CO/DC. One such benefit is the convergedControl Plane (BGP) and converged data plane (SR MPLS/SRv6) over SP WANand CO/DC network.Moreover, EVPNs can replace existing proprietary EthernetMulti-Homed/All-Active solutions with a standard BGP-based ControlPlane.End-To-End (Flat) – ServicesThe End-To-End Services use cases are summarized in the table in Figure24 and shown in the network diagram in Figure 25.Figure 24# End-To-End – Services tableFigure 25# End-To-End – ServicesAll services use cases are based on BGP Control Plane.Refer also to Section# “Transport and Services Integration”.Hierarchical – ServicesHierarchical Services Use Cases are summarized in the table of Figure 26and shown in the network diagram of Figure 27.Figure 26# Hierarchical – Services tableFigure 27# Hierarchical - ServicesHierarchical services designs are critical for Service Providers lookingfor limiting requirements on the access platforms and deploying morecentralized provisioning models that leverage very rich features sets ona limited number of touch points.Hierarchical Services can also be required by Service Providers who wantto integrate their SP-WAN with the Central Office/Data Center networkusing well-established designs based on Data Central Interconnect (DCI).Figure 27 shows hierarchical services deployed on PE routers, but thesame design applies when services are deployed on AG or DCI routers.The Converged SDN Transport Design offers scalable hierarchical services withsimplified provisioning. The three most important use cases aredescribed in the following sections# Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service(H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) andPWHE Hierarchical L2 Multipoint Multi-Homed/All-ActiveFigure 28 shows a very elegant way to take advantage of the benefits ofSegment-Routing Anycast-SID and EVPN. This use case providesHierarchical L2 Multipoint Multi-Homed/All-Active (Single-Homed Ethernetaccess) service with traditional access routerintegration.Figure 28# Hierarchical – Services (Anycast-PW)Access Router A1 establishes a Single-Active static pseudowire(Anycast-Static-PW) to the Anycast IP address of PE1/PE2. PEs anycast IPaddress is represented by Anycast-SID.Access Router A1 doesn’t need to establish active/backup PWs as in atraditional H-VPLS design and doesn’t need any enhancement on top of theestablished spoke pseudowire design.PE1 and PE2 use BGP EVPN Control Plane to provide Multi-Homed/All-Activeaccess, protecting from L2 loop, and providing efficient per-flowload-balancing (with aliasing) toward the remote PEs (PE3/PE4).A3, PE3 and PE4 do the same, respectively.Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRBFigure 29 shows how EVPNs can completely replace the traditional H-VPLSsolution. This use case provides the greatest flexibility asHierarchical L2 Multi/Single-Home, All/Single-Active modes are availableat each layer of the servicehierarchy.Figure 29# Hierarchical – Services (H-EVPN)Optionally, Anycast-IRB can be used to enable Hierarchical L2/L3Multi/Single-Home, All/Single-Active service and to provide optimal L3routing.Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHEFigure 30 shows how the previous H-EVPN can be extended by takingadvantage of Pseudowire Headend (PWHE). PWHE with the combination ofMulti-Homed, Single-Active EVPN provides an Hierarchical L2/L3Multi-Homed/Single-Active (H-EVPN) solution that supports QoS.It completely replaces traditional H-VPLS based solutions. This use caseprovides Hierarchical L2 Multi/Single-Home, All/Single-Activeservice.Figure 30# Hierarchical – Services (H-EVPN and PWHE)Refer also to the section# “Transport and Services Integration”.Services – Route-Reflector (S-RR)Figure 31 shows the design of Services Router-Reflectors(S-RRs).Figure 31# Services – Route-ReflectorsThe Converged SDN Transport Design focuses mainly on BGP-based services,therefore it is important to provide a robust and scalable ServicesRoute-Reflector (S-RR) design.For Redundancy reasons, there are at least 2 S-RRs in any given IGPDomain, although Access and Aggregation are supported by the same pairof S-RRs.Each node participating in BGP-based service termination has two BGPsessions with Domain Specific S-RRs and supports multipleaddress-Families# VPNv4, VPNv6, EVPN.Core Domain S-RRs cover the core Domain. Aggregation Domain S-RRs coverAccess and Aggregation Domains. Aggregation Domain S-RRs and Core S-RRshave BGP sessions among each other.The described solution is very scalable and can be easily extended toscale to higher numbers of BGP sessions by adding another pair of S-RRsin the Access Domain.Ethernet Services OAM using Ethernet CFMEthernet CFM using 802.1ag/Y.1731 has been added in the CST 3.0 design. Ethernet CFM provides end-to-end continuity monitoring and alerting on a per-service basis. Maintenance End Points (MEPs) are configured on PE-CE interfaces with periodic Continuity Check Messages (CCMs) sent between them utilizing the same forwarding path as service traffic. Ethernet CFM also enables the transmission of Alarm Indication Signal (AIS) messages to alert remote endpoints of local faults. Additional information on Ethernet CFM can be found in the CST Implementation Guide at https#//xrdocs.io/design/blogs/latest-converged-sdn-transport-implementation-guideTransport and Services IntegrationSection# “Transport - Design” described how Segment Routing provides flexible End-To-End andAny-To-Any Highly-Available transport together with Fast Re-Route. Aconverged BGP Control Plane provides a scalable and flexible solutionalso at the services layer.Figure 35 shows a consolidated view of the Converged SDN Transport networkfrom a Control-Plane standpoint. Note that while network operators coulduse both PCEP and BGR-SR-TE at the same time, it is nottypical.Figure 35# Converged SDN Transport – Control-PlaneAs mentioned, service provisioning is independent of the transportlayer. However, transport is responsible for providing the path based onservice requirements (SLA). The component that enables such integrationis On-Demand Next Hop (ODN). ODN is the capability of requesting to acontroller a path that satisfies specific constraints (such as lowlatency). This is achieved by associating an SLA tag/attribute to thepath request. Upon receiving the request, the SR-PCE controller calculatesthe path based on the requested SLA and use PCEP or BGP-SR-TE todynamically program the Service End Point with a specific SR-TE Policy.The Converged SDN Transport design also use MPLS Performance Management tomonitor link delay/jitter/drop (RFC6374) to be able to create a LowLatency topology dynamically.Figure 36 shows a consolidated view of the Converged SDN Transport network froma Data Planestandpoint.Figure 36# Converged SDN Transport – Data-PlaneThe Converged SDN Transport DesignTransportThis section describes in detail the Converged SDN Transportdesign. This Converged SDN Transport design focuses on transport programmability using Segment Routing and BGP-basedservices adoption.Figure 35 and Figure 36 show the network topology and transport DataPlane details for Phase 1. Refer also to the Access domain extension usecase in Section# “Use Cases”.The network is split into Access and Core IGP domains. Each IGP domainis represented by separate IGP processes. The Converged SDN Transportdesign uses ISIS IGP protocol for validation.Validation will be done on two types of access platforms, IOS-XR andIOS-XE, to proveinteroperability.Figure 37# Access Domain Extension – End-To-End TransportFor the End-To-End LSP shown in Figure 35, the Access Router imposes 3transport labels (SID-list) An additional label, the TI-LFA label, canbe also added for FRR (node and link protection). In the Core and in theremote Access IGP Domain, 2 additional TI-LFA labels can be used for FRR(node and link protection). In Phase 1 PE ABRs are represented byPrefix-SID. Refer also to Section# “Transport Programmability - Phase 1”.Figure 38# Access Domain Extension – Hierarchical TransportFigure 38 shows how the Access Router imposes a single transport labelto reach local PE ABRs, where the hierarchical service is terminated.Similarly, in the Core and in the remote Access IGP domain, thetransport LSP is contained within the same IGP domain (Intra-DomainLSP). Routers in each IGP domain can also impose two additional TI-LFAlabels for FRR (to provide node and link protection).In the Hierarchical transport use case, PE ABRs are represented byAnycast-SID or Prefix-SID. Depending on the type of service, Anycast-SIDor Prefix-SID is used for the transport LSP.Transport ProgrammabilityThe Converged SDN Transport employs a distributed and highly available SR-PCEdesign as described in Section# “Transport Programmability”. Transport programmability is basedon PCEP. Figure 39 shows the design when SR-PCE uses PCEP.Figure 39# SR-PCE – PCEPSR-PCE in the Access domain is responsible for Inter-Domain LSPs andprovides the SID-list. PE ABRs are represented by Prefix-SID.SR-PCE in the Core domain is responsible for On-Demand Nexthop (ODN) forhierarchical services. Refer to the table in Figure 41 to see whatservices use ODN. Refer to Section# “Transport Controller - Path Computation Engine (PCE)” to see more details about XRTransport Controller (SR-PCE). Note that Phase 1 uses the “DelegatedComputation to SR-PCE” mode described in Section# “Path Computation Engine - Workflow” without WAE as shownin Figure38.Figure 40# PCE Path Computation – Phase 1Delegated Computation to SR-PCE NSO provisions the service – Service can also be provisioned via CLI Access Router requests a path SR-PCE computes the path SR-PCE provides the path to Access Router Access Router confirms ServicesThis section describes the Services used in the Converged SDN TransportPhase 1.The table in Figure 41 describes the End-To-End services, while thenetwork diagram in Figure 42 shows how services are deployed in thenetwork. Refer also to Section# “Services - Design” of this document.Figure 41# End-To-End Services tableFigure 42# End-To-End ServicesThe table in Figure 43 describes the hierarchical services, while thenetwork diagram in Figure 44 shows how services are deployed in thenetwork. Refer also to Section# “Services - Design” of this document.In addition, the table in Figure 44 shows where PE ABRs Anycast-SID isrequired and where ODN in the Core IGP domain is used.Figure 43# Hierarchical Services tableFigure 44# Hierarchical ServicesThe Converged SDN Transport uses the hierarchical Services Route-Reflectors(S-RRs) design described in Section# “Services - Route-Reflector (S-RR)”. Figure 45 shows in detail the S-RRs design used for Phase 1.Figure 45# Services Route-Reflectors (S-RRs)Network Services Orchestrator (NSO) is used for service provisioning.Refer to Section# “Network Services Orchestrator (NSO)”.Transport and Services IntegrationTransport and Services integration is described in Section# “Transport and Services Integration” of this document. Figure 46 shows an example of End-To-End LSP and servicesintegration.Figure 46# Transport and Services Data-PlaneFigure 47 shows a consolidated view of the Transport and ServicesControl-Plane.Figure 47# Transport and Services Control-PlaneFigure 48 shows the detailed topology of the testbed used for validation.Figure 48# TestbedFigure 49 shows the detailed topology of the testbed used for CIN and Remote PHY validation.Figure 49# Remote PHY/CIN Validation TestbedThe Converged SDN Transport Design - SummaryThe Converged SDN Transport brings huge simplification at the Transport aswell as at the Services layers of a Service Provider network.Simplification is a key factor for real Software Defined Networking(SDN). Cisco continuously improves Service Provider network designs tosatisfy market needs for scalability and flexibility.From a very well established and robust Unified MPLS design, Cisco hasembarked on a journey toward transport simplification andprogrammability, which started with the Transport Control Planeunification in Evolved Programmable Network 5.0 (EPN5.0). The CiscoConverged SDN Transport provides another huge leap forward in simplification andprogrammability adding Services Control Plane unification andcentralized path computation.Figure 50# Converged SDN Transport – EvolutionThe transport layer requires only IGP protocols with Segment Routingextensions for Intra and Inter Domain forwarding. Fast recovery for nodeand link failures leverages Fast Re-Route (FRR) by Topology IndependentLoop Free Alternate (TI-LFA), which is a built-in function of SegmentRouting. End to End LSPs are built using Traffic Engineering by SegmentRouting, which does not require additional signaling protocols. Insteadit solely relies on SDN controllers, thus increasing overall networkscalability. The controller layer is based on standard industryprotocols like BGP-LS, PCEP, BGP-SR-TE, etc., for path computation andNETCONF/YANG for service provisioning, thus providing a on openstandards based solution.For all those reasons, the Cisco Converged SDN Transport design really brings anexciting evolution in Service Provider Networking.", "url": "/blogs/2021-01-20-converged-sdn-transport-4_0-hld/", "author": "Phil Bedard", "tags": "iosxr, Metro, Design, 5G, Cable, CIN" } , "#": {} , "#": {} , "blogs-2022-01-07-cst-routed-optical-1-0": { "title": "Cisco Routed Optical Networking", "content": " On This Page Revision History Solution Component Software Versions What is Routed Optical Networking? Key Drivers Changing Networks Network Complexity Inefficiences Between Network Layers Operational Complexity Network Cost Routed Optical Networking Solution Overview Today’s Complex Multi-Layer Network Infrastructure DWDM OTN Ethernet/IP Enabling Technologies Pluggable Digital Coherent Optics QSFP-DD and 400ZR and OpenZR+ Standards Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S) Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S) Cisco Routers Cisco DWDM Network Hardware Routed Optical Networking Network Use Cases Where to use 400ZR and where to use OpenZR+ Supported DWDM Optical Topologies 64 Channel FOADM P2P Deployment Colorless Add/Drop Deployment Multi-Degree ROADM Deployment Long-Haul Deployment Core Networks Metro Aggregation Access DCI and 3rd Party Location Interconnect Routed Optical Networking Architecture Hardware Routed Optical Networking Validated Routers Cisco 8000 Series Cisco 5700 Systems and NCS 5500 Line Cards ASR 9000 Series NCS 500 Series Routed Optical Networking Optical Hardware Network Convergence System 2000 Network Convergence System 1000 Multiplexer Network Convergence System 1001 Routed Optical Networking Automation Overview IETF ACTN SDN Framework Cisco’s SDN Controller Automation Stack Cisco Open Automation Crosswork Hierarchical Controller Crosswork Network Controller Cisco Optical Network Controller Cisco Network Services Orchestrator and Routed Optical Networking ML Core Function Pack Routed Optical Networking Service Management Supported Provisioning Methods OpenZR+ and 400ZR Properties ZR/ZR+ Supported Frequencies Supported Line Side Rate and Modulation Crosswork Hierarchical Controller UI Provisioning Inter-Layer Link Definition IP Link Provisioning Operational Discovery NSO RON-ML CFP Provisioning Routed Optical Networking Inter-Layer Links RON-ML End to End Service RON-ML API Provisioning IOS-XR CLI Configuration IOS-XR NETCONF Configuration Routed Optical Networking Assurance Crosswork Hierarchical Controller Multi-Layer Path Trace Routed Optical Networking Link Assurance IOS-XR CLI Monitoring of ZR400/OpenZR+ Optics Optics Controller Coherent DSP Controller Cisco IOS-XR Model-Driven Telemetry for ZR/ZR+ Monitoring Open-source ZR/ZR+ Monitoring Additional Resources Cisco Routed Optical Networking Home Cisco Routed Optical Networking Tech Field Day Cisco Champion Podcasts Cisco Routed Optical Networking 1.0 Solution Guide Appendix A Acronyms DWDM Network Hardware Overview Optical Transmitters and Receivers Multiplexers/Demultiplexers Optical Amplifiers Optical add/drop multiplexers (OADMs) Reconfigurable optical add/drop multiplexers (ROADMs) Revision History Version Date Comments 1.0 01/10/2022 Initial Routed Optical Networking Publication Solution Component Software Versions Element Version IOS-XR 7.3.2 IOS-XR (NCS 540) 7.4.1 NCS 2000 SVO 12.2 Cisco Optical Network Controller 1.1 Crosswork Network Controller 3.0 Crosswork Hierarchical Controller 5.1 Cisco EPNM 5.1.3 What is Routed Optical Networking?Routed Optical Networking as part of Cisco’s Converged SDN Transportarchitecture brings network simplification to the physical networkinfrastructure, just as EVPN and Segment Routing simplify the service andtraffic engineering network layers. Routed Optical Networking collapses complextechnologies and network layers into a single cost efficient and easy to managenetwork infrastructure. Here we present the Cisco Routed Optical Networkingarchitecture and validated design.Key DriversChanging NetworksInternet traffic has seen a compounded annual growth rate of 30% or higher overthe last ten years, as more devices are connected, end user bandwidth speedsincrease, and applications continue to move to the cloud. The introduction of 5Gin mobile carriers and backhaul providers is also a disruptor, networks must bebuilt to handle the advanced services and traffic increase associated with 5G.Networks must evolve so the infrastructure layer can keep up with the servicelayer. 400G Ethernet is the next evolution for SP IP network infrastructure, andwe must make that as efficient as possible.Network ComplexityComputer networks at their base are a set of interconnected nodes to deliverdata between two endpoints. In the very beginning, these networks were designedusing a layered approach to separate functions. The OSI model is an example ofhow functional separation has led to innovation by allowing different standardsbodies to work in parallel at each layer. In some cases even these OSI layersare further split into different layers. While these layers can bring some costbenefit, it also brings added complexity. Each layer has its own management,control plane, planning, and operational model.Inefficiences Between Network LayersOTN and IP network traffic must be converted into wavelengthsignals to traverse the DWDM network. This has traditionally required dedicatedexternal hardware, a transponder. All of these layers bring complexity, andtoday some of those layers, such as OTN, bring little to the table in terms ofefficiency or additional value. OTN switching, like ATM previously, has not beenable to keep up with traffic demands due to very complex hardware. UnlikeEthernet/IP, OTN also does not have a widely interoperable control plane, locking providers into a single vendor or solution long-term.Operational ComplexityNetworks involving opaque layers are difficult to plan, build, and operate. IPand optical networks often have duplicate teams covering similar tasks. Networkprotection and restoration is also often complicated by different schemesrunning independently across layers. The industry has tried over decades tosolve some of these issues with complex control planes such as GMPLS, but we arenow at an evolution point where simplifying the physical layers and reducingcontrol plane complexity in the optical layer allows a natural progression to asingle control-plane and protection/restoration layer.Network CostSimplyfing networks reduces both capex and opex. As we move to 400G, the networkcost is shifted away from routers and router ports to optics. Any way we canreduce the number of 400G interconnects on the network will greatly reduce cost.Modeling networks with 400ZR and OpenZR+ optics in place of traditionaltransponders and muxponders shows this in almost any network scenario. It also results in a reduced space and power footprint.Routed Optical Networking Solution OverviewAs part of the Converged SDN Transport architecture, Routed Optical Networkingextends the key tenet of network simplification. Routed Optical Networkingtackles the challenges of building and managing networks by simplifying both theinfrastructure and operations.Today’s Complex Multi-Layer Network InfrastructureDWDMMost modern SP networks start at the physical fiber optic layer. Above thephysical fiber is technology to allow multiple photonic wavelengths to traversea single fiber and be switched at junction points, we will call that the DWDMlayer.OTNIn some networks, above this DWDM layer is an OTN layer, OTN being theevolution of traditional SONET/SDH networks. OTN grooms low speed TDM servicesinto higher speed containers, and if OTN switching is involved, allows switchingthese services at intermediate points in the network. OTN is primarily used in network to carry guaranteed bandwidth services.Ethernet/IPIn all high bandwidth networks today, there is an Ethernet layer on which IPservices traverse, since almost all data traffic today is IP. Ethernetand IP is used due to its ability to support statistical multiplexing, topologyflexibility, and widespread interoperability between different vendors based onwell-defined standards. In larger networks today carrying Internet traffic, theEthernet/IP layer does not typically traverse an OTN layer, the OTN layer isprimarily used only for business services.Enabling TechnologiesPluggable Digital Coherent OpticsSimple networks are easier to build and easier to operate. As networks scale tohandle traffic growth, the level of network complexity must decline or at leastremain flat.IPoDWDM has attempted to move the transponder function into the router to removethe transponder and add efficiency to networks. In lower bandwidth applications,it has been a very successful approach. CWDM, DWDM SFP/SFP+, and CFP2-DCOpluggable transceivers have been used for many years now to build access,aggregation, and lower speed core networks. The evolution to 400G andadvances in technology created an opportunity to unlock this potentialin higher speed networks.Transponder or muxponders have typically been used to aggregate multiple 10G or100G signals into a single wavelength. However, with reach limitations, and thefact transponders are still operating at 400G wavelength speeds, the transponderbecomes a 1#1 input to output stage in the network, adding no benefit.The Routed Optical Networking architecture unlocks this efficiency for networksof all sizes, due to advancements in coherent plugable technology.QSFP-DD and 400ZR and OpenZR+ StandardsAs mentioned, the industry saw a point to improve network efficiency by shiftingcoherent DWDM functions to router pluggables. Technology advancements haveshrunk the DCO components into the standard QSFP-DD form factor, meaning nospecialized hardware and the ability to use the highest capacity routersavailable today. ZR/OpenZR+ QSFP-DD optics can be used in the same ports as thehighest speed 400G non-DCO transceivers.Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S)Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S)Two industry optical standards have emerged to cover a variety of use cases. TheOIF created the 400ZR specification,https#//www.oiforum.com/technical-work/hot-topics/400zr-2 as a 400G interopablestandard for metro reach coherent optics. The industry saw the benefit of theapproach, but wanted to cover longer distances and have flexibility inwavelength rates, so the OpenZR+ MSA was created, https#//www.openzrplus.org.The following table outlines the specs of each standard. ZR400 and OpenZR+ transceivers are tunable across the ITU C-Band, 196.1 To 191.3 THz.The following part numbers are used for Cisco’s ZR400 and OpenZR+ MSA transceivers Standard Part 400ZR QDD-400G-ZR-S OpenZR+ QDD-400G-ZRP-S Cisco datasheet for these transceivers can be found at https#//www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/datasheet-c78-744377.htmlCisco RoutersWe are at a point in NPU development where the pace of NPU bandwidth growth hasoutpaced network traffic growth. Single NPUs such as Cisco’s Silicon One have acapacity exceeding 12.8Tbps in a single NPU package without sacrificingflexibility and rich feature support. This growth of NPU capacity also bringsreduction in cost, meaning forwarding traffic at the IP layer is moreadvantageous vs. a network where layer transitions happen often.Cisco supports 400ZR and OpenZR+ optics across the NCS 540, NCS 5500, NCS 5700,ASR 9000, and Cisco 8000 series routers. This enabled providers to utilize the architecture across their end to end infrastructure in a variety of router roles. SeeCisco DWDM Network HardwareRouted Optical Networking shifts an expensive and now often redundanttransponder function into a pluggable transceiver. However, to make the mostefficient use of a valuable resource, the underlying fiber optic network, westill need a DWDM layer. Routed Optical Networking is flexible enough to workacross point to point, ROADM based optical networks, or a mix of both. Ciscomultiplexers, amplifiers, and ROADMs can satisfy any network need. See thevalidated design hardware section for more information.Routed Optical Networking Network Use CasesCisco is embracing Routed Optical Networking in every SP router role. Access,aggregation, core, peering, DCI, and even PE routers can be enabled with highspeed DCO optics. Routed Optical Networking is also not limited to SP networks,there are applications across enterprise, government, and education networks.Where to use 400ZR and where to use OpenZR+The OIF 400ZR and OpenZR+ MSA standards have important differences.400ZR supports 400G rates only, and targets metro distance point to pointconnections up to 120km. 400ZR mandates a strict power consumption of 15W aswell. Networks requiring only 400G over distances less than 120km may benefitfrom using 400ZR optics. DCI and 3rd party peering interconnection are good usecases for 400ZR.If a provider needs flexibility in rates and distances and wants to standardizeon a single optics type, OpenZR+ can fulfill the need. In areas of the networkwhere 400G may not be needed, OpenZR+ optics can be run at 100G or 200G.Additionally, hardware with QSFP-DD 100G ports can utilize OpenZR+ optics in100G mode. This can be ideal for high density access and aggregation networks.Supported DWDM Optical TopologiesFor those unfamiliar with DWDM hardware, please see the overview of DWDM networkhardware in Appendix AThe future of networks may be a flat L3 network with simple point to pointinterconnection, but it will take time to migrate to this type of architecture.Routed Optical Network supports an evolution to the architecture by working overmost modern photonic DWDM networks. Below gives just a few of the supportedoptical topologies including both point to point and ROADM networks.64 Channel FOADM P2P DeploymentThis example provides up to 25.6Tb on a single network span, and highlights thesimplicity of the Routed Optical Networking solution. The “optical” portion ofthe network including the ZR/ZR+ configuration can be completed in a matter ofminutes from start to finish.Colorless Add/Drop DeploymentUsing the NCS2K-MF-6AD-CFS colorless NCS2K-MF-LC module along with the LC16 LCaggregation module, and SMR20-FS ROADM module, a scalable colorless add/dropcomplex can be deployed to support 400ZR and OpenZR+.Multi-Degree ROADM DeploymentIn this example a 3 degree ROADM node is shown with a local add/drop degree. TheRouted Optical Networking solution fully supports ROADM based networks withoptical bypass. The traffic demands of the network will dictate the mostefficient network build. In cases where an existing or new build requires DWDMswitching capability, ZR and ZR+ wavelengths are easily provisioned over theinfrastructure.Long-Haul DeploymentCisco has demonstrated in a physical lab 400G OpenZR+ services provisionedacross 1200km using NCS 2000 optical line systems. 300G, 200G, and 100G signalscan achieve even greater distances. OpenZR+ is not just for shorter reachapplications, it fulfills an ideal sweet spot in most provider networks in termsof bandwidth and reach.Core NetworksLong-haul core networks also benefit from the CapEx and OpEx savings of movingto Routed Optical Networking. Moving to a simpler IP enabled convergedinfrastructure makes networks easier to manage and operate vs. networks withcomplex underlying optical infrastructure. The easiest place to start in thejourney is replacing external transponders with OpenZR+ QSFP-DD transceivers. At400G connecting a 400G gray Ethernet port to a transponder with a 400G or 600Gline side is not cost or environmentally efficient. Cisco can assist in modeling your core network to determine the TCO of Routed Optical Networking compared to traditional approaches.Metro AggregationTiered regional or metro networks connecting hub locations to larger aggregation site or datacenters can also benefit from Routed Optical Networking. Whether deployed in a hub and spoke topology or hop by hop IP ring, Routed Optical Networking satisfied provider’s growth demands at a lower cost than traditional approaches.AccessAccess deployments in a ring or point-to-point topology are ideal for Routed Optical Networking. Shorter distances over dark fiber may not require active optical equipment, and with up to 400G per span may provide the bandwidthnecessary for growth over a number of years without the use of additional multiplexers.DCI and 3rd Party Location InterconnectIn this use case, Routed Optical Networking simplifies deployments by eliminating active transponders, reducing power, space, and cabling requirements between end locations. 25.6Tbps of bandwidth is available over a single fiber using 64 400G wavelengths and simple optical amplifiers and multiplexers requiring no additional configuration after initial turn-up.Routed Optical Networking Architecture HardwareAll Routed Optical Networking solution routers are powered by Cisco IOS-XR.Routed Optical Networking Validated RoutersBelow is a non-exhaustive snapshot of platforms validated for use with ZR andOpenZR+ transceivers. Cisco supports Routed Optical Networking in the NCS 540,NCS 5500/5700, ASR 9000, and Cisco 8000 router families. The breadth of coverageenabled the solution across all areas of the network.Cisco 8000 SeriesThe Cisco 8000 and its Silicone One NPU represents the next generation inrouters, unprecedented capacity at the lowest power consumption while supportinga rich feature set applicable for a number of network roles.See more information on Cisco 8000 at https#//www.cisco.com/c/en/us/products/collateral/routers/8000-series-routers/datasheet-c78-742571.htmlSpecific information on ZR/ZR+ support can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/cisco8000/Interfaces/73x/configuration/guide/b-interfaces-config-guide-cisco8k-r73x/m-zr-zrp-cisco-8000.htmlCisco 5700 Systems and NCS 5500 Line CardsThe Cisco 5700 family of fixed and modular systems and line cards are flexibleenough to use at any location in the networks. The platform has seen widespreaduse in peering, core, and aggregation networks.See more information on Cisco NCS 5500 and 5700 at https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-736270.html andhttps#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-744698.htmlSpecific information on ZR/ZR+ support can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/interfaces/73x/configuration/guide/b-interfaces-hardware-component-cg-ncs5500-73x/m-zr-zrp.htmlASR 9000 SeriesThe ASR 9000 is the most widely deployed SP router in the industry. It has arich heritage dating back almost 20 years, but Cisco continues to innovate onthe ASR 9000 platform. The ASR 9000 series now supports 400G QSFP-DD on avariety of line cards and the ASR 9903 2.4Tbps 3RU platform.See more information on Cisco ASR 9000 at https#//www.cisco.com/c/en/us/products/collateral/routers/asr-9000-series-aggregation-services-routers/data_sheet_c78-501767.htmlSpecific information on ZR/ZR+ support can be found at https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-3/interfaces/configuration/guide/b-interfaces-hardware-component-cg-asr9000-73x/m-zr-zrp.html#Cisco_Concept.dita_59215d6f-1614-4633-a137-161ebe794673NCS 500 SeriesThe 1Tbps N540-24QL16DD-SYS high density router brings QSFP-DD and Routed Optical NetworkingZR/OpenZR+ optics to a flexible access and aggregation platform. Using OpenZR+ optics it allows a migration path from 100G to 400G access rings or uplinks when used in an aggregation role.See more information on Cisco NCS 540 at https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-500-series-routers/ncs-540-large-density-router-ds.htmlRouted Optical Networking Optical HardwareBelow gives an overview of some of the supported equipment used to build theDWDM layer of the Routed Optical Networking Solution.Network Convergence System 2000The NCS 2000 Optical Line System is a flexible platform supporting all modernoptical topologies and deployment use cases. Simple point to point tomulti-degree CDC deployments are all supported as part of Routed OpticalNetworking.See more information on the NCS 2000 series at https#//www.cisco.com/c/en/us/products/optical-networking/network-convergence-system-2000-series/index.htmlNetwork Convergence System 1000 MultiplexerThe NCS1K-MD-64-C is a new fixed multiplexer designed specificallyfor the 400G 75Ghz 400ZR and OpenZR+ wavelengths, allowing up to 25.6Tbps on asingle fiber.Network Convergence System 1001The NCS 1001 is utiized in point to point network spans as an amplifier andoptionally protection switch. The NCS 1001 now has specific support for 75Ghzspaced 400ZR and OpenZR+ wavelengths, with the ability to monitor incomingwavelengths for power. The 1001 features the ability to determine the properamplifier gain setpoints based on the desired user power levels.See more information on the NCS 1001 at https#//www.cisco.com/c/en/us/products/collateral/optical-networking/network-convergence-system-1000-series/datasheet-c78-738782.htmlRouted Optical Networking AutomationOverviewRouted Optical Networking by definition is a disaggregated optical solution,creating efficiency by moving coherent endpoints in the router. The solutionrequires a new way of managing the network, one which unifies the IP and Opticallayers, replacing the traditional siloed tools used in the past. Realtransformation in operations comes from unifying teams and workflows, ratherthan trying to make an existing tool fit a role it was not originally designedfor. Cisco’s standards based hierarchical SDN solution allows providers tomanage a multi-vendor Routed Optical Networking solution using standardinterfaces and YANG models.IETF ACTN SDN FrameworkThe IETF Action and Control of Traffic Engineered Networks group (ACTN) hasdefined a hierarchical controller framework to allow vendors to plug componentsinto the framework as needed. The lowest level controller, the ProvisioningNetwork Controller (PNC), is responsible for managing physical devices. Thesecontroller expose their resources through standard models and interface to aHierarchical Controller (HCO), called a Multi-Domain Service Controller (MDSC)in the ACTN framework.Note that while Cisco is adhering to the IETF framework proposed in RFC8453 , Cisco is supporting the mostwidely supported industry standards for controller to controller communicationand service definition. In optical the de facto standard is Transport API fromthe ONF for the management of optical line system networks and optical services.In packet we are leveraging Openconfig device models where possible and IETFmodels for packet topology (RFC8345) and xVPN services (L2NM and L3NM)Cisco’s SDN Controller Automation StackAligning to the ACTN framework, Cisco’s automation stack includes amulti-vendor IP domain controller (PNC), optical domain controller (PNC), andmulti-vendor hierarchical controller (HCO/MDSC).Cisco Open AutomationCisco believes not all providers consume automation in the same way, so we arededicated to make sure we have open interfaces at each layer of the networkstack. At the device level, we utilize standard NETCONF, gRPC, and gNMIinterfaces along with native, standard, and public consortium YANG models. Thereis no aspect of a Cisco IOS-XR router today not covered by YANG models. At thedomain level we have Cisco’s network controllers, which use the same standardinterfaces to communicate with devices and expose standards based NBIs. Ourmulti-layer/multi-domain controller likewise uses the same standard interfaces.Crosswork Hierarchical ControllerResponsible for Multi-Layer Automation is the Crosswork Hierarchical Controller. Crosswork Hierarchical Controller is responsible for the following network functions# CW HCO unifies data from the IP and optical networks into a single networkmodel. HCO utilizes industry standard IETF topology models for IP and TAPI foroptical topology and service information. HCO can also leverage legacy EMS/NMSsystems or device interrogation. Responsible for managing multi-layer Routed Optical Networking links using asingle UI. Providing assurance at the IP and optical layers in a single tool. Thenetwork model allows users to quickly correlate faults and identify at whichlayer faults have occurred. Additional HCO applications include the Root Cause Analysis tool, able toquickly correlate upper layer faults to an underlying cause.Please see the following resources for more information on Crosswork HCO. https#//www.cisco.com/c/en/us/products/collateral/cloud-systems-management/crosswork-network-automation/solution-overview-c22-744695.htmlCrosswork Network ControllerCrosswork Network Controller is a multi-vendor IP domain controller. CrossworkNetwork Controller is responsible for the following IP network functions. Collecting Ethernet, IP, RSVP-TE, and SR network information for internalapplications and exposing northbound via IETF RFC 8345 topology models Collecting traffic information from the network for use with CNC’s trafficoptimization application, Crosswork Optimization Engine Perform provisioning of SR-TE, RSVP-TE, L2VPN, and L3VPN using standardindustry models (IETF TEAS-TE, L2NM, L3NM) via UI or northbound API Visualization and assurance of SR-TE, RSVP-TE, and xVPN services Use additional Crosswork applications to perform telemetry collection/alerting,zero-touch provisioning, and automated and assurance network changesMore information on Crosswork and Crosswork Network Controller can be found at https#//www.cisco.com/c/en/us/products/collateral/cloud-systems-management/crosswork-network-automation/datasheet-c78-743456.htmlCisco Optical Network ControllerCisco Optical Network Controller (Cisco ONC) is responsible for managing Cisco optical line systems and circuit services. Cisco ONC exposes a ONF TAPI northbound interface, the de facto industry standard for optical network management. Cisco ONC runs as an application on the same Crosswork Infrastructure as CNC.More information on Cisco ONC can be found at https#//www.cisco.com/c/en/us/support/optical-networking/optical-network-controller/series.htmlCisco Network Services Orchestrator and Routed Optical Networking ML Core Function PackCisco NSO is the industry standard for service orchestration and deviceconfiguration management. The RON-ML CFP can be used to fully configure an IPlink between routers utilizing 400ZR/OpenZR+ optics over a Cisco optical linesystem using Cisco ONC. This includes IP addressing and adding links to anexisting Ethernet LAG. The CFP can also support optical-only provisioning on therouter to fit into existing optical provisioning workflows.Routed Optical Networking Service ManagementSupported Provisioning MethodsWe support multiple ways to provision Routed Optical Networking services based on existing provider workflows. Unified IP and Optical using Crosswork Hierarchical Controller Unified IP and Optical using Cisco NSO Routed Optical Networking Multi-Layer Function Pack ZR/ZR+ Optics using IOS-XR CLI ZR/ZR+ Optics using IOS-XR NetconfOpenZR+ and 400ZR PropertiesZR/ZR+ Supported FrequenciesThe frequency on Cisco ZR/ZR+ transceivers may be set between 191.275Thz and196.125Thz in increments of 6.25Ghz, supporting flex spectrum applications. Tomaximize the available C-Band spectrum, these are the recommended 6475Ghz-spaced channels, also aligning to the NCS1K-MD-64-C fixed channel add/dropmultiplexer.                 196.100 196.025 195.950 195.875 195.800 195.725 195.650 195.575 195.500 195.425 195.350 195.275 195.200 195.125 195.050 194.975 194.900 194.825 194.75 194.675 194.600 194.525 194.450 194.375 194.300 194.225 194.150 194.075 194.000 193.925 193.850 193.775 193.700 193.625 193.550 193.475 193.400 193.325 193.250 193.175 193.100 193.025 192.950 192.875 192.800 192.725 192.650 192.575 192.500 192.425 192.350 192.275 192.200 192.125 192.050 191.975 191.900 191.825 191.750 191.675 191.600 191.525 191.450 191.375 Supported Line Side Rate and ModulationOIF 400ZR transceivers support 400G only per the OIF specification. OpenZR+transceivers can support 100G, 200G, 300G, or 400G line side rate. See routerplatform documentation for supported rates. The modulation is determined by theline side rate. 400G will utilize 16QAM, 300G 8QAM, and 200G/100G rates willutilize QPSK.Crosswork Hierarchical Controller UI ProvisioningEnd-to-End IP+Optical provisioning can be done using Crosswork Hierarchical Controller’s GUI IP Linkprovisioning. Those familiar with traditional GUI EMS/NMS systems for servicemanagement will have a very familiar experience. Crosswork Hierarchical Controller provisioning will provisionboth the router optics as well as the underlying optical network to support theZR/ZR+ wavelength.Inter-Layer Link DefinitionEnd to end provisioning requires first defining the Inter-Layer link between therouter ZR/ZR+ optics and the optical line system add/drop ports. This is doneusing a GUI based NMC (Network Media Channel) Cross-Link application in Crosswork HCO.The below screenshot shows defined NMC cross-links.IP Link ProvisioningOnce the inter-layer links are created, the user can then proceed inprovisioning an end to end circuit. The provisioning UI takes as input the tworouter endpoints, the associated ZR/ZR+ ports, and the IP addressing or bundlemembership of the link. The optical line system provisioning is abstracted fromthe user, simplifying the end to end workflow. The frequency and power isautomatically derived by Cisco Optical Network Controller based on the add/dropport and returned as a parameter to be used in router optics provisioning.Operational DiscoveryThe Crosswork Hierarchical Controller provisioning process also performs a discovery phase to ensure theservice is operational before considering the provisioning complete. Ifoperational discovery fails, the end to end service will be rolled back.NSO RON-ML CFP ProvisioningProviders familiar with using Cisco Network Service Orchestrator have an optionto utilize NSO to perform IP+Optical provisioning of Routed Optical Networkingservices. Cisco has created the Routed Optical Network Multi-Layer Core FunctionPack, RON-ML CFP to perform end to end provisioning of services. Theaforementioned Crosswork HCO provisioning utilizes the RON-ML CFP to perform end deviceprovisioning.Please see the Cisco Routed Optical Networking RON-ML CFP documentation located atRouted Optical Networking Inter-Layer LinksSimilar to the use case with CW HCO provisioning, before end to end provisioningcan be performed, inter-layer links must be provisioned between the opticalZR/ZR+ port and the optical line system add/drop port. This is done using the“inter-layer-link” NSO service. The optical end point can be defined as either aTAPI SIP or by the TAPI equipment inventory identifier. Inter-layer links are not required for router-only provisioning.RON-ML End to End ServiceThe RON-ML service is responsible for end to end IP+optical provisioning. RON-MLsupports full end to end provisioning, router-only provisioning, or optical-onlyprovisioning where only the router ZR/ZR+ configuration is performed. Thefrequency and transmit power can be manually defined or optionally provided byCisco ONC when end to end provisioning is performed.RON-ML API ProvisioningUse the following URL for NSO provisioning# http#//<nso host>/restconf/dataInter-Layer Link Service{ ~data~# { ~cisco-ron-cfp#ron~# { ~inter-layer-link~# [ { ~end-point-device~# ~ron-8201-1~, ~line-port~# ~0/0/0/20~, ~ols-domain~# { ~network-element~# ~ron-ols-1~, ~optical-add-drop~# ~1/2008/1/13,14~, ~optical-controller~# ~onc-real-new~ } } ] } }}Provisioning ZR+ optics and adding interface to Bundle-Ether 100 interface{ ~cisco-ron-cfp#ron~# { ~ron-ml~# [ { ~name~# ~E2E_Bundle_ZRP_ONC57_2~, ~mode~# ~transponder~, ~bandwidth~# ~400~, ~circuit-id~# ~E2E Bundle ONC-57 S9|chan11 - S10|chan11~, ~grid-type~# ~100mhz-grid~, ~ols-domain~# { ~service-state~# ~UNLOCKED~ }, ~end-point~# [ { ~end-point-device~# ~ron-8201-1~, ~terminal-device-optical~# { ~line-port~# ~0/0/0/11~, ~transmit-power~# -100 }, ~ols-domain~# { ~end-point-state~# ~UNLOCKED~ }, ~terminal-device-packet~# { ~bundle~# [ { ~id~# 100 } ], ~interface~# [ { ~index~# 0, ~membership~# { ~bundle-id~# 100, ~mode~# ~active~ } } ] } }, { ~end-point-device~# ~ron-8201-2~, ~terminal-device-optical~# { ~line-port~# ~0/0/0/11~, ~transmit-power~# -100 }, ~ols-domain~# { ~end-point-state~# ~UNLOCKED~ }, ~terminal-device-packet~# { ~bundle~# [ { ~id~# 100 } ], ~interface~# [ { ~index~# 0, ~membership~# { ~bundle-id~# 100, ~mode~# ~active~ } } ] } } ] } ] } }IOS-XR CLI ConfigurationConfiguring the router portion of the Routed Optical Networking link is verysimple. All optical configuration related to the ZR/ZR+ optics configuration islocated under the optics controller relevent to the faceplate port. Defaultconfiguration the optics will be in an up/up state using a frequency of193.10Thz.The basic configuration with a specific frequency of 195.65 Thz is located below, the only required component is the bolded channel frequency setting.ZR/ZR+ Optics Configurationcontroller Optics0/0/0/20 transmit-power -100 dwdm-carrier 100MHz-grid frequency 1956500 logging events link-statusIOS-XR NETCONF ConfigurationAll configuration performed in IOS-XR today can also be done using NETCONF/YANG. The following payload exhibits the models and configuration used to perform router optics provisioning. This is a more complete example showing the FEC, power, modulation, and line side rate (200G) configuration.<data xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <interface-configurations xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-ifmgr-cfg~> <interface-configuration> <active>act</active> <interface-name>Optics0/0/0/20</interface-name> <description> Managed by NSO .58, do not change manually</description> <optics xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-controller-optics-cfg~> <optics-transmit-power>-100</optics-transmit-power> <optics-performance-monitoring>true</optics-performance-monitoring> <optics-modulation>16qam</optics-modulation> <optics-fec>fec-ofec</optics-fec> <optics-dwdm-carrier> <grid-type>100Mhz-grid</grid-type> <param-type>frequency</param-type> <param-value>1956500 </param-value> </optics-dwdm-carrier> </optics> <breakout xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-optics-driver-cfg~>2x100</breakout> </interface-configuration></interface-configurations></data>Routed Optical Networking AssuranceCrosswork Hierarchical ControllerMulti-Layer Path TraceUsing topology and service data from both the IP and Optical network CW HCO candisplay the full service from IP services layer to the physical fiber. Below isan example of the “waterfall” trace view from the OTS (Fiber) layer to theSegment Routing TE layer across all layers. CW HCO identifies specific RoutedOptical Networking links using ZR/ZR+ optics as seen by the ZRC (ZR Channel) andZRM (ZR Media) layers from the 400ZR specification.When faults occur at a specific layer, faults will be highlighted in red,quickly identifying the layer a fault has occurred.Routed Optical Networking Link AssuranceThe Link Assurance application isolates the multi-layer path of a single RoutedOptical Networking service, showing both the router termination points as wellas the optical layer. This information is further enhanced with telemetry datacoming from both the ZR/ZR+ optics as well as the optical link system nodes.Optionally the user can see graphs of collected telemetry data to quickly identify trends or changes in specific operational data.IOS-XR CLI Monitoring of ZR400/OpenZR+ OpticsOptics ControllerThe optics controller represents the physical layer of the optics. In the caseof ZR/ZR+ optics this includes the frequency information, RX/TX power, OSNR, andother associated physical layer information.RP/0/RP0/CPU0#ron-8201-1#show controllers optics 0/0/0/20Thu Jun 3 15#34#44.098 PDT Controller State# Up Transport Admin State# In Service Laser State# On LED State# Green FEC State# FEC ENABLED Optics Status Optics Type# QSFPDD 400G ZR DWDM carrier Info# C BAND, MSA ITU Channel=10, Frequency=195.65THz, Wavelength=1532.290nm Alarm Status# ------------- Detected Alarms# None LOS/LOL/Fault Status# Alarm Statistics# ------------- HIGH-RX-PWR = 0 LOW-RX-PWR = 0 HIGH-TX-PWR = 0 LOW-TX-PWR = 4 HIGH-LBC = 0 HIGH-DGD = 1 OOR-CD = 0 OSNR = 10 WVL-OOL = 0 MEA = 0 IMPROPER-REM = 0 TX-POWER-PROV-MISMATCH = 0 Actual TX Power = -7.17 dBm RX Power = -9.83 dBm RX Signal Power = -9.18 dBm Frequency Offset = 9 MHz Baud Rate = 59.8437500000 GBd Modulation Type# 16QAM Chromatic Dispersion 6 ps/nm Configured CD-MIN -2400 ps/nm CD-MAX 2400 ps/nm Second Order Polarization Mode Dispersion = 34.00 ps^2 Optical Signal to Noise Ratio = 35.50 dB Polarization Dependent Loss = 1.20 dB Polarization Change Rate = 0.00 rad/s Differential Group Delay = 2.00 psPerformance Measurement DataRP/0/RP0/CPU0#ron-8201-1#show controllers optics 0/0/0/20 pm current 30-sec optics 1Thu Jun 3 15#39#40.428 PDTOptics in the current interval [15#39#30 - 15#39#40 Thu Jun 3 2021]Optics current bucket type # Valid MIN AVG MAX Operational Configured TCA Operational Configured TCA Threshold(min) Threshold(min) (min) Threshold(max) Threshold(max) (max)LBC[% ] # 0.0 0.0 0.0 0.0 NA NO 100.0 NA NOOPT[dBm] # -7.17 -7.17 -7.17 -15.09 NA NO 0.00 NA NOOPR[dBm] # -9.86 -9.86 -9.85 -30.00 NA NO 8.00 NA NOCD[ps/nm] # -489 -488 -488 -80000 NA NO 80000 NA NODGD[ps ] # 1.00 1.50 2.00 0.00 NA NO 80.00 NA NOSOPMD[ps^2] # 28.00 38.80 49.00 0.00 NA NO 2000.00 NA NOOSNR[dB] # 34.90 35.12 35.40 0.00 NA NO 40.00 NA NOPDL[dB] # 0.70 0.71 0.80 0.00 NA NO 7.00 NA NOPCR[rad/s] # 0.00 0.00 0.00 0.00 NA NO 2500000.00 NA NORX_SIG[dBm] # -9.23 -9.22 -9.21 -30.00 NA NO 1.00 NA NOFREQ_OFF[Mhz]# -2 -1 4 -3600 NA NO 3600 NA NOSNR[dB] # 16.80 16.99 17.20 7.00 NA NO 100.00 NA NOCoherent DSP ControllerThe coherent DSP controller represents the framing layer of the optics. It includes Bit Error Rate, Q-Factor, and Q-Margin information.RP/0/RP0/CPU0#ron-8201-1#show controllers coherentDSP 0/0/0/20Sat Dec 4 17#24#38.245 PSTPort # CoherentDSP 0/0/0/20Controller State # UpInherited Secondary State # NormalConfigured Secondary State # NormalDerived State # In ServiceLoopback mode # NoneBER Thresholds # SF = 1.0E-5 SD = 1.0E-7Performance Monitoring # EnableBandwidth # 400.0Gb/sAlarm Information#LOS = 10 LOF = 0 LOM = 0OOF = 0 OOM = 0 AIS = 0IAE = 0 BIAE = 0 SF_BER = 0SD_BER = 0 BDI = 0 TIM = 0FECMISMATCH = 0 FEC-UNC = 0 FLEXO_GIDM = 0FLEXO-MM = 0 FLEXO-LOM = 3 FLEXO-RDI = 0FLEXO-LOF = 5Detected Alarms # NoneBit Error Rate InformationPREFEC BER # 1.7E-03POSTFEC BER # 0.0E+00Q-Factor # 9.30 dBQ-Margin # 2.10dBFEC mode # C_FECPerformance Measurement DataRP/0/RP0/CPU0#ron-8201-1#show controllers coherentDSP 0/0/0/20 pm current 30-sec fecThu Jun 3 15#42#28.510 PDTg709 FEC in the current interval [15#42#00 - 15#42#28 Thu Jun 3 2021]FEC current bucket type # Valid EC-BITS # 20221314973 Threshold # 83203400000 TCA(enable) # YES UC-WORDS # 0 Threshold # 5 TCA(enable) # YES MIN AVG MAX Threshold TCA Threshold TCA (min) (enable) (max) (enable)PreFEC BER # 1.5E-03 1.5E-03 1.6E-03 0E-15 NO 0E-15 NOPostFEC BER # 0E-15 0E-15 0E-15 0E-15 NO 0E-15 NOQ[dB] # 9.40 9.40 9.40 0.00 NO 0.00 NOQ_Margin[dB] # 2.20 2.20 2.20 0.00 NO 0.00 NOCisco IOS-XR Model-Driven Telemetry for ZR/ZR+ MonitoringAll operational data on IOS-XR routers can be monitored using streaming telemetry based on YANG models. Routed Optical Networking is no different, so a wealth of information can be streamed from the routers in intervals as low as 5s.The following represents a list of validated sensor paths useful for monitoringthe DCO optics in IOS-XR and the data fields available within thesesensor paths. Note PM fields also support 15m and 24h paths in addition to the 30s paths shown in the table below. Sensor Path Fields Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info alarm-detected, baud-rate, dwdm-carrier-frequency, controller-state, laser-state, optical-signal-to-noise-ratio, temperature, voltage Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-lanes/optics-lane receive-power, receive-signal-power, transmit-power Cisco-IOS-XR-controller-otu-oper#otu/controllers/controller/info bandwidth, ec-value, post-fec-ber, pre-fec-ber, qfactor, qmargin, uc Cisco-IOS-XR-pmengine-oper#performance-management/optics/optics-ports/optics-port/optics-current/optics-second30/optics-second30-optics/optics-second30-optic dd__average, dgd__average, opr__average, opt__average, osnr__average, pcr__average, pmd__average, rx-sig-pow__average, snr__average, sopmd__average Cisco-IOS-XR-pmengine-oper#performance-management/otu/otu-ports/otu-port/otu-current/otu-second30/otu-second30fecs/otu-second30fec ec-bits__data, post-fec-ber__average, pre-fec-ber__average, q__average, qmargin__average, uc-words__data Open-source ZR/ZR+ MonitoringCisco model-driven telemetry along with the open source collector Telegraf and the open source dashboard software Grafana can be used to quickly build powerful dashboards to monitor ZR/ZR+ performance.Additional ResourcesCisco Routed Optical Networking Home https#//www.cisco.com/c/en/us/solutions/service-provider/routed-optical-networking.html Cisco Routed Optical Networking Tech Field Day Solution Overview# https#//techfieldday.com/video/build-your-network-with-cisco-routed-optical-networking-solution/ Automation Demo# https#//techfieldday.com/video/cisco-routed-optical-networking-solution-demo/Cisco Champion Podcasts Cisco Routed Optical Networking Solution for the Next Decade https#//smarturl.it/CCRS8E24 Simplify Network Operations with Crosswork Hierarchical Controller# https#//smarturl.it/CCRS8E48 Cisco Routed Optical Networking 1.0 Solution GuideAppendix AAcronyms     DWDM Dense Waveform Division Multiplexing OADM Optical Add Drop Multiplexer FOADM Fixed Optical Add Drop Multiplexer ROADM Reconfigurable Optical Add Drop Multiplexer DCO Digital Coherent Optics FEC Forward Error Correction OSNR Optical Signal to Noise Ratio BER Bit Error Rate DWDM Network Hardware OverviewOptical Transmitters and ReceiversOptical transmitters provide the source signals carried across the DWDM network.They convert digital electrical signals into a photonic light stream on aspecific wavelength. Optical receivers detect pulses of light and and convertsignals back to electrical signals. In Routed Optical Networking, digital coherent QSFP-DD OpenZR+ and 400ZR transceivers in routers are used as optical transmitters and receivers.Multiplexers/DemultiplexersMultiplexers take multiple wavelengths on separate fibers and combine them intoa single fiber. The output of a multiplexer is a composite signal.Demultiplexers take composite signals that compatible multiplexers generate andseparate the individual wavelengths into individual fibers.Optical AmplifiersOptical amplifiers amplify an optical signal. Optical amplifiers increase thetotal power of the optical signal to enable the signal transmission acrosslonger distances. Without amplifiers, the signal attenuation over longerdistances makes it impossible to coherently receive signals. We use differenttypes of optical amplifiers in optical networks. For example# preamplifiers,booster amplifiers, inline amplifiers, and optical line amplifiers.Optical add/drop multiplexers (OADMs)OADMs are devices capable of adding one or more DWDM channels into or droppingthem from a fiber carrying multiple channels.Reconfigurable optical add/drop multiplexers (ROADMs)ROADMs are programmable versions of OADMs. With ROADMs, you can change thewavelengths that are added or dropped. ROADMs make optical networks flexible andeasily modifiable.", "url": "/blogs/2022-01-07-cst-routed-optical-1_0/", "author": "Phil Bedard", "tags": "iosxr, design, optical, ron, routing" } , "#": {} , "blogs-latest-converged-sdn-transport-hld": { "title": "Converged SDN Transport High Level Design v5.0", "content": " On This Page Revision History Minimum supported IOS-XR Release Minimum supported IOS-XE Release Value Proposition Summary Technical Overview Hardware Components in Design Cisco 8000 ASR 9000 NCS-560 NCS 5504, 5508, 5516 Modular Chassis NCS 5500 / 5700 Fixed Chassis NCS 540 Small, Medium, Large Density, and Fronthaul routers NCS-55A2-MOD NCS-57C3-MOD ASR 920 NCS 520 Transport – Design Components Network Domain Structure Topology options and PE placement - Inline and non-inline PE Cisco Routed Optical Networking Connectivity using 100G/200G digital coherent optics w/MACSec Routed Optical Networking ring deployment without multiplexers Routed Optical Networking deployment with multiplexer Unnumbered Interface Support Intra-Domain Operation Intra-Domain Routing and Forwarding Intra-Domain Forwarding - Fast Re-Route using TI-LFA Inter-Domain Operation Inter-Domain Forwarding Area Border Routers – Prefix-SID and Anycast-SID Inter-Domain Forwarding - High Availability and Fast Re-Route Inter-Domain Open Ring Support Transport Programmability Traffic Engineering (Tactical Steering) – SR-TE Policy Traffic Engineering (Tactical Steering) - Per-Flow SR-TE Policy Traffic Engineering - Dynamic Anycast-SID Paths and Black Hole Avoidance Transport Controller Path Computation Engine (PCE) Segment Routing Path Computation Element (SR-PCE) PCE Controller Summary – SR-PCE Converged SDN Transport Path Computation Workflows Static SR-TE Policy Configuration On-Demand Next-Hop Driven Configuration Segment Routing Flexible Algorithms (Flex-Algo) Flex-Algo Node SID Assignment Flex-Algo IGP Definition Path Computation across SR Flex-Algo Network Flex-Algo Dual-Plane Example Segment Routing and Unified MPLS (BGP-LU) Co-existence Summary ABR BGP-LU design Quality of Service and Assurance Overview NCS 540, 560, 5500, and 5700 QoS Primer Cisco 8000 QoS Support for Time Sensitive Networking in N540-FH-CSR-SYS and N540-FH-AGG-SYS Hierarchical Edge QoS H-QoS platform support CST Core QoS mapping with five classes Example Core QoS Class and Policy Maps Class maps for ingress header matching Class maps for egress queuing and marking policies Egress QoS queuing policy Egress QoS marking policy L3 Multicast using Segment Routing Tree-SID Tree SID Diagram Tree-SID Overview Static Tree-SID Dynamic Tree-SID using BGP mVPN Control-Plane L3 IP Multicast and mVPN using mLDP LDP Auto-configuration LDP mLDP-only Session Capability (RFC 7473) LDP Unicast FEC Filtering for SR Unicast with mLDP Multicast Converged SDN Transport Use Cases 4G and 5G Mobile Networks Summary and 5G Service Types Key Validated Components End to End Timing Validation Low latency SR-TE path computation Dynamic Link Performance Measurement SR Policy latency constraint configuration on configured policy SR Policy latency constraint configuration for ODN policies Dynamic link delay metric configuration Static defined link delay metric TE metric definition SR Policy one-way delay measurement IP Endpoint Delay Measurement Global Routing Table IP Endpoint Delay Measurement VRF IP Endpoint Delay Measurement Segment Routing Flexible Algorithms for 5G Slicing End to end network QoS with H-QoS on Access PE CST QoS mapping with 5 classes FTTH Design using EVPN E-Tree Summary E-Tree Diagram E-Tree Operation Split-Horizon Groups L3 IRB Support Multicast Traffic Ease of Configuration Cisco Cloud Native Broadband Network Gateway Cisco cnBNG Architecture cnBNG Control Plane cnBNG User Plane Cable Converged Interconnect Network (CIN) Summary Distributed Access Architecture Remote PHY Components and Requirements Remote PHY Device (RPD) RPD Network Connections Cisco cBR-8 and cnBR cBR-8 Network Connections cBR-8 Redundancy Remote PHY Communication DHCP Remote PHY Standard Flows GCP UEPI and DEPI L2TPv3 Tunnels CIN Network Requirements IPv4/IPv6 Unicast and Multicast Network Timing CST 4.0+ Update to CIN Timing Design QoS DHCPv4 and DHCPv6 Relay Converged SDN Transport CIN Design Deployment Topology Options High Scale Design (Recommended) Collapsed Digital PIC and SUP Uplink Connectivity Collapsed RPD and cBR-8 DPIC Connectivity Cisco Hardware Scalable L3 Routed Design L3 IP Routing CIN Router to Router Interconnection Leaf Transit Traffic cBR-8 DPIC to CIN Interconnection DPIC Interface Configuration Router Interface Configuration RPD to Router Interconnection Native IP or L3VPN/mVPN Deployment SR-TE CIN Quality of Service (QoS) CST Network Traffic Classification CST and Remote-PHY Load Balancing SmartPHY RPD Automation 4G Transport and Services Modernization Business and Infrastructure Services using L3VPN and EVPN EVPN Multicast EVPN Centralized Gateway Multicast LDP to Converged SDN Transport Migration Towards Converged SDN Transport Design Segment Routing Enablement Segment Routing Mapping Server Design Automation Network Management using Cisco Crosswork Network Controller L2VPN Service Provisioning and Visualization L3VPN Service Provisioning and Visualization Crosswork Automated Assurance Zero Touch Provisioning Zero Touch Provisioning using Crosswork Network Controller Model-Driven Telemetry Transport and Service Management using Crosswork Network Controller Network Services Orchestrator Converged SDN Transport Supported Service Models Core Function Packs Example Function Packs Base Services Supporting Advanced Use Cases Overview Ethernet VPN (EVPN) Ethernet VPN Hardware Support Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or with Data Center End-To-End (Flat) Services Hierarchical Services Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHE EVPN Centralized Gateway EVPN Head-End for L3 Services The Converged SDN Transport Design - Summary Revision History Version Date Comments 1.0 05/08/2018 Initial Converged SDN Transport publication 1.5 09/24/2018 NCS540 Access, ZTP, NSO Services 2.0 4/1/2019 Non-inline PE Topology, NCS-55A2-MOD, IPv4/IPv6/mLDP Multicast, LDP to SR Migration 3.0 1/20/2020 Converged Transport for Cable CIN, Multi-domain Multicast, Qos w/H-QoS access, MACSEC, Coherent Optic connectivity 3.5 10/15/2020 Unnumbered access rings, Anycast SID ABR Resiliency, E-Tree for FTTH deployments, SR Multicast using Tree-SID, NCS 560, SmartPHY for R-PHY, Performance Measurement 4.0 2/1/2020 SR Flexible Algorithms inc. Inter-Domain, PTP multi-profile inc. G.82751<>G.8275.2 interworking, G.8275.2 on BVI, ODN support for EVPN ELAN, TI-LFA Open Ring support, NCS 520, SR on cBR8 5.0 7/1/2022 Cisco 8000, Cloud Native BNG, EVPN-HE/EVPN-CGW, Dynamic Tree-SID, Routed Optical Networking, Crosswork Automation Minimum supported IOS-XR Release CST Version XR version 1.0 6.3.2 1.5 6.5.1 2.0 6.5.3 3.0 6.6.3 3.5 7.1.2 4.0 7.2.2 on NCS, 7.1.3 on ASR9K 5.0 7.5.2 on NCS, 8000, ASR 9000 (7.4.2 for cnBNG) Minimum supported IOS-XE Release CST Version XR version 4.0 16.12.03 on NCS 520, ASR920; 17.03.01w on cBR-8 5.0 16.12.03 on NCS 520, ASR920; 17.03.01w on cBR-8 Value PropositionService Providers are facing the challenge to provide next generationservices that can quickly adapt to market needs. New paradigms such as5G introduction, video traffic continuous growth, IoT proliferation andcloud services model require unprecedented flexibility, elasticity andscale from the network. Increasing bandwidth demands and decreasing ARPUput pressure on reducing network cost. At the same time, services needto be deployed faster and more cost effectively to stay competitive.Metro Access and Aggregation solutions have evolved from nativeEthernet/Layer 2 based, to Unified MPLS to address the above challenges.The Unified MPLS architecture provides a single converged networkinfrastructure with a common operational model. It has great advantagesin terms of network convergence, high scalability, high availability,and optimized forwarding. However, that architectural model is stillquite challenging to manage, especially on large-scale networks, becauseof the large number of distributed network protocols involved whichincreases operational complexity.Converged SDN Transport design introduces an SDN-ready architecturewhich evolves traditional Metro network design towards an SDN enabled,programmable network capable of delivering all services (Residential,Business, 4G/5G Mobile Backhaul, Video, IoT) on the premise ofsimplicity, full programmability, and cloud integration, with guaranteedservice level agreements (SLAs).The Converged SDN Transport design brings tremendous value to ServiceProviders# Fast service deployment and rapid time to market throughfully automated service provisioning and end-to-end networkprogrammability Operational simplicity with less protocols to operate and manage Smooth migration towards an SDN-ready architecture thanks tobackward-compatibility with existing network protocols and services Next generation service creation leveraging guaranteed SLAs Enhanced and optimized operations using telemetry/analytics inconjunction with automation tools The Converged SDN Transport design is targeted at Service Providercustomers who# Want to evolve their existing Unified MPLS Network Are looking for an SDN ready solution Need a simple, scalable design that can support future growth Want a future proof architecture built using industry-leading technology SummaryThe Converged SDN Transport design satisfies the following criteria for scalable next-generation networks# Simple# based on Segment Routing as unified forwarding plane andEVPN and L3VPN as a common BGP based services control plane Programmable# Using SR-PCE to program end-to-end multi-domain paths across thenetwork with guaranteed SLAs Automated # Service provisioning is fully automated using NSOand YANG models; Analytics with model driven telemetry inconjunction with Crosswork Network Controller toenhance operations and network visibility Technical OverviewThe Converged SDN Transport design evolves from the successful CiscoEvolved Programmable Network (EPN) 5.0 architecture framework, to bringgreater programmability and automation.In the Converged SDN Transport design, the transport and service are builton-demand when the customer service is requested. The end-to-endinter-domain network path is programmed through controllers and selectedbased on the customer SLA, such as the need for a low latency path.The Converged SDN Transport is made of the following main buildingblocks# IOS-XR as a common Operating System proven in Service ProviderNetworks Transport Layer based on Segment Routing as UnifiedForwarding Plane SDN - Segment Routing Path Computation Element (SR-PCE) as Cisco Path ComputationEngine (PCE) coupled with Segment Routing to provide simple andscalable inter-domain transport connectivity, TrafficEngineering, and advanced Path control with constraints Service Layer for Layer 2 (EVPN) and Layer 3 VPN services basedon BGP as Unified Control Plane Automation and Analytics NSO for service provisioning Netconf/YANG data models Telemetry to enhance and simplify operations Zero Touch Provisioning and Deployment (ZTP/ZTD) Hardware Components in DesignCisco 8000The Converged SDN Transport design now includes the Cisco 8000 family. Cisco 8000 routers provide the lowest power consumption in the industry, all while supporting systems over 200 Tbps and features service providers require. Starting in CST 5.0 the Cisco 8000 fulfills the role of core and aggregation router in the design. The 8000 provides transit for end to end unicast and multicast services including those using SR-TE and advanced capabilities such as SR Flexible Algorithms. Service termination is not supported on the 8000 in CST 5.0.ASR 9000The ASR 9000 is the router of choice for high scale edge services. The Converged SDN Transport utilizes the ASR 9000 in a PE function role, performing high scale L2VPN, L3VPN, and Pseudowire headend termination. All testing up to CST 3.0 has been performed using Tomahawk series line cards on the ASR 9000. Starting in CST 5.0 we introduce ASR 9000 Lightspeed+ high capacity line cards to the design. The ASR 9000 also serves as the user plane for Cisco’s distributed BNG architecture.NCS-560The NCS-560 with RSP4 is a next-generation platform with high scale and modularity to fit in many access, pre-aggregation, and aggregation roles. Available in 4-slot and 7-slot versions, the NCS 560 is fully redundant with a variety of 40GE/100GE, 10GE, and 1GE modular adapters. The NCS 560 RSP4 has built-in GNSS timing support along with a high scale (-E) version to support full Internet routing tables or large VPN routing tables with room to spare for 5+ years of growth. The NCS 560 provides all of this with a very low power and space footprint with a depth of 9.5”.NCS 5504, 5508, 5516 Modular ChassisThe modular chassis version of the NCS 5500 is available in 4, 8, and 16 slot versions for flexible interfaces at high scale with dual RP modules. A variety of line cards are available with 10G, 40G, 100G, and 400G interface support. The NCS 5500 fully supports timing distribution for applications needing high accuracy clocks like mobile backhaul.NCS 5500 / 5700 Fixed ChassisThe NCS 5500 / 5700 fixed series devices are validated in access, aggregation,and core role in the Converged SDN Transport design. All platforms listed belowsupport at least PTP class B timing and the full set of IOS-XR xVPN and SegmentRouting features.The NCS-55A1-48Q6H has 48x1GE/10GE/25GE interfaces and 6x40GE/100GE interfaces,supporting high density mobile and subscriber access aggregation applications.The NCS-55A1-24Q6H-S and NCS-55A1-24Q6H-SS have 24x1GE/10GE, 24x1GE/10GE/25GE,and 6x40GE/100GE interfaces. The 24Q6H-SS provides MACSEC support on allinterfaces. The NCS-55A1-24Q6H series also supports 10GE/25GE DWDM optics on allrelevant ports.NCS-55A1-48Q6HNCS-55A1-24Q6HThe NCS-57C1-48Q6D platform 32xSFP28 (1/10/25), 16xSFP56 (1/10/25/50), 4x400G QSFP-DD, and 2xQSFP-DD with 4x100G/2x100G support. ZR/RZ+ optics can be utilized on three of the 400G QSFP-DD interfaces.The NCS-55A1-36H and NCS-55A1-36H-SE provide 36x100GE in 1RU for denseaggregation and core needs. The NCS-57B1-6D24 and NCS-57B1-5DSE provide24xQSFP28 and either 5 or 6 QSFP-DD ports capable of 400GE with support for ZRand ZR+ optics.More information on the NCS 5500 fixed routers can be found at#https#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.htmlNCS 540 Small, Medium, Large Density, and Fronthaul routersThe NCS 540 family of routers supports mobile and business services across a widevariety of service provier and enterprise applications, including support forRouted Optical Networking in the QSFP-DD enabled NCS-540 Large Density router.More information on the NCS 540 router line can be found at#https#//www.cisco.com/c/en/us/products/routers/network-convergence-system-540-series-routers/index.htmlThe N540-FH-CSR-SYS and N540-FH-AGG-SYS Fronthaul routers introduced in CST 5.0 can beutilized for ultra low latency mobile fronthaul, midhaul, or backhaul networks.These fronthaul routers support native CPRI interfaces and special processingfor eCPRI and ROE (Radio over Ethernet) traffic guaranteeing low latency. Thesedevices also support stringent class C timing.NCS-55A2-MODThe Converged SDN Transport design now supports the NCS-55A2-MOD access and aggregation router. The 55A2-MOD is a modular 2RU router with 24 1G/10G SFP+, 16 1G/10G/25G SFP28 onboard interfaces, and two modular slots capable of 400G of throughput per slot using Cisco NCS Modular Port Adapters or MPAs. MPAs add additional 1G/10G SFP+, 100G QSFP28, or 100G/200G CFP2 interfaces. The 55A2-MOD is available in an extended temperature version with a conformal coating as well as a high scale configuration (NCS-55A2-MOD-SE-S) scaling to millions of IPv4 and IPv6 routes.NCS-57C3-MODThe NCS-57C3-MOD is the next-generation 300mm modular router supporting theConverged SDN Transport design. The NCS-57C3-MOD is a 3.2Tbps platform with thefollowing fixed interfaces# 8xQSFP28 100G, 48 SFP28 1/10/25G. The 57C3 alsoincludes two 800G MPA slots, and one 400G MPA slot for port expansion. Theseexpansion modules support additional 1/10/25G, 100G, and 400G interfaces. TheNCS-57C3 is available in both standard (NCS-57C3-MOD-SYS) and scale(NCS-57C3-MOD-SE-SYS) varieties.ASR 920The IOS-XE based ASR 920 is tested within the Converged SDN Transport as an access node. The Segment Routing data plane and supported service types are validated on the ASR 920 within the CST design. Please see the services support section for all service types supported on the ASR 920. NCS 520The IOS-XE based NCS 520 acts as an Ethernet demarcation device (NID) or carrier Ethernet switch in the Converged SDN Transport design. The MEF 3.0 certified device acts as a customer equipment termination point where QoS, OAM (Y.1731,802.3ah), and service validation/testing using Y.1564 can be performed. The NCS 520 is available in a variety of models covering different port requirements including industrial temp and conformal coated models for harsher environments.Transport – Design ComponentsNetwork Domain StructureTo provide unlimited network scale, the Converged SDN Transport isstructured into multiple IGP Domains# Access, Aggregation, and Core. However as we will illustrate in the next section, the number of domains is completely flexible based on provider need.Refer to the network topology in Figure 1.Figure 1# High scale fully distributedThe network diagram in Figure 2 shows how a Service Provider network canbe simplified by decreasing the number of IGP domains. In this scenariothe Core domain is extended over the Aggregation domain, thus increasingthe number of nodes in theCore.Figure 2# Distributed with expanded accessA similar approach is shown in Figure 3. In this scenario the Coredomain remains unaltered and the Access domain is extended over theAggregation domain, thus increasing the number of nodes in the Accessdomain.#%s/Figure 3# Distributed with expanded coreThe Converged SDN Transport transport design supports all three networkoptions, while remaining easily customizable.The first phase of the Converged SDN Transport, discussed later in thisdocument, will cover in depth the scenario described in Figure 3.Topology options and PE placement - Inline and non-inline PEThe non-inline PE topology, shown in the figure below, moves the services edgePE device from the forwarding path between the access/aggregation networks andthe core. There are several factors which can drive providers to this designvs. one with an in-line PE, some of which are outlined in the table below. Thecontrol-plane configuration of the Converged SDN Transport does not change, allexisting ABR configuration remains the same, but the device no longer acts as ahigh-scale PE.Figure# Non-Inline Aggregation TopologyCisco Routed Optical NetworkingStarting in CST 5.0, the CST design now supports and validates 400G ZR/ZR+tunable DWDM QSFP-DD transceivers. These transceivers are supported across theASR 9000, Cisco 8000, NCS 5500/5700, and NCS 540 routers with QSFP-DD ports.Routed Optical Network as part of the Coverged SDN Transport design addssimplification of provider IP and Optical infrastructure to the control and dataplane simplification introduced in previous CST designs. All CST capabilities are supported over Cisco ZR and ZR+ enabled interfaces.For more information on Cisco’s Routed Optical Networking design please see the following high-level design document#https#//xrdocs.io/design/blogs/latest-routed-optical-networking-hldNote class C timing is currently not supported over ZR/ZR+ optics, ZR/ZR+ optics in this release support class A or B timing depending on platform.Connectivity using 100G/200G digital coherent optics w/MACSecConverged SDN Transport 3.0+ adds support for the use of pluggable CFP2-DCOtransceivers to enable high speed aggregation and access network infrastructure.As endpoint bandwidth increases due to technology innovation such as 5G andRemote PHY, access and aggregation networks must grow from 1G and 10G to 100Gand beyond. Coherent router optics simplify this evolution by allowing anupgrade path to increase ring bandwidth up to 400Gbps without deploying costlyDWDM optical line systems. CFP2-DCO transceivers are supported using 400GModular Port Adapters for the NCS-55A2-MOD-S/SE, NCS-57C3-MOD-S/SE chassis andNC55-MOD-A-S/SE line cards. The NC55-MPA-1TH2H-S MPA has two QSFP28 ports andone CFP2-DCO port. The NC55-MPA-2TH-HX-S is a temperature hardened version ofthis MPA. The NC55-MPA-2TH-S has two CFP2-DCO ports.MACSec is an industry standard protocol running at L2 to provide encryptionacross Ethernet links. In CST 3.0 MACSec is enabled across CFP2-DCO access toaggregation links. MACSec support is hardware dependent, please consultindividual hardware data sheets for MACSec support.Routed Optical Networking ring deployment without multiplexersIn the simplest deployment access rings are deployed over dark fiber, enablingplug and play operation up to 80km without amplification.Routed Optical Networking DWDM ring deploymentRouted Optical Networking deployment with multiplexerIn this option the nodes are deployed with active or passive multiplexers to maximize fiber utilization rings needing more bandwidth per ring site. While this example shows each site on the ring having direct DWDM links back to the aggregation nodes, a hybrid approach could also be supported targeting only high-bandwidth locations with direct links while leaving other sites on a an aggregation ring.Routed Optical Networking DWDM hub and spoke or partial mesh deploymentUnnumbered Interface SupportIn CST 3.5, starting at IOS-XR 7.1.1 we have added support for unnumbered interfaces. Using unnumbered interfaces in the network eases the burden of deploying nodes by not requiring specific IPv4 or IPv6 interface addresses between adjacent node. When inserting a new node into an existing access ring the provideronly needs to configure each interface to use a Loopback address on the East and West interfaces of the nodes. IGP adjacencies will be formed over the unnumbered interfaces.IS-IS and Segment Routing/SR-TE utilized in the Converged SDN Transport design supports using unnumbered interfaces. SR-PCE used to compute inter-domain SR-TE paths also supports the use of unnumbered interfaces. In the topology database each interface is uniquely identified by a combination of router ID and SNMP IfIndex value.Unnumbered node insertionUnnumbered interface configuration#interface TenGigE0/0/0/2 description to-AG2 mtu 9216 ptp profile My-Slave port state slave-only local-priority 10 ! service-policy input core-ingress-classifier service-policy output core-egress-exp-marking ipv4 point-to-point ipv4 unnumbered Loopback0 frequency synchronization selection input priority 10 wait-to-restore 1 !!Intra-Domain OperationIntra-Domain Routing and ForwardingThe Converged SDN Transport is based on a fully programmable transport thatsatisfies the requirements described earlier. The foundation technologyused in the transport design is Segment Routing (SR) with a MPLS basedData Plane in Phase 1 and a IPv6 based Data Plane (SRv6) in future.Segment Routing dramatically reduces the amount of protocols needed in aService Provider Network. Simple extensions to traditional IGP protocolslike ISIS or OSPF provide full Intra-Domain Routing and ForwardingInformation over a label switched infrastructure, along with HighAvailability (HA) and Fast Re-Route (FRR) capabilities.Segment Routing defines the following routing related concepts# Prefix-SID – A node identifier that must be unique for each node ina IGP Domain. Prefix-SID is statically allocated by th3 networkoperator. Adjacency-SID – A node’s link identifier that must be unique foreach link belonging to the same node. Adjacency-SID is typicallydynamically allocated by the node, but can also be staticallyallocated. In the case of Segment Routing with a MPLS Data Plane, both Prefix-SIDand Adjacency-SID are represented by the MPLS label and both areadvertised by the IGP protocol. This IGP extension eliminates the needto use LDP or RSVP protocol to exchange MPLS labels.The Converged SDN Transport design uses IS-IS as the IGP protocol.Intra-Domain Forwarding - Fast Re-Route using TI-LFASegment-Routing embeds a simple Fast Re-Route (FRR) mechanism known asTopology Independent Loop Free Alternate (TI-LFA).TI-LFA provides sub 50ms convergence for link and node protection.TI-LFA is completely stateless and does not require any additionalsignaling mechanism as each node in the IGP Domain calculates a primaryand a backup path automatically and independently based on the IGPtopology. After the TI-LFA feature is enabled, no further care isexpected from the network operator to ensure fast network recovery fromfailures. This is in stark contrast with traditional MPLS-FRR, whichrequires RSVP and RSVP-TE and therefore adds complexity in the transportdesign.Please refer also to the Area Border Router Fast Re-Route covered inSection# “Inter-Domain Forwarding - High Availability and Fast Re-Route” for additional details.Inter-Domain OperationInter-Domain ForwardingThe Converged SDN Transport achieves network scale by IGP domainseparation. Each IGP domain is represented by separate IGP process onthe Area Border Routers (ABRs).Section# “Intra-Domain Routing and Forwarding” described basic Segment Routing concepts# Prefix-SID andAdjacency-SID. This section introduces the concept of Anycast SID.Segment Routing allows multiple nodes to share the same Prefix-SID,which is then called a “Anycast” Prefix-SID or Anycast-SID. Additionalsignaling protocols are not required, as the network operator simplyallocates the same Prefix SID (thus a Anycast-SID) to a pair of nodestypically acting as ABRs.Figure 4 shows two sets of ABRs# Aggregation ABRs – AG Provider Edge ABRs – PE Figure 4# IGP Domains - ABRs Anycast-SIDFigure 5 shows the End-To-End Stack of SIDs for packets traveling fromleft to right through thenetwork.Figure 5# Inter-Domain LSP – SR-TE PolicyThe End-To-End Inter-Domain Label Switched Path (LSP) was computed viaSegment Routing Traffic Engineering (SR-TE) Policies.On the Access router “A” the SR-TE Policy imposes# Local Aggregation Area Border Routers Anycast-SID# Local-AGAnycast-SID Local Provider Edge Area Border Routers Anycast-SID# Local-PEAnycast SID Remote Provider Edge Area Border Routers Anycast-SID# Remote-PEAnycast-SID Remote Aggregation Area Border Routers Anycast-SID# Remote-AGAnycast-SID Remote/Destination Access Router# Destination-A Prefix-SID#Destination-A Prefix-SID The SR-TE Policy is programmed on the Access device on-demand by anexternal Controller and does not require any state to be signaledthroughout the rest of the network. The SR-TE Policy provides, by simpleSID stacking (SID-List), an elegant and robust way to programInter-Domain LSPs without requiring additional protocols such as BGP-LU(RFC3107).Please refer to Section# “Transport Programmability” for additional details.Area Border Routers – Prefix-SID and Anycast-SIDSection# “Inter-Domain Forwarding” showed the use of Anycast-SID at the ABRs for theprovisioning of an Access to Access End-To-End LSP. When the LSP is setup between the Access Router and the AG/PE ABRs, there are two options# ABRs are represented by Anycast-SID; or Each ABR is represented by a unique Prefix-SID. Choosing between Anycast-SID or Prefix-SID depends on the requested service andinclusion of Anycast SIDs in the SR-TE Policy. If one is using the SR-PCE, suchas the case of ODN SR-TE paths, the inclusion of Anycast SIDs is done viaconfiguration.Note both options can be combined on the same network.Inter-Domain Forwarding - High Availability and Fast Re-RouteAG/PE ABRs redundancy enables high availability for Inter-DomainForwarding.Figure 7# IGP Domains - ABRs Anycast-SIDWhen Anycast-SID is used to represent AG or PE ABRs, no other mechanismis needed for Fast Re-Route (FRR). Each IGP Domain provides FRRindependently by TI-LFA as described in Section# “Intra-Domain Forwarding - Fast Re-Route”.Figure 8 shows how FRR is achieved for a Inter-DomainLSP.Figure 8# Inter-Domain - FRRThe access router on the left imposes the Anycast-SID of the ABRs andthe Prefix-SID of the destination access router. For FRR, any router inIGP1, including the Access router, looks at the top label# “ABRAnycast-SID”. For this label, each device maintains a primary and backuppath preprogrammed in the HW. In IGP2, the top label is “Destination-A”.For this label, each node in IGP2 has primary and backup pathspreprogrammed in the HW. The backup paths are computed by TI-LFA.As Inter-Domain forwarding is achieved via SR-TE Policies, FRR iscompletely self-contained and does not require any additional protocol.Note that when traditional BGP-LU is used for Inter-Domain forwarding,BGP-PIC is also required for FRR.Inter-Domain LSPs provisioned by SR-TE Policy are protected by FRR alsoin case of ABR failure (because of Anycast-SID). This is not possiblewith BGP-LU/BGP-PIC, since BGP-LU/BGP-PIC have to wait for the IGP toconverge first.SR Data Plane Monitoring provides proactive method to ensure reachability between all SR enabled nodes in an IGP domain. SR DPM utilizes well known MPLS OAM capabilities with crafted SID lists to ensure valid forwarding across the entire IGP domain. See the CST Implementation Guide for more details on SR Data Plane monitoring.Inter-Domain Open Ring SupportPrior to CST 4.0 and XR 7.2.1, the use of TI-LFA within a ring topology required the ring be closed within the IGP domain. This required an interconnect at the ASBR domain node for each IGP domain terminating on the ASBR. This type of connectivity was not always possible in an aggregation network due to fiber or geographic constraints. In CST 4.0 we have introduced support for open rings by utilizing MPLSoGRE tunnels between terminating boundary nodes across the upstream IGP domain. The following picture illustrates open ring support between an access and aggregation network.In the absence of a physical link between the boundary nodes PA1 and PA2, GRE tunnels can be created to interconnect each domain over its adjacent domain. During a protection event, such as the link failure between PA1 and GA1, traffic will enter the tunnel on the protection node, in this case PA1 towards PA2. Keep in mind traffic will loop back through the domain until re-convergence occurs. In the case of a core failure, bandwidth may not be available in an access ring to carry all core traffic, so care must be taken to determine traffic impact.Transport ProgrammabilityFigure 9 and Figure 10 show the design of Route-Reflectors (RR), Segment Routing Path Computation Element (SR-PCE) and WAN Automation Engines (WAE).High-Availability is achieved by device redundancy in the Aggregationand Core networks.Figure 9# Transport Programmability – PCEPTransport RRs collect network topology from ABRs through BGP Link State (BGP-LS).Each Transport ABR has a BGP-LS session with the two Domain RRs. Each domain is represented by a different BGP-LS instance ID.Aggregation Domain RRs collect network topology information from theAccess and the Aggregation IGP Domain (Aggregation ABRs are part of theAccess and the Aggregation IGP Domain). Core Domain RRs collect networktopology information from the Core IGP Domain.Aggregation Domain RRs have BGP-LS sessions with Core RRs.Through the Core RRs, the Aggregation Domains RRs advertise localAggregation and Access IGP topologies and receive the network topologiesof the remote Access and Aggregation IGP Domains as well as the networktopology of the Core IGP Domain. Hence, each RR maintains the overallnetwork topology in BGP-LS.Redundant Domain SR-PCEs have BGP-LS sessions with the local Domain RRsthrough which they receive the overall network topology. Refer toSection# “Segment Routing Path Computation Element (SR-PCE)” for more details about SR-PCE.SR-PCE is capable of computing the Inter-Domain LSP path on-demand. The computed path (Segment Routing SID List) is communicated to the Service End Points via a Path Computation Element Protocol (PCEP) response as shown in Figure 9.The Service End Points create a SR-TE Policy and use the SID list returned by SR-PCE as the primary path.Service End Points can be located on the Access Routers for End-to-End Services or at both the Access and domain PE routers for Hierarchical Services. The domain PE routers and ABRs may or may not be the same router. The SR-TE Policy DataPlane in the case of Service End Point co-located with the Access routerwas described in Figure 5.The proposed design is very scalable and can be easily extended tosupport even higher numbers of PCEP sessions by addingadditional RRs and SR-PCE elements into the Access Domain.Figure 11 shows the Converged SDN Transport physical topology with examplesof product placement.Figure 11# Converged SDN Transport – Physical Topology with transportprogrammabilityTraffic Engineering (Tactical Steering) – SR-TE PolicyOperators want to fully monetize their network infrastructure byoffering differentiated services. Traffic engineering is used to providedifferent paths (optimized based on diverse constraints, such aslow-latency or disjoined paths) for different applications. Thetraditional RSVP-TE mechanism requires signaling along the path fortunnel setup or tear down, and all nodes in the path need to maintainstates. This approach doesn’t work well for cloud applications, whichhave hyper scale and elasticity requirements.Segment Routing provides a simple and scalable way of defining anend-to-end application-aware traffic engineering path known as an SR-TE Policy. The SR-TE Policy expresses the intent of the applications constraints across the network.In the Converged SDN Transport design, the Service End Point uses PCEP along with Segment Routing On-Demand Next-hop (SR-ODN) capability, to request from the controller a path thatsatisfies specific constraints (such as low latency). This is done byassociating SLA tags/attributes to the path request. Upon receiving therequest, the SR-PCE controller calculates the path based on the requestedSLA, and uses PCEP to dynamically program the ingress nodewith a specific SR-TE Policy.Traffic Engineering (Tactical Steering) - Per-Flow SR-TE PolicySR-TE and On-Demand Next-Hop have been enhanced to support per-flow trafficsteering. Per-flow traffic steering is accomplished by using ingress QoSpolicies to mark traffic with a traffic class which is mapped to a SR-TE Policysupporting that traffic class. A variety of IP header match criteria can beused in the QoS policy to classify traffic, giving operators flexibility tocarry a specific traffic flow in a SR-TE Policy matching the SLA of the traffic.Traffic Engineering - Dynamic Anycast-SID Paths and Black Hole AvoidanceAs shown in Figure 7, inter-domain resilience and load-balancing is satisfied byusing the same Anycast SID on each boundary node. Starting in CST 3.5 AnycastSIDs are used by a centralized SR-PCE without having to define an explicit SIDlist. Anycast SIDs are learned via the topology information distributed to theSR-PCE using BGP-LS. Once the SR-PCE knows the location of a set of AnycastSIDs, it will utilize the SID in the path computation to an egress node. TheSR-PCE will only utilize the Anycast SID if it has a valid path to the next SIDin the computed path, meaning if one ABR loses it’s path to the adjacent domain,the SR-PCE will update the head-end path with one utilizing a normal node SID toensure traffic is not dropped.It is also possible to withdraw an anycast SID from the topology by using theconditional route advertisement feature for IS-IS, new in 3.5. Once the anycastSID Loopback has been withdrawn, it will no longer be used in a SR Policy path.Conditional route advertisement can be used for SR-TE Policies with Anycast SIDsin either dynamic or static SID candidate paths. Conditional route advertisementis implemented by supplying the router with a list of remote prefixes to monitorfor reachability in the RIB. If those routes disappear from the RIB, theinterface route will be withdrawn. Please see the CST Implementation Guide forinstructions on configuring anycast SID inclusion and blackhole avoidance.Transport Controller Path Computation Engine (PCE)Segment Routing Path Computation Element (SR-PCE)Segment Routing Path Computation Element, or SR-PCE, is a Cisco Path ComputationEngine (PCE) and is implemented as a feature included as part of Cisco IOS-XRoperating system. The function is typically deployed on a Cisco IOS-XR cloudappliance XRv-9000, as it involves control plane operations only. The SR-PCEgains network topology awareness from BGP-LS advertisements received from theunderlying network. Such knowledge is leveraged by the embedded multi-domaincomputation engine to provide optimal path information to Path ComputationElement Clients (PCCs) using the Path Computation Element Protocol (PCEP).The PCC is the device where the service originates (PE) and therefore itrequires end-to-end connectivity over the segment routing enabledmulti-domain network.The SR-PCE provides a path based on constraints such as# Shortest path (IGP metrics). Traffic-Engineering metrics. Disjoint paths starting on one or two nodes. Latency Figure 12# XR Transport Controller – ComponentsPCE Controller Summary – SR-PCESegment Routing Path Computation Element (SR-PCE)# Runs as a feature on a physical or virtual IOS-XR node Collects topology from BGP using BGP-LS, ISIS, or OSPF Deploys SR Policies based on client requests Computes Shortest, Disjoint, Low Latency, and Avoidance paths North Bound interface with applications via REST APIConverged SDN Transport Path Computation WorkflowsStatic SR-TE Policy Configuration NSO provisions the service. Alternatively, the service can beprovisioned via CLI SR-TE Policy is configured via NSO or CLI on the access node to the other service end points, specifying pcep as the computation method Access Router requests a path from SR-PCE with metric type and constraints SR-PCE computes the path SR-PCE provides the path to Access Router Access Router acknowledges and installs the SR Policy as the forwarding path for the service. On-Demand Next-Hop Driven Configuration NSO provisions the service. Alternatively, the service can beprovisioned via CLI On-demand colors are configured on each node, specifying specific constraints and pcep as the dynamic computation method On reception of service routes with a specific ODN color community, Access Router requests a path from SR-PCE to the BGP next-hop as the SR-TE endpoint. SR-PCE computes the path SR-PCE provides the path to Access Router Access Router acknowledges and installs the SR Policy as the forwarding path for the service. Segment Routing Flexible Algorithms (Flex-Algo)A powerful tool used to create traffic engineered Segment Routing paths is SR Flexible Algorithms, better known as SR Flex-Algo. Flex-Algo assigns a specific set of “algorithms” to a Segment. The algorithm identifies a specific computation constraint the segment supports. There are standards based algorithm definitions such as least cost IGP path and latency, or providers can define their own algorithms to satisfy their business needs. CST 4.0 supports computation of Flex-Algo paths in intra-domain and inter-domain deployments. In CST 4.0 (IOS-XR 7.2.2) inter-domain Flex-Algo using SR-PCE is limited to IGP lowest metric path computation. CST 5.0 (IOS-XR 7.5.2) enhances the inter-domain capabilities and can now compute inter-domain paths using additional metric types such as a latency.Flex-Algo limits the computation of a path to only those nodes participating in that algorithm. This gives a powerful way to create multiple network domains within a single larger network, constraining an SR path computation to segments satisfying the metrics defined by the algorithm. As you will see, we can now use a single node SID to reach a node via a path satisfying an advanced constraint such as delay.Flex-Algo Node SID AssignmentNodes participating in a specific algorithm must have a unique node SID prefix assigned to the algorithm. In a typical deployment, the same Loopback address is used for multiple algorithms. IGP extensions advertise algorithm membership throughout the network. Below is an example of a node with multiple algorithms and node SID assignments. By default, the basic IGP path computation is assigned to algorithm “0”. Algorithm “1” is also reserved. Algorithms 128-255 are user-definable. All Flex-Algo SIDs belong to the same global SRGB so providers deploying SR should take this into account. Each algorithm should be assigned its own block of SIDs within the SRGB, in the case below the SRGB is 16000-32000, each algorithm is assigned 1000 SIDs. interface Loopback0 address-family ipv4 unicast prefix-sid index 150 prefix-sid algorithm 128 absolute 18003 prefix-sid algorithm 129 absolute 19003 prefix-sid algorithm 130 absolute 20003Flex-Algo IGP DefinitionFlexible algorithms being used within a network must be defined in the IGP domains in the network The configuration is typically done on at least one node under the IGP configuration for domain. Under the definition the metric type used for computation is defined along with any link affinities. Link affinities are used to constrain the algorithm to not only specific nodes, but also specific links. These affinities are the same previously used by RSVP-TE.Note# Inter-domain Flex-Algo path computation requires synchronized Flex-Algo definitions across the end-to-end path flex-algo 130 metric-type delay advertise-definition ! flex-algo 131 advertise-definition affinity exclude-any redPath Computation across SR Flex-Algo NetworkFlex-Algo works by creating a separate topology for each algorithm. By default,all links interconnecting nodes participating in the same algorithm can be usedfor those paths. If the algorithm is defined to include or exclude specific linkaffinities, the topology will reflect it. A SR-TE path computation using aspecific Flex-Algo will use the Algo’s topology for end the end pathcomputation. It will also look at the metric type defined for the Algo and useit for the path computation. Even with a complex topology, a single SID is usedfor the end to end path, as opposed to using a series of node and adjacency SIDsto steer traffic across a shared topology. Each node participating in thealgorithm has adjacencies to other nodes utilizing the same algorithm, so when aincoming MPLS label matching the algo SID enters, it will utilize the pathspecific to the algorithm. A Flex-Algo can also be used as a constraint in anODN policy.Flex-Algo Dual-Plane ExampleA very simple use case for Flex-Algo is to easily define a dual-plane networktopology where algorithm 129 red and algorithm 130 is green. Nodes A1 and A6participate in both algorithms. When a path request is made for algorithm 129,the head-end nodes A1 and A6 will only use paths specific to the algorithm. TheSR-TE Policy does not need to reference the specific SID, only the Algo beingused as the constraints. The local node or SR-PCE will utilize the Algo tocompute the path dynamically.The following policy configuration is an example of constraining the path to the Algo 129 “Red” path.segment-routing traffic-eng policy GREEN-PE8-128 color 1128 end-point ipv4 100.0.2.53 candidate-paths preference 1 dynamic pcep ! metric type igp ! ! constraints segments sid-algorithm 129 Segment Routing and Unified MPLS (BGP-LU) Co-existenceSummaryIn the Converged SDN Transport 3.0 design we introduce validation for the co-existence of services using BGP Labeled Unicast transport for inter-domain forwarding and those using SR-TE. Many networks deployed today have an existing BGP-LU design which may not be easily migrated to SR, so graceful introduction between the two transport methods is required. In the case of a multipoint service such as EVPN ELAN or L3VPN, an endpoint may utilize BGP-LU to one endpoint and SR-TE to another.ABR BGP-LU designIn a BGP-LU design each IGP domain or ASBR boundary node will exchange BGP labeled prefixes between domains while resetting the BGP next-hop to its own loopback address. The labeled unicast label will change at each domain boundary across the end to end network. Within each IGP domain, a label distribution protocol is used to supply MPLS connectivity between the domain boundary and interior nodes. In the Converged SDN Transport design, IS-IS with SR-MPLS extensions is used to provide intra-domain MPLS transport. This ensures within each domain BGP-LU prefixes are protected using TI-LFA.The BGP-LU design utilized in the Converged SDN Transport validation is based on Cisco’s Unified MPLS design used in EPN 4.0.Quality of Service and AssuranceOverviewQuality of Service is of utmost importance in today’s multi-service converged networks. The Converged SDN Transport design has the ability to enforce end to end traffic path SLAs using Segment Routing Traffic Engineering. In addition to satisfying those path constraints, traditional QoS is used to make sure the PHB (Per-Hop Behavior) of each packet is enforced at each node across the converged network.NCS 540, 560, 5500, and 5700 QoS PrimerFull details of the NCS 540 and 5500 QoS capabilities and configuration can be found at#https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/qos/75x/b-qos-cg-ncs5500-75x.htmlThe NCS platforms utilize the same MQC configuration for QoS as other IOS-XR platforms but based on their hardware architecture use different elements for implementing end to end QoS. On these platforms ingress traffic is# Matched using flexible criteria via Class Maps Assigned to a specific Traffic Class (TC) and/or QoS Group for further treatment on egress Has its header marked with a specific IPP, DSCP, or MPLS EXP valueTraffic Classes are used internally for determining fabric priority and as the match condition for egress queuing. QoS Groups are used internally as the match criteria for egress CoS header re-marking. IPP/DSCP marking and re-marking of ingress MPLS traffic is done using ingress QoS policies. MPLS EXP for imposed labels can be done on ingress or egress, but if you wish to rewrite both the IPP/DSCP and set an explicit EXP for imposed labels, the MPLS EXP must be set on egress.The priority-level command used in an egress QoS policy specifies the egress transmit priority of the traffic vs. other priority traffic. Priority levels can be configured as 1-7 with 1 being the highest priority. Priority level 0 is reserved for best-effort traffic.Please note, multicast traffic does not follow the same constructs as unicast traffic for prioritization. All multicast traffic assigned to Traffic Classes 1-4 are treated as Low Priority and traffic assigned to 5-6 treated as high priority.Cisco 8000 QoSThe QoS configuration of the Cisco 8000 follows similar configuration guidelinesas the NCS 540, 5500, and NCS 5700 series devices. Detailed documentation of8000 series QoS including platform dependencies can be found at#https#//www.cisco.com/c/en/us/td/docs/iosxr/cisco8000/qos/75x/b-qos-cg-8k-75x.htmlSupport for Time Sensitive Networking in N540-FH-CSR-SYS and N540-FH-AGG-SYSThe Fronthaul family of NCS 540 routers support frame preemption based on theIEEE 802.1Qbu-2016 and Time Sensitive Networking (TSN) standards.Time Sensitive Networking (TSN) is a set of IEEE standards that addresses thetiming-critical aspect of signal flow in a packet switched Ethernet network toensure deterministic operation. TSN operates at the Ethernet layer on physicalinterfaces. Frames are marked with a specific QoS class (typically 7 in a devicewith classes 0-7) qualify as express traffic, while other classes other thancontrol plane traffic are marked as preemptable traffic.This allows critical signaling traffic to traverse a device as quickly aspossible without having to wait for lower priority frames before beingtransmitted on the wire.Please see the TSN configuration guide for NCS 540 Fronthaul routers athttps#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5xx/fronthaul/b-fronthaul-config-guide-ncs540-fh/m-fh-tsn-ncs540.pdfHierarchical Edge QoSHierarchical QoS enables a provider to set an overall traffic rate across all services, and then configure parameters per-service via a child QoS policy where the percentages of guaranteed bandwidth are derived from the parent rateH-QoS platform supportNCS platforms support 2-level and 3-level H-QoS. 3-level H-QoS applies a policer (ingress) or shaper (egress) to a physical interface, with each sub-interface having a 2-level H-QoS policy applied. Hierarchical QoS is not enabled by default on the NCS 540 and 5500 platforms. H-QoS is enabled using the hw-module profile qos hqos-enable command. Once H-QoS is enabled, the number of priority levels which can be assigned is reduced from 1-7 to 1-4. Additionally, any hierarchical QoS policy assigned to a L3 sub-interface using priority levels must include a “shape” command.The ASR9000 supports multi-level H-QoS at high scale for edge aggregation function. In the case of hierarchical services, H-QoS can be applied to PWHE L3 interfaces.CST Core QoS mapping with five classesQoS designs are typically tailored for each provider, but we introduce a 5-level QoS design which can fit most provider needs. The design covers transport of both unicast and multicast traffic. Traffic Type Core Marking Core Priority Comments Network Control EXP 6 Highest Underlay network control plane Low latency EXP 5 Highest Low latency service, consistent delay High Priority 1 EXP 3 Medium-High High priority service traffic Medium Priority / Multicast EXP 2 Medium priority and multicast   Best Effort EXP 0 General user traffic   Example Core QoS Class and Policy MapsThese are presented for reference only, please see the implementation guide for the full QoS configurationClass maps for ingress header matchingclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 end-class-mapIngress QoS policypolicy-map ingress-classifier class match-ef-exp5 set traffic-class 2 set qos-group 2 ! class match-cs5-exp4 set traffic-class 3 set qos-group 3 ! class class-default set traffic-class 0 set dscp 0 set qos-group 0 ! end-policy-mapClass maps for egress queuing and marking policiesclass-map match-any match-traffic-class-2 description ~Match highest priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match high priority traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-mapEgress QoS queuing policypolicy-map egress-queuing class match-traffic-class-2 priority level 2 ! class match-traffic-class-3 priority level 3 ! class class-default ! end-policy-mapEgress QoS marking policypolicy-map core-egress-exp-marking class match-qos-group-2 set mpls experimental imposition 5 ! class match-qos-group-3 set mpls experimental imposition 4 class class-default set mpls experimental imposition 0 ! end-policy-mapL3 Multicast using Segment Routing Tree-SIDTree SID DiagramTree-SID OverviewConverged SDN Transport 3.5 introduces Segment Routing Tree-SID across allIOS-XR nodes. TreeSID utilizes the programmability of SR-PCE to create andmaintain an optimized multicast tree from source to receiver across an SR-onlyIPv4 network. In CST 3.5 Tree-SID utilizes MPLS labels at each hop in thenetwork. Each node in the network maintains a session to the same set of SR-PCEcontrollers. The SR-PCE creates the tree using PCE-initiated segments. TreeSIDsupports advanced functionality such as TI-LFA for fast protection and disjointtrees.Static Tree-SIDMulticast traffic is forwarded across the tree using static S,G mappings at thehead-end source nodes and tail-end receiver nodes. Providers needing a solutionwhere dynamic joins and leaves are not common, such as broadcast videodeployments, can be benefit from the simplicity static Tree-SID brings,eliminating the need for distributed BGP mVPN signaling. Static Tree-SID issupported for both default VRF (Global Routing Table) and mVPN.Please see the CST 3.5+ Implementation Guide for static Tree-SID configurationguidelines and examples.Dynamic Tree-SID using BGP mVPN Control-PlaneIn CST 5.0+, we now support using fully dynamic signaling to create multicastdistribution trees using Tree-SID. Sources and receivers are discovered usingBGP auto-discovery (BGP-AD) and advertised throughout the mVPN using the IPv4 orIPv6 mVPN AFI/SAFI. Once the source head-end node learns of receivers, thehead-end will create a PCEP request to the configured primary PCE. The PCE thencomputes the optimal multicast distribution tree based on the metric-type andconstraints specified in the request. Once the Tree-SID policy is up, multicasttraffic will be forwarded using the tree by the head-end node. Tree-SIDoptionally supports TI-LFA for all segments, and the ability to create disjointtrees for high available applications.All routers across the network needing to participate in the tree, includingcore nodes, must be configured as a PCC to the primary PCE being used by thehead-end node.Please see the CST 5.0+ Implementation Guide for dynamic Tree-SID configuration guidelines and examples.L3 IP Multicast and mVPN using mLDPIP multicast continues to be an optimization method for delivering contenttraffic to many endpoints, especially traditional broadcast video. Unicastcontent dominates the traffic patterns of most networks today, but multicastcarries critical high value services, so proper design and implementation isrequired. In Converged SDN Transport 2.0 we introduced multicast edge and corevalidation for native IPv4/IPv6 multicast using PIM, global multicast usingin-band mLDP (profile 7), and mVPN using mLDP with in-band signaling (profile6). Converged SDN Transport 3.0 extends this functionality by adding support formLDP LSM with the NG-MVPN BGP control plane (profile 14). Using BGP signalingadds additional scale to the network over in-band mLDP signaling and fits withthe overall design goals of CST. More information about deployment of profile 14can be found in the Converged SDN Transport implementation guide. Converged SDNTransport 3.0 supports mLDP-based label switched multicast within a single domanand across IGP domain boundaries. In the case of the Converged SDN Transportdesign multicast has been tested with the source and receivers on both accessand ABR PE devices. Supported Multicast Profiles Description Profile 6 mLDP VRF using in-band signaling Profile 7 mLDP global routing table using in-band signaling Profile 14 Partitioned MDT using BGP-AD and BGP c-multicast signaling Profile 14 is recommended for all service use cases and supports bothintra-domain and inter-domain transport use cases.LDP Auto-configurationLDP can automatically be enabled on all IS-IS interfaces with the following configuration in the IS-IS configurationrouter isis ACCESS address-family ipv4 unicast mpls ldp auto-config LDP mLDP-only Session Capability (RFC 7473)In Converged SDN Transport 3.0 we introduce the ability to only advertise mLDP state on each router adjacency, eliminating the need to filter LDP unicast FECs from advertisement into the network. This is done using the SAC (State Advertisement Control) TLV in the LDP initialization messages to advertise which LDP FEC classes to receive from an adjacent peer. We can restrict the capabilities to mLDP only using the following configuration. Please see the implementation guide and configurations for the full LDP configuration.mpls ldp capabilities sac mldp-only LDP Unicast FEC Filtering for SR Unicast with mLDP MulticastThe following is for historical context, please see the above section regarding disabling LDP unicast FECs using session capability advertisements.The Converged SDN Transport design utilized Segment Routing with the MPLS dataplane for all unicast traffic. The first phase of multicast support in Converged SDN Transport 2.0 will use mLDP for use with existing mLDP based networks and new networks wishing to utilize label switcched multicast across the core. LDP is enabled on an interface for both unicast and multicast by default. Since SR is being used for unicast, one must filtering out all LDP unicast FECs to ensure they are not distributed across the network. SR is used for all unicast traffic in the presence of an LDP FEC for the same prefix, but filtering them reduces control-plane activity, may aid in re-convergence, and simplifies troubleshooting. The following should be applied to all interfaces which have mLDP enabled#ipv4 access-list no-unicast-ldp 10 deny ipv4 any any!RP/0/RSP0/CPU0#Node-6#show run mpls ldpmpls ldplog neighboraddress-family ipv4 label local allocate for no-unicast-ldp Converged SDN Transport Use CasesService Provider networks must adopt a very flexible design that satisfyany to any connectivity requirements, without compromising in stabilityand availability. Moreover, transport programmability is essential tobring SLA awareness into the network.The goal of the Converged SDN Transport is to provide a flexible networkblueprint that can be easily customized to meet customer specificrequirements. This blueprint must adapt to carry any service type, for examplecable access, mobile, and business services over the same converged network infrastructure. The following sections highglight some specific customer use cases and the components of the design used in building those solutions.4G and 5G Mobile NetworksSummary and 5G Service TypesThe Converged SDN Transport design introduces support for 5G networks and 5G services. There are a variety of new service use cases being defined by 3GPP for use on 5G networks, illustrated by the figure below. Networks must now be built to support the stringent SLA requirements of Ultra-Reliable Low-Latency services while also being able to cope with the massive bandwidth introduced by Enhanced Mobile Broadband services. The initial support for 5G in the Converged SDN Transport design focuses on the backhaul and midhaul portions of the network utilizing end to end Segment Routing. The design introduces no new service types, the existing scalable L3VPN and EVPN based services using BGP are sufficient for carrying 5G control-plane and user-plane traffic.Key Validated ComponentsThe following key features have been added to the CST validated design to support 5G deploymentsEnd to End Timing ValidationEnd to end timing using PTP with profiles G.8275.1 and G.8275.2 have beenvalidated in the CST design. Best practice configurations are available in theonline configurations and CST Implementation Guide. It is recommended to useG.8257.1 when possible to main the highest level of accuracy across the network.In CST 4.0+ we include validation for G.8275.1 to G.8275.2 interworking,allowing the use of different profiles across the network. Synchronous Ethernet(SyncE) is also recommended across the network to maintain stability when timingto the PRC. All nodes used in the CST design support SyncE.Low latency SR-TE path computationThe “latency” constraint type is used either with a configured SR Policy or ODNSR Policy specifies the computation engine to compute the lowest latency pathacross the network. The latency computation algorithm can use differentmechanisms for computing the end to end path. The first and preferred mechanismis to use the realtime measured per-link one-way delay across the network. Thismeasured information is distributed via IGP extensions across the IGP domain andto external PCEs using BGP-LS extensions for use in both intra-domain andinter-domain calculations. Two other metric types can also be utilized as partof the “latency” path computation. The TE metric, which can be defined on all SRIS-IS links and the regular IGP metric can be used in the absence of thelink-delay metric. More information on Performance Measurement for link delay can be found at https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-5/segment-routing/configuration/guide/b-segment-routing-cg-asr9000-75x/configure-performance-measurement.html Performance Measurement is supported on all hardware used in the CST design.Dynamic Link Performance MeasurementStarting in version 3.5 of the CST, dynamic measurement of one-way and two-waylatency on logical links is fully supported across all devices. The delaymeasurement feature utilizes TWAMP-Lite as the transport mechanism for probesand responses. PTP is a requirement for accurate measurement of one-way latencyacross links and is recommended for all nodes. In the absence of PTP a“two-way” delay mode is supported to calculate the one-way link delay. It isrecommended to configure one-way delay on all IS-IS core links within the CSTnetwork. A sample configuration can be found below and detailed configurationinformation can be found in the implementation guide.One way delay measurement is also available for SR-TE Policy paths to give theprovider an accurate latency measurement for all services utilizing the SR-TEPolicy. This information is available through SR Policy statistics using the CLIor model-driven telemetry. The latency measurement is done for all activecandidate paths.Dynamic one-way link delay measurements using PTP are not currently supported on unnumbered interfaces. In the case of unnumbered interfaces, static link delay values must be used.Different metric types can be used in a single path computation, with the following order used# Unidirectional link delay metric either computed or statically defined Statically defined TE metric IGP metricSR Policy latency constraint configuration on configured policysegment-routing traffic-eng policy LATENCY-POLICY color 20 end-point ipv4 1.1.1.3 candidate-paths preference 100 dynamic mpls metric type latencySR Policy latency constraint configuration for ODN policiessegment-routing traffic-eng on-demand color 100 dynamic pcep ! metric type latencyDynamic link delay metric configurationperformance-measurement interface TenGigE0/0/0/10 delay-measurement interface TenGigE0/0/0/20 delay-measurement ! ! protocol twamp-light measurement delay unauthenticated querier-dst-port 12345 ! ! ! delay-profile interfaces advertisement accelerated threshold 25 ! periodic interval 120 threshold 10 ! ! probe measurement-mode one-way protocol twamp-light computation-interval 60 ! !Static defined link delay metricStatic delay is set by configuring the “advertise-delay” value in microseconds under each interfaceperformance-measurement interface TenGigE0/0/0/10 delay-measurement advertise-delay 15000 interface TenGigE0/0/0/20 delay-measurement advertise-delay 10000TE metric definitionsegment-routing traffic-eng interface TenGigE0/0/0/10 metric 15 ! interface TenGigE0/0/0/20 metric 10The link-delay metrics are quantified in the unit of microseconds. On mostnetworks this can be quite large and may be out of range from normal IGPmetrics, so care must be taken to ensure proper compatibility when mixing metrictypes. The largest possible IS-IS metric is 16777214 which is equivalent to16.77 seconds.SR Policy one-way delay measurementIn addition to the measurement of delay on physical links, the end to endone-way delay can also be measured across a SR Policy. This allows a provider tomonitor the traffic path for increases in delay and log/alarm when thresholdsare exceeded. Please note SR Policy latency measurements are not supported forPCE-computed paths, only those using head-end computation or configured staticsegment lists. The basic configuration for SR Policy measurement follows#performance-measurement delay-profile sr-policy advertisement accelerated threshold 25 ! periodic interval 120 threshold 10 ! threshold-check average-delay ! ! probe tos dscp 46 ! measurement-mode one-way protocol twamp-light computation-interval 60 burst-interval 60 ! ! protocol twamp-light measurement delay unauthenticated querier-dst-port 12345 ! ! !!segment-routing traffic-eng policy APE7-PM color 888 end-point ipv4 100.0.2.52 candidate-paths preference 200 dynamic metric type igp ! ! ! ! performance-measurement delay-measurement logging delay-exceededIP Endpoint Delay MeasurementIn CST 5.0+ IOS-XR’s Performance Measurement is extended to perform SLA measurements between IP endpoints across multi-hop paths. Delay measurements as well as liveness detection are supported. Model-driven telemetry as well as CLI commands can be used to monitor the path delay.Global Routing Table IP Endpoint Delay Measurementperformance-measurement endpoint ipv4 1.1.1.5 source-address ipv4 1.1.1.1 delay-measurement ! ! delay-profile endpoint default probe measurement-mode one-wayVRF IP Endpoint Delay Measurementperformance-measurement endpoint ipv4 10.10.10.100 vrf green source-address ipv4 1.1.1.1 delay-measurement ! ! delay-profile endpoint default probe measurement-mode one-waySegment Routing Flexible Algorithms for 5G SlicingSR Flexible Algorithms, outlined earlier in the transport section, giveproviders a powerful mechanism to segment networks into topoligies defined bySLA requirements. The SLA-driven topologies solve the constraints of specific 5Gservice types such as Ulta-Reliable Low-Latency Services. Using SR with a packetdataplane ensures the most efficient network possible, unlike slicing solutionsusing optical transport or OTN.End to end network QoS with H-QoS on Access PEQoS is of utmost importance for ensuring the mobile control plane and criticaluser plane traffic meets SLA requirements. Overall network QoS is covered in theQoS section in this document, this section will focus on basic Hierarchical QoSto support 5G services.H-QoS enables a provider to set an overall traffic rate across all services, andthen configure parameters per-service via a child QoS policy where thepercentages of guaranteed bandwidth are derived from the parent rate. NCSplatforms support 2-level and 3-level H-QoS. 3-level H-QoS applies a policer(ingress) or shaper (egress) to a physical interface, with each sub-interfacehaving a 2-level H-QoS policy applied.CST QoS mapping with 5 classes Traffic Type Ingress Marking Core Marking Comments Low latency IPP 5 EXP 5 URLLC, consistent delay, small buffer 5G Control Plane IPP 4 EXP 4 Mobile control and billing High Priority Service IPP 3 (in contract), 1 (out of contract) EXP 1,3 Business service Best Effort IPP 0 EXP 0 General user traffic Network Control IPP 6 EXP 6 Underlay network control plane FTTH Design using EVPN E-TreeSummaryMany providers today are migrating from L2 access networks to more flexible L3 underlay networks using xVPN overlays to support a variety of network services. L3 networks offer more flexibility in terms of topology, resiliency, and support of both L2VPN and L3VPN services. Using a converged aggregation and access network simplifies networks and reduced both capex and opex spend by eliminating duplicate networks. Fiber to the home networks using active Ethernet have typically used L2 designs using proprietary methods like Private VLANs for subscriber isolation. EVPN E-Tree gives us a modern alternative to provide these services across a converged L3 Segment Routing network. This use case highlights one specific use case for E-Tree, however there are a number of other business and subscriber service use cases benefitting from EVPN E-Tree.E-Tree DiagramE-Tree OperationOne of the strongest features of EVPN is its dynamic signaling of PE state across the entire EVPN virtual instance. E-Tree extends this paradigm by signaling between EVPN PEs which Ethernet Segments are considered root segments and which ones are considered leaf segments. Similar to hub and spoke L3VPN networks, traffic is allowed between root/leaf and root/root interfaces but not between leaf interfaces either on the same node or on different nodes. EVPN signaling creates the forwarding state and entries to restrict traffic forwarding between endpoints connected to the same leaf Ethernet Segment.Split-Horizon GroupsE-Tree enables split horizon groups on access interfaces within the same Bridge Domain/EVI configured for E-Tree to prohibit direct L2 forwarding between these interfaces.L3 IRB SupportIn a fully distributed FTTH deployment, a provider may choose to put the L3 gateway for downstream access endpoints on the leaf device. The L3 BVI interface defined for the E-Tree BD/EVI is always considered a root endpoint. E-Tree operates at L2 so when a L3 interface is present traffic will be forwarded at L3 between leaf endpoints. Note L2 leaf devices using a centralized IRB L3 GW on an E-Tree root node is not currently supported. In this type of deployment where the L3 GW is not located on the leaf the upstream L3 GW node must be attached via a L2 interface into the E-Tree root node Bridge Domani/EVI. It is recommended to locate the L3 GW on the leaf device if possible.Multicast TrafficMulticast traffic across the E-Tree L2/L3 network is performed using ingress replication from the source to the receiver nodes. It is important to use IGMP or MLDv2 snooping in order to minimize the flooding of multicast traffic across the entire Ethernet VPN instance. When snooping is utilized, traffic is only sent to EVPN PE nodes with interested receivers instead of all PEs in the EVI.Ease of ConfigurationConfiguring a node as a leaf in an E-Tree EVI requires only a single command “etree” to be configured under the EVI in the global EVPN configuration. Please see the Implementation Guide for specific configuration examples.l2vpn bridge group etree bridge-domain etree-ftth interface TenGigE0/0/0/23.1098 routed interface BVI100 ! evi 100 !!evpn evi 100 etree leaf ! advertise-mac ! ! Cisco Cloud Native Broadband Network GatewaycnBNG represents a fundamental shift in how providers build converged accessnetworks by separating the subscriber BNG control-plane functions fromuser-plane functions. CUPS (Control/User-Plane Separation) allows the use ofscale-out x86 compute for subscriber control-plane functions, allowing providersto place these network functions at an optimal place in the network, and alsoallows simplification of user-plane elements. This simplification enablesproviders to distribute user-plane elements closer to end users, optimizingtraffic efficiency to and from subscribers. In the CST 5.0 design we includeboth traditional physical BNG (pBNG) and the newer cnBNG architecture.Cisco cnBNG ArchitectureCisco’s cnBNG supports the BBF TR-459 standards for control and user planecommunication. The State Control Interface (SCi) is used for programming andmanagement of dynamic subscriber interfaces including accounting information.The Control Packet Redirect Interface (CPRi) as its name implies redirects userpackets destined for control-plane functions from the user plane to controlplane. These include# DHCP DORA, DHCPv6, PPPoE, and L2TP. More information on TR-459 can be found at https#//www.broadband-forum.org/marketing/download/TR-459.pdfcnBNG Control PlaneThe cloud native BNG control plane is a highly resilient scale out architecture.Traditional physical BNGs embedded in router software often scale poorly,require complex HA mechnaisms for resiliency, and are relatively painful toupgrade. Moving these network functions to a modern Kubernetes basedcloud-native infrastructure reduces operator complexity providing nativescale-out capacity growth, in-service software upgrades, and faster featuredelivery. Cisco cnBNG control plane supports deployment on VMWare ESXi,cnBNG User PlaneThe cnBNG user plane is provided by Cisco ASR 9000 routers. The routers areresponsible for terminating subscriber sessions (IPoE/PPPoE), communicating withthe cnBNG control plane for user authentication and policy, applyingsubscriber policy elements such as QoS and security policies, and performs subscriber routing.Cable Converged Interconnect Network (CIN)SummaryThe Converged SDN Transport Design enables a multi-service CIN by adding supportfor the features and functions required to build a scalable next-generationEthernet/IP cable access network. Differentiated from simple switch or L3aggregation designs is the ability to support NG cable transport over the samecommon infrastructure already supporting other services like mobile backhaul andbusiness VPN services. Cable Remote PHY is simply another service overlayed ontothe existing Converged SDN Transport network architecture. We will cover allaspects of connectivity between the Cisco cBR-8 and the RPD device.Distributed Access ArchitectureThe cable Converged Interconnect Network is part of a next-generationDistributed Access Architecture (DAA), an architecture unlocking highersubscriber bandwidth by moving traditional cable functions deeper into thenetwork closer to end users. R-PHY or Remote PHY, places the analog to digitalconversion much closer to users, reducing the cable distance and thus enablingdenser and higher order modulation used to achieve Gbps speeds over existingcable infrastructure. This reference design will cover the CIN design to supportRemote PHY deployments.Remote PHY Components and RequirementsThis section will list some of the components of an R-PHY network and thenetwork requirements driven by those components. It is not considered to be anexhaustive list of all R-PHY components, please see the CableLabs specificationdocument, the latest which can be access via the following URL#https#//specification-search.cablelabs.com/CM-SP-R-PHYRemote PHY Device (RPD)The RPD unlocks the benefits of DAA by integrating the physical analog todigital conversions in a device deployed either in the field or located in ashelf in a facility. The uplink side of the RPD or RPHY shelf is simplyIP/Ethernet, allowing transport across widely deployed IP infrastructure. TheRPD-enabled node puts the PHY function much closer to an end user, allowinghigher end-user speeds. The shelf allows cable operators to terminate only thePHY function in a hub and place the CMTS/MAC function in a more centralizedfacility, driving efficiency in the hub and overall network. The followingdiagram shows various options for how RPDs or an RPD shelf can be deployed.Since the PHY function is split from the MAC it allows independent placement ofthose functions.RPD Network ConnectionsEach RPD is typically deployed with a single 10GE uplink connection. The compactRPD shelf uses a single 10GE uplink for each RPD.Cisco cBR-8 and cnBRThe Cisco Converged Broadband Router performs many functions as part of a RemotePHY solution. The cBR-8 provisions RPDs, originates L2TPv3 tunnels to RPDs,provisions cable modems, performs cable subscriber aggregation functions, andacts as the uplink L3 router to the rest of the service provider network. In theRemote PHY architecture the cBR-8 acts as the DOCSIS core and can also serve asa GCP server and video core. The cBR-8 runs IOS-XE. The cnBR, cloud nativeBroadband Router, provides DOCSIS core functionality in a server-based softwareplatform deployable anywhere in the SP network. CST 3.0 has been validated usingthe cBR-8, the cnBR will be validated in an upcoming release.cBR-8 Network ConnectionsThe cBR-8 is best represented as having “upstream” and “downstream”connectivity.The upstream connections are from the cBR8 Supervisor module to the SP network.Subscriber data traffic and video ingress these uplink connections for deliveryto the cable access network. The cBR-8 SUP-160 has 8x10GE SFP+ physicalconnections, the SUP-250 has 2xQSFP28/QSFP+ interfaces for 40G/100G upstreamconnections.In a remote PHY deployment the downstream connections to the CIN are via theDigital PIC (DPIC-8X10G) providing 40G of R-PHY throughput with 8 SFP+ networkinterfaces.cBR-8 RedundancyThe cBR-8 supports both upstream and downstream redundancy. Supervisorredundancy uses active/standby connections to the SP network. Downstreamredundancy can be configured at both the line card and port level. Line cardredundancy uses an active/active mechanism where each RPD connects to the DOCSIScore function on both the active and hot standby Digital PIC line card. Portredundancy uses the concept of “port pairs” on each Digital PIC, with ports 0/1,2/3, 4/6, and 6/7 using either an active/active (L2) or active/standby (L3)mechanism. In the CST design we utilize a L3 design with the active/standbymechanism. The mechanism uses the same IP address on both ports, with thestandby port kept in a physical down state until switchover occurs.Remote PHY CommunicationDHCPThe RPD is provisioned using ZTP (Zero Touch Provisioning). DHCPv4 and DHCPv6are used along with CableLabs DHCP options in order to attach the RPD to thecorrect GCP server for further provisioning.Remote PHY Standard FlowsThe following diagram shows the different core functions of a Remote PHYsolution and the communication between those elements.GCPGeneric Communications Protocol is used for the initial provisioning of the RPD.When the RPD boots and received its configuration via DHCP, one of the DHCPoptions will direct the RPD to a GCP server which can be the cBR-8 or CiscoSmart PHY. GCP runs over TCP typically on port 8190.UEPI and DEPI L2TPv3 TunnelsThe upstream output from an RPD is IP/Ethernet, enabling the simplification ofthe cable access network. Tunnels are used between the RPD PHY functions andDOCSIS core components to transport signals from the RPD to the core elements,whether it be a hardware device like the Cisco cBR-8 or a virtual networkfunction provided by the Cisco cnBR (cloud native Broadband Router).DEPI (Downstream External PHY Interface) comes from the M-CMTS architecture,where a distributed architecture was used to scale CMTS functions. In the RemotePHY architecture DEPI represents a tunnel used to encapsulate and transport fromthe DOCSIS MAC function to the RPD. UEPI (Upstream External PHY Interface) isnew to Remote PHY, and is used to encode and transport analog signals from theRPD to the MAC function.In Remote PHY both DEPI and UEPI tunnels use L2TPv3, defined in RFC 3931, totransport frames over an IP infrastructure. Please see the following Cisco whitepaper for more information on how tunnels are created specific toupstream/downstream channels and how data is encoded in the specific tunnelsessions.https#//www.cisco.com/c/en/us/solutions/collateral/service-provider/converged-cable-access-platform-ccap-solution/white-paper-c11-732260.html.In general there will be one or two (standby configuration) UEPI and DEPI L2TPv3tunnels to each RPD, with each tunnel having many L2TPv3 sessions for individualRF channels identified by a unique session ID in the L2TPv3 header. Since L2TPv3is its own protocol, no port number is used between endpoints, the endpoint IPaddresses are used to identify each tunnel. Unicast DOCSIS data traffic canutilize either or multicast L2TPv3 tunnels. Multicast tunnels are used withdownstream virtual splitting configurations. Multicast video is encoded anddelivered using DEPI tunnels as well, using a multipoint L2TPv3 tunnel tomultiple RPDs to optimize video delivery.CIN Network RequirementsIPv4/IPv6 Unicast and MulticastDue to the large number of elements and generally greenfield network builds, the CIN network must support all functions using both IPv4 and IPv6. IPv6 may be carried natively across the network or within an IPv6 VPN across an IPv4 MPLS underlay network. Similarly the network must support multicast traffic delivery for both IPv4 and IPv6 delivered via the global routing table or Multicast VPN. Scalable dynamic multicast requires the use of PIMv4, PIMv6, IGMPv3, and MLDv2 so these protocols are validated as part of the overall network design. IGMPv2 and MLDv2 snooping are also required for designs using access bridge domains and BVI interfaces for aggregation.Network TimingFrequency and phase synchronization is required between the cBR-8 and RPD to properly handle upstream scheduling and downstream transmission. Remote PHY uses PTP (Precision Timing Protocol) for timing synchronization with the ITU-T G.8275.2 timing profile. This profile carries PTP traffic over IP/UDP and supports a network with partial timing support, meaning multi-hop sessions between Grandmaster, Boundary Clocks, and clients as shown in the diagram below. The cBR-8 and its client RPD require timing alignment to the same Primary Reference Clock (PRC). In order to scale, the network itself must support PTP G.8275.2 as a T-BC (Boundary Clock). Synchronous Ethernet (SyncE) is also recommended across the CIN network to maintain stability when timing to the PRC.CST 4.0+ Update to CIN Timing DesignStarting in CST 4.0, NCS nodes support both G.8275.1 and G.8275.2 on the same node, and also support interworking between them. If the network path between the PTP GM and client RPDs can support G.8275.1 on each hop, it should be used. G.8275.1 runs on physical interfaces and does not have limitations such as running over Bundle Ethernet interfaces. The G.8275.1 to G.8275.2 interworking will take place on the RPD leaf node, with G.8275.2 being used to the RPDs. The following diagram depicts a recommended end-to-end timing design between the PTP GM and the RPD. Please review the CST 4.0 Implementation Guide for details on configuring G.8275.1 to G.8275.2 interworking. In addition to PTP interworking, CST 4.0 supports PTP timing on BVI interfaces.QoSControl plane functions of Remote PHY are critical to achieving proper operation and subscriber traffic throughput. QoS is required on all RPD-facing ports, the cBR-8 DPIC ports, and all core interfaces in between. Additional QoS may be necessary between the cBR-8, RPD, and any PTP timing elements. See the design section for further details on QoS components.DHCPv4 and DHCPv6 RelayAs a critical component of the initial boot and provisioning of RPDs, the network must support DHCP relay functionality on all RPD-facing interfaces, for both IPv4 and IPv6.Converged SDN Transport CIN DesignDeployment Topology OptionsThe Converged SDN Transport design is extremely flexible in how Remote PHY components are deployed. Depending on the size of the deployment, components can be deployed in a scalable leaf-spine fabric with dedicated routers for RPD and cBR-8 DPIC connections or collapsed into a single pair of routers for smaller deployments. If a smaller deployment needs to be expanded, the flexible L3 routed design makes it very easy to simply interconnect new devices and scale the design to a fabric supporting thousands of RPD and other access network connections.High Scale Design (Recommended)This option maximizes statistical multiplexing by aggregating Digital PIC downstream connections on a separate leaf device, allowing one to connect a number of cBR-8 interfaces to a fabric with minimal 100GE uplink capacity. The topology also supports the connectivity of remote shelves for hub consolidation. Another benefit is the fabric has optimal HA and the ability to easily scale with more leaf and spine nodes.High scale topologyCollapsed Digital PIC and SUP Uplink ConnectivityThis design for smaller deployments connects both the downstream Digital PIC connections and uplinks on the same CIN core device. If there is enough physical port availability and future growth does not dictate capacity beyond these nodes this design can be used. This design still provides full redundancy and the ability to connect RPDs to any cBR-8. Care should be taken to ensure traffic between the DPIC and RPD does not traverse the SUP uplink interfaces.Collapsed cBR-8 uplink and Digital PIC connectivityCollapsed RPD and cBR-8 DPIC ConnectivityThis design connects each cBR-8 Digital PIC connection to the RPD leaf connected to the RPDs it will serve. This design can also be considered a “pod” design where cBR-8 and RPD connectivity is pre-planned. Careful planning is needed since the number of ports on a single device may not scale efficiently with bandwidth in this configuration.Collapsed or Pod cBR-8 Digital PIC and RPD connectivityIn the collapsed desigs care must be taken to ensure traffic between each RPD can reach the appropriate DPIC interface. If a leaf is single-homed to the aggregation router its DPIC interface is on, RPDs may not be able to reach their DPIC IP. The options with the shortest convergence time are# Adding interconnects between the agg devices or multiple uplinks from the leaf to agg devices.Cisco HardwareThe following table highlights the Cisco hardware utilized within the Converged SDN Transport design for Remote PHY. This table is non-exhaustive. One highlight is all NCS platforms listed are built using the same NPU family and share most features across all platforms. See specific platforms for supported scale and feature support. Product Role 10GE SFP+ 25G SFP28 100G QSFP28 Timing Comments NCS-55A1-24Q6H-S RPD leaf 48 24 6 Class B   N540-ACC-SYS RPD leaf 24 8 2 Class B Smaller deployments NCS-55A1-48Q6H-S DPIC leaf 48 48 6 Class B   NCS-55A2-MOD Remote agg 40 24 upto 8 Class B CFP2-DCO support NCS-55A1-36H-S Spine 144 (breakout) 0 36 Class B   NCS-5502 Spine 192 (breakout) 0 48 None   NCS-5504 Multi Upto 576 x Upto 144 Class B 4-slot modular platform Scalable L3 Routed DesignThe Cisco validated design for cable CIN utilizes a L3 design with or without Segment Routing. Pure L2 networks are no longer used for most networks due to their inability to scale, troubleshooting difficulty, poor network efficiency, and poor resiliency. L2 bridging can be utilized on RPD aggregation routers to simplify RPD connectivity.L3 IP RoutingLike the overall CST design, we utilize IS-IS for IPv4 and IPv6 underlay routing and BGP to carry endpoint information across the network. The following diagram illustrates routing between network elements using a reference deployment. The table below describes the routing between different functions and interfaces. See the implementation guide for specific configuration. Interface Routing Comments cBR-8 Uplink IS-IS Used for BGP next-hop reachability to SP Core cBR-8 Uplink BGP Advertise subscriber and cable-modem routes to SP Core cBR-8 DPIC Static default in VRF Each DPIC interface should be in its own VRF on the cBR-8 so it has a single routing path to its connected RPDs RPD Leaf Main IS-IS Used for BGP next-hop reachability RPD Leaf Main BGP Advertise RPD L3 interfaces to CIN for cBR-8 to RPD connectivity RPD Leaf Timing BGP Advertise RPD upstream timing interface IP to rest of network DPIC Leaf IS-IS Used for BGP next-hop reachability DPIC Leaf BGP Advertise cBR-8 DPIC L3 interfaces to CIN for cBR-8 to RPD connectivity CIN Spine IS-IS Used for reachability between BGP endpoints, the CIN Spine does not participate in BGP in a SR-enabled network CIN Spine RPD Timing IS-IS Used to advertise RPD timing interface BGP next-hop information and advertise default CIN Spine BGP (optional) In a native IP design the spine must learn BGP routes for proper forwarding CIN Router to Router InterconnectionIt is recommended to use multiple L3 links when interconnecting adjacent routers, as opposed to using LAG, if possible. Bundles increase the possibility for timing inaccuracy due to asymmetric timing traffic flow between slave and master. If bundle interfaces are utilized, care should be taken to ensure the difference in paths between two member links is kept to a minimum. All router links will be configured according to the global CST design. Leaf devices will be considered CST access PE devices and utilize BGP for all services routing.Leaf Transit TrafficIn a single IGP network with equal IGP metrics, certain link failures may cause a leaf to become a transit node. Several options are available to keep transit traffic from transiting a leaf and potentially causing congestion. Using high metrics on all leaf to agg uplinks will prohibit this and is recommended in all configurations.cBR-8 DPIC to CIN InterconnectionThe cBR-8 supports two mechanisms for DPIC high availability outlined in the overview section. DPIC line card and link redundancy is recommended but not a requirement. In the CST reference design, if link redundancy is being used each port pair on the active and standby line cards is connected to a different router and the default active ports (even port number) is connected to a different router. In the example figure, port 0 from active DPIC card 0 is connected to R1 and port 0 from standby DPIC card 1 is connected to R2. DPIC link redundancy MUST be configured using the “cold” method since the design is using L3 to each DPIC interface and no intermediate L2 switching. This is done with the cable rphy link redundancy cold global command and will keep the standby link in a down/down state until switchover occurs.DPIC line card and link HADPIC Interface ConfigurationEach DPIC interface should be configured in its own L3 VRF. This ensures traffic from an RPD assigned to a specific DPIC interface takes the traffic path via the specific interface and does not traverse the SUP interface for either ingress or egress traffic. It’s recommended to use a static default route within each DPIC VRF towards the CIN network. Dynamic routing protocols could be utilized, however it will slow convergence during redundancy switchover.Router Interface ConfigurationIf no link redundancy is utilized each DPIC interface will connect to the router using a point to point L3 interface.If using cBR-8 link HA, failover time is reduced by utilizing the same gateway MAC address on each router. Link HA uses the same IP and MAC address on each port pair on the cBR-8, and retains routing and ARP information for the L3 gateway. If a different MAC address is used on each router, traffic will be dropped until an ARP occurs to populate the GW MAC address on the router after failover. On the NCS platforms, a static MAC address cannot be set on a physical L3 interface. The method used to set a static MAC address is to use a BVI (Bridged Virtual Interface), which allows one to set a static MAC address. In the case of DPIC interface connectivity, each DPIC interface should be placed into its own bridge domain with an associated BVI interface. Since each DPIC port is directly connected to the router interface, the same MAC address can be utilized on each BVI.If using IS-IS to distribute routes across the CIN, each DPIC physical interface or BVI should be configured as a passive IS-IS interface in the topology. If using BGP to distribute routing information the “redistribute connected” command should be used with an appropriate route policy to restrict connected routes to only DPIC interface. The BGP configuration is the same whether using L3VPN or the global routing table.It is recommended to use a /31 for IPv4 and /127 for IPv6 addresses for each DPIC port whether using a L3 physical interface or BVI on the CIN router.RPD to Router InterconnectionThe Converged SDN Transport design supports both P2P L3 interfaces for RPD and DPIC aggregation as well as using Bridge Virtual Interfaces. A BVI is a logical L3 interface within a L2 bridge domain. In the BVI deployment the DPIC and RPD physical interfaces connected to a single leaf device share a common IP subnet with the gateway residing on the leaf router.It is recommended to configure the RPD leaf using bridge-domains and BVI interfaces. This eases configuration on the leaf device as well as the DHCP configuration used for RPD provisioning.The following shows the P2P and BVI deployment options.Native IP or L3VPN/mVPN DeploymentTwo options are available and validated to carry Remote PHY traffic between the RPD and MAC function. Native IP means the end to end communication occurs as part of the global routing table. In a network with SR-MPLS deployed such as the CST design, unicast IP traffic is still carried across the network using an MPLS header. This allows for fast reconvergence in the network by using SR and enabled the network to carry other VPN services on the network even if they are not used to carry Remote PHY traffic. In then native IP deployment, multicast traffic uses either PIM signaling with IP multicast forwarding or mLDP in-band signaling for label-switched multicast. The multicast profile used is profile 7 (Global mLDP in-band signaling). L3VPN and mVPN can also be utilized to carry Remote PHY traffic within a VPN service end to end. This has the benefit of separating Remote PHY traffic from the network underlay, improving security and treating Remote PHY as another service on a converged access network. Multicast traffic in this use case uses mVPN profile 14. mLDP is used for label-switched multicast, and the NG-MVPN BGP control plane is used for all multicast discovery and signaling. SR-TESegment Routing Traffic Engineering may be utilized to carry traffic end to end across the CIN network. Using On-Demand Networking simplifies the deployment of SR-TE Policies from ingress to egress by using specific color BGP communities to instruct head-end nodes to create policies satisfying specific user constraints. As an example, if RPD aggregation prefixes are advertised using BGP to the DPIC aggregation device, SR-TE tunnels following a user constraint can be built dynamically between those endpoints.CIN Quality of Service (QoS)QoS is a requirement for delivering trouble-free Remote PHY. This design uses sample QoS configurations for concept illustration, but QoS should be tailored for specific network deployments. New CIN builds can utilize the configurations in the implementation guide verbatim if no other services are being carried across the network. Please see the section in this document on QoS for general NCS QoS information and the implementation guide for specific details.CST Network Traffic ClassificationThe following lists specific traffic types which should be treated with specific priority, default markings, and network classification points. Traffic Type Ingress Interface Priority Default Marking Comments BGP Routers, cBR-8 Highest CS6 (DSCP 48) None IS-IS Routers, cBR-8 Highest CS6 IS-IS is single-hop and uses highest priority queue by default BFD Routers Highest CS6 BFD is single-hop and uses highest priority queue by default DHCP RPD High CS5 DHCP COS is set explicitly PTP All High DSCP 46 Default on all routers, cBR-8, and RPD DOCSIS MAP/UCD RPD, cBR-8 DPIC High DSCP 46   DOCSIS BWR RPD, cBR-8 DPIC High DSCP 46   GCP RPD, cBR-8 DPIC Low DSCP 0   DOCSIS Data RPD, cBR-8 DPIC Low DSCP 0   Video cBR-8 Medium DSCP 32 Video within multicast L2TPv3 tunnel when cBR-8 is video core MDD RPD, cBR-8 Medium DSCP 40   CST and Remote-PHY Load BalancingUnicast network traffic is load balanced based on MPLS labels and IP header criteria. The devices used in the CST design are capable of load balancing traffic based on MPLS labels used in the SR underlay and IP headers underneath any MPLS labels. In the higher bandwidth downstream direction, where a series of L2TP3 tunnels are created from the cBR-8 to the RPD, traffic is hashed based on the source and destination IP addresses of those tunnels. Downstream L2TPv3 tunnels from a single Digital PIC interface to a set of RPDs will be distributed across the fabric based on RPD destination IP address. The followUing illustrates unicast load balancing across the network.Multicast traffic is not load balanced across the network. Whether the network is utilizing PIMv4, PIMv6, or mVPN, a multicast flow with two equal cost downstream paths will utilize only a single path, and only a single member link will be utilized in a link bundle. If using multicast, ensure sufficient bandwidth is available on a single link between two adjacencies.SmartPHY RPD AutomationSmartPHY is an automation solution for managing deployed RPDs across the SP network. In a non-SmartPHY deployment providers must manually assign RPHY cores via DHCP and manually configure the cBR8 by CLI SmartPHY provides a flexible either GUI or API driven way to eliminate manual configuration. SmartPHY is configured as the RPHY core in the DHCP server for all RPDs. When the RPD boots it will initiate a GCP session to SmartPHY. SmartPHY identifies the RPD and if configured in SmartPHY, will redirect it to the proper RPHY core instance. When provisioning a new RPD, SmartPHY will also deploy the proper configuration to the RPHY core cBR8 node and verify the RPD is operational. The diagram below shows basic SmartPHY operation.4G Transport and Services ModernizationWhile talk about deploying 5G services has reached a fever pitch, many providers are continuing to build and evolve their 4G networks. New services require more agile and scalable networks, satisfied by Cisco’s Converged SDN Transport. The services modernization found in Converged SDN Transport 2.0 follows work done in EPN 4.0. Transport modernization requires simplification and new abilities. We evolve the EPN 4.0 design based on LDP and hierarchical BGP-LU to one using Segment Routing with an MPLS data plane and the SR-PCE to add inter-domain path computation, scale, and programmability. L3VPN based 4G services remain, but are modernized to utilize SR-TE On-Demand Next-Hop, reducing provisioning complexity, increasing scale, and adding advanced path computation constraints. 4G services utilizing L3VPN remain the same, but those utilizing L2VPN such as VPWS and VPLS transition to EVPN services. EVPN is the modern replacement for legacy LDP signalled L2VPN services, reducing complexity and adding advanced multi-homing functionality. The following table highlights the legacy and new way of delivering services for 4G. Element EPN 4.0 Converged SDN Transport Intra-domain MPLS Transport LDP IS-IS w/Segment Routing Inter-domain MPLS Transport BGP Labeled Unicast SR using SR-PCE for Computation MPLS L3VPN (LTE S1,X2) MPLS L3VPN MPLS L3VPN w/ODN L2VPN VPWS LDP Pseudowire EVPN VPWS w/ODN eMBMS Multicast Native / mLDP Native / mLDP The CST 4G Transport modernization covers only MPLS-based access and not L2 access scenarios.Business and Infrastructure Services using L3VPN and EVPNEVPN MulticastMulticast within a L2VPN EVPN has been supported since Converged SDN Transport 1.0. Multicast traffic within an EVPN is replicated to the endpoints interested in a specific group via EVPN signaling. EVPN utilizes ingress replication for all multicast traffic, meaning multicast is encapsulated with a specific EVPN label and unicast to each PE router with interested listeners for each multicast group. Ingress replication may add additional traffic to the network, but simplifies the core and data plane by eliminating multicast signaling, state, and hardware replication. EVPN multicast is also not subject to domain boundary restrictions.EVPN Centralized Gateway MulticastIn CGW deployments, EVPN multicast is enhanced with support for EVPN Route Type6 (RT-6), the Selective Multicast Ethernet Tag Route. RT-6 or SMET routes areused to distribute a leaf node’s interest in a specific multicast S,G. Thisallows the sender node to only transmit the multicast traffic to an EVPN routerwith an interested receiver instead of sending unwanted traffic dropped on theremote router. In release 5.0 CGW is supported on ASR 9000 routers only. CGWselective multicast is supported for IPv4 and *,G multicast.LDP to Converged SDN Transport MigrationVery few networks today are built as greenfield networks, most new designs are migrated from existing ones and must support some level of interop during migration. In the Converged SDN Transport design we tackle one of the most common migration scenarios, LDP to the Converged SDN Transport design. The following sections explain the configuration and best practices for performing the migration. The design is applicable to transport and services originating and terminating in the same LDP domain.Towards Converged SDN Transport DesignThe Converged SDN Transport design utilizes isolated IGP domains in different parts of the network, with each domain separated at a logical boundary by an ASBR router. SR-PCE is used to provide end to end paths across the inter-domain network. LDP does not support inter-domain transport, only between LDP FECs in the same IGP domain. It is recommended to plan logical boundaries if necessary when doing a flat LDP migration to the Converged SDN Transport design, so that when migration is complete the future scale benefits can be realized.Segment Routing EnablementOne must define the global Segment Routing Block (SRGB) to be used across the network on every node participating in SR. There is a default block enabled by default but it may not be large enough to supportan entire network, so it’s advised to right-size this value for your deployment. The current maximum SRGB sizefor SR-MPLS is 256K entries.Enabling SR in IS-IS requires only issuing the command “segment-routing mpls” under the IPv4 address-family and assigning a prefix-sid value to any loopback interfaces you require the node be addressed towards as a service destination. Enabling TI-LFA is done on a per-interface basis in the IS-IS configuration for each interface.Enabling SR-Prefer within IS-IS aids in migration by preferring a SR prefix-sid to a prefix over an LDP prefix, allowing a seamless migration to SR without needing to enable SR completely within a domain.Segment Routing Mapping Server DesignOne component introduced with Segment Routing is the SR Mapping Server (SRMS), a control-plane element converting unicast LDP FECs to Segment Routing prefix-SIDs for advertisement throughout the Segment Routing domain. Each separate IGP domain requires a pair of SRMS nodes until full migratino to SR is complete.AutomationNetwork Management using Cisco Crosswork Network ControllerCrosswork Network Controller provides a platform for UI and API based network management. CNC supports RSVP-TE, SR-TE Policy, L2VPN, and L3VPN provisioning using standards based IETF models.More information on Crosswork Network Controller can be found at#https#//www.cisco.com/c/en/us/products/cloud-systems-management/crosswork-network-controller/index.htmlL2VPN Service Provisioning and VisualizationCrosswork Network Controller supports UI and API based provisioning of EVPN-VPWS services using the IETF L2NM standard model. Once services are provisionined they are visualized using the CNC topology UI along with their underlying SR-TE policies, if applicable.L3VPN Service Provisioning and VisualizationCrosswork Network Controller supports UI and API based provisioning of L3VPNservices using the IETF L3NM standard model. Once services are provisionined they are visualized using the CNC service topology UI along with their underlying SR-TE policies, if applicable.Crosswork Automated AssuranceIn addition to provisioning, monitoring of all transport infrastructure is also supported including advanced service assurance for xVPN services. Service assurance checks all aspects of the network making up the service along with realtime Y.1731 measurements to ensure the defined SLA for the service is met.The figure below shows an example of a degraded service where the measuredone-way latency on the end to end path of 1680uS has exceeded the SLA of 500uS.Zero Touch ProvisioningIn addition to model-driven configuration and operation, Converged SDN Transport 1.5 supports ZTP operation for automated device provisioning. ZTP is useful both in production as well as staging environments to automate initial device software installation, deploy an initial bootstrap configuration, as well as advanced functionality triggered by ZTP scripts. ZTP is supported on both out of band management interfaces as well as in-band data interfaces. When a device first boots, the IOS-XR ZTP processbeging on the management interface of the device and if no response is received, or the the interface is not active, the ZTP process will begin the process on data ports. IOS-XRcan be part of an ecosystem of automated device and service provisioning via Cisco NSO.Zero Touch Provisioning using Crosswork Network ControllerCrosswork Network Controller now includes a ZTP application used to onboard network devices with the proper IOS-XR software and base configuration. Crosswork ZTP supports both traditional unsecure as well as fully secure ZTP operation as outlined in RFC 8572. More information on Crosswork ZTP can be found athttps#//www.cisco.com/c/en/us/products/collateral/cloud-systems-management/crosswork-network-automation/datasheet-c78-743677.htmlModel-Driven TelemetryIn the 3.0 release the implementation guide includes a table of model-driven telemetry paths applicable to different components within the design. More information on Cisco model-driven telemetry can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/telemetry/66x/b-telemetry-cg-ncs5500-66x.html. Additional information about how to consume and visualize telemetry data can be found at https#//xrdocs.io/telemetry. We also introduce integration with Cisco Crosswork Health Insights, a telemetry and automated remediation platform, and sensor packs correspondding to Converged SDN Transport components. More information on Crosswork Health Insights can be found at https#//www.cisco.com/c/en/us/support/cloud-systems-management/crosswork-health-insights/model.htmlTransport and Service Management using Crosswork Network ControllerCrosswork Network Controller provides support for provisioning SR-TE and RSVP-TEtraffic engineering paths as well as managing the VPN services utilizing thosepaths or standard IGP based Segment Routing paths.Network Services OrchestratorNSO is a management and orchestration (MANO) solution for networkservices and Network Functions Virtualization (NFV). The NSO includescapabilities for describing, deploying, configuring, and managingnetwork services and VNFs, as well as configuring the multi-vendorphysical underlay network elements with the help of standard open APIssuch as NETCONF/YANG or a vendor-specific CLI using Network ElementDrivers (NED).In the Converged SDN Transport design, NSO is used for ServicesManagement, Service Provisioning, and Service Orchestration. Example or Core NSO Function Packs are used for end-to-end provisioning of CST services.The NSO provides several options for service designing as shown inFigure 32 Service model with service template Service model with mapping logic Service model with mapping logic and servicetemplates Figure 32# NSO – ComponentsA service model is a way of defining a service in a template format.Once the service is defined, the service model accepts user inputs forthe actual provisioning of the service. For example, a E-Line servicerequires two endpoints and a unique virtual circuit ID to enable theservice. The end devices, attachment circuit UNI interfaces, and acircuit ID are required parameters that should be provided by the userto bring up the E-Line service. The service model uses the YANG modelinglanguage (RFC 6020) inside NSO to define a service.Once the service characteristics are defined based on the requirements,the next step is to build the mapping logic in NSO to extract the userinputs. The mapping logic can be implemented using Python or Java. Thepurpose of the mapping logic is to transform the service models todevice models. It includes mechanisms of how service related operationsare reflected on the actual devices. This involves mapping a serviceoperation to available operations on the devices.Finally, service templates need to be created in XML for each devicetype. In NSO, the service templates are required to translate theservice logic into final device configuration through CLI NED. The NSOcan also directly use the device YANG models using NETCONF for deviceconfiguration. These service templates enable NSO to operate in amulti-vendor environment.Converged SDN Transport Supported Service ModelsCST 5.0+ supports using NSO Transport SDN Function Packs. The T-SDN function packs cover both Traffic Engineering and xVPN service provisioning. CST 5.0 is aligned with T-SDN FP Bundle version 3.0 which includes the following function packs.Core Function PacksCore Function Packs are supported function packs meant to be used as-is without modification. CFP Capabilities SR-TE ODN Configure SR-TE On-Demand Properties SR-TE Configured Segment Routing Traffic Engineered Policies Example Function PacksExample function packs are meant to be used as-is or modified to fit specific network use cases. FP Capabilities IETF-TE Provision RSVP-TE LSPs using IETF IETF-TE model L2NM Provision EVPN-VPWS service using IETF L2NM model L3NM Provision multi-point L3VPN service using IETF L3NM model Y1731 Provision Y.1731 CFM for L2VPN/L3VPN services https#//www.cisco.com/c/dam/en/us/td/docs/cloud-systems-management/crosswork-network-automation/NSO_Reference_Docs/Cisco_NSO_Transport_SDN_Function_Pack_Bundle_User_Guide_3_0_0.pdfBase Services Supporting Advanced Use CasesOverviewThe Converged SDN Transport Design aims to enable simplification across alllayers of a Service Provider network. Thus, the Converged SDN Transportservices layer focuses on a converged Control Plane based on BGP.BGP based Services include EVPNs and Traditional L3VPNs (VPNv4/VPNv6).EVPN is a technology initially designed for Ethernet multipoint servicesto provide advanced multi-homing capabilities. By using BGP fordistributing MAC address reachability information over the MPLS network,EVPN brought the same operational and scale characteristics of IP basedVPNs to L2VPNs. Today, beyond DCI and E-LAN applications, the EVPNsolution family provides a common foundation for all Ethernet servicetypes; including E-LINE, E-TREE, as well as data center routing andbridging scenarios. EVPN also provides options to combine L2 and L3services into the same instance.To simplify service deployment, provisioning of all services is fullyautomated using Cisco Network Services Orchestrator (NSO) using (YANG)models and NETCONF. Refer to Section# “Network Services Orchestrator (NSO)”.There are two types of services# End-To-End and Hierarchical. The nexttwo sections describe these two types of services in more detail.Ethernet VPN (EVPN)EVPNs solve two long standing limitations for Ethernet Services inService Provider Networks# Multi-Homed & All-Active Ethernet Access Service Provider Network - Integration with Central Office or withData Center Ethernet VPN Hardware SupportIn CST 3.0+ EVPN ELAN, ETREE, and VPWS services are supported on all IOS-XRdevices. The ASR920 running IOS-XE does not support native EVPN services in theCST design, but can integrate into an overall EVPN service by utilizing servicehierarchy. Please see the tables under End-to-End and Hierarchical Services forsupported service types.Multi-Homed & All-Active Ethernet AccessFigure 21 demonstrates the greatest limitation of traditional L2Multipoint solutions likeVPLS.Figure 21# EVPN All-Active AccessWhen VPLS runs in the core, loop avoidance requires that PE1/PE2 andPE3/PE4 only provide Single-Active redundancy toward their respectiveCEs. Traditionally, techniques such mLACP or Legacy L2 protocols likeMST, REP, G.8032, etc. were used to provide Single-Active accessredundancy.The same situation occurs with Hierarchical-VPLS (H-VPLS), where theaccess node is responsible for providing Single-Active H-VPLS access byactive and backup spoke pseudowire (PW).All-Active access redundancy models are not deployable as VPLStechnology lacks the capability of preventing L2 loops that derive fromthe forwarding mechanisms employed in the Core for certain categories oftraffic. Broadcast, Unknown-Unicast and Multicast (BUM) traffic sourcedfrom the CE is flooded throughout the VPLS Core and is received by allPEs, which in turn flood it to all attached CEs. In our example PE1would flood BUM traffic from CE1 to the Core, and PE2 would sends itback toward CE1 upon receiving it.EVPN uses BGP-based Control Plane techniques to address this issue andenables Active-Active access redundancy models for either Ethernet orH-EVPN access.Figure 22 shows another issue related to BUM traffic addressed byEVPN.Figure 22# EVPN BUM DuplicationIn the previous example, we described how BUM is flooded by PEs over theVPLS Core causing local L2 loops for traffic returning from the core.Another issue is related to BUM flooding over VPLS Core on remote PEs.In our example either PE3 or PE4 receive and send the BUM traffic totheir attached CEs, causing CE2 to receive duplicated BUM traffic.EVPN also addresses this second issue, since the BGP Control Planeallows just one PE to send BUM traffic to an All-Active EVPN access.Figure 23 describes the last important EVPNenhancement.Figure 23# EVPN MAC Flip-FloppingIn the case of All-Active access, traffic is load-balanced (per-flow)over the access PEs (CE uses LACP to bundle multiple physical ethernetports and uses hash algorithm to achieve per flow load-balancing).Remote PEs, PE3 and PE4, receive the same flow from different neighbors.With a VPLS core, PE3 and PE4 would rewrite the MAC address tablecontinuously, each time the same mac address is seen from a differentneighbor.EVPN solves this by mean of “Aliasing”, which is also signaled via theBGP Control Plane.Service Provider Network - Integration with Central Office or with Data CenterAnother very important EVPN benefit is the simple integration withCentral Office (CO) or with Data Center (DC). Note that Metro CentralOffice design is not covered by this document.The adoption of EVPNs provides huge benefits on how L2 Multipointtechnologies can be deployed in CO/DC. One such benefit is the convergedControl Plane (BGP) and converged data plane (SR MPLS/SRv6) over SP WANand CO/DC network.Moreover, EVPNs can replace existing proprietary EthernetMulti-Homed/All-Active solutions with a standard BGP-based ControlPlane.End-To-End (Flat) ServicesThe End-To-End Services use cases are summarized in the table in Figure24 and shown in the network diagram in Figure 25.Figure 24# End-To-End – Services tableFigure 25# End-To-End – ServicesAll services use cases are based on BGP Control Plane.Refer also to Section# “Transport and Services Integration”.Hierarchical ServicesHierarchical Services Use Cases are summarized in the table of Figure 26and shown in the network diagram of Figure 27.Figure 26# Supported Hierarchical ServicesFigure 27# Hierarchical Services Control PlaneHierarchical services designs are critical for Service Providers lookingfor limiting requirements on the access platforms and deploying morecentralized provisioning models that leverage very rich features sets ona limited number of touch points.Hierarchical Services can also be required by Service Providers who wantto integrate their SP-WAN with the Central Office/Data Center networkusing well-established designs based on Data Central Interconnect (DCI).Figure 27 shows hierarchical services deployed on PE routers, but thesame design applies when services are deployed on AG or DCI routers.The Converged SDN Transport Design offers scalable hierarchical services withsimplified provisioning. The three most important use cases aredescribed in the following sections# Hierarchical L2 Multipoint Multi-Homed/All-Active Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service(H-EVPN) and Anycast-IRB Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) andPWHE Hierarchical L2 Multipoint Multi-Homed/All-ActiveFigure 28 shows a very elegant way to take advantage of the benefits ofSegment-Routing Anycast-SID and EVPN. This use case providesHierarchical L2 Multipoint Multi-Homed/All-Active (Single-Homed Ethernetaccess) service with traditional access routerintegration.Figure 28# Hierarchical Services (Anycast-PW)Access Router A1 establishes a Single-Active static pseudowire(Anycast-Static-PW) to the Anycast IP address of PE1/PE2. PEs anycast IPaddress is represented by Anycast-SID.Access Router A1 doesn’t need to establish active/backup PWs as in atraditional H-VPLS design and doesn’t need any enhancement on top of theestablished spoke pseudowire design.PE1 and PE2 use BGP EVPN Control Plane to provide Multi-Homed/All-Activeaccess, protecting from L2 loop, and providing efficient per-flowload-balancing (with aliasing) toward the remote PEs (PE3/PE4).A3, PE3 and PE4 do the same, respectively.Hierarchical L2/L3 Multi/Single-Home, All/Single-Active Service (H-EVPN) and Anycast-IRBFigure 29 shows how EVPNs can completely replace the traditional H-VPLSsolution. This use case provides the greatest flexibility asHierarchical L2 Multi/Single-Home, All/Single-Active modes are availableat each layer of the servicehierarchy.Figure 29# Hierarchical Services (H-EVPN)Optionally, Anycast-IRB can be used to enable Hierarchical L2/L3Multi/Single-Home, All/Single-Active service and to provide optimal L3routing.Hierarchical L2/L3 Multipoint Multi-Homed/Single-Active (H-EVPN) and PWHEFigure 30 shows how the previous H-EVPN can be extended by takingadvantage of Pseudowire Headend (PWHE). PWHE with the combination ofMulti-Homed, Single-Active EVPN provides an Hierarchical L2/L3Multi-Homed/Single-Active (H-EVPN) solution that supports QoS.It completely replaces traditional H-VPLS based solutions. This use caseprovides Hierarchical L2 Multi/Single-Home, All/Single-Activeservice.Figure 30# Hierarchical Services (H-EVPN and PWHE)Refer also to the section# “Transport and Services Integration”.EVPN Centralized GatewaySimilar to the Hierarchical L2/L3 service with Anycast-IRB, EVPN Centralized Gateway extends that service type by allowing the use of EVPN-ELAN services between the access site and core location. In previous versions the Anycast-IRB L3 gateway needed to be part of the access L2 domain and could not be placed elsewhere in the EVPN acrossthe core network. EVPN CGW relaxes the constraint and allows the L3 Anycast IRB gateway to be located at any point in the EVPN ELAN. The IRB can be placed in either the global routing table or within a VRF.In CST 5.0 EVPN Centralized GW is supported on the ASR 9000 platform.The figure below shows an example EVPN CGW deployment. In this scenario A-PE3,A-PE4, A-PE5, PE1, and PE2 all belong to the same EVPN-ELAN EVI 100. The CEnodes connected to A-PE3, A-PE4, and A-PE5 can communicate at Layer 2 or Layer 3with eachother without having to traverse the core nodes. This is onefundamental difference between EVPN-CGW and EVPN-HE. Traffic destined to anothersubnet, such as the 10.0.0.2 address is routed through the CGW core gateway.Also in this example CE4 is an example of a multi-homed CE node, utilizing a LAGacross A-PE4/A-PE3. This multi-homed connection can be configured in anall-active, single-active, or port-active configuration.Figure 31# Hierarchical Services EVPN Centralized GWEVPN Head-End for L3 ServicesCST 5.0 also introduces Cisco’s EVPN Head-End solution for hierarchicalservices. EVPN Head-End is similar to the existing Hierarchical PWHE services,but allows the use of native EVPN-VPWS between the access PE node andcentralized PE node. This simplifies deployments by allowing providers to usethe fully dynamic EVPN control plane for signaling, including the ability tosignal active/backup signaling between access PE and core PE nodes. In CST 5.0EVPN-HE is supported for L3 service termination, with the L3 gateway residing oneither a PWHE P2P interface or BVI interface. The L3 GW can reside in the global routing table or within a VRF.In CST 5.0 EVPN-HE for L3 services is supported on the ASR 9000 platform.The figure below shows a typical EVPN Head-End deployment. A-PE3 is configuredas an EVPN-VPWS endpoint, with PE1 and PE2 configured with the same EVPN-VPWSEVI, acting as All-Active or Single-Active gateways. PE1 and PE2 are configuredwith the same 10.1.0.1/24 address on the terminating L3 interface, providing aredundant gateway for the CE device with address 10.1.0.2/24. While not shown inthis figure, the CE device could also be multi-homed to two separate A-PE nodes in a all-active, single-active, or port-active configuration.Figure 32# Hierarchical Services EVPN Centralized GWThe Converged SDN Transport Design - SummaryThe Converged SDN Transport brings huge simplification at the Transport aswell as at the Services layers of a Service Provider network.Simplification is a key factor for real Software Defined Networking(SDN). Cisco continuously improves Service Provider network designs tosatisfy market needs for scalability and flexibility.From a very well established and robust Unified MPLS design, Cisco hasembarked on a journey toward transport simplification andprogrammability, which started with the Transport Control Planeunification in Evolved Programmable Network 5.0 (EPN5.0). The CiscoConverged SDN Transport provides another huge leap forward in simplification andprogrammability adding Services Control Plane unification andcentralized path computation.Figure 51# Converged SDN Transport – EvolutionThe transport layer requires only IGP protocols with Segment Routingextensions for Intra and Inter Domain forwarding. Fast recovery for nodeand link failures leverages Fast Re-Route (FRR) by Topology IndependentLoop Free Alternate (TI-LFA), which is a built-in function of SegmentRouting. End to End LSPs are built using Traffic Engineering by SegmentRouting, which does not require additional signaling protocols. Insteadit solely relies on SDN controllers, thus increasing overall networkscalability. The controller layer is based on standard industryprotocols like BGP-LS, PCEP, BGP-SR-TE, etc., for path computation andNETCONF/YANG for service provisioning, thus providing a on openstandards based solution.For all those reasons, the Cisco Converged SDN Transport design really brings anexciting evolution in Service Provider Networking.", "url": "/blogs/latest-converged-sdn-transport-hld", "author": "Phil Bedard", "tags": "iosxr, Metro, Design, 5G, Cable, CIN, RON" } , "#": {} , "blogs-latest-peering-fabric-hld": { "title": "Peering Fabric Design", "content": " On This Page Revision History Key Drivers Traffic Growth Network Simplification Network Efficiency Enhanced SLAs for Peering Traffic High-Level Design Peering Strategy Content Cache Aggregation Topology and Peer Distribution Platforms Cisco 8000 Cisco NCS 5500 / NCS 5700 Cisco NCS 540, ASR 9000 Control Plane Slow Peer Detection for BGP Telemetry Automation Zero Touch Provisioning Cisco Crosswork Health Insights KPI pack Advanced Security using BGP Flowspec and QPPB (1.5) Radware validated DDoS solution Radware DefensePro Radware DefenseFlow Solution description Solution diagram Router SPAN (monitor) to physical interface configuration Router SPAN (monitor) to PWE Netscout Arbor validated DDoS Solution Solution Diagram Netscout Arbor Sightline Sightline Appliance Roles Netscout Arbor Threat Management System (TMS) Solution description Edge Mitigation Options Traffic Redirection Options Netscout Arbor TMS Blacklist Offloading Mitigation Example Internet and Peering in a VRF RPKI and Route Origin Validation Next-Generation IXP Fabric Validated Design Peering Fabric Design Use Cases Traditional IXP Peering Migration to Peering Fabric Peering Fabric Extension Localized Metro Peering and Content Delivery Express Peering Fabric Datacenter Edge Peering Peer Traffic Engineering with Segment Routing ODN (On-Demand Next-Hop) for Peering DDoS Traffic Steering using SR-TE and EPE Low-Level Design Integrated Peering Fabric Reference Diagram Distributed Peering Fabric Reference Diagram Peering Fabric Hardware Detail NCS-5501-SE NCS-55A1-36H-SE NCS-55A1-24H NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card NCS-55A2-MOD-SE-S Peer Termination Strategy Distributed Fabric Device Roles PFL – Peering Fabric Leaf PFS – Peering Fabric Spine Device Interconnection Capacity Scaling Peering Fabric Control Plane PFL to Peer PFL to PFS PFS to Core SR Peer Traffic Engineering Summary Nodal EPE Peer Interface EPE Abstract Peering SR-TE On-Demand Next-Hop for Peering ODN Configuration SR-TE Per-Flow Traffic Steering Per-Flow Segment Routing Configuration (NCS Platforms) Per-Flow QoS Configuration Per-Flow Policy Configuration On-Demand Next-Hop Per-Flow Configuration SR-TE Egress Peer Engineering EPE SID Types EPE PeerSet Use Case IOS-XR Configuration IXP Fabric Low Level Design Segment Routing Underlay EVPN L2VPN Services Peering Fabric Telemetry Telemetry Diagram Model-Driven Telemetry BGP Monitoring Protocol Netflow / IPFIX Automation and Programmability Crosswork Cloud Network Insights Looking Glass and AS Path Trace AS Path and Prefix Alarm Capabilities Crosswork Cloud Traffic Analysis Looking Glass and AS Path Trace AS Path and Prefix Alarm Capabilities Crosswork Cloud Trust Insights Visualize Trust Track & Verify Inventory Utilize Trusted Data for Automation Cisco NSO Modules Netconf YANG Model Support 3rd Party Hosted Applications XR Service Layer API Recommended Device and Protocol Configuration Overview Common Node Configuration Enable LLDP Globally PFS Nodes IGP Configuration Segment Routing Traffic Engineering BGP Global Configuration Model-Driven Telemetry Configuration PFL Nodes Peer QoS Policy Peer Infrastructure ACL Peer Interface Configuration IS-IS IGP Configuration BGP Add-Path Route Policy BGP Global Configuration EBGP Peer Configuration PFL to PFS IBGP Configuration Netflow/IPFIX Configuration Model-Driven Telemetry Configuration Abstract Peering Configuration PFS Configuration BGP Flowspec Configuration and Operation Enabling BGP Flowspec Address Families on PFS and PFL Nodes BGP Flowspec Server Policy Definition BGP Flowspec Server Enablement BGP Flowspec Client Configuration QPPB Configuration and Operation Routing Policy Configuration Global BGP Configuration QoS Policy Definition Interface-Level Configuration BGP Graceful Shutdown Outbound graceful shutdown configuration Inbound graceful shutdown configuration Activating graceful shutdown Security Peering and Internet in a VRF VRF per Peer, default VRF for Internet Internet in a VRF Only VRF per Peer, Internet in a VRF Infrastructure ACLs BCP Implementation BGP Attribute and CoS Scrubbing BGP Control-Plane Type 6 Encryption Configuration TCP Authentication Option, MD5 Deprecation Per-Peer Control Plane Policers BGP Prefix Security RPKI Origin Validation BGP RPKI and ROV Confguration Create ROV Routing Policies Configure RPKI Server and ROV Options Enabling RPKI ROV on BGP Neighbors Communicating ROV Status via Well-Known BGP Community BGPSEC (Reference Only) DDoS traffic steering using SR-TE SR-TE Policy configuration Egress node BGP configuration Egress node MPLS static LSP configuration Appendix Applicable YANG Models NETCONF YANG Paths BGP Operational State Global BGP Protocol State BGP Neighbor State Example Usage BGP RIB Data Example Usage BGP Flowspec Device Resource YANG Paths Validated Model-Driven Telemetry Sensor Paths Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform LLDP Monitoring Interface statistics and state The following sub-paths can be used but it is recommended to use the base openconfig-interfaces model Aggregate bundle information (use interface models for interface counters) BGP Peering information IS-IS IGP information It is not recommended to monitor complete RIB tables using MDT but can be used for troubleshooting QoS and ACL monitoring BGP RIB information It is not recommended to monitor these paths using MDT with large tables Routing policy Information Revision History Version Date Comments 1.0 05/08/2018 Initial Peering Fabric publication 1.5 07/31/2018 BGP-FS, QPPB, ZTP, Internet/Peering in a VRF, NSO Services 2.0 04/01/2019 IXP Fabric, ODN and SR-PCE for Peering, RPKI 3.0 01/10/2020 SR-TE steering for DDoS, BGP graceful shutdown, Radware DDoS validation 3.5 11/01/2020 BGP slow peer detection, Type-6 Password Encryption, Arbor DDoS validation 4.0 03/01/2021 SR Per-Flow Traffic Steering 5.0 07/01/2022 Cisco 8000 as Peering Edge, Cisco Crosswork Cloud Network Insights, Traffic Analysis, and Trust Insights Key DriversTraffic GrowthInternet traffic has seen a compounded annual growth rate of 30% orhigher over the last five years, as more devices are connected and morecontent is consumed, fueled by the demand for video. Traffic willcontinue to grow as more content sources are added and Internetconnections speeds increase. Service and content providers must designtheir peering networks to scale for a future of more connected deviceswith traffic sources and destinations spanning the globe. Efficientpeering is required to deliver traffic to consumers.Network SimplificationSimple networks are easier to build and easier to operate. As networksscale to handle traffic growth, the level of network complexity mustremain flat. A prescriptive design using standard discrete componentsmakes it easier for providers to scale from networks handling a smallamount of traffic to 10s of Tbps without complete network forklifts.Fabrics with reduced control-plane elements and feature sets enhancestability and availability. Dedicating nodes to specific functions ofthe network also helps isolate the rest of the network from maliciousbehavior, defects, or instability.Network EfficiencyNetwork efficiency refers not only to maximizing network resources butalso optimizing the environmental impact of the deployed network. Muchof Internet peering today is done in 3rd party facilitieswhere space, power, and cooling are at a premium. High-density, lowerenvironmental footprint devices are critical to handling more trafficwithout exceeding the capabilities of a facility. In cases wheremultiple facilities must be connected, a simple and efficient way toextend networks must exist.Enhanced SLAs for Peering TrafficNetworks and their users are increasingly reliant on 3rd party services or 3rd party cloud providers for both internal and external applications. These applications can be sensitive to jitter, latency, and bandwidth congestion. TheSegment Routing enabled peering fabric allows providers to create end to end paths satisfying metrics based on latency, IGP cost, secondary TE cost, or overall hop count and use constraints such as SR flexible algorithms to ensure traffic always stays on the optimal path.High-Level DesignThe Peering design incorporates high-density environmentallyefficient edge routers, a prescriptive topology and peer terminationstrategy, and features delivered through IOS-XR to solve the needs ofservice and content providers. Also included as part of the Peeringdesign are ways to monitor the health and operational status of thepeering edge and through Cisco NSO integration assist providers inautomating peer configuration and validation. All designs areboth feature tested and validated as a complete design to ensurestability once implemented.Peering Strategyproposes a localized peering strategy to reduce network cost for“eyeball” service providers by placing peering or content provider cachenodes closer to traffic consumers. This reduces not only reducescapacity on long-haul backbone networks carrying traffic from IXPs toend users but also improves the quality of experience for users byreducing latency to content sources. The same design can also be usedfor content provider networks wishing to deploy a smaller footprintsolution in a SP location or 3rd party peering facility.Content Cache AggregationTraditional peering via EBGP at defined locations or over point to point circuits between routers is not sufficient enough today to optimize and efficiently deliver content between content providers and end consumers. Caching has been used for decades now performing traffic offload closer to eyeballs, and plays a critical role in today’s networks. The Peering Fabric design considers cache aggregation another role in “Peering” in creating a cost-optimized and scalable way to aggregate both provider and 3rd party caching servers such as those from Netflix, Google, or Akamai. The following diagram ** depicts a typical cache aggregation scenario at a metro aggregation facility. In larger high bandwidth facilities it is recommended to place caching nodes on a separate scalable set of devices separate from functions such as PE edge functions. Deeper in the network, Peering Fabric devices have the flexibility to integrate other functions such as small edge PE and compute termination such as in a 5G Mobile Edge Compute edge DC. Scale limitations are not a consideration with the ability to support full routing tablesin an environmentally optimized 1RU/2RU footprint.Topology and Peer DistributionThe Cisco Peering Fabric introduces two options for fabric topology andpeer termination. The first, similar to more traditional peeringdeployments, collapses the Peer Termination and Core Connectivitynetwork functions into a single physical device using the device’sinternal fabric to connect each function. The second option utilizes afabric separating the network functions into separate physical layers,connected via an external fabric running over standard Ethernet.In many typical SP peering deployments, a traditional two-node setup isused where providers vertically upgrade nodes to support the highercapacity needs of the network. Some may employ technologies such as backto back or multi-chassis clusters in order to support more connectionswhile keeping what seems like the operational footprint low. However,failures and operational issues occurring in these types of systems aretypically difficult to troubleshoot and repair. They also requirelengthy planning and timeframes for performing system upgrades. Weintroduce a horizontally scalable distributed peering fabric, the endresult being more deterministic interface or node failures.Minimizing the loss of peering capacity is very important for bothingress-heavy SPs and egress-heavy content providers. The loss of localpeering capacity means traffic must ingress or egress a sub-optimalnetwork port. Making a conscious design decision to spread peerconnections, even to the same peer, across multiple edge nodes helpsincrease resiliency and limit traffic-affecting network events.PlatformsCisco 8000The Cisco 8000 series represents the next-generation in router technology,featuring Cisco’s Silicon One ASICs to deliver unmatched density and powerefficiency, while support the features and resiliency service providers require.In 5.0 the Q200 series routers are supporting in the Peering Fabric Design. Thisincludes the 8201-32FH fixed system and the 88-LC0-36FH 36x400G line card forthe 8804, 8808, 8812, and 8818 modular chassis. The 88-LC0-34H14FH (34x100G,14x400G) is also ideal for deployments requiring a mix of 100G and 400Ginterfaces. All Cisco 8000 routers running the next-gernation XR7 operatingsystem with advanced management and programmability capabilities.The Peering Fabric design supports using the Cisco 8000 series in a peeringfabric leaf, peering fabric spine, combined PFL/PFS, or core router. The PeeringFabric IX design is not applicable to the Cisco 8000 in version 5.0.https#//www.cisco.com/c/en/us/products/routers/8000-series-routers/index.htmlCisco NCS 5500 / NCS 5700The Cisco NCS 5500 and NCS 57000 platforms is ideal for edge peer termination,given their high port density, large RIB and FIB scale, buffering capability,and IOS-XR software feature set. The NCS 5500 and 5700 series is also space andpower efficient, while not sacrificing capabilities. Using these components apeering fabric can also scale to support 100s of terabits of capacity in asingle rack for large peering deployments. Fixed chassis are ideal forincrementally building a peering edge fabric. The NCS NC5-55A1-36X100GE-A-SE,NC5-5A1-24H, and NCS-57B1-5D-SE are efficient high density building blocks whichcan be rapidly deployed as needed without installing a large footprint ofdevices day one. The next-generation NCS 5700 devices support a mix of 100G and400G for high capacity deployments.Deployments needing more capacity or interface flexibility such as can utilizethe family of modular chassis available. The NCS 5504 4-slot or NCS 5508 8-slotmodular chassis is ideal for high port density needs, supporting a variety ofline cards such as the 36x100GE NC57-36H-SE and 18x400G (72x100GE) NC57-18DD-SE.Smaller deployments needing interface flexibility can utilize the NC5-57C3-MOD-Splatform, with 48 1/10/25G + 8x100G onboard ports, plus 3 modular port adapterslots supporting 1/10/25/100/400G interfaces, all in a 3RU platform with 300mm depth.All NCS 5500 and 5700 routers also contain powerful Route Processors to unlockpowerful telemetry and programmability. The Peering Fabric fixed chassiscontain 1.6Ghz 8-core processors and 32GB of RAM. The latest NC55-RP-E and NC55-RP2-E (Class C Timing) for themodular NCS5500 chassis has a 1.9Ghz 6-core processor and 32G of RAM.More information on the NCS 5500 and NCS 5700 platforms can be round at#https#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/indhttps#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5700-series/index.htmlCisco NCS 540, ASR 9000Today’s networks may consume or deliver traffic to external peers at differentplaces in the network, traditional IX peering locations, far edge datacenters,or even localized to a cell tower location are all locations we find edgepeering or CDN today. All Cisco routers running IOS-XR have the feature set tofulfill edge peering or content delivery at any point in the network. Thesmallest NCS 540 routers to the high-scale service edge and BNG capable ASR 9000support the security and traffic engineering capabilities to optimize edgetraffic delivery of both enterprise and service provider networks.Control PlaneThe peering fabric design introduces a simplified control plane builtupon IPv4/IPv6 with Segment Routing. In the collapsed design, eachpeering node is connected to EBGP peers and upstream to the core viastandard IS-IS, OSPF, and TE protocols, acting as a PE or LER in aprovider network.In the distributed design, network functions are separated. PeerTermination happens on Peering Fabric Leaf nodes. Peering Fabric Spineaggregation nodes are responsible for Core Connectivity and perform moreadvanced LER functions. The PFS routers use ECMP to balance trafficbetween PFL routers and are responsible for forwarding within the fabricand to the rest of the provider network. Each PFS acts as an LER,incorporated into the control-plane of the core network. The PFS, oralternatively vRRs, reflect learned peer routes from the PFL to the restof the network. The SR control-plane supports several trafficengineering capabilities. EPE to a specific peer interface, PFL node, orPFS is supported. We also introduce the abstract peering concept wherePFS nodes utilize a next-hop address bound to an anycast SR SID to allowtraffic engineering on a per-peering center basis.Slow Peer Detection for BGPIn the Peering Fabric 3.5 design and IOS-XR 7.1.2 slow-peer detection is enabled by default. Slow peers are those who are slow to receive and process inbound BGPupdates and ack those to the sender. If the slow peer is participating in the same update group as other peers, this can slow down the update process for all peers. In this release when IOS-XR detects a slow peer, it will create a syslogmention with information about the specific peer.TelemetryThe Peering fabric design uses the rich telemetry available in IOS-XRand all Cisco platforms to enable an unprecedented level of insightinto network and device behavior. The Peering Fabric leverages Model-DrivenTelemetry and NETCONF along with both standard and native YANG modelsfor metric statistics collection. Telemetry configuration and applicablesensor paths have been identified to assist providers in knowing what tomonitor and how to monitor it. All configuration in Cisco routers can be represented by a YANG model and all operational data can be consumed using network models over either Cisco MDT or standards-based gNMI.AutomationNETCONF and YANG using OpenConfig and native IOS-XR models are used tohelp automate peer configuration and validation. Cisco has developed specific Peering Fabric NSO service models to help automate common tasks suchas peer interface configuration, peer BGP configuration, and addingphysical interfaces to an existing peer bundle. In addition to the device-level capabilities, the Cisco Crosswork family of automation provides deeper insights into network behavior.Zero Touch ProvisioningIn addition to model-driven configuration and operation, Peering Fabric 1.5 alsosupports ZTP operation for automated device provisioning. ZTP is useful both in production as well as staging environments to automate initial device software installation, deploy an initial bootstrap configuration, as well as advanced functionality triggered by ZTP scripts. ZTP is supported on both out of band management interfaces as well as in-band data interfaces.Cisco Crosswork Health Insights KPI packTo ease the monitoring of common peering telemetry using CW Health Insights, a peering sensor pack is available containing common elements monitored for peering not included in the baseline CW HI KPI definitions. These include BGP session monitoring, RIB/FIB counts, and Flowspec statistics.Advanced Security using BGP Flowspec and QPPB (1.5)Release 1.5 of the Cisco Peering Fabric enhances the design by adding advancedsecurity capabilities using BGP Flowspec and QoS Policy Propagation using BGPor QPPB. BGP Flowspec was standardized in RFC 5575 and defines additional BGPNLRI to inject packet filter information to receiving routers. BGP is the control-plane fordisseminating the policy information while it is up to the BGP Flowspecreceiver to implement the dataplane rules specified in the NLRI. At theInternet peering edge, DDoS protection has become extremely important,and automating the remediation of an incoming DDoS attack has becomevery important. Automated DDoS protection is only one BGP Flowspec usecase, any application needing a programmatic way to create interfacepacket filters can make se use of its capabilities.QPPB allows using BGP attributes as a match criteria in dataplane packet filters. Matching packets based on attributes like BGP community and AS Path allows serviceproviders to create simplified edge QoS policies by not having to manage more cumbersome prefix lists or keep up to date when new prefixes are added. QPPB is supported in the peering fabric for destination prefix BGP attribute matching and has a number of use cases when delivering traffic from external providers to specific internal destinations.Radware validated DDoS solutionRadware, a Cisco partner, provides a robust and intelligent DDoS detection and mitigation solution covering both volumetric and application-layer DDoS attacks. The validated solution includes the following elements#Radware DefenseProDefensePro is used for attack detection and traffic scrubbing. DefensePro can be deployed at the edge of the network or centralized as is the case with a centralized scrubbing center. DefensePro uses realtime traffic analysis through SPAN (monitor) sessions from the edge routers to the DefensePro virtual machine or hardware appliance.Radware DefenseFlowDefenseFlow can work in a variety of ways as part of a comprehensive DDoS mitigation solution. DefenseFlow performs $anomaly detection by using advanced network behavioral analysis to first baseline a network during peacetime and then evaluate anomalies to determine when an attack is occurring. DefenseFlow can also incorporate third party data such as flow data or other data to enhance its attack detection capability. DefenseFlow also coordinates the mitigation actions of other solution components such as DefensePro and initiates traffic redirection through the use of BGP and BGP Flowspec on edge routers.Solution descriptionThe following steps describe the analysis and mitigation of DDoS attacks using Radware components. Radware DefenseFlow is deployed to orchestrate DDoS attack detection and mitigation. Virtual or appliance version of Radware DefensePro is deployed to a peering fabric location or centralized location. PFL nodes use interface monitoring sessions to mirror specific ingress traffic to an interface connected to the DefensePro element. The interface can be local to the PFL node or traffic or SPAN over Pseudowire can be used to tunnel traffic to an interface attached to a centralized DefensePro.Solution diagramRouter SPAN (monitor) to physical interface configurationThe following is used to direct traffic to a DefensePro virtual machine or appliance.monitor-session radware ethernet destination interface TenGigE0/0/2/2!interface TenGigE0/0/2/1 description ~DefensePro clean interface~ ipv4 address 182.10.1.1 255.255.255.252! interface TenGigE0/0/2/2 description ~SPAN interface to DefensePro~ !interface TenGigE0/0/2/3 description ~Transit peer connection~ ipv4 address 182.30.1.1 255.255.255.252 monitor-session radware ethernet port-level !end Router SPAN (monitor) to PWEThe following is used to direct traffic to a DefensePro virtual machine or appliance at a remote locationmonitor-session radware ethernet destination pseudowire !l2vpn xconnect group defensepro-remote p2p dp1 monitor-session radware neighbor ipv4 100.0.0.1 pw-id 1!interface TenGigE0/0/2/3 description ~Transit peer connection~ ipv4 address 182.30.1.1 255.255.255.252 monitor-session radware ethernet port-level !end Netscout Arbor validated DDoS SolutionNetscout, a Cisco partner, has deployed its Arbor solution at SPs around the world for advanced DDoS detection and mitigation. Using network analysis at the flow and packet level along with BGP and network statistic data, Arbor categorizes traffic based on user defined and learned criteria to quickly detect attacks. Once those attacks are detected SPs can mitigate those attacks using a combination of Route Triggered Blackhole, ACLs, and BGP Flowspec. Additionally, SPs can deploy the Arbor TMS or vTMS scrubbing appliances on-net to separate and block malicious traffic from legitimate traffic. Now we walk through the various solution components used in the Netscout Arbor solution.Information about all of Netscout’s traffic visibility and security solutions can be found at https#//www.netscout.comSolution DiagramNetscout Arbor SightlineSightline Appliance RolesSightline comprises the scalable distributed services responsible for network data collection, attack analysis, and mitigation coordination across the network. Each Sightline virtual machine appliance can be configured in a specific solution role. One of the deployed appliances is configured as the leader appliance maintaining the configuration for the entire Sightline cluster. In order to scale collection of network data, multiple collectors can be deployed to collect Netflow, SNMP, and BGP data. This data is then aggregated and used for traffic analysis and attack detection. Sightline elements can be configured via CLI or via the web UI once the UI appliance is operational. The following lists the different roles for Sightline VM appliances# Role Description Required UI Provides the web UI and all API access to the Sightline system Yes (recommended as Leader) Traffic and Routing Analysis Provides Netflow, BGP, and SNMP collection from the network along with DDoS analytics Yes Data Storage Separate data storage for Managed Object data, increasing scale of the overall solution for large deployments No Flow Sensor Generates flow data from the network when router export of Netflow is not capable No As seen in the table, UI and Traffic and Routing Analysis appliances are required.Netscout Arbor Threat Management System (TMS)The TMS or vTMS appliances provide deeper visibility into network traffic and acts as a scrubber as part of a holistic DDoS mitigation solution. The TMS performs deep packet inspection to identify application layer attacks at a packet and payload level, performs granular mitigation of the attack traffic, and provides reporting for the traffic. The TMS is integrated with a Routing and Analytics appliance so when attacks are detected by the R&A appliance it can then be redirected to a tethered TMS appliance for further inspection or mitigation.Solution descriptionThe following steps describe the analysis and mitigation of DDoS attacks using Netscout Arbor components. Netscout Arbor Sightline UI leader virtual appliance One or more Netscout Arbor Sightline Routing and Analytics appliances One or more Netscout Arbor TMS or vTMS appliances All routers in the network configured to export Netflow data to the R&A appliances Sightline and the network configured for SNMP and BGP collection from each network router, assigned to proper roles (Edge, Core) If necessary, configure Netscout Arbor Managed Objects to collect and analyze specific traffic for anomalies and attacks Configure mitigation components such as RTBH next-hops, TMS mitigation, and BGP Flowspec redirect and drop parametersEdge Mitigation OptionsThe methods to either police or drop traffic at the edge of the network are# Route Triggered Blackhole The RTBH IPv4 or IPv6 BGP prefix is advertised from the Routing and Analytics node, directing edge routers to Null route a specific prefix being attackced. This will cause all traffic to the destination prefix to be dropped on the edge router. Access Control Lists ACLs are generated and deployed on edge interfaces to mitigate attacks. In addition to matching either source or destination prefixes, ACLs can also match additional packet header information such as protocol and port. ACLs can be created to either drop all traffic matching the specific defined rules or rate-limit traffic to a configured policing rate. ACLs BGP Flowspec BGP Flowspec mitigation allows the provider to distribute edge mitigation in a scalable way using BGP. In a typical BGP Flowspec deployment the Netscout Arbor R&A node will advertise the BGP Flowspec policy to a provider Route Reflector which then distributes the BGP FS routes to all edge routers. BGP Flowspec rules can match a variety of header criteria and perform drop, police, or redirect actions. Traffic Redirection Options BGP Flowspec It is recommended to use BGP Flowspec to redirect traffic on PFL nodes to TMS appliances. This can be done through traditional configuration with next-hop redirection in the global routing table, redirection into a “dirty” VRF, or using static next-hops into SR-TE tunnels in the case where the scrubbing appliances are not connected via a directly attached interface to the PFL or PFS nodes. Netscout Arbor TMS Blacklist OffloadingBlacklist offloading is a combination of traffic scrubbing using the TMS along with filtering/dropping traffic on each edge router. The Netscout Arbor system identifies the top sources of attack traffic and automatically generates the BGP Flowspec rules to drop traffic on the edge router before it is redirected to the TMS. This makes the most efficient use of the TMS mitigation resources.Mitigation ExampleThe graphic below shows an example of traffic mitigation via RTBH. Netscout Arbor still receives flow information from the network edge for mitigated traffic, so Arbor is able to detect the amount of traffic which has been mitigates using the appropriate mitigation method.Internet and Peering in a VRFWhile Internet peering and carrying the Internet table in a provider network is typically done using the Global Routing Table (default VRF in IOS-XR) many modern networks are being built to isolate the GRT from the underlying infrastructure. In this case, the Internet global table is carried as a service just like any other VPN service, leaving the infrastructure layer protected from both the global Internet. Another application using VRFs is to simply isolate peers to specific VRFs in order to isolate the forwarding plane of each peer from each other and be able to control which routes a peer sees by the use of VPN route target communities as opposed to outbound routing policy. In this simplified use the case the global table is still carried in the default VRF, using IOS-XR capabilities to import and export routes to and from specific peer VRFs. Separating Internet and Peering routes into specific VRFs also gives flexibility in creating custom routing tables for specific customers, giving a service provider the flexibility to offer separate regional or global reach on the same network.Internet in a VRF and Peering in a VRF for IPv4 and IPv6 are compatible with most Peering Fabric features. Specific caveats are document in the Appendixof the document.RPKI and Route Origin ValidationRPKI stands for Resource Public Key Infrastructure and is a repository for attaching a trust anchor to Internet routing resources such as Autonomous Systems and IP Prefixes. Each RIR (Regional Internet Registry) houses the signed resource records it is responsible for, giving a trust anchor to those resources.The RPKI contains a Route Origin Authorization object, used to uniquely identify the ASN originating a prefix and optionally, the longer sub-prefixes covered by it. RPKI records are published by each Regional Internet Regitstry (RIR) adn consume by offline RPKI validators. The RPKI validator is an on-premise application responsible for compiling a list of routes considered VALID. Keep in mind these are only the routes which are registered in the RPKI database, no information is gathered from the global routing table. Once resource records are validated, the validator uses the RTR protocol **insert RFC ref to communicate with client routers who periodically make requests for an updated database.The router uses this database along with policy to validate incoming BGP prefixes against the database, a process called as Route Origin Validation (ROV). ROV verifies the origin ASN in the AS_PATH of the prefix NLRI matches the RPKI database. A communication flow diagram is given below. RPKI configuration examples are given in the implementation section.The Peering Fabric design was validated using the Routinator RPKI validator. Please see the security section for configuration of RPKI ROV in IOS-XR.For more information on RPKI and RPKI deployment with IOS-XR please see# https#//xrdocs.io/design/blogs/routinator-hosted-on-xrNext-Generation IXP FabricIntroduced in Peering Fabric 2.0 is a modern design for IXP fabrics. The design creates a simplified fault-tolerant L2VPN fabric with point to point and multi-point peer connectivity. Segment Routing brings a simplified MPLS underlay with resilience using TI-LFA and traffic engineering capabilities using Segment Routing - Traffic Engineering Policies. Today’s IX Fabrics utilize either traditional L2 networks or emulated L2 using VPLS and LDP/RSVP-TE underlays. The Cisco NG IX Fabric uses EVPN for all L2VPN services, replacing complicated LDP signaled services with a scalable BGP control-plane. See the implementation section for more details on configuring the IX fabric underlay and EVPN services.The IX fabric can also utilize the NSO automation created in the Metro Fabric design for deploying EVPN VPWS (point-to-point) and multi-point EVPN ELAN services.Validated DesignThe Peering Fabric Design control, management, and forwarding planes haveundergone validation testing to ensure individual design features workas intended and the peering fabric as a whole performs without fault.Validation is done exceeding real-world scaling requirements to ensurethe design fulfills its rule in existing networks with room for futuregrowth.Peering Fabric Design Use CasesTraditional IXP Peering Migration to Peering FabricA traditional SP IXP design traditionally uses one or two large modularsystems terminating all peering connections. In many cases, sinceproviders are constrained on space and power they use a collapsed designwhere the minimal set of peering nodes not only terminates peerconnections but also provides services and core connectivity to thelocation. The Peering Fabric uses best of breed high density,low footprint hardware requiring much less space than older generationmodular systems. Many older systems provide densities at approximately4x100GE per rack unit, while Peering Fabric PFL nodes start at 24x100GEor 36x100GE per 1RU with high FIB capability. Due to the superior spaceefficiency, there is no longer a limitation of using just a pair ofnodes for these functions. In either a collapsed function or distributedfunction design, peers can be distributed across a number of devices toincrease resiliency and lessen collateral impact when failures occur.The diagram below shows a fully distributed fabric, where peers are nowdistributed across three PFL nodes, each with full connectivity toupstream PFS nodes.Peering Fabric ExtensionIn some cases, there may be peering facilities within close geographicproximity which need to integrate into a single fabric. This may happenif there are multiple 3rd party facilities in a closegeographic area, each with unique peers you want to connect to. Theremay also be multiple independent peering facilities within a smallgeographic area you do not wish to install a complete peering fabricinto. In those cases, connecting remote PFL nodes to a larger peeringfabric can be done using optical transport or longer range gray optics.Localized Metro Peering and Content DeliveryIn order to drive greater network efficiency, content sources should beplaces as close to the end destination as possible. Traditional wirelineand wireless service providers have heavy inbound traffic from contentproviders delivering OTT video. Providers may also be providing theirown IP video services to on-net and off-net destinations via a SP CDN.Peering and internal CDN equipment can be placed within a localized peeror content delivery center, connected via a common peering fabric. Inthese cases the PFS nodes connect directly to the metro core to enabledelivery across the region or metro.Express Peering FabricAn evolution to localized metro peering is to interconnect the PFSpeering nodes directly or a metro-wide peering core. The main driver fordirect interconnection is minimizing the number of router and transportnetwork interfaces traffic must pass through. High density opticalmuxponders such as the NCS1002 along with flexible photonic ROADMarchitectures enabled by the NCS2000 can help make the most efficientuse of metro fiber assets.Datacenter Edge PeeringIn order to serve traffic as close to consumer endpoints as possible aprovider may construct a peering edge attached to an edge or centraldatacenter. As gateway functions in the network become virtualized forapplications such as vPE, vCPE, and mobile 5G, the need to attachInternet peering to the SP DC becomes more important. The Peering Fabric supports interconnected to the DC via the SP core or withthe PFS nodes as leafs to the DC spine. These would act as traditionalborder routers in the DC design.Peer Traffic Engineering with Segment RoutingSegment Routing performs efficient source routing of traffic across aprovider network. Traffic engineering is particular applicable topeering as content providers look for ways to optimize egress networkports and eyeball providers work to reduce network hops between ingressand subscriber. There are also a number of advanced use cases based onusing constraints to place traffic on optimal paths, such as latency. AnSRTE Policy represents a forwarding entity within the SR domain mappingtraffic to a specific network path, defined statically on the node orcomputed by an external PCE. An additional benefit of SR is the abilityto source route traffic based on a node SID or an anycast SIDrepresenting a set of nodes. ECMP behavior is preserved at each point inthe network, redundancy is simplified, and traffic protection issupplied using TI-LFA.In the Low-Level Design we explore common peer engineering use cases.Much more information on Segment Routing technology and its futureevolution can be found at http#//segment-routing.netODN (On-Demand Next-Hop) for PeeringThe 2.0 release of Peering Fabric introduces ODN as a method for dynamically provisioning SR-TE Policies to nodes based on specific “color” extended communities attached to advertised BGP routes. The color represents a set of constraints used for the provisioned SR-TE Policy, applied to traffic automatically steered into the Policy once the SR-TE Policy is instantiated.An applicable example is the use case where I have several types of peers on the same device sending traffic to destinations across my larger SP network. Some of this traffic may be Best Effort with no constraints, other traffic from cloud partners may be considered low-latency traffic, and traffic from a services partner may have additional constraints such as maintaining a disjoint path from the same peer on another router. Traffic in the reverse direction egressing a peer from a SP location can also utilize the same mechanisms to apply constraints to egress traffic.DDoS Traffic Steering using SR-TE and EPESR-TE and Egress Peer Engineering can be utilized to direct DDoS traffic to a specific end node and specific DDoS destination interface without the complexities of using VRFs to separate dirty/clean traffic. On ingress, traffic is immediately steered into a SR-TE Policy and no IP lookup is performed between the ingress node and egress DDoS “dirty” interface. In the 3.0 design using IOS-XR 6.6.3 Flowspec redirects traffic to a next-hop IP pointing to a pre-configured “DDoS” SR-Policy. An MPLS xconnect is used map DDoS traffic with a specific EPE label on the egress node to a specific egress interface.Low-Level DesignIntegrated Peering Fabric Reference DiagramDistributed Peering Fabric Reference DiagramPeering Fabric Hardware DetailThe NCS5500 family of routers provide high density, high routing scale,idea buffer sizes, and environmental efficiency to help providerssatisfy any peering fabric use case. Due to high FIB scale, largebuffers, and broad XR feature set, all prescribed hardware can serve ineither a collapsed or distributed fabric. Further detailed informationon each platform can be found athttps#//www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.html.NCS-5501-SEThe NCS 5501 is a 1RU fixed router with 40X10GE SFP+ and 4X100GE QSFP28interfaces. The 5501 has IPv4 FIB scale of at least 2M routes. The5501-SE is ideal as a peering leaf node when providers need 10GEinterface flexibility such as ER, ZR, or DWDM.NCS-55A1-36H-SEThe 55A1-36H-SE is a second generation 1RU NCS5500 fixed platform with36 100GE QSFP28 ports operating at line rate. The –SE model contains anexternal TCAM increasing route scale to a minimum of 3M IPv4/512K IPv6routes in its FIB. It also contains a powerful multi-core routeprocessor with 64GB of RAM and an on-board 64GB SSD. Its high density,efficiency, and buffering capability make it ideal in 10GE or 100GEdeployments. Peering fabrics can scale to much higher capacity 1RU at atime by simply adding additional 55A1-36H-SE spine nodes.NCS-55A1-24HThe NCS-55A1-24H is a second generation 1RU NCS5500 fixed platform with24 100GE QSFP28 ports. The device uses two 900GB NPUs, with 12X100GEports connected to each NPU. The 55A1-24H uses a high scale NPU with aminimum of 1.3M IPv4/256K IPv6 routes. At just 675W it is ideal for 10GEpeering fabric deployments with a migration path to 100GE connectivity.The 55A1-24H also has a powerful multi-core processor and 32GB of RAM.NCS 5504 and 5508 Modular Chassis and NC55-36X100G-A-SE line card Very large peering fabric deployments or those needing interfaceflexibility such as IPoDWDM connectivity can use the modular NCS5500series chassis. Large deployments can utilize the second-generation36X100G-A-SE line card with external TCAM, supporting a minimum of 3MIPv4 routes.NCS-55A2-MOD-SE-SThe NCS-55A2-MOD router is a 2RU router with 24x10G SFP+ interfaces, 16x25 SFP28 interfaces, and two Modular Port Adapter (MPA) slots with 400Gbps of full-duplex bandwidth. A variety of MPAs are available, adding additional 10GE, 100GE QSFP28, and 100G/200G CFP2 interfaces. The CFP2 interfaces support CFP2-DCO Digital Coherent Optics, simplifying deployment for peering extensions connected over dark fiber or DWDM multiplexers.The 55A2-MOD-SE-S uses a next-generation external TCAM with a minimum route scale of 3M IPv4/512K IPv6. The 55A2-MOD-SE-S also supports advanced security using BGP Flowspec and QPPB.Peer Termination StrategyOften overlooked when connecting to Internet peers is determining astrategy to maximize efficiency and resiliency within a local peeringinstance. Often times a peer is connected to a single peering node evenwhen two nodes exist for ease of configuration and coordination with thepeering or transit partner. However, with minimal additionalconfiguration and administration assisted by automation, even singlepeers can be spread across multiple edge peering nodes. Ideally, withina peering fabric, a peer is connected to each leaf in the fabric. Incases where this cannot be done, the provider should use capacityplanning processes to balance peers and transit connections acrossmultiple leafs in the fabric. The added resiliency leads to greaterefficiency when failures do happen, with less reliance on peeringcapacity further away from the traffic destination.Distributed Fabric Device RolesPFL – Peering Fabric LeafThe Peering Fabric Leaf is the node physically connected to externalpeers. Peers could be aggregation routers or 3rd party CDNnodes. In a deconstructed design the PFL is analogous to a line card ina modular chassis solution. PFL nodes can be added as capacity needsgrow.PFS – Peering Fabric SpineThe Peering Fabric Spine acts as an aggregation node for the PFLs and isalso physical connected to the rest of the provider network. Theprovider network could refer to a metro core in the case of localizedpeering, a backbone core in relation to IXP peering, a DC spine layer inthe case of DC peering.Device InterconnectionIn order to maximize resiliency in the fabric, each PFL node isconnected to each PFS. While the design shown includes three PFLs andtwo PFS nodes, there could be any number of PFL and PFS nodes, scalinghorizontally to keep up with traffic and interface growth. PFL nodes arenot connected to each other, the PFS nodes provide the capacity for anytraffic between those nodes. The PFS nodes are also not interconnectedto each other, as no end device should terminate on the PFL, only otherrouters.Capacity ScalingCapacity of the peering fabric is scaled horizontally. The uplinkcapacity from PFL to PFS will be determine by an appropriateoversubscription factor determined by the service provider’s capacityplanning exercises. The leaf/spine architecture of the fabric connectseach PFL to each PFS with equal capacity. In steady-state operationtraffic is balanced between the PFS and PFL in both directions,maximizing the total capacity. The entropy in peering traffic generallyensures equal distribution between either ECMP paths or bundle interfacemember links in the egress direction. More information can be found inthe forwarding plane section of the document. An example deployment mayhave two NC55-36X100G-A-SE spine nodes and two NC55A1-24H leaf nodes. Ina 100GE peer deployment scenario each leaf would support 14x100GE clientconnections and 5x100GE to each spine node. A 10GE deployment wouldsupport 72x10GE client ports and 3x100GE to each spine, at a 1.2#1oversubscription ratio.Peering Fabric Control PlanePFL to PeerThe Peering Fabric Leaf is connected directly to peers via traditionalEBGP. BFD may additionally be used for fault detection if agreed to bythe peer. Each EBGP peer will utilize SR EPE to enable TE to the peerfrom elsewhere on the provider network.PFL to PFSPFL to Peering Fabric Spine uses widely deployed standard routingprotocols. IS-IS is the prescribed IGP protocol within the peeringfabric. Each PFS is configured with the same IS-IS L1 area. In the casewhere OSPF is being used as an IGP, the PFL nodes will reside in an OSPFNSSA area. The peering fabric IGP is SR-enabled with the loopback ofeach PFL assigned a globally unique SR Node SID. Each PFL also has anIBGP session to each PFR to distribute its learned EBGP routes upstreamand learn routes from elsewhere on the provider network. If a provideris distributing routes from PFL to PFL or from another peering locationto local PFLs it is important to enable the BGP “best-path-external”feature to ensure the PFS has the routing information to acceleratere-convergence if it loses the more preferred path.Egress peer engineering will be enabled for EBGP peering connections, sothat each peer or peer interface connected to a PFL is directlyaddressable by its AdJ-Peer-SID from anywhere on the SP network.Adj-Peer-SID information is currently not carried in the IGP of thenetwork. If utilized it is recommended to distribute this informationusing BGP-LS to all controllers creating paths to the PFL EPEdestinations.Each PFS node will be configured with IBGP multipath so traffic is loadbalanced to PFL nodes and increase resiliency in the case of peerfailure. On reception of a BGP withdraw update for a multipath route,traffic loss is minimized as the existing valid route is stillprogrammed into the FIB.PFS to CoreThe PFS nodes will participate in the global Core control plane and actas the gateway between the peering fabric and the rest of the SPnetwork. In order to create a more scalable and programmatic fabric, itis prescribed to use Segment Routing across the core infrastructure.IS-IS is the preferred protocol for transmitting SR SID information fromthe peering fabric to the rest of the core network and beyond. Indeployments where it may be difficult to transition quickly to an all-SRinfrastructure, the PFS nodes will also support OSPF and RSVP-TE forinterconnection to the core. The PFS acts as an ABR or ASBR between thepeering fabric and the larger metro or backbone core network.SR Peer Traffic EngineeringSummarySR allows a provider to create engineered paths to egress peeringdestinations or egress traffic destinations within the SP network. Astack of globally addressable labels is created at the traffic entrypoint, requiring no additional protocol state at midpoints in thenetwork and preserving qualities of normal IGP routing such as ECMP ateach hop. The Peering Fabric proposes end-to-end visibility fromthe PFL nodes to the destinations and vice-versa. This will allow arange of TE capabilities targeting a peering location, peering exitnode, or as granular as a specific peering interface on a particularnode. The use of anycast SIDs within a group of PFS nodes increasesresiliency and load balancing capability.Nodal EPENode EPE directs traffic to a specific peering node within the fabric.The node is targeted using first the PFS cluster anycast IP along withthe specific PFL node SID.Peer Interface EPEThis example uses an Egress Peer Engineering peer-adj-SID value assignedto a single peer interface. The result is traffic sent along this SRpath will use only the prescribed interface for egress traffic.Abstract PeeringAbstract peering allows a provider to simply address a Peering Fabric bythe anycast SIDs of its cluster of PFS nodes. In this case PHP is usedfor the anycast SIDs and traffic is simply forwarded as IP to the finaldestination across the fabric.SR-TE On-Demand Next-Hop for PeeringSR-TE On-Demand Next-Hop is a method to dynamically create specific constraint-based tunnels across an SP network to/from edge peering nodes. ODN utilizes Cisco’s Segment Routing Path Computation Element (SR-PCE) to compute paths on demand based on the BGP next-hop and associated “color” communities.When a node receives a route with a specific community, it builds a SR-TE Policy to the BGP next-hop based on policy.One provider example is the case where I have DIA (Direct Internet Access) customers with different levels of service. I can create a specific SLA for “Gold” customers so their traffic takes a lower latency path across the network. In B2B peering arrangements, I can ensure voice or video traffic I am ingesting from a partner network takes priority. I can do this without creating a number of static tunnels on the network.ODN ConfigurationODN requires a few components be configured. In this example we tag routes coming from a specific provider with the color “BLUE” with a numerical value of 100. In IOS-XR we first define an extended community set defining our color with a unique string identifier of BLUE. This configuration should be found on both the ingress and egress nodes of the SR Policy.extcommunity-set opaque BLUE 100end-setThe next step is to define an inbound routing policy on the PFL nodes tagging all inbound routes from PEER1 with the BLUE extended community.route-policy PEER1-IN set community (65000#100) set local-preference 100 set extcommunity color BLUE passend-policyIn order for the head-end node to process the color community and create an SR Policy with constraints, the color must be configured under SR Traffic Engineering. The following configuration defined a color value of 100, the same as our extended community BLUE, and instructs the router how to handle creating the SR-TE Policy to the BGP next-hop address of the prefix received with the community. In this instance it instructs the router to utilize an external PCE, SR-PCE, to compute the path and use the lower IGP metric path cost to reach the destination. Other options available are TE metric, latency, hop count, and others covered in the SR Traffic Engineering documentation found on cisco.com.segment-routing traffic-eng on-demand color 100 dynamic pcep ! metric type igpThe head-end router will only create a single SR-TE Policy to the next-hop address, other prefixes matching the original next-hop constraints will utilize the pre-existing tunnel. The tunnels are ephemeral meaning they will not persist across router reboots.SR-TE Per-Flow Traffic SteeringIn the 4.0+ version of the Peering Fabric Design, starting with XR 7.2.2 Per-Flow Traffic Steering is supported. Per-Flow Traffic steering extends the current Per-Destination Traffic Steering on the SR-TE head-end node. As a review, per-destination traffic steering matches the color and BGP next-hop of an incoming prefix and either matches it to a pre-configured SR-TE Policy with the same (color, endpoint), or creates one dynamically through the use of ODN. Per-flow uses the same methodology as Per-Destination but includes another element, the Forwarding Class, when making a decision which SR-TE Policy to forward traffic onto. The Forwarding Class is set by matching ingress traffic header criteria using the IOS-XR QoS framework. A new SR-TE Policy type, the Per-Flow Policy is used as a parent policy, with each child policy corresponding to a specific Forwarding Class. When an incoming BGP prefix matches the color of the Per-Flow Policy, ingress traffic will utilize the Forward Class and child SR-TE Policies for forwarding. A child policy can be any SR-TE Per-Destination Policy with the same endpoint as the Per-Flow Policy.Per-Flow Segment Routing Configuration (NCS Platforms)The following configuration is required on the NCS 5500 / 5700 platforms toallocate the PFP Binding SID (BSID) from a specific label block.mpls label blocks\u000b block name sample-pfp-bsid-block type pfp start 40000 end 41000 client anyPer-Flow QoS ConfigurationThe Forward Class must be set in the ingress QoS policy so traffic is steeredinto the correct child Per-Destination Policy.policy-map per-flow-steering class MatchIPP1 set forward-class 1! class MatchIPP2 set forward-class 2! class MatchIPv4_SRC set forward-class 3 ! class MatchIPv6_SRC set forward-class 4 end-policy-map!class-map match-any MatchIPP1 match precedence 1end-class-map!class-map match-any MatchIPP2 match precedence 2end-class-map!class-map match-any MatchIPv4_SRC   match access-group ipv4 ipv4_sourcesend-class-map!class-map match-any MatchIPv6_SRC   match access-group ipv4 ipv6_sourcesend-class-mapipv4 access-list ipv4_sources  10 permit ipv4 100.0.0.0/24 any  20 permit ipv4 100.0.1.0/24 any !ipv6 access-list ipv6_sources 10 permit ipv6 2001#100##/64 any 20 permit ipv6 2001#200##/64 anyPer-Flow Policy ConfigurationThis example shows both the child Per-Destination Policies as well as the parent Per-Flow Policy. Each Forward-Class is mapped to the color of the child policy. The default Forward Class is meant to catch traffic not matching a configured Forward Class.segment-routing traffic-eng policy PERFLOW     color 100 endpoint 1.1.1.4      candidate-paths       preference 100        per-flow forward-class 0 color 10          forward-class 1 color 20         forward-class 2 color 30 forward-class 3 color 40 forward-class 4 color 50         forward-class default 0 ! policy pe1_fc0 color 10 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFL4-PE1-FC1 ! policy pe1_fc1 color 20 end-point ipv4 192.168.11.1 candidate-paths preference 150 dynamic ! policy pe1_fc2 color 30 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFL4-PE1-FC2 ! policy pe1_fc3 color 40 end-point ipv4 192.168.11.1 candidate-paths preference 150 dynamic On-Demand Next-Hop Per-Flow ConfigurationThe creation of the SR-TE Policies can be fully automated using ODN. ODN is used to create the child Per-Destination Policies as well as the Per-Flow Policy.segment-routing traffic-eng on-demand color 10 dynamic metric type igp ! ! ! on-demand color 20 dynamic sid-algorithm 128 ! ! on-demand color 30 dynamic metric type te ! ! on-demand color 30 dynamic metric type igp ! ! on-demand color 50 dynamic metric type latency ! ! on-demand color 100 per-flow forward-class 0 color 10 forward-class 1 color 20 forward-class 2 color 30 forward-class 3 color 40 forward-class 4 color 50SR-TE Egress Peer EngineeringSR-TE EPE is used to steer traffic out a specific egress interface, set ofinterfaces, or set of neighbors. This path or set of paths will override therouter’s normal best path selection process.EPE SID TypesThe following EPE SID types were created to address different network use cases.The figure below highlights a set of specific links and nodes as an example ofwhen each SID type is created. IOS-XR allows users to use either dynamic orexplicit persistent definitions for PeerAdj SIDs corresponding to a logicalinterface. PeerNode and PeerSet SIDs are always defined explictly by thepersistent configuration. EPE SID Name Purpose PeerAdj Used to steer traffic out a specific adjacent single interface PeerNode Used to steer traffic out multiple interfaces to the same EBGP peer (typically requires the use of eBGP Multi-Hop) PeerSet Used to steer traffic out a set of interfaces or nodes by grouping PeerAdj and PeerNode SIDs into a set addressable by the PeerSet SID EPE PeerSet Use CaseIn the following example we would like to balance traffic to 10.0.0.0/24 across three egress interface to three different ASNs. In typical networks, the BGP best path selection algorithm will select the best egress path based on its selection criteria. EBGP Multipath may select all three paths if the proper criteria is met, but it can not be guaranteed. EPE will override this process and balance traffic out the three interfaces despite the BGP path selection process.IOS-XR ConfigurationThe following configuration is used for the above example. Peer-set 1 groups the three external neighbors into a single peer-set, allowing traffic to balanced across all three neighbors by referencing the PeerSet SID 15001. In addition a second PeerSet 2 is used to balancetraffic across two specific logical interfaces.segment-routing   local-block 15000 15999router bgp 10  address-family ipv4 unicast   peer-set-id 1      peer-set-sid index 1  peer-set-id 2      peer-set-sid index 2   adjacencies 10.10.10.2    adjacency-sid index 500    peer-set 2 30.10.10.2 adjacency-sid index 501     peer-set 2    neighbor 10.10.10.2     remote-as 1001     egress-engineering     peer-node-sid index 600     peer-set 1   neighbor 20.10.10.2      remote-as 1002      egress-engineering      peer-node-sid index 700      peer-set 1neighbor 30.10.10.2      remote-as 1003       egress-engineering      peer-node-sid index 800      peer-set 1IXP Fabric Low Level DesignSegment Routing UnderlayThe underlay network used in the IXP Fabric design is the same as utilized with the regular Peering Fabric design. The validated IGP used for all iterations of the IXP Fabric is IS-IS, with all elements of the fabric belonging to the same Level 2 IS-IS domain.EVPN L2VPN ServicesComprehensive configuration for EVPN L2VPN services are outside the scope of this document, please consult the Converged SDN Transport design guide or associated Cisco documentation for low level details on configuring EVPN VPWS and EVPN ELAN services. The Converged SDN Transport design guide can be found at the following URL# https#//xrdocs.io/design/blogs/latest-converged-sdn-transport-hldPeering Fabric TelemetryOnce a peering fabric is deployed, it is extremely important to monitorthe health of the fabric as well as harness the wealth of data providedby the enhanced telemetry on the NCS5500 platform and IOS-XR. Throughstreaming data mechanisms such as Model-Driven Telemetry, BMP, andNetflow, providers can extract data useful for operations, capacityplanning, security, and many other use cases. In the diagram below, thetelemetry collection hosts could be a single system or distributedsystems used for collection. The distributed design of the peeringfabric enhances the ability to collect telemetry data from the fabric bydistributing resources across the fabric. Each PFL or PFS contains amodern multi-core CPU and at least 32GB of RAM (64GB in NC55A1-36H-SE)to support not only built in telemetry operation but also 3rdparty applications a service or content provider may want to deploy tothe node for additional telemetry. Examples of 3rd partytelemetry applications include those storing temporary data forroot-cause analysis if a node is isolated from the rest of the networkor performance measurement applications.The peering fabric also fully supports traditional collections methodssuch as SNMP, and NETCONF using YANG models to integrate with legacysystems.Telemetry DiagramModel-Driven TelemetryMDT uses standards-based or native IOS-XR YANG data models to streamoperational state data from deployed devices. The ability to pushstatistics and state data from the device adds capabilities andefficiency not found using traditional SNMP. Sensors and collectionhosts can be configured statically on the host (dial-out) or the set ofsensors, collection hosts, and their attributes can be managed off-boxusing OpenConfig or native IOS-XR YANG models. Pipeline is Cisco’s opensource collector, which can take MDT data as an input and output it viaa plugin architecture supporting scalable messages buses such as Kafka,or directly to a TSDB such as InfluxDB or Prometheus. The appendixcontains information about MDT YANG paths relevant to the peering fabricand their applicability to PFS and PFL nodes.BGP Monitoring ProtocolBMP, defined in RFC7854, is a protocol to monitor BGP RIB information,updates, and protocol statistics. BMP was created to alleviate theburden of collecting BGP routing information using inefficientmechanisms like screen scraping. BMP has two primary modes, RouteMonitoring mode and Route Mirroring mode. The monitoring mode willinitially transmit the adj-rib-in contents per-peer to a monitoringstation, and continue to send updates as they occur on the monitoreddevice. Setting the L bits on the RM header to 1 will convey this is apost-policy route, 0 will indicate pre-policy. The mirroring mode simplyreflects all received BGP messages to the monitoring host. IOS-XRsupports sending pre and post policy routing information and updates toa station via the Route Monitoring mode. BMP can additionally sendinformation on peer state change events, including why a peer went downin the case of a BGP event.There are drafts in the IETF process led by Cisco to extend BMP toreport additional routing data, such as the loc-RIB and per-peeradj-RIB-out. Local-RIB is the full device RIB include ng received BGProutes, routes from other protocols, and locally originated routes.Adj-RIB-out will add the ability to monitor routes advertised to peerspre and post routing policy.Netflow / IPFIXNetflow was invented by Cisco due to requirements for traffic visibilityand accounting. Netflow in its simplest form exports 5-tuple data foreach flow traversing a Netflow-enabled interface. Netflow data isfurther enhanced with the inclusion of BGP information in the exportedNetflow data, namely AS_PATH and destination prefix. This inclusionmakes it possible to see where traffic originated by ASN and derive thedestination for the traffic per BGP prefix. The latest iteration ofCisco Netflow is Netflow v9, with the next-generation IETF standardizedversion called IPFIX (IP Flow Information Export). IPFIX has expanded onNetflow’s capabilities by introducing hundreds of entities.Netflow is traditionally partially processed telemetry data. The deviceitself keeps a running cache table of flow entries and countersassociated with packets, bytes, and flow duration. At certain timeintervals or event triggered, the flow entries are exported to acollector for further processing. The type 315 extension to IPFIX,supported on the NCS5500, does not process flow data on the device, butsends the raw sampled packet header to an external collector for allprocessing. Due to the high bandwidth, PPS rate, and large number ofsimultaneous flows on Internet routers, Netflow samples packets at apre-configured rate for processing. Typical sampling values on peeringrouters are 1 in 8192 packets, however customers implementing Netflow orIPFIX should work with Cisco to fine tune parameters for optimal datafidelity and performance.Automation and ProgrammabilityCrosswork Cloud Network InsightsCrosswork Cloud Network Insights is a cloud based service providing BGP networkanalytics for provider networks. CCNI uses an extensive set of worldwide BGPprobes to continuously collect routing updates and check for prefix routinganomalies. CCNI also provides advanced BGP looking glass capabilities from theperspective of its worldwide visibility endpoints. Integration with theworldwide RPKI ROA infrastructure also gives users an instant way to view ROAvalidity for your own prefixes as well as other prefixes on the Internet. ThePeerMon feature of CCNI also allows monitoring for prefixes coming into yournetwork, any advertisement changes are historically logged and can be alertedupon. The tools provided detect BGP Prefix hijack scenarios quickly so providers can remediate them as quickly as possible.Being a cloud based application there is no on-premise software or serverresources to manage, and continuous updates are added to CCNI withoutdisruption. More information on CCNI can be found at https#//crosswork.cisco.com/Looking Glass and AS Path TraceThe figure below shows the ability to graphically trace the AS path of a prefix from its origin through a specified ASN, in this case 3356. The visibility of the prefix at CCNIs end probes on the left show the prefix is correctly being propagated through 3356.AS Path and Prefix Alarm CapabilitiesCCNI can alarm and send notifications on a wide variety of prefix anomalybehavior. The following anomalies can trigger alarms via policy# AS Path Length Violation New AS Path Edge Parent Aggregate Change ROA Expiry ROA Failure ROA Not Found Subprefix Advertisement Upstream AS Change Valid AS Path Violation Unexpected AS PrefixCrosswork Cloud Traffic AnalysisCrosswork Cloud Traffic Analysis (CCTA) collects network traffic data andprovides both historical statistics as well as advanced traffic analysisapplications so providers can better understand their traffic patterns and makeintelligent changes to optimie their networks. CCTA uses a lightweighton-premise collector (Crosswork Data Gateway) to collect SNMP, Netflow, and BGPinformation from the network routers and a secure tether to the Cisco cloudwhere data is ingested, processed, and analyzed. The on-premise component canalso replicate Netflow data to other Netflow tools, eliminating the need toexport flows to multiple destinations from the routers. The flexible taggingarchitecture also makes grouping sets of prefixes or devices very easy andallows users to view aggregate data across those tagged elements.CCTA allows users to drill down into per-prefix data and provides advancedaplications such as Peer Prospecting and Traffic Balancing recommendations. Thetraffic comparison application allows one to quickly see the balance of trafficon a specific prefix or set of prefixes between different routers. The examplebelow shows a traffic inbalance for a set of prefixes on two separate edgepeering routers.More information on CCTA can be found at https#//crosswork.cisco.com/Looking Glass and AS Path TraceThe figure below shows the ability to graphically trace the AS path of a prefix from its origin through a specified ASN, in this case 3356. The visibility of the prefix at CCNIs end probes on the left show the prefix is correctly being propagated through 3356.AS Path and Prefix Alarm CapabilitiesCCNI can alarm and send notifications on a wide variety of prefix anomalybehavior. The following anomalies can trigger alarms via policy# AS Path Length Violation New AS Path Edge Parent Aggregate Change ROA Expiry ROA Failure ROA Not Found Subprefix Advertisement Upstream AS Change Valid AS Path Violation Unexpected AS PrefixCrosswork Cloud Trust InsightsTrust Insights is a cloud based application allowing providers to easily monitorand report on the security of their network devices. Trust Insights utilizes thesame on-premise Data Gateway to securely send network to Cisco’s secure cloud for additional analysis and reporting. Just some of the capabilities of Trust Insights are highlighted below.Visualize Trust Report on unique trust data from Cisco IOS XR devices Verify HW/SW running on production systems with cryptographic proof Review security capabilities for IOS XR routing devicesTrack & Verify Inventory Streamline tracking and traceability of hardware, software, and patches Prove remediation of SW/HW issues for compliance & audit Simplify forensics with extensive history of inventory changesThe figure below shows tracking of both inventory and SW/HW changes over the specified timeline.Utilize Trusted Data for Automation Use securely collected evidence of hardware and software inventory Tie inventory change to trigger closed-loop automation workflow Enable integration and data access via standards-based APICisco NSO ModulesCisco Network Services Orchestrator is a widely deployed networkautomation and orchestration platform, performing intent-drivenconfiguration and validation of networks from a single source of truthconfiguration database. The Peering design includes a Cisco NSOmodules to perform specific peering tasks such as peer turn-up, peermodification, deploying routing policy and ACLs to multiple nodes,providing a jumpstart to peering automation. The following table highlights the currently available Peering NSO services. The current peering service models use the IOS-XR CLI NED and are validated with NSO 4.5.5. Service Description peering-service Manage full BGP and Interface Configuration for EBGP Peers peering-acl Manage infrastructure ACLs referenced by the peering service prefix-set Manage IOS-XR prefix-sets as-path-set Manage IOS-XR as-path sets route-policy Manage XR routing policies for deployment to multiple peering nodes peering-common A set of services to manage as-path sets, community sets, and static routing policies drain-service Service to automate draining traffic away from a node under maintenance telemetry Service to enable telemetry sensors and export to collector bmp Service to enable BMP on configured peers and export to monitoring station netflow Service to enable Netflow on configured peer interfaces and export to collector PFL-to-PFS-Routing Configures IGP and BGP routing between PFL and PFS nodes PFS-Global-BGP Configures global BGP parameters for PFS nodes PFS-Global-ISIS Configures global IS-IS parameters for PFS nodes NetconfNetconf is an industry standard method for configuration networkdevices. Standardized in RFC 6241, Netconf has standard Remote ProcedureCalls (RPCs) to manipulate configuration data and retrieving state data.Netconf on IOS-XR supports the candidate datastore, meaningconfiguration must be explicitly committed for application to therunning configuration.YANG Model SupportWhile Netconf created standard RPCs for managing configuration on adevice, it did not define a language for expressing configuration. Theconfiguration syntax communicated by Netconf followed the typical CLIconfiguration, proprietary for each network vendor XML formatted withoutfollowing any common semantics. YANG or Yet Another Network Grammar, isa modeling language to express configuration using standard elementssuch as containers, groups, lists, and endpoint data called leafs. YANG1.0 was defined in RFC 6020 and updated to version 1.1 in RFC 7950.Vendors cover the majority of device configuration and state usingNative YANG models unique to each vendor, but the industry is headedtowards standardized models where applicable. Groups such as OpenConfigand the IETF are developing standardized YANG models allowing operatorsto write a configuration once across all vendors. Cisco has implementeda number of standard OpenConfig network models relevant to peeringincluding the BGP protocol, BGP RIB, and Interfaces model.The appendix contains information about YANG paths relevant toconfiguring the peering fabric and their applicability to PFS and PFLnodes.3rd Party Hosted ApplicationsIOS-XR starting in 6.0 runs on an x86 64-bit Linux foundation. The moveto an open and well supported operating system, with XR componentsrunning on top of it, allows network providers to run 3rdparty applications directly on the router. There are a wide variety ofapplications which can run on the XR host, with fast path interfaces inand out of the application. Example applications are telemetrycollection, custom network probes, or tools to manage other portions ofthe network within a location.XR Service Layer APIThe XR service layer API is a gRPC based API to extract data from adevice as well as provide a very fast programmatic path into therouter’s runtime state. One use case of SL API in the peering fabricis to directly program FIB entries on a device, overriding the defaultpath selection. Using telemetry extracted from a peering fabric, anexternal controller can use the data and additional external constraintsto programmatically direct traffic across the fabric. SL API alsosupports transmission of event data via subscriptions.Recommended Device and Protocol ConfigurationOverviewThe following configuration guidelines will step through the majorcomponents of the device and protocol configuration specific to thepeering fabric and highlight non-default configuration recommended foreach device role and the reasons behind those choices. Complete exampleconfigurations for each role can be found in the Appendix of thisdocument. Configuration specific to telemetry is covered in section 4.Common Node ConfigurationThe following configuration is common to both PFL and PFS NCS5500 seriesnodes.Enable LLDP GloballylldpPFS NodesAs the PFS nodes will integrate into the core control-plane, onlyrecommended configuration for connectivity to the PFL nodes is given.IGP Configurationrouter isis pf-internal-core set-overload-bit on-startup wait-for-bgp is-type level-1-2 net <L2 NET> net <L1 PF NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10Segment Routing Traffic EngineeringIn IOS-XR there are two mechanisms for configuring SR-TE. Prior to IOS-XR 6.3.2 SR-TE was configured using the MPLS traffic engineering tunnel interface configuration. Starting in 6.3.2 SR-TE can now be configured using the more flexible SR-TE Policy model. The following examples show how to define a static SR-TE path from PFS node to exit PE node using both the legacy tunnel configuration model as well as the new SR Policy model.Paths to PE exit node being load balanced across two static P routers using legacy tunnel configexplicit-path name PFS1-P1-PE1-1 index 1 next-address 192.168.12.1 index 2 next-address 192.168.11.1!explicit-path name PFS1-P2-PE1-1 index 1 next-label 16221 index 2 next-label 16511!interface tunnel-te1 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.1 path-option 1 explicit name PFS1-P1-PE1-1 segment-routing!interface tunnel-te2 bandwidth 1000 ipv4 unnumbered Loopback0 destination 192.168.11.2 path-option 1 explicit name PFS1-P2-PE1-1 segment-routingIOS-XR 6.3.2+ SR Policy Configurationsegment-routingtraffic-eng segment-list PFS1-P1-PE1-SR-1 index 1 mpls label 16211 index 2 mpls label 16511 ! segment-list PFS1-P2-PE1-SR-1 index 1 mpls label 16221 index 2 mpls label 16511 ! policy pfs1_pe1_via_p1 binding-sid mpls 900001 color 1 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! ! policy pfs1_pe1_via_p2 binding-sid mpls 900002 color 2 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFS1-P1-PE1-SR-1 weight 1 ! ! ! !BGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATH is longer address-family ipv4 unicast additional-paths receive maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths receive bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF Model-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1PFL NodesPeer QoS PolicyPolicy applied to edge of the network to rewrite any incoming DSCP valueto 0.policy-map peer-qos-in class class-default set dscp default ! end-policy-map!Peer Infrastructure ACLSee the Security section of the document for recommended best practicesfor ingress and egress infrastructure ACLs.access-group v4-infra-acl-in access-group v6-infra-acl-in access-group v4-infra-acl-out access-group v6-infra-acl-out Peer Interface Configurationinterface TenGigE0/0/0/0 description “external peer” service-policy input peer-qos-in ;Explicit policy to rewrite DSCP to 0 lldp transmit disable #Do not run LLDP on peer connected interfaces lldp receive disable #Do not run LLDP on peer connected interfaces ipv4 access-group v4-infra-acl-in #IPv4 Ingress infrastructure ACL ipv4 access-group v4-infra-acl-out #IPv4 Egress infrastructure ACL, BCP38 filtering ipv6 access-group v6-infra-acl-in #IPv6 Ingress infrastructure ACL ipv6 access-group v6-infra-acl-out #IPv6 Egress infrastructure ACL, BCP38 filtering IS-IS IGP Configurationrouter isis pf-internal set-overload-bit on-startup wait-for-bgp is-type level-1 net <L1 Area NET> log adjacency changes log pdu drops lsp-refresh-interval 65000 ;Maximum refresh interval to reduce IS-IS protocol traffic max-lsp-lifetime 65535 ;Maximum LSP lifetime to reduce IS-IS protocol traffic lsp-password hmac-md5 <password> ;Set LSP password, enhance security address-family ipv4 unicast metric-style wide segment-routing mpls ;Enable segment-routing for IS-IS maximum-paths 32 ;Set ECMP path limit address-family ipv6 unicast metric-style wide maximum-paths 32 !interface Loopback0 passive address-family ipv4 unicast metric 10 prefix-sid index <globally unique index> address-family ipv6 unicast metric 10 ! interface HundredGigE0/0/0 point-to-point circuit-type level-1 hello-password hmac-md5 <password> bfd minimum-interval 100 bfd multiplier 3 bfd fast-detect ipv4 bfd fast-detect ipv6 address-family ipv4 unicast metric 10 fast-reroute per-prefix ti-lfa ;Enable topology-independent loop-free-alternates on a per-prefix basis address-family ipv6 unicast metric 10BGP Add-Path Route Policyroute-policy advertise-all ;Create policy for add-path advertisements set path-selection all advertiseend-policyBGP Global Configurationbgp router-id <Lo0 IP> bgp bestpath aigp ignore ;Ignore AIGP community when sent by peer bgp bestpath med always ;Compare MED values even when AS_PATH doesn’t match bgp bestpath as-path multipath-relax ;Use multipath even if AS_PATh is longer address-family ipv4 unicast bgp attribute-download ;Enable BGP information for Netflow/IPFIX export additional-paths send additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv4 NLRI to PFS maximum-paths ibgp 32 ;set maximum retained IBGP paths to 32 maximum-paths ebgp 32 ;set maximum retained EBGP paths to 32 !address-family ipv6 unicast additional-paths send additional-paths receive additional-paths selection route-policy advertise-all ;Advertise all equal-cost IPv6 NLRI to PFS bgp attribute-download maximum-paths ibgp 32 maximum-paths ebgp 32!address-family link-state link-state ;Enable BGP-LS AF EBGP Peer Configurationsession-group peer-session ignore-connected-check #Allow loopback peering over ECMP w/o EBGP Multihop egress-engineering #Allocate adj-peer-SID ttl-security #Enable gTTL security if neighbor supports it bmp-activate server 1 #Optional send BMP data to receiver 1af-group v4-af-peer address-family ipv4 unicast soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 1000 80;Set maximum inbound prefixes, warning at 80% thresholdaf-group v6-af-peer soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor maximum-prefix 100 80 #Set maximum inbound prefixes, warning at 80% thresholdneighbor-group v4-peer use session-group peer-session dmz-link-bandwidth ;Propagate external link BW address-family ipv4 unicast af-group v4-af-peerneighbor-group v6-peer use session-group peer-session dmz-link-bandwidth address-family ipv6 unicast af-group v6-af-peer neighbor 1.1.1.1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v4-peer address-family ipv4 unicast route-policy v4-peer-in(12345) in route-policy v4-peer-out(12345) out neighbor 2001#dead#b33f#0#1#1#1#1 description ~ext-peer;12345~ remote-as 12345 use neighbor-group v6-peer address-family ipv6 unicast route-policy v6-peer-in(12345) in route-policy v6-peer-out(12345) out PFL to PFS IBGP Configurationsession-group pfs-session ttl-security #Enable gTTL security if neighbor supports it bmp-activate server 1 #Optional send BMP data to receiver 1 update-source Loopback0 #Set BGP session source address to Loopback0 address af-group v4-af-pfs address-family ipv4 unicast next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v4-pfs-in in route-policy v4-pfs-out out af-group v6-af-pfs next-hop-self #Set next-hop to Loopback0 address soft-reconfiguration inbound always #Store inbound routes for operational purposes multipath #Store multiple paths if using ECMP to neighbor route-policy v6-pfs-in in route-policy v6-pfs-out out neighbor-group v4-pfs ! use session-group pfs-session address-family ipv4 unicast af-group v4-af-pfsneighbor-group v6-pfs ! use session-group pfs-session address-family ipv6 unicast af-group v6-af-pfs neighbor <PFS IP> description ~PFS #1~ remote-as <local ASN> use neighbor-group v4-pfsNetflow/IPFIX Configurationflow exporter-map nf-export version v9 options interface-table timeout 60 options sampler-table timeout 60 template timeout 30 ! transport udp <port> source Loopback0 destination <dest>flow monitor-map flow-monitor-ipv4 record ipv4 option bgpattr exporter nf-export cache entries 50000 cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-ipv6 record ipv6 option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10!flow monitor-map flow-monitor-mpls record mpls ipv4-ipv6-fields option bgpattr exporter nf-export cache timeout active 60 cache timeout inactive 10 sampler-map nf-sample-8192 random 1 out-of 8192Peer Interfaceinterface Bundle-Ether100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressPFS Upstream Interfaceinterface HundredGigE0/0/0/100 flow ipv4 monitor flow-monitor-ipv4 sampler nf-sample-8192 ingress flow ipv6 monitor flow-monitor-ipv6 sampler nf-sample-8192 ingress flow mpls monitor flow-monitor-mpls sampler nf-sample-8192 ingressModel-Driven Telemetry ConfigurationThe configuration below creates two sensor groups, one for BGP data andone for Interface counters. Each is added to a separate subscription,with the BGP data sent every 60 seconds and the interface data sentevery 30 seconds. A single destination is used, however multipledestinations could be configured. The sensors and timers provided arefor illustration only.telemetry model-driven destination-group mdt-dest-1 vrf default address-family ipv4 <dest IP> <dest-port> encoding <gpb | self-describing-gbp> protocol <tcp | grpc> ! ! sensor-group peering-pfl-bgp sensor-path openconfig-bgp#bgp/neighbors ! sensor-group peering-pfl-interface sensor-path openconfig-platform#components sensor-path openconfig-interfaces#interfaces sensor-path Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface sensor-path Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info sensor-path Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters ! subscription peering-pfl-sub-bgp sensor-group-id peering-pfl-bgp sample-interval 60000 destination-id mdt-dest-1 ! subscription peering-pfl-sub-interface sensor-group-id peering-pfl-interface sample-interval 30000 destination-id mdt-dest-1Abstract Peering ConfigurationAbstract peering uses qualities of Segment Routing anycast addresses toallow a provider to steer traffic to a specific peering fabric by simplyaddressing a node SID assigned to all PFS members of the peeringcluster. All of the qualities of SR such as midpoint ECMP and TI-LFAfast protection are preserved for the end to end BGP path, improvingconvergence across the network to the peering fabric. Additionally,through the use of SR-TE Policy, source routed engineered paths can beconfigured to the peering fabric based on business logic and additionalpath constraints.PFS ConfigurationOnly the PFS nodes require specific configuration to perform abstractpeering. Configuration shown is for example only with IS-IS configuredas the IGP carrying SR information. The routing policy setting thenext-hop to the AP anycast SID should be incorporated into standard IBGPoutbound routing policy.interface Loopback1 ipv4 address x.x.x.x/32 ipv6 address x#x#x#x##x/128 router isis <ID> passive address-family ipv4 unicast prefix-sid absolute <Global IPv4 AP Node SID> address-family ipv6 unicast prefix-sid absolute <Global IPv6 AP Node SID> route-policy v4-abstract-ibgp-out set next-hop <Loopback1 IPv4 address> route-policy v6-abstract-ibgp-out set next-hop <Loopback1 IPv6 address> router bgp <ASN> ibgp policy out enforce-modifications ;Enables a PFS node to set a next-hop address on routes reflected to IBGP peersrouter bgp <ASN> neighbor x.x.x.x address-family ipv4 unicast route-policy v4-abstract-ibgp-out neighbor x#x#x#x##x address-family ipv6 unicast route-policy v6-abstract-ibgp-out BGP Flowspec Configuration and OperationBGP Flowspec consists of two different node types. The BGP Flowspec Server is where Flowspec policy is defined and sent to peers via BGP sessions with the BGP Flowspec IPv4 and IPv6 AFI/SAFI enabled. The BGP Flowspec Client receives Flowspec policy information and applies the proper dataplane match and action criteria via dynamic ACLs applied to each routerinterface. By default, IOS-XR applies the dynamic policy to all interfaces, with an interface-level configuration setting used to disable BGP Flowspec on specific interfaces.In the Peering Fabric, PFL nodes will act as Flowspec clients. The PFS nodes may act as Flowspec servers, but will never act as clients.Flowspec policies are typically defined on an external controller to be advertised to the rest of the network. The XRv-9000 virtual router works well in these instances. If one is using an external element to advertise Flowspec policies to the peering fabric, they should be advertised to the PFS nodes which will reflect them to the PFL nodes. In the absence of an external policy injector Flowspec policies can be defined on the Peering Fabric PFS nodes for advertisement to all PFL nodes. IPv6 Flowspec on the NCS5500 requires the use of the following global command, followed by a device reboot. hw-module profile flowspec ipv6-enableEnabling BGP Flowspec Address Families on PFS and PFL NodesFollowing the standard Peering Fabric BGP group definitions the following new groups are augmented. The following configuration assumes the PFS node is the BGP Flowspec server.PFSrouter bgp <ASN>address-family ipv4 flowspec address-family ipv6 flowspec af-group v4-flowspec-af-pfl address-family ipv4 flowspec multipath route-reflector-client next-hop-self af-group v6-flowspec-af-pfl address-family ipv4 flowspec multipath route-reflector-client next-hop-self neighbor-group v4-pfl address-family ipv4 flowspec use af-group v4-flowspec-af-pfl neighbor-group v6-pfl address-family ipv6 flowspec use af-group v6-flowspec-af-pfl PFLrouter bgp <ASN>address-family ipv4 flowspec address-family ipv6 flowspec af-group v4-flowspec-af-pfs address-family ipv4 flowspec multipath af-group v6-flowspec-af-pfs address-family ipv4 flowspec multipath neighbor-group v4-pfs address-family ipv4 flowspec use af-group v4-flowspec-af-pfl neighbor-group v6-pfs address-family ipv6 flowspec use af-group v6-flowspec-af-pfl BGP Flowspec Server Policy DefinitionPolicies are defined using the standard IOS-XR QoS Configuration, the first example below matches the recent memcached DDoS attack and drops all traffic. Additional examples are given covering various packet matching criteria and actions.class-map type traffic match-all memcached match destination-port 11211 match protocol udp tcp match destination-address ipv4 10.0.0.0 255.255.255.0 end-class-map!!policy-map type pbr drop-memcached class type traffic memcached drop ! class type traffic class-default ! end-policy-mapclass-map type traffic match-all icmp-echo-flood match protocol icmp match ipv4 icmp type 8 match destination-address ipv4 10.0.0.0 255.255.255.0 end-class-map!!policy-map type pbr limit-icmp-echo class type traffic memcached police rate 100 kbps ! class type traffic class-default ! end-policy-mapclass-map type traffic match-all dns match protocol udp match source port 53 end-class-map!!policy-map type pbr redirect-dns class type traffic dns police rate 100 kbps ! class type traffic class-default redirect nexthop 1.1.1.1 redirect nexthop route-target 1000#1 ! end-policy-mapBGP Flowspec Server EnablementThe following global configuration will enable the Flowspec server and advertisethe policy via the BGP Flowspec NLRIflowspec address-family ipv4 service-policy type pbr drop-memcachedBGP Flowspec Client ConfigurationThe following global configuration enables the BGP Flowspec client function and installation of policies on all local interfaces. Flowspec can be disabled on individual interfaces using the [ipv4|ipv6] flowspec disable command in interface configuration mode.flowspec address-family ipv4 local-install interface-all QPPB Configuration and OperationQoS Policy Propagation using BGP is described in more detail in the Security section.QPPB applies standard QoS policies to packets matching BGP prefix criteria such as BGP community or AS Path. QPPB is supported for both IPv4 and IPv6 address families and packets. QPPB on the NCS5500 supports matching destination prefix attributes only.QPPB configuration starts with a standard RPL route policy that matches BGP attributes and sets a specific QoS group based on that criteria. This routing policy is applied to each address-family as a table-policy in the global BGP configuration. A standard MQC QoS policy is then defined using the specific QoS groups as match criteria to apply additional QoS behavior such as filtering, marking, or policing. This policy is applied to a logical interface, with a specific QPPB command used to enable the propagation of BGP data as part of the dataplane ACL packet match criteria.IPv6 QPPB on the NCS5500 requires the use of the following global command, followed by a device reboot. hw-module profile qos ipv6 shortRouting Policy Configurationroute-policy qppb-test if community matches-every (1000#1) then set qos-group 1 endif if community matches-every (1000#2) then set qos-group 2 endifend-policyGlobal BGP Configurationrouter bgp <ASN> address-family ipv4 unicast table-policy qppb-test address-family ipv6 unicast table-policy qppb-test QoS Policy Definitionclass-map match-any qos-group-1 match qos-group 1 end-class-map class-map match-any qos-group-2 match qos-group 2 end-class-map policy-map remark-peer-traffic class qos-group1 set precedence 5 set mpls experimental imposition 5 ! class qos-group2 set precedence 3 set mpls experimental imposition 3 ! class class-default ! end-policy-mapInterface-Level Configurationinterface gigabitethernet0/0/0/1 service-policy input remark-peer-traffic ipv4 bgp policy propagation input qos-group destination ipv6 bgp policy propagation input qos-group destination BGP Graceful ShutdownBGP graceful shutdown is an IETF standard mechanism for notifying an IBGP or EBGP peer the peer will be g$ing offline. Graceful shutdown uses a well-known community, the GSHUT community (65535#0), on each prefix advertised to a peer so the peer can match the community and perform an action to move traffic gracefully away from the peer before it goes down. In the example in the peering design we will lower the local preference on the route.Outbound graceful shutdown configurationGraceful shutdown is part of the graceful maintenance configuration within BGP. Graceful maintenance can also perform an AS prepend operation when activated. Sending the GSHUT community is enabled using the send-community-gshut-ebgp command under each address family. Graceful maintenance is enabled using the “activate” keyword in the configuration for the neighbor, neighbor-group, or globally for the BGP process.neighbor 1.1.1.1 graceful-maintenance as-prepends 3 address-family ipv4 unicast send-community-gshut-ebgp ! address-family ipv6 unicast send-community-gshut-ebgp Inbound graceful shutdown configurationInbound prefixes tagged with the GSHUT community should be processed with a local-preference of 0 applied so if there is another path for traffic it can be utilized prior to the peer going down. The following is a simple example of a community-set and routing policy to perform this. This could also be added to an existing peer routing policy.community-set graceful-shutdown 65535#0 end-set ! route-policy gshut-inbound if community matches-any graceful-shutdown then set local-preference 0 endif end-policy Activating graceful shutdownGraceful maintenance can be activated globally or for a specific neighbor/neighbor-group. To enable graceful shutdown use the activate keyword under the “graceful-maintenance” configuration context. Without the “all-neighbors” flag maintenance will only be enabled for peers with their own graceful-maintenance configuration. The activate command is persistantGlobalrouter bgp 100 graceful-maintenance activate [ all-neighbors ] Individual neighborrouter bgp 100 neighbor 1.1.1.1 graceful-maintenance activate Peers in specific neighbor-groupneighbor-group peer-group graceful-maintenance activate SecurityPeering by definition is at the edge of the network, where security ismandatory. While not exclusive to peering, there are a number of bestpractices and software features when implemented will protect your ownnetwork as well as others from malicious sources within your network.Peering and Internet in a VRFUsing VRFs to isolate peers and the Internet routing table from the infrastructure can enhance security by keeping internal infrastructure components separate from Internet and end user reachability. VRF separation can be done one of three different ways# Separate each peer into its own VRF, use default VRF on SP Network Single VRF for all “Internet” endpoints, including peers Separate each peer into its own VRF, and use a separate “Internet” VRFVRF per Peer, default VRF for InternetIn this method each peer, or groups of peers, are configured under separate VRFs. The SP carries these and all other routes via the default VRF in IOS-XR commonly known as the Global Routing Table. The VPNv4 and VPNv6 address families are NOT configured on the BGP peering sessions between the PFL and PFS nodes and the PFS nodes and the rest of the network. IOS-XR provides the command import from default-vrf and export to default-vrf with a route-policy to match specific routes to be imported to/from each peer VRF to the default VRF. This provides dataplane isolation between peers and another mechanism to determine which SP routes are advertised to each peer.Internet in a VRF OnlyIn this method all Internet endpoints are configured in the same “Internet” VRF. The security benefit is removing dataplane connectivity between the global Internet and your underlying infrastructure, which is using the default VRF for all internal connectivity. This method uses the VPNv4/VPNv6 address families on all BGP peers and requires the Internet VRF be configured on all peering fabric nodes as well as SP PEs participating in the global routing table. If there are VPN customers or public-facing services in their own VRF needing Internet access, routes can be imported/exported from the Internet VRF on the PE devices they attach to.VRF per Peer, Internet in a VRFThis method combines the properties and configuration of the previous two methods for a solution with dataplane isolation per peer and separation of all public Internet traffic from the SP infrastructure layer. The exchange of routes between the peer VRFs and Internet VRF takes place on the PFL nodes with the rest of the network operating the same as the Internet in a VRF use case.The VPNv4 and VPNv6 address families must be configured across all routers in the network.Infrastructure ACLsInfrastructure ACLs and their associated ACEs (Access Control Entries)are the perimeter protection for a network. The recommended PFL deviceconfiguration uses IPv4 and IPv6 infrastructure ACLs on all edgeinterfaces. These ACLs are specific to each provider’s security needs,but should include the following sections. Filter IPv4 and IPv6 BOGON space ingress and egress Drop ingress packets with a source address matching your own aggregate IPv4/IPv6 prefixes. Rate-limit ingress traffic to Unix services typically used in DDoSattacks, such as chargen (TCP/19). On ingress and egress, allow specific ICMP types and rate-limit toappropriate values, filter out ones not needed on your network. ICMPttl-exceeded, host unreachable, port unreachable, echo-reply,echo-request, and fragmentation needed should always be allowed in somecapacity.BCP ImplementationBest Current Practices are informational documents published by the IETFto give guidelines on operational practices. This document will notoutline the contents of the recommended BCPs, but two in particular areof interest to Internet peering. BCP38 explains the need to filterunused address space at the edges of the network, minimizing the chancesof spoofed traffic from DDoS sources reaching their intended target.BCP38 is applicable for ingress traffic and especially egress traffic,as it stops spoofed traffic before it reaches outside your network.BCP194, BGP Operations and Security, covers a number of BGP operationalpractices, many of which are used in Internet peering. IOS-XR supportsall of the mechanisms recommended in BCP38, BCP84, and BCP194, includingsoftware features such as GTTL, BGP dampening, and prefix limits.BGP Attribute and CoS ScrubbingScrubbing of data on ingress and egress of your network is an importantsecurity measure. Scrubbing falls into two categories, control-plane anddataplane. The control-plane for Internet peering is BGP and there are afew BGP transitive attributes one should take care to normalize. Yourinternal BGP communities should be deleted from outbound BGP NLRI viaegress policy. Most often you are setting communities on inboundprefixes, make sure you are replacing existing communities from the peerand not adding communities. Unless you have an agreement with the peer,normalize the MED attribute to zero or another standard value on allinbound prefixes.In the dataplane, it’s important to treat the peering edge as untrustedand clear any CoS markings on inbound packets, assuming a prioragreement hasn’t been reached with the peer to carry them across thenetwork boundary. It’s an overlooked aspect which could lead to peertraffic being prioritized on your network, leading to unexpected networkbehavior. An example PFL infrastructure ACL is given resetting incomingIPv4/IPv6 DSCP values to 0.BGP Control-PlaneType 6 Encryption ConfigurationType 6 encryption provides stronger on-box storage of control-plane secrets thanlegacy methods which use relatively weak encryption methods. Type 6 encryption uses the onboard Trust Anchor Module, or TAM, to store the encrypted key outsideof the device configuration, meaning simply having access to the config does not expose the keys used in control-plane protocol security.Create key (exec mode, not config mode)key config-key password-encryption (enter key) password6 encryption aesKey chain configurationAt the “key-string” command simply enter the unencrypted string. If Type-6 encryption is enabled, the key will automatically use the “password6” encryptiontype.key chain bgp_type6 key 1 accept-lifetime 01#00#00 october 24 2005 infinite key-string password6 634d695d4848565e5a5d49604741465566496568575046455a6265414142 send-lifetime 01#00#00 october 24 2005 infinite cryptographic-algorithm HMAC-MD5TCP Authentication Option, MD5 DeprecationTCP Authentication Option, commonly known as TCP-AO, is a modern way to authenticate TCP sessions. TCP is defined in RFC 5925. TCP-AO replaces MD5 authentication, which has been deprecated for a number of years due to its weak security. TCP-AO does NOT encrypt BGP session traffic, it authenticates the TCP header to ensure the neighbor is the correct sender.TCP-AO should be used along with Type 6 encryption to best secure BGP sessions.tcp ao keychain TCP-AO-KEY key 1 SendID 100 ReceiveID 100 !!key chain TCP-AO-KEY key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password6 5d574a574d5b6657555c534c62485b51584b57655351495352564f55575060525a60504b send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm AES-128-CMAC-96 !!BGP Neighbor Configurationrouter bgp 100 neighbor 1.2.3.4 remote-as 101 ao TCP-AO-KEY include-tcp-options enablePer-Peer Control Plane PolicersBGP protocol packets are handled at the RP level, meaning each packet ishandled by the router CPU with limited bandwidth and processingresources. In the case of a malicious or misconfigured peer this couldexhaust the processing power of the CPU impacting other important tasks.IOS-XR enforces protocol policers and BGP peer policers by default.BGP Prefix SecurityRPKI Origin ValidationPrefix hijacking has been prevalent throughout the last decade as theInternet became more integrated into our lives. This led to the creationof RPKI origin validation, a mechanism to validate a prefix was beingoriginated by its rightful owner by checking the originating ASN vs. asecure database. IOS-XR fully supports RPKI for origin validation.BGP RPKI and ROV ConfgurationThe following section outlines an example configuration for RPKI and Route Origin Validation (ROV) within IOS-XR.Create ROV Routing PoliciesIn order to apply specific attributes to routes tagged with an ROV status, one must use a routing policy. The “invalid”, “valid”, and “unconfigured” states can be matched upon and then used to set specific BGP attributes as well as accept or drop the route. In the following example a routes’ local-preference attribute is set based on ROV status.route-policy rpki if validation-state is invalid then set local-preference 50 endif if validation-state is not-found then set local-preference 75 endif if validation-state is valid then set local-preference 100 endif else pass end policyConfigure RPKI Server and ROV OptionsAn RPKI server is defined using the “rpki server” section under the global BGP hierarchy. Also configurable is whether or not the ROV status is taken into account as part of the BGP best path selection process. A route with a “valid” status is preferred over a route with a “not-found” or “invalid” status. There is also a configuration option for whether or not to allow invalid routes at all as part of the selection process. It is recommended to includerouter bgp 65536 bgp router id 192.168.0.1 rpki server 172.16.0.254 transport tcp port 32000 refresh-time 120 bgp bestpath origin-as use validity bgp bestpath origin-as allow invalidEnabling RPKI ROV on BGP NeighborsROV is done at the global BGP level, but the treatment of routes is done at the neighbor level. This requires applying the pre-defined ROV route-policy to the neighbors you wish to apply policy to based on ROV status.neighbor 192.168.0.254 remote-as 64555 address-family ipv4 unicast route-policy rpki in Communicating ROV Status via Well-Known BGP CommunityRPKI ROV is typically only done on the edges of the network, and in IOS-XR is only done on EBGP sessions. In a network with multiple ASNs under the same administrative control, one should configure the following to signal ROV validation status via a well-known community to peers within the same administrative domain. This way only the nodes connected to external peers have RTR sessions to the RPKI ROV validators and are responsible for applying ROV policy, adding efficiency to the process and reducing load on the validator.address-family ipv4 unicast bgp origin-as validation signal ibgp BGPSEC (Reference Only)RPKI origin validation works to validate the source of a prefix, butdoes not validate the entire path of the prefix. Origin validation alsodoes not use cryptographic signatures to ensure the originator is whothey say they are, so spoofing the ASN as well does not stop someoneform hijacking a prefix. BGPSEC is an evolution where a BGP prefix iscryptographically signed with the key of its valid originator, and eachBGP router receiving the path checks to ensure the prefix originatedfrom the valid owner. BGPSEC standards are being worked on in the SIDRworking group. Cisco continues to monitor the standards related to BGPSEC and similar technologies to determine which to implement to best serve our customers.DDoS traffic steering using SR-TESee the overview design section for more details. This shows the configuration of a single SR-TE Policy which will balance traffic to two different egress DDoS “dirty” interfaces. If a BGP session is enabled between the DDoS mitigation appliance and the router, an EPE label can be assigned to the interface. In the absence of EPE, a MPLS static LSP can be created on the core-facing interfaces on the egress node, with the action set to “pop” towards the DDoS mitigation interface.SR-TE Policy configurationIn this example the node SID is 16441. The EPE or manual xconnect SID for a specific egress interface is 28000 and 28001. The weight of each path is 100, so traffic will be equally balanced across the paths.segment-routing traffic-eng segment-list pr1-ddos-1 index 1 mpls label 16441 index 2 mpls label 28000 segment-list pr1-ddos-2 index 1 mpls label 16441 index 2 mpls label 28001 policy pr1_ddos1_epe color 999 end-point ipv4 192.168.14.4 candidate-paths preference 100 explicit segment-list pr1-ddos-1 weight 100 ! explicit segment-list pr1-ddos-2 weight 100Egress node BGP configurationOn the egress BGP node, 192.168.14.4, prefixes are set with a specific “DDoS” color to enable the ingress node to steer traffic into the correct SR Policy. An example is given of injecting the 50.50.50.50/32 route with the “DDoS” color of 999.extcommunity-set opaque DDOS 999 end-set!route-policy SET-DDOS-COLOR set extcommunity color DDOS passend-policy!router static address-family ipv4 unicast 50.50.50.50/32 null0 ! ! router bgp 100 address-family ipv4 unicast network 50.50.50/32 route-policy SET-DDOS-COLOR ! !Egress node MPLS static LSP configurationIf EPE is not being utilized, the last label in the SR Policy path must be matched to a static LSP. The ingress label on the egress node is used to map traffic to a specific IP next-hop and interface. We will give an example using the label 28000 in the SR Policy path. The core-facing ingress interface is HundredGigE0/0/0/1, the egress DDoS “dirty” interface is TenGigE0/0/0/1 with a NH address of 192.168.100.1.mpls static interface HundredGigE0/0/0/1 lsp ddos-interface-1 in-label 28000 allocate forward path 1 nexthop TenGigE0/0/0/1 192.168.100.1 out-label pop ! !!AppendixApplicable YANG ModelsModelDataopenconfig-interfacesCisco-IOS-XR-infra-statsd-operCisco-IOS-XR-pfi-im-cmd-operInterface config and state Common counters found in SNMP IF-MIB openconfig-if-ethernet Cisco-IOS-XR-drivers-media-eth-operEthernet layer config and stateXR native transceiver monitoringopenconfig-platformInventory, transceiver monitoring openconfig-bgpCisco-IOS-XR-ipv4-bgp-oper Cisco-IOS-XR-ipv6-bgp-operBGP config and state Includes neighbor session state, message counts, etc.openconfig-bgp-rib Cisco-IOS-XR-ip-rib-ipv4-oper Cisco-IOS-XR-ip-rib-ipv6-operBGP RIB information. Note# Cisco native includes all protocols openconfig-routing-policyConfigure routing policy elements and combined policyopenconfig-telemetryConfigure telemetry sensors and destinations Cisco-IOS-XR-ip-bfd-cfg Cisco-IOS-XR-ip-bfd-operBFD config and state Cisco-IOS-XR-ethernet-lldp-cfg Cisco-IOS-XR-ethernet-lldp-operLLDP config and state openconfig-mplsMPLS config and state, including Segment RoutingCisco-IOS-XR-clns-isis-cfgCisco-IOS-XR-clns-isis-operIS-IS config and state Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-operNCS 5500 HW resources NETCONF YANG PathsNote that while paths are given to retrieve data from a specific leafnode, it is sometimes more efficient to retrieve all the data under aspecific heading and let a management station filter unwanted data thanperform operations on the router. Additionally, Model Driven Telemetrymay not work at a leaf level, requiring retrieval of an entire subset ofdata.The data is also available via NETCONF, which does allow subtree filtersand retrieval of specific data. However, this is a more resourceintensive operation on the router. Metric Data Logical Interface Admin State Enum SNMP OID IF-MIB#ifAdminStatus OC YANG openconfig-interfaces#interfaces/interface/state/admin-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Interface Operational State Enum SNMP OID IF-MIB#ifOperStatus OC YANG openconfig-interfaces#interfaces/interface/state/oper-status (see OC model, not just up/down) Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/state     Logical Last State Change (seconds) Counter SNMP OID IF-MIB#ifLastChange OC YANG openconfig-interfaces#interfaces/interface/state/last-change Native YANG Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/last-state-transition-time     Logical Interface SNMP ifIndex Integer SNMP OID IF-MIB#ifIndex OC YANG openconfig-interfaces#interfaces/interface/state/if-index Native YANG Cisco-IOS-XR-snmp-agent-oper#snmp/interface-indexes/if-index     Logical Interface RX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCInOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-received     Logical Interface TX Bytes 64-bit Counter SNMP OID IF-MIB#ifHCOutOctets OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-octets Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/bytes-sent     Logical Interface RX Errors Counter SNMP OID IF-MIB#ifInErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-errors MDT Native     Logical Interface TX Errors Counter SNMP OID IF-MIB#ifOutErrors OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-errors Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-errors     Logical Interface Unicast Packets RX Counter SNMP OID IF-MIB#ifHCInUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Unicast Packets TX Counter SNMP OID IF-MIB#ifHCOutUcastPkts OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-unicast-pkts Native YANG Not explicitly supported, subtract multicast/broadcast from total     Logical Interface Input Drops Counter SNMP OID IF-MIB#ifIntDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/in-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/input-drops     Logical Interface Output Drops Counter SNMP OID IF-MIB#ifOutDiscards OC YANG openconfig-interfaces#/interfaces/interface/state/counters/out-discards Native YANG Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters/output-drops     Ethernet Layer Stats – All Interfaces Counters SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics     Ethernet PHY State – All Interfaces Counters SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info     Ethernet Input CRC Errors Counter SNMP OID NA OC YANG openconfig-interfaces#interfaces/interface/oc-eth#ethernet/oc-eth#state/oc-eth#counters/oc-eth#in-crc-errors Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/statistics/statistic/dropped-packets-with-crc-align-errors The following transceiver paths retrieve the total power for thetransceiver, there are specific per-lane power levels which can beretrieved from both native and OC models, please refer to the model YANGfile for additionalinformation.     Ethernet Transceiver RX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-rx-power     Ethernet Transceiver TX Power Counter SNMP OID NA OC YANG oc-platform#components/component/oc-transceiver#transceiver/oc-transceiver#physical-channels/oc-transceiver#channel/oc-transceiver#state/oc-transceiver#input-power Native YANG Cisco-IOS-XR-drivers-media-eth-oper/ethernet-interface/interfaces/interface/phy-info/phy-details/transceiver-tx-power BGP Operational StateGlobal BGP Protocol StateIOS-XR native models do not store route information in the BGP Opermodel, they are stored in the IPv4/IPv6 RIB models. These models containRIB information based on protocol, with a numeric identifier for eachprotocol with the BGP ProtoID being 5. The protoid must be specified orthe YANG path will return data for all configured routingprotocols.     BGP Total Paths (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-paths Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/num-active-paths MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native BGP Neighbor StateExample UsageDue the construction of the YANG model, the neighbor-address key must beincluded as a container in all OC BGP state RPCs. The following RPC getsthe session state for all configured peers#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address/> <state> <session-state/> </state> </neighbor> </neighbors> </bgp> </filter> </get></rpc>\t<nc#rpc-reply message-id=~urn#uuid#24db986f-de34-4c97-9b2f-ac99ab2501e3~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp xmlns=~http#//openconfig.net/yang/bgp~> <neighbors> <neighbor> <neighbor-address>172.16.0.2</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <state> <session-state>IDLE</session-state> </state> </neighbor> </neighbors> </bgp> </nc#data></nc#rpc-reply>     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Complete State for all BGP neighbors Mixed SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors     Session State for all BGP neighbors Enum SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/session-state Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/connection-state     Message counters for all BGP neighbors Counter SNMP OID NA OC YANG openconfig-bgp#bgp/neighbors/neighbor/state/messages Native YANG Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighbors/neighbor/message-statistics Current queue depth for all BGP neighborsCounterSNMP OIDNAOC YANG/openconfig-bgp#bgp/neighbors/neighbor/state/queuesNative YANGCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-outCisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/sessions/session/messages-queued-inBGP RIB DataRIB data is retrieved per AFI/SAFI. To retrieve IPv6 unicast routesusing OC models, replace “ipv4-unicast” with “ipv6-unicast”IOS-XR native models do not have a BGP specific RIB, only RIB dataper-AFI/SAFI for all protocols. Retrieving RIB information from thesepaths will include this data.While this data is available via both NETCONF and MDT, it is recommendedto use BMP as the mechanism to retrieve RIB table data.Example UsageThe following retrieves a list of best-path IPv4 prefixes withoutattributes from the loc-RIB#<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <loc-rib> <routes> <route> <prefix/> <best-path>true</best-path> </route> </routes> </loc-rib> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc>     IPv4 Local RIB – Prefix Count Counter OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/num-routes Native YANG       IPv4 Local RIB – IPv4 Prefixes w/o Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes/route/prefix     IPv4 Local RIB – IPv4 Prefixes w/Attributes List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/loc-rib/routes Native YANG   The following per-neighbor RIB paths can be qualified with a specificneighbor address to retrieve RIB data for a specific peer. Below is anexample of a NETCONF RPC to retrieve the number of post-policy routesfrom the 192.168.2.51 peer and the returned output.<rpc message-id=~101~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <get> <filter> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes/> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </filter> </get></rpc><nc#rpc-reply message-id=~urn#uuid#7d9a0468-4d8d-4008-972b-8e703241a8e9~ xmlns#nc=~urn#ietf#params#xml#ns#netconf#base#1.0~ xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <nc#data> <bgp-rib xmlns=~http#//openconfig.net/yang/rib/bgp~> <afi-safis> <afi-safi> <afi-safi-name xmlns#idx=~http#//openconfig.net/yang/rib/bgp-types~>idx#IPV4_UNICAST</afi-safi-name> <ipv4-unicast> <neighbors> <neighbor> <neighbor-address>192.168.2.51</neighbor-address> <adj-rib-in-post> <num-routes>3</num-routes> </adj-rib-in-post> </neighbor> </neighbors> </ipv4-unicast> </afi-safi> </afi-safis> </bgp-rib> </nc#data></nc#rpc-reply>     IPv4 Neighbor adj-rib-in pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-re     IPv4 Neighbor adj-rib-in post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-in-post     IPv4 Neighbor adj-rib-out pre-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre     IPv4 Neighbor adj-rib-out post-policy List OC YANG openconfig-bgp-rib#bgp-rib/afi-safis/afi-safi/ipv4-unicast/neighbors/neighbor/adj-rib-out-pre BGP Flowspec     BGP Flowspec Operational State Counters SNMP OID NA OC YANG NA Native YANG Cisco-IOS-XR-flowspec-oper MDT Native     BGP Total Prefixes (all AFI/SAFI) Counter SNMP OID NA OC YANG openconfig-bgp#bgp/global/state/total-prefixes Native YANG Cisco-IOS-XR-ip-rib-ipv4-oper/rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count/active-routes-count MDT Native Device Resource YANG Paths     Device Inventory List OC YANG oc-platform#components     NCS5500 Dataplane Resources List OC YANG NA Native YANG Cisco-IOS-XR-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data Validated Model-Driven Telemetry Sensor PathsThe following represents a list of validated sensor paths useful formonitoring the Peering Fabric and the data which can be gathered byconfiguring these sensorpaths.Device inventory and monitoring, not transceiver monitoring is covered under openconfig-platform openconfig-platform#components cisco-ios-xr-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data cisco-ios-xr-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info cisco-ios-xr-shellutil-oper#system-time/uptime cisco-ios-xr-wdsysmon-fd-oper#system-monitoring/cpu-utilizationLLDP MonitoringCisco-IOS-XR-ethernet-lldp-oper#lldpCisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/neighborsInterface statistics and stateopenconfig-interfaces#interfacesCisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-countersCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interfaceCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statisticsCisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statistics/basic-interface-statsThe following sub-paths can be used but it is recommended to use the base openconfig-interfaces modelopenconfig-interfaces#interfaces/interfaceopenconfig-interfaces#interfaces/interface/stateopenconfig-interfaces#interfaces/interface/state/countersopenconfig-interfaces#interfaces/interface/subinterfaces/subinterface/state/countersAggregate bundle information (use interface models for interface counters)sensor-group openconfig-if-aggregate#aggregatesensor-group openconfig-if-aggregate#aggregate/statesensor-group openconfig-lacp#lacpsensor-group Cisco-IOS-XR-bundlemgr-oper#bundlessensor-group Cisco-IOS-XR-bundlemgr-oper#bundle-information/bfd-countersBGP Peering informationsensor-path openconfig-bgp#bgpsensor-path openconfig-bgp#bgp/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticssensor-path Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper/bgp/instances/instance/instance-active/default-vrf/neighborssensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/vrfsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/neighbors/neighborsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/globalsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/bmpsensor-path Cisco-IOS-XR-ipv6-bgp-oper#bgp/instances/instance/instance-active/default-vrf/process-info/performance-statisticsIS-IS IGP informationsensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighborssensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/interfacessensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/adjacenciesIt is not recommended to monitor complete RIB tables using MDT but can be used for troubleshootingCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sumCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-countQoS and ACL monitoringopenconfig-acl#aclCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/general-statsCisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/queue-stats-arrayBGP RIB informationIt is not recommended to monitor these paths using MDT with large tablesopenconfig-rib-bgp#bgp-ribCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-extCisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-intRouting policy InformationCisco-IOS-XR-policy-repository-oper#routing-policy/policies", "url": "/blogs/latest-peering-fabric-hld", "author": "Phil Bedard", "tags": "iosxr, design, peering, ddos, ixp" } , "#": {} , "#": {} , "blogs-latest-converged-sdn-transport-ig": { "title": "Converged SDN Transport Implementation Guide", "content": " On This Page Version Targets Testbed Overview Devices Key Resources to Allocate Role-Based Router Configuration IOS-XR Router Configuration Underlay Bundle interface configuration with BFD Underlay physical interface configuration Performance Measurement Interface delay metric dynamic configuration Interface delay metric static configuration SR Policy Delay Measurement Profile Enabling SR Policy Delay Measurement SR Policy Liveness Detection Profile SR Policy with Liveness Detection Enabled IOS-XR SR-MPLS Transport Segment Routing SRGB and SRLB Definition IGP protocol (ISIS) and Segment Routing MPLS configuration IS-IS router configuration IS-IS Loopback and node SID configuration IS-IS Physical and Bundle interface configuration with BFD MPLS-TE Configuration Unnumbered Interfaces Unnumbered Interface IS-IS Database Anycast SID ABR node configuration IS-IS logical interface configuration with TI-LFA Segment Routing Data Plane Monitoring MPLS Segment Routing Traffic Engineering (SR-TE) configuration MPLS Segment Routing Traffic Engineering (SR-TE) TE metric configuration IOS-XR SR Flexible Algorithm Configuration Flex-Algo IS-IS Definition Flex-Algo Node SID Configuration IOS-XE Nodes - SR-MPLS Transport Segment Routing MPLS configuration Prefix-SID assignment to loopback 0 configuration Basic IGP protocol (ISIS) with Segment Routing MPLS configuration TI-LFA FRR configuration IS-IS and MPLS interface configuration MPLS Segment Routing Traffic Engineering (SR-TE) Area Border Routers (ABRs) IPv4/IPv6 route distribution using BGP Core SR-PCE BGP Configuration ABR BGP Configuration Deprecated Area Border Routers (ABRs) IGP-ISIS Redistribution configuration (IOS-XR) Redistribute Core SvRR and TvRR loopback into Access domain Redistribute Access SR-PCE and SvRR loopbacks into CORE domain Multicast transport using mLDP Overview mLDP core configuration LDP base configuration with defined interfaces LDP auto-configuration G.8275.1 and G.8275.2 PTP (1588v2) timing configuration Summary Enable frequency synchronization Optional Synchronous Ethernet configuration (PTP hybrid mode) PTP G.8275.2 global timing configuration PTP G.8275.2 interface profile definitions IPv4 G.8275.2 master profile IPv6 G.8275.2 master profile IPv4 G.8275.2 slave profile IPv6 G.8275.2 slave profile PTP G.8275.1 global timing configuration IPv6 G.8275.1 slave profile IPv6 G.8275.1 master profile Application of PTP profile to physical interface G.8275.2 interface configuration G.8275.1 interface configuration G.8275.1 and G.8275.2 Multi-Profile and Interworking G.8275.1 Primary to G.8275.2 Configuration G.8275.2 Primary to G.8275.1 Configuration Segment Routing Path Computation Element (SR-PCE) configuration BGP - Services (sRR) and Transport (tRR) route reflector configuration Services Route Reflector (sRR) configuration Transport Route Reflector (tRR) configuration BGP – Provider Edge Routers (A-PEx and PEx) to service RR IOS-XR configuration IOS-XE configuration BGP-LU co-existence BGP configuration Segment Routing Global Block Configuration Boundary node configuration PE node configuration Area Border Routers (ABRs) IGP topology distribution Segment Routing Traffic Engineering (SR-TE) and Services Integration On Demand Next-Hop (ODN) configuration – IOS-XR On-Demand Route Policies On Demand Next-Hop (ODN) configuration – IOS-XE SR-PCE configuration – IOS-XR SR-PCE configuration – IOS-XE SR-TE Policy Configuration SR-TE Color and Endpoint SR-TE Candidate Paths Service to SR-TE Policy Forwarding - Per-Destination Service to SR-TE Policy Forwarding - Per-Flow SR-TE and ODN Configuration Examples SR Policy using IGP metric, head-end computation PCE delegated SR Policy using lowest IGP metric PCE delegated SR Policy using lowest latency metric PCE delegated SR Policy including Anycast SIDs PCE delegated SR Policy using specific Flexible Algorithm SR Policy using explicit segment list Per-Flow Segment Routing Configuration (NCS Platforms) Per-Flow QoS Configuration Per-Flow Policy Configuration On-Demand Next-Hop Per-Flow Configuration QoS Implementation Summary Core QoS configuration Class maps used in QoS policies Core ingress classifier policy Core egress queueing map Core egress MPLS EXP marking map H-QoS configuration Enabling H-QoS on NCS 540 and NCS 5500 Example H-QoS policy for 5G services Class maps used in ingress H-QoS policies Parent ingress QoS policy H-QoS ingress child policies Egress H-QoS parent policy (Priority levels) Egress H-QoS child using priority only Egress H-QoS child using reserved bandwidth Egress H-QoS child using shaping Support for Time Sensitive Networking in N540-FH-CSR-SYS and N540-FH-AGG-SYS Time Sensitive Networking Configuration Ingress Interface Egress Interface Services End-To-End VPN Services End-To-End VPN Services Data Plane L3VPN MP-BGP VPNv4 On-Demand Next-Hop Access Router Service Provisioning (IOS-XR) Access Router Service Provisioning (IOS-XE) L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Access Router Service Provisioning (IOS-XR)# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# L2VPN EVPN E-Tree IOS-XR Root Node Configuraiton IOS-XR Leaf Node Configuration Hierarchical Services L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Router Service Provisioning (IOS-XR)# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# L2/L3VPN – EVPN Head-End Configuration Access Router Service Provisioning (IOS-XR)# Provider Edge Routers Service Provisioning (IOS-XR)# L2/L3VPN – EVPN Centralized Gateway Access Router Service Provisioning (IOS-XR)# Provider Edge Routers Service Provisioning (IOS-XR)# Ethernet CFM for L2VPN service assurance Maintenance Domain configuration MEP configuration for EVPN-VPWS services Multicast Source Distribution using BGP Multicast AFI/SAFI Multicast BGP Configuration Multicast Profile 14 using mLDP and ODN L3VPN Multicast core configuration Unicast L3VPN PE configuration Multicast PE configuration Multicast distribution using Tree-SID with static S,G Mapping Tree-SID SR-PCE Configuration Endpoint Set Configuration P2MP Tree-SID SR Policy Configuration Tree-SID Common Config on All Nodes Segment Routing Local Block PCEP Configuration Static Tree-SID Source Node Multicast Configuration Static Tree-SID Receiver Node Multicast Configuration Global Routing Table Multicast mVPN Multicast Configuration Tree-SID Verification on PCE Multicast distribution using fully dynamic Tree-SID PE BGP Configuration PE Multicast Routing Configuration PE PIM Configuration Hierarchical Services Examples L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Router Service Provisioning (IOS-XR)# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRB Access Router Service Provisioning (IOS-XR)# Access Router Service Provisioning (IOS-XE)# Provider Edge Routers Service Provisioning (IOS-XR)# Remote PHY CIN Implementation Summary Sample QoS Policies Class maps RPD and DPIC interface policy maps Core QoS CIN Timing Configuration PTP Messaging Rates Example CBR-8 RPD DTI Profile Multicast configuration Summary Global multicast configuration - Native multicast Global multicast configuration - LSM using profile 14 PIM configuration - Native multicast PIM configuration - LSM using profile 14 IGMPv3/MLDv2 configuration - Native multicast IGMPv3/MLDv2 configuration - LSM profile 14 IGMPv3 / MLDv2 snooping profile configuration (BVI aggregation) RPD DHCPv4/v6 relay configuration Native IP / Default VRF RPHY L3VPN cBR-8 DPIC interface configuration without Link HA cBR-8 DPIC interface configuration with Link HA cBR-8 Digital PIC Interface Configuration RPD interface configuration P2P L3 BVI RPD/DPIC agg device IS-IS configuration Additional configuration for L3VPN Design Global VRF Configuration BGP Configuration cBR-8 Segment Routing Configuration Cloud Native Broadband Network Gateway (cnBNG) Model-Driven Telemetry Configuration Summary Device inventory and monitoring Interface Data LLDP Monitoring Aggregate bundle information (use interface models for interface counters) PTP and SyncE Information BGP Information IS-IS Information Routing protocol RIB information BGP RIB information Routing policy Information Ethernet CFM EVPN Information Per-Interface QoS Statistics Information Per-Policy, Per-Interface, Per-Class statistics L2VPN Information L3VPN Information SR-PCE PCC and SR Policy Information MPLS performance measurement mLDP Information ACL Information VersionThe following aligns to and uses features from Converged SDN Transport 5.0, pleasesee the overview High Level Design document at https#//xrdocs.io/design/blogs/latest-converged-sdn-transport-hldTargets Hardware# ASR 9000 as Centralized Provider Edge (C-PE) router NCS 5500, NCS 560, and NCS 55A2 as Aggregation and Pre-Aggregation router NCS 5500 as P core router ASR 920, NCS 540, and NCS 5500 as Access Provider Edge (A-PE) cBR-8 CMTS with 8x10GE DPIC for Remote PHY Compact Remote PHY shelf with three 1x2 Remote PHY Devices (RPD) Software# IOS-XR 7.5.2 on Cisco 8000, NCS 560, NCS 540, NCS 5500, and NCS 55A2 routers IOS-XR 7.5.2 on ASR 9000 routers for non-cnBNG use IOS-XR 7.4.2 on ASR 9000 routers for cnBNG use IOS-XE 16.12.03 on ASR 920 IOS-XE 17.03.01w on cBR-8 Key technologies Transport# End-To-End Segment-Routing Network Programmability# SR-TE Inter-Domain LSPs with On-DemandNext Hop Network Availability# TI-LFA/Anycast-SID Services# BGP-based L2 and L3 Virtual Private Network services(EVPN and L3VPN/mVPN) Network Timing# G.8275.1 and G.8275.2 Network Assurance# 802.1ag Testbed OverviewDevicesAccess PE (A-PE) Routers Cisco NCS-5501-SE (IOS-XR) – A-PE7 Cisco N540-24Z8Q2C-M (IOS-XR) - A-PE1, A-PE2, A-PE3 Cisco N540-FH-CSR-SYS - A-PE8 Cisco ASR-920 (IOS-XE) – A-PE9Pre-Aggregation (PA) Routers Cisco NCS5501-SE (IOS-XR) – PA3, PA4Aggregation (AG) Routers Cisco NCS5501-SE (IOS-XR) – AG2, AG3, AG4 Cisco NCS 560-4 w/RSP-4E (IOS-XR) - AG1High-scale Provider Edge Routers Cisco ASR9000 w/Tomahawk Line Cards (IOS-XR) – PE1, PE2 Cisco ASR9000 w/Tomahawk and Lightspeed+ Line Cards (IOS-XR) – PE3, PE4Area Border Routers (ABRs) Cisco ASR9000 (IOS-XR) – PE3, PE4 Cisco 55A2-MOD-SE - PA2 Cisco NCS540 - PA1Core Routers Cisco 55A1-36H (36x100G) - P1,P2 Cisco 8201-32FH - P3,P4Service and Transport Route Reflectors (RRs) Cisco IOS XRv 9000 – tRR1-A, tRR1-B, sRR1-A, sRR1-B, sRR2-A, sRR2-B,sRR3-A, sRR3-BSegment Routing Path Computation Element (SR-PCE) Cisco IOS XRv 9000 – SRPCE-A1-A, SRPCE-A1-B, SRPCE-A2-A, SRPCE-A2-A, SRPCE-CORE-A, SRPCE-CORE-BKey Resources to Allocate IP Addressing IPv4 address plan IPv6 address plan, recommend dual plane day 1 Plan for SRv6 in the future Color communities for ODN Segment Routing Blocks SRGB (segment-routing address block) Keep in mind anycast SID for ABR node pairs Allocate 3 SIDs for potential future Flex-algo use SRLB (segment routing local block) Local significance only Can be quite small and re-used on each node IS-IS unique instance identifiers for each domainRole-Based Router ConfigurationIOS-XR Router ConfigurationUnderlay Bundle interface configuration with BFDinterface Bundle-Ether100 bfd mode ietf bfd address-family ipv4 timers start 180 bfd address-family ipv4 multiplier 3 bfd address-family ipv4 destination 10.1.2.1 bfd address-family ipv4 fast-detect bfd address-family ipv4 minimum-interval 50 mtu 9216 ipv4 address 10.15.150.1 255.255.255.254 ipv4 unreachables disable load-interval 30 dampeningUnderlay physical interface configurationinterface HundredGigE0/0/0/24 mtu 9216 ipv4 address 10.15.150.1 255.255.255.254 ipv4 unreachables disable load-interval 30 dampeningPerformance MeasurementInterface delay metric dynamic configurationStarting with CST 3.5 we now support end to end dynamic link delay measurements across all IOS-XR nodes. The feature in IOS-XR is called Performance Measurement and all configuration is found under the performance-measurement configuration hierarchy. There are a number of configuration options utilized when configuring performance measurement, but the below configuration will enable one-way delay measurements on physical links. The probe measurement-mode options are either one-way or two-way. One-way mode requires nodes be time synchronized to a common PTP clock, and should be used if available. In the absence of a common PTP clock source, two-way mode can be used which calculates the one-way delay using multiple timestamps at the querier and responder.The advertisement options specify when the advertisements are made into the IGP. The periodic interval sets the minimum interval, with the threshold setting the difference required to advertise a new delay value. The accelerated threshold option sets a percentage change required to trigger and advertisement prior to the periodic interval timer expiring. Performance measurement takes a series of measurements within each computation interval and uses this information to derive the min, max, and average link delay.Full documentation on Performance Measurement can be found at# https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-5/segment-routing/configuration/guide/b-segment-routing-cg-asr9000-75x/configure-performance-measurement.htmlperformance-measurement interface TenGigE0/0/0/20 delay-measurement ! ! interface TenGigE0/0/0/21 delay-measurement ! ! protocol twamp-light measurement delay unauthenticated querier-dst-port 12345 ! ! ! delay-profile interfaces advertisement accelerated threshold 25 ! periodic interval 120 threshold 10 ! ! probe measurement-mode two-way protocol twamp-light computation-interval 60 ! !!endInterface delay metric static configurationIn the absence of dynamic realtime one-way latency monitoring for physical interfaces, the interface delay can be set manually. The one-way delay measurement value is used when computing SR Policy paths with the “latency” constraint type. The configured value is advertised in the IGP using extensions defined in RFC 7810, and advertised to the PCE using BGP-LS extensions. Keep in mind the delay metric value is defined in microseconds, so if you are mixing dynamic computation with static values they should be set appropriately.performance-measurement interface TenGigE0/0/0/10 delay-measurement advertise-delay 15000 interface TenGigE0/0/0/20 delay-measurement advertise-delay 10000SR Policy Delay Measurement ProfileProperties for SR Policy end to end measurement can be customized to set specific intervals, logging, delay thresholds, and protocol. The “default” profile will be used for all SR Policies with delay measurement enabled unless a specific profile is specified.delay-profile sr-policy default advertisement accelerated threshold 25 ! periodic interval 120 threshold 10 ! threshold-check average-delay ! ! probe tos dscp 46 ! measurement-mode two-way protocol twamp-light computation-interval 60 burst-interval 60 ! ! protocol twamp-light measurement delay unauthenticated querier-dst-port 12345Enabling SR Policy Delay Measurementpolicy srte_c_5227_ep_100.0.0.27 color 5227 end-point ipv4 100.0.0.27 candidate-paths preference 100 dynamic metric type igp ! ! ! ! performance-measurement delay-measurementSR Policy Liveness Detection ProfileNote on platforms with HW enabled probe generation, the minimum interval is 3.3ms, on platforms with CPU probe generation, the minimum interval is 30ms (30000us).performance-measurement liveness-profile name cst liveness-detection multiplier 3 ! probe tx-interval 30000SR Policy with Liveness Detection EnabledThis example uses the default liveness detection profile. In this case when three probes are missed, the SR Policy will transition to a “down” state due to the “invalidation-action down” command. If this is omitted, path changes will be logged but no action will be taken.segment-routing traffic-eng policy sr-policy-liveness color 5000 end-point ipv4 100.0.0.25 candidate-paths preference 200 dynamic pcep ! anycast-sid-inclusion ! ! constraints segments sid-algorithm 130 ! ! ! ! performance-measurement liveness-detection invalidation-action downIOS-XR SR-MPLS TransportSegment Routing SRGB and SRLB DefinitionIt’s recommended to first configure the Segment Routing Global Block (SRGB) across all nodes needing connectivity between each other. In most instances a single SRGB will be used across the entire network. In a SR MPLS deployment the SRGB and SRLB correspond to the label blocks allocated to SR. IOS-XR has a maximum configurable SRGB limit of 512,000 labels, however please consult platform-specific documentation for maximum values. The SRLB corresponds to the labels allocated for SIDs local to the node, such as Adjacency-SIDs. It is recommended to configure the same SRLB block across all nodes. The SRLB must not overlap with the SRGB. The SRGB and SRLB are configured in IOS-XR with the following configuration#segment-routing global-block 16000 23999 local-block 15000 15999 IGP protocol (ISIS) and Segment Routing MPLS configurationThe following section documents the configuration without Flex-Algo, Flex-Algo configuration is found in the Flex-Algo configuration section.Key chain global configuration for IS-IS authenticationkey chain ISIS-KEY key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password 00071A150754 send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm HMAC-MD5 IS-IS router configurationAll routers, except Area Border Routers (ABRs), are part of one IGPdomain and L2 area (ISIS-ACCESS or ISIS-CORE). Area border routersrun two IGP IS-IS processes (ISIS-ACCESS and ISIS-CORE). Note that Loopback0 is part of both IGP processes.router isis ISIS-ACCESS set-overload-bit on-startup 360 is-type level-2-only net 49.0001.0101.0000.0110.00 nsr distribute link-state nsf cisco log adjacency changes lsp-gen-interval maximum-wait 5000 initial-wait 5 secondary-wait 100 lsp-refresh-interval 65000 max-lsp-lifetime 65535 lsp-password keychain ISIS-KEY address-family ipv4 unicast metric-style wide advertise link attributes spf-interval maximum-wait 1000 initial-wait 5 secondary-wait 100 segment-routing mpls spf prefix-priority high tag 1000 maximum-redistributed-prefixes 100 level 2 ! address-family ipv6 unicast metric-style wide spf-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 maximum-redistributed-prefixes 100 level 2Note# ABR Loopback 0 on domain boundary is part of both IGP processes together with same “prefix-sid absolute” valueNote# The prefix SID can be configured as either absolute or index. The index configuration is required for interop with nodes using a different SRGB.IS-IS Loopback and node SID configuration interface Loopback0 ipv4 address 100.0.1.50 255.255.255.255 address-family ipv4 unicast prefix-sid absolute 16150 tag 1000 IS-IS Physical and Bundle interface configuration with BFDinterface HundredGigE0/0/0/20/0 circuit-type level-2-only bfd minimum-interval 5 bfd multiplier 5 bfd fast-detect ipv4 point-to-point address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 10MPLS-TE ConfigurationEnabling the use of Segment Routing Traffic Engineering requires first configuring basic MPLS TE so the router Traffic Engineering Database (TED) is populated with the proper TE attributes. The configuration requires nompls traffic-eng Unnumbered InterfacesIS-IS and Segment Routing/SR-TE utilized in the Converged SDN Transport design supports using unnumbered interfaces. SR-PCE used to compute inter-domain SR-TE paths also supports the use of unnumbered interfaces. In the topology database each interface is uniquely identified by a combination of router ID and SNMP IfIndex value.Unnumbered interface configurationinterface TenGigE0/0/0/2 description to-AG2 mtu 9216 ptp profile My-Slave port state slave-only local-priority 10 ! service-policy input core-ingress-classifier service-policy output core-egress-exp-marking ipv4 point-to-point ipv4 unnumbered Loopback0 frequency synchronization selection input priority 10 wait-to-restore 1 !!Unnumbered Interface IS-IS DatabaseThe IS-IS database will reference the node SNMP IfIndex valueMetric# 10 IS-Extended A-PE1.00 Local Interface ID# 1075, Remote Interface ID# 40 Affinity# 0x00000000 Physical BW# 10000000 kbits/sec Reservable Global pool BW# 0 kbits/sec Global Pool BW Unreserved# [0]# 0 kbits/sec [1]# 0 kbits/sec [2]# 0 kbits/sec [3]# 0 kbits/sec [4]# 0 kbits/sec [5]# 0 kbits/sec [6]# 0 kbits/sec [7]# 0 kbits/sec Admin. Weight# 90 Ext Admin Group# Length# 32 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 Link Average Delay# 1 us Link Min/Max Delay# 1/1 us Link Delay Variation# 0 us Link Maximum SID Depth# Label Imposition# 12 ADJ-SID# F#0 B#1 V#1 L#1 S#0 P#0 weight#0 Adjacency-sid#24406 ADJ-SID# F#0 B#0 V#1 L#1 S#0 P#0 weight#0 Adjacency-sid#24407Anycast SID ABR node configurationAnycast SIDs are SIDs existing on two more ABR nodes to offer a redundant fault tolerant path for traffic between Access PEs and remote PE devices. In CST 3.5 and above, anycast SID paths can either be manually configured on the head-end or computed by the SR-PCE. When SR-PCE computes a path it will inspect the topology database to ensure the next SID in the computed segment list is reachable from all anycast nodes. If not, the anycast SID will not be used. The same IP address and prefix-sid must be configured on all shared anycast nodes, with the n-flag clear option set. Note when anycast SID path computation is used with SR-PCE, only IGP metrics are supported.IS-IS Configuration for Anycast SIDrouter isis ACCESS interface Loopback100 ipv4 address 100.100.100.1 255.255.255.255 address-family ipv4 unicast prefix-sid absolute 16150 n-flag clear tag 1000 Conditional IGP Loopback advertisement While not the only use case for conditional advertisement, it is a required component when using anycast SIDs with static segment list. Conditional advertisement will not advertise the Loopback interface if certain routes are not found in the RIB. If the anycast Loopback is withdrawn, the segment list will be considered invalid on the head-end node. The conditional prefixes should be all or a subset of prefixes from the adjacent IGP domain.route-policy check if rib-has-route in async remote-prefixes pass endif end-policyprefix-set remote-prefixes 100.0.2.52, 100.0.2.53router isis ACCESS interface Loopback100 address-family ipv4 unicast advertise prefix route-policy checkIS-IS logical interface configuration with TI-LFAIt is recommended to use manual adjacency SIDs. A protected SID is eligible for backup path computation, meaning if a packet ingresses the node with the label a backup path will be provided in case of a link failure. In the case of having multiple adjacencies between the same two nodes, use the same adjacency-sid on each link. Unnumbered interfaces are configured using the same configuration. interface TenGigE0/0/0/10 point-to-point hello-password keychain ISIS-KEY address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa adjacency-sid absolute 15002 protected metric 100 ! address-family ipv6 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 Segment Routing Data Plane MonitoringIn CST 3.5 we introduce SR DPM across all IOS-XR platforms. SR DPM uses MPLS OAM mechanisms along with specific SID lists in order to exercise the dataplane of the originating node, detecting blackholes typically difficult to diagnose. SR DPM ensures the nodes SR-MPLS forwarding plane is valid without a drop in traffic towards adjacent nodes and other nodes in the same IGP domain. SR DPM is a proactive approach to blackhole detection and mitigation.SR DPM first performs interface adjacency checks by sending an MPLS OAM packet to adjacent nodes using the interface adjacency SID and its own node SID in the SID list. This ensures the adjacent node is sending traffic back to the node correctly.Once this connectivity is verified, SR DPM will then test forwarding to all other node SIDs in the IGP domain across each adjacency. This is done by crafting a MPLS OAM packet with SID list {Adj-SID, Target Node SID} with TTL=2. The packet is sent to the adjacent node, back to the SR DPM testing node, and then onto the target node via SR-MPLS forwarding. The downstream node towards the target node will receive the packet with TTL=0 and send an MPLS OAM response to the SR DPM originating node. This communicates valid forwarding across the originating node towards the target node.It is recommended to enable SR DPM on all CST IOS-XR nodes.SR Data Plane Monitoring Configurationmpls oam dpm pps 10 interval 60 (minutes) MPLS Segment Routing Traffic Engineering (SR-TE) configurationThe following configuration is done at the global ISIS configuration level and should be performed for all IOS-XR nodes.router isis ACCESS address-family ipv4 unicast mpls traffic-eng level-2-only mpls traffic-eng router-id Loopback0MPLS Segment Routing Traffic Engineering (SR-TE) TE metric configurationThe TE metric is used when computing SR Policy paths with the “te” or “latency” constraint type. The TE metric is carried as a TLV within the TE opaque LSA distributed across the IGP area and to the PCE via BGP-LS.The TE metric is used in the CST 5G Transport use case. If no TE metric is defined the local CSPF or PCE will utilize the IGP metric.segment-routing traffic-eng interface TenGigE0/0/0/6 metric 1000IOS-XR SR Flexible Algorithm ConfigurationSegment Routing Flexible Algorithm offers a way to to easily define multiple logical network topologies satisfying a specific network constraint. Flex-Algo definitions must first be configured in each IGP domain on all nodes participating in Flex-Algo. By default, all nodes participate in Algorithm 0, mapping to “use lowest IGP metric” path computation. In the CST design, ABR nodes must have Flex-Algo definitions in both IS-IS instances if an inter-domain path is required.Flex-Algo IS-IS DefinitionEach Flex-Algo is defined on the nodes participating in the Flex-Algo. In this configuration IS-IS is configured to advertise the definition network wide. This is not required on each node in the domain, only a single node needs to advertise the definition, but there is no downside to having each node advertise the definition. In this case we are also defining a link affinity to be used in the 131 Flex-Algo. The same affinity-map must be used on all nodes in the IGP domain. The link affinity is configured under specific interfaces in the IS-IS interface configuration as shown with interface TenGigE0/0/0/20 below. The configuration for 131 is set to exclude links matching the “red” affinity, so any path utilizing Flex-Algo 131 as a constraint will not utilize the TenGigE0/0/0/20 path. The Flex-Algo link affinity is applied to both local and remote interfaces matching the affinity.Also note non-Flex-Algo configuration can utilize link affinities, which are defined under segment-routing->traffic-engineering->interface->affinity.As of CST 4.0, delay is the only metric-type supported. Utilizing the delay metric-type for a Flex-Algo will ensure a path will utilize only the lowest delay path, even if a single destination SID is referenced in the SR-TE path.router isis ACCESS affinity-map red bit-position 0 flex-algo 128 advertise-definition ! flex-algo 129 advertise-definition ! flex-algo 130 metric-type delay advertise-definition ! flex-algo 131 advertise-definition affinity exclude-any red ! ! interface TenGigE0/0/0/20 affinity flex-algo redFlex-Algo Node SID ConfigurationFlex-Algo works by allocating a globally unique node SID referencing the algorithm on each node participating in the Flex-Algo topology. This requires additional Node SID configuration on the Loopback0 interface for each router. The following is an example for a node participating in four different Flex-Algo domains in addition to the default Algo 0 domain, covered by the base Node SID configuration. Each SID belongs to the same global SRGB.router isis ACCESS interface Loopback0 address-family ipv4 unicast prefix-sid index 150 prefix-sid algorithm 128 absolute 18003 prefix-sid algorithm 129 absolute 19003 prefix-sid algorithm 130 absolute 20003 prefix-sid algorithm 131 absolute 21003If one inspects the IS-IS database for the nodes, you will see the Flex-Algo SID entries. RP/0/RP0/CPU0#NCS540-A-PE3#show isis database NCS540-A-PE3.00-00 verbose Router Cap# 100.0.1.50 D#0 S#0 Segment Routing# I#1 V#0, SRGB Base# 16000 Range# 8000 SR Local Block# Base# 15000 Range# 1000 Node Maximum SID Depth# Label Imposition# 12 SR Algorithm# Algorithm# 0 Algorithm# 1 Algorithm# 128 Algorithm# 129 Algorithm# 130 Algorithm# 131 Flex-Algo Definition# Algorithm# 128 Metric-Type# 0 Alg-type# 0 Priority# 128 Flex-Algo Definition# Algorithm# 129 Metric-Type# 0 Alg-type# 0 Priority# 128 Flex-Algo Definition# Algorithm# 130 Metric-Type# 1 Alg-type# 0 Priority# 128 Flex-Algo Definition# Algorithm# 131 Metric-Type# 0 Alg-type# 0 Priority# 128 Flex-Algo Exclude-Any Ext Admin Group# 0x00000001IOS-XE Nodes - SR-MPLS TransportSegment Routing MPLS configurationmpls label range 6001 32767 static 16 6000segment-routing mpls ! set-attributes address-family ipv4 sr-label-preferred exit-address-family ! global-block 16000 24999 ! Prefix-SID assignment to loopback 0 configuration connected-prefix-sid-map address-family ipv4 100.0.1.51/32 index 151 range 1 exit-address-family ! Basic IGP protocol (ISIS) with Segment Routing MPLS configurationkey chain ISIS-KEY key 1 key-string cisco accept-lifetime 00#00#00 Jan 1 2018 infinite send-lifetime 00#00#00 Jan 1 2018 infinite!router isis ACCESS net 49.0001.0102.0000.0254.00 is-type level-2-only authentication mode md5 authentication key-chain ISIS-KEY metric-style wide fast-flood 10 set-overload-bit on-startup 120 max-lsp-lifetime 65535 lsp-refresh-interval 65000 spf-interval 5 50 200 prc-interval 5 50 200 lsp-gen-interval 5 5 200 log-adjacency-changes segment-routing mpls segment-routing prefix-sid-map advertise-local TI-LFA FRR configuration fast-reroute per-prefix level-2 all fast-reroute ti-lfa level-2 microloop avoidance protected!interface Loopback0 ip address 100.0.1.51 255.255.255.255 ip router isis ACCESS isis circuit-type level-2-onlyend IS-IS and MPLS interface configurationinterface TenGigabitEthernet0/0/12 mtu 9216 ip address 10.117.151.1 255.255.255.254 ip router isis ACCESS mpls ip isis circuit-type level-2-only isis network point-to-point isis metric 100end MPLS Segment Routing Traffic Engineering (SR-TE)router isis ACCESS mpls traffic-eng router-id Loopback0 mpls traffic-eng level-2 Area Border Routers (ABRs) IPv4/IPv6 route distribution using BGPThe ABR nodes must provide IP reachability for RRs, SR-PCEs and NSO between ISIS-ACCESS and ISIS-CORE IGP domains. One use case is SR Tree-SID, which requires all nodes have a PCEP session to a single SR-PCE.The recommended method to achieve reachability to nodes in the Coredomain from access domain routers is to utilize BGP to advertise the Loopbackaddresses of specific nodes to the ABR nodes, and use either BGP to IGPredistribution or IPv4/IPv6 Unicast BGP between ABR and access nodes todistribute those routes. If unicast BGP is used, the ABR nodes will act asinline Route Reflectors.Reachability to the access routers from the core routes is provided by advertising access domain aggregate routes from each access domain via BGP to core nodes requiring them. If the core element such as SR-PCE is a router, SR-MPLS and BGP can enabled, with the ABRs advertising theaggregates directly. If the element is not a router, then the router it is connected to will receive and advertise BGP prefixes to establish end to end connectivity.The following is an example from one ABR node and one SR-PCE node.Core SR-PCE BGP ConfigurationThe following configuration is for SR-PCE with Loopback 101.0.0.100. 101.0.0.3and 101.0.0.4 are ABRs for one access domain, 101.0.1.1 and 101.0.1.2 the other.The optional route policies are used as a strict check to make sure only theproper routes are being received.route-policy access1-in if destination in (101.0.1.0/24) then pass else drop endifend-policy!route-policy access2-in if destination in (101.0.2.0/24) then pass else drop endifend-policyrouter bgp 100 nsr bgp router-id 101.0.0.100 bgp redistribute-internal bgp graceful-restart nexthop validation color-extcomm sr-policy nexthop validation color-extcomm disable ibgp policy out enforce-modifications address-family ipv4 unicast network 101.0.0.100/32 ! neighbor 101.0.0.3 remote-as 100 update-source Loopback0 address-family ipv4 unicast route-policy access1-in in ! ! neighbor 101.0.0.4 remote-as 100 update-source Loopback0 address-family ipv4 unicast route-policy access1-in in ! ! neighbor 101.0.1.1 remote-as 100 update-source Loopback0 address-family ipv4 unicast route-policy access2-in in ! ! neighbor 101.0.1.2 remote-as 100 update-source Loopback0 address-family ipv4 unicast route-policy access2-in in ! ! ABR BGP ConfigurationIn this example the ABR node advertises the aggregate 100.0.1.0/24 covering A-PE loopback addresses in the Access-1 IGP domain to the core SR-PCE node. It uses the IPv4 unicast AFI to advertise the SR-PCE Loopback prefix to the A-PE nodes. Route policies are used to restrict the prefixes advertised in both directions.router staticaddress-family ipv4 unicast 100.0.1.0/24 Null0prefix-set ACCESS-PE-PREFIX 100.0.1.0/24end-setprefix-set SRPCE-PREFIX 100.0.0.100/32end-setroute-policy ABR-to-SRPCE if destination in ACCESS-PE-PREFIX then pass else drop endifend-policy!route-policy ABR-to-APE if destination in SRPCE-PREFIX then pass else drop endifend-policy!router bgp 100 nsr bgp router-id 101.0.0.3 bgp redistribute-internal bgp graceful-restart nexthop validation color-extcomm sr-policy nexthop validation color-extcomm disable ibgp policy out enforce-modifications address-family ipv4 unicast network 100.0.1.0/24 ! address-family ipv6 unicast ! neighbor-group BGP-APE remote-as 100 update-source Loopback0 ! address-family ipv4 unicast route-reflector-client route-policy ABR-to-APE out next-hop-self ! address-family ipv6 unicast route-reflector-client next-hop-self ! ! neighbor 101.0.2.52 use neighbor-group BGP-APE ! neighbor 101.0.2.53 use neighbor-group BGP-APE ! neighbor 101.0.0.100 description SR-PCE remote-as 100 update-source Loopback0 address-family ipv4 unicast route-policy ABR-to-SRPCE out next-hop-self ! ! Deprecated Area Border Routers (ABRs) IGP-ISIS Redistribution configuration (IOS-XR)Note the following is for historical reference and has been deprecated, IGP redistribution is not recommended for production deploymentsThe ABR nodes must provide IP reachability for RRs, SR-PCEs and NSO between ISIS-ACCESS and ISIS-CORE IGP domains. This is done by IPprefix redistribution. The ABR nodes have static hold-down routes for the block of IP space used in each domain across the network, those static routes are then redistributed into the domains using the redistribute static command with a route-policy. The distance command is used to ensure redistributed routes are not preferred over local IS-IS routes on the opposite ABR. The distance command must be applied to both ABR nodes.router staticaddress-family ipv4 unicast 100.0.0.0/24 Null0 100.0.1.0/24 Null0 100.1.0.0/24 Null0 100.1.1.0/24 Null0prefix-set ACCESS-PCE_SvRR-LOOPBACKS 100.0.1.0/24, 100.1.1.0/24end-setprefix-set RR-LOOPBACKS 100.0.0.0/24, 100.1.0.0/24end-set Redistribute Core SvRR and TvRR loopback into Access domainroute-policy CORE-TO-ACCESS1 if destination in RR-LOOPBACKS then pass else drop endifend-policy!router isis ACCESS address-family ipv4 unicast distance 254 0.0.0.0/0 RR-LOOPBACKS redistribute static route-policy CORE-TO-ACCESS1 Redistribute Access SR-PCE and SvRR loopbacks into CORE domainroute-policy ACCESS1-TO-CORE if destination in ACCESS-PCE_SvRR-LOOPBACKS then pass else drop endif end-policy ! router isis CORE address-family ipv4 unicast distance 254 0.0.0.0/0 ACCESS-PCE_SvRR-LOOPBACKS redistribute static route-policy CORE-TO-ACCESS1 Multicast transport using mLDPOverviewThis portion of the implementation guide instructs the user how to configure mLDP end to end across the multi-domain network. Multicast service examples are given in the “Services” section of the implementation guide.mLDP core configurationIn order to use mLDP across the Converged SDN Transport network LDP must first be enabled. There are two mechanisms to enable LDP on physical interfaces across the network, LDP auto-configuration or manually under the MPLS LDP configuration context. The capabilities statement will ensure LDP unicast FECs are not advertised, only mLDP FECs. Recursive forwarding is required in a multi-domain network. mLDP must be enabled on all participating A-PE, PE, AG, PA, and P routers.LDP base configuration with defined interfacesmpls ldp capabilities sac mldp-only mldp logging notifications address-family ipv4 make-before-break delay 30 forwarding recursive recursive-fec ! ! router-id 100.0.2.53 session protection address-family ipv4 ! interface TenGigE0/0/0/6 ! interface TenGigE0/0/0/7 LDP auto-configurationLDP can automatically be enabled on all IS-IS interfaces with the following configuration in the IS-IS configuration. It is recommended to do this only after configuring all MPLS LDP properties.router isis ACCESS address-family ipv4 unicast segment-routing mpls sr-prefer mpls ldp auto-config G.8275.1 and G.8275.2 PTP (1588v2) timing configurationSummaryThis section contains the base configurations used for both G.8275.1 andG.8275.2 timing. Please see the CST HLD for an overview on timing in general.G.8275.1 is the preferred method for end to end timing if possible since itprovides the most accurate clock and has no limitations on interface type usedfor PTP peers. G.8275.1 to G.8275.2 interworking can be used on edge nodes toprovide timing to devices requiring G.8275.2.Enable frequency synchronizationIn order to lock the internal oscillator to a PTP source, frequency synchronization must first be enabled globally.frequency synchronization quality itu-t option 1 clock-interface timing-mode system log selection changes! Optional Synchronous Ethernet configuration (PTP hybrid mode)If the end-to-end devices support SyncE it should be enabled. SyncE will allow much faster frequency sync and maintain integrity for long periods of time during holdover events. Using SyncE for frequency and PTP for phase is known as “Hybrid” mode. A lower priority is used on the SyncE input (50 for SyncE vs. 100 for PTP).interface TenGigE0/0/0/10 frequency synchronization selection input priority 50 !! PTP G.8275.2 global timing configurationAs of CST 3.0, IOS-XR supports a single PTP timing profile and single clock type in the global PTP configuration. The clock domain should follow the ITU-T guidelines for specific profiles using a domain >44 for G.8275.2 clocks.ptp clock domain 60 profile g.8275.2 clock-type T-BC ! frequency priority 100 time-of-day priority 50 log servo events best-master-clock changes ! PTP G.8275.2 interface profile definitionsIt is recommended to use “profiles” defined globally which are then applied to interfaces participating in timing. This helps minimize per-interface timing configuration. It is also recommended to define different profiles for “master” and “slave” interfaces.IPv4 G.8275.2 master profileThe master profile is assigned to interfaces for which the router is acting as a boundary clockptp profile g82752_master_v4 transport ipv4 port state master-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 5 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv6 G.8275.2 master profileThe master profile is assigned to interfaces for which the router is acting as a boundary clockptp profile g82752_master_v6 transport ipv6 port state master-only sync frequency 16 clock operation one-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv4 G.8275.2 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82752_master_v4 transport ipv4 port state slave-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! IPv6 G.8275.2 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82752_master_v6 transport ipv6 port state slave-only sync frequency 16 clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 unicast-grant invalid-request deny delay-request frequency 16 !! PTP G.8275.1 global timing configurationAs of CST 3.0, IOS-XR supports a single PTP timing profile and single clock type in the global PTP configuration. The clock domain should follow the ITU-T guidelines for specific profiles using a domain <44 for G.8275.1 clocks.ptpclock domain 24 operation one-step Use one-step for NCS series, two-step for ASR 9000 physical-layer-frequency frequency priority 100 profile g.8275.1 clock-type T-BClog servo events best-master-clock changes IPv6 G.8275.1 slave profileThe slave profile is assigned to interfaces for which the router is acting as a slave to another master clockptp profile g82751_slave port state slave-only clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step announce timeout 10 announce interval 1 delay-request frequency 16 multicast transport ethernet !! IPv6 G.8275.1 master profileThe master profile is assigned to interfaces for which the router is acting as a master to slave devicesptp profile g82751_slave port state master-only clock operation one-step <-- Note the NCS series should be configured with one-step, ASR9000 with two-step sync frequency 16 announce timeout 10 announce interval 1 delay-request frequency 16 multicast transport ethernet !! Application of PTP profile to physical interfaceNote# In CST 3.0 PTP may only be enabled on physical interfaces. G.8275.1 operates at L2 and supports PTP across Bundle member links and interfaces part of a bridge domain. G.8275.2 operates at L3 and does not support Bundle interfaces.G.8275.2 interface configurationThis example is of a slave device using a master of 2405#10#23#253##0.interface TenGigE0/0/0/6 ptp profile g82752_slave_v6 master ipv6 2405#10#23#253## ! ! G.8275.1 interface configurationinterface TenGigE0/0/0/6 ptp profile g82751_slave ! ! G.8275.1 and G.8275.2 Multi-Profile and InterworkingIn CST 4.0 and IOS-XR 7.2.2 PTP Multi-Profile is supported, along with the ability to interwork between G.8275.1 and G.8275.2 on the same router. This allows a node to run one timing profile to its upstream GM peer and supply a timing reference to downstream peers using different profiles. It is recommended to use G.8275.1 as the primary profile across the network, and G.8275.2 to peers who only support the G.8275.2 profile, such as Remote PHY Devices.The interworking feature is enabled on the client interface which has a different profile from the primary node profile. The domain must be specified along with the interop mode.G.8275.1 Primary to G.8275.2 Configurationinterface TenGigE0/0/0/5 ptp interop g.8275.2 domain 60 ! transport ipv4 port state master-only G.8275.2 Primary to G.8275.1 Configurationinterface TenGigE0/0/0/5 ptp interop g.8275.1 domain 24 ! transport ethernet port state master-only Segment Routing Path Computation Element (SR-PCE) configurationrouter static address-family ipv4 unicast 0.0.0.0/1 Null0router bgp 100 nsr bgp router-id 100.0.0.100 bgp graceful-restart graceful-reset bgp graceful-restart ibgp policy out enforce-modifications address-family link-state link-state ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR !!pce address ipv4 100.100.100.1 rest user rest_user password encrypted 00141215174C04140B ! authentication basic ! state-sync ipv4 100.100.100.2 peer-filter ipv4 access-list pe-routers! BGP - Services (sRR) and Transport (tRR) route reflector configurationServices Route Reflector (sRR) configurationIn the CST validation a sRR is used to reflect all service routes. In a production network each service could be allocated its own sRR based on resiliency and scale demands. In CST 5.0 (XR 7.5.2) and higher versions we will utilize the BGP soft next-hopvalidation feature to accept service prefixes without a BGPnext-hop residing in the RIB.router bgp 100 nsr bgp router-id 100.0.0.200 bgp graceful-restart nexthop validation color-extcomm disable ibgp policy out enforce-modifications address-family vpnv4 unicast nexthop trigger-delay critical 10 additional-paths receive additional-paths send ! address-family vpnv6 unicast nexthop trigger-delay critical 10 additional-paths receive additional-paths send retain route-target all ! address-family l2vpn evpn additional-paths receive additional-paths send ! address-family ipv4 mvpn nexthop trigger-delay critical 10 soft-reconfiguration inbound always ! address-family ipv6 mvpn nexthop trigger-delay critical 10 soft-reconfiguration inbound always ! neighbor-group SvRR-Client remote-as 100 bfd fast-detect bfd minimum-interval 3 update-source Loopback0 address-family l2vpn evpn route-reflector-client ! address-family vpnv4 unicast route-reflector-client ! address-family vpnv6 unicast route-reflector-client ! address-family ipv4 mvpn route-reflector-client ! address-family ipv6 mvpn route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group SvRR-Client !! Transport Route Reflector (tRR) configurationIn CST 5.0 (XR 7.5.2) and higher versions we will utilize the BGP soft next-hopvalidation feature to accept BGP-LS prefixes without a BGPnext-hop residing in the RIB.router bgp 100 nsr bgp router-id 100.0.0.10 bgp graceful-restart nexthop validation color-extcomm disable ibgp policy out enforce-modifications address-family link-state link-state additional-paths receive additional-paths send ! neighbor-group RRC remote-as 100 update-source Loopback0 address-family link-state link-state route-reflector-client ! ! neighbor 100.0.0.1 use neighbor-group RRC ! neighbor 100.0.0.2 use neighbor-group RRC! BGP – Provider Edge Routers (A-PEx and PEx) to service RREach PE router is configured with BGP sessions to service route-reflectors for advertising VPN service routes across the inter-domain network.IOS-XR configurationIn CST 5.0 (XR 7.5.2) and higher versions we will utilize the BGP soft next-hopvalidation feature. PE nodes will use the computed ODN SR-TE Policy as avalidation criteria for the BGP path. If a SR-TE Policy can be computed eitherlocally or by SR-PCE, the path will be active, otherwise the path will not be installed.router bgp 100 nsr bgp router-id 100.0.1.50 bgp graceful-restart graceful-reset bgp graceful-restart nexthop validation color-extcomm sr-policy ibgp policy out enforce-modifications address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! address-family l2vpn evpn ! neighbor-group SvRR remote-as 100 bfd fast-detect bfd minimum-interval 3 update-source Loopback0 address-family vpnv4 unicast soft-reconfiguration inbound always ! address-family vpnv6 unicast soft-reconfiguration inbound always ! address-family ipv4 mvpn soft-reconfiguration inbound always ! address-family ipv6 mvpn soft-reconfiguration inbound always ! address-family l2vpn evpn soft-reconfiguration inbound always ! ! neighbor 100.0.1.201 use neighbor-group SvRR ! ! IOS-XE configurationrouter bgp 100 bgp router-id 100.0.1.51 bgp log-neighbor-changes no bgp default ipv4-unicast neighbor SvRR peer-group neighbor SvRR remote-as 100 neighbor SvRR update-source Loopback0 neighbor 100.0.1.201 peer-group SvRR ! address-family ipv4 exit-address-family ! address-family vpnv4 neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family ! address-family l2vpn evpn neighbor SvRR send-community both neighbor SvRR next-hop-self neighbor 100.0.1.201 activate exit-address-family ! BGP-LU co-existence BGP configurationCST 3.0 introduced co-existence between services using BGP-LU and SR endpoints. If you are using SR and BGP-LU within the same domain it requires using BGP-SR in order to resolve prefixes correctly on the each ABR. BGP-SR uses a new BGP attribute attached to the BGP-LU prefix to convey the SR prefix-sid index end to end across the network. Using the same prefix-sid index both within the SR-MPLS IGP domain and across the BGP-LU network simplifies the network from an operational perspective since the path to an end node can always be identified by that SID.It is recommended to enable the BGP-SR configuration when enabling SR on the PE node. See the PE configuration below for an example of this configuration.Segment Routing Global Block ConfigurationThe BGP process must know about the SRGB in order to properly allocate local BGP-SR labels when receiving a BGP-LU prefix with a BGP-SR index community. This is done via the following configuration. If a SRGB is defined under the IGP it must match the global SRGB value. The IGP will inherit this SRGB value if none is previously defined.segment-routing global-block 32000 64000 !! Boundary node configurationThe following configuration is necessary on all domain boundary nodes. Note the ibgp policy out enforce-modifications command is required to change the next-hop on reflected IBGP routes.router bgp 100 ibgp policy out enforce-modifications neighbor-group BGP-LU-PE remote-as 100 update-source Loopback0 address-family ipv4 labeled-unicast soft-reconfiguration inbound always route-reflector-client next-hop-self ! ! neighbor-group BGP-LU-BORDER remote-as 100 update-source Loopback0 address-family ipv4 labeled-unicast soft-reconfiguration inbound always route-reflector-client next-hop-self ! ! neighbor 100.0.2.53 use neighbor-group BGP-LU-PE ! neighbor 100.0.2.52 use neighbor-group BGP-LU-PE ! neighbor 100.0.0.1 use neighbor-group BGP-LU-BORDER ! neighbor 100.0.0.2 use neighbor-group BGP-LU-BORDER ! ! PE node configurationThe following configuration is necessary on all domain PE nodes participating in BGP-LU/BGP-SR. The label-index set must match the index of the Loopback addresses being advertised into BGP. This example shows a single Loopback address being advertised into BGP.route-policy LOOPBACK-INTO-BGP-LU($SID-LOOPBACK0) set label-index $SID-LOOPBACK0 set aigp-metric igp-costend-policy!router bgp 100 address-family ipv4 unicast network 100.0.2.53/32 route-policy LOOPBACK-INTO-BGP-LU(153) ! neighbor-group BGP-LU-BORDER remote-as 100 update-source Loopback0 address-family ipv4 labeled-unicast ! ! neighbor 100.0.0.3 use neighbor-group BGP-LU-BORDER ! neighbor 100.0.0.4 use neighbor-group BGP-LU-BORDER ! Area Border Routers (ABRs) IGP topology distributionNext network diagram# “BGP-LS Topology Distribution” shows how AreaBorder Routers (ABRs) distribute IGP network topology from ISIS ACCESSand ISIS CORE to Transport Route-Reflectors (tRRs). tRRs then reflecttopology to Segment Routing Path Computation Element (SR-PCEs). Each SR-PCE has full visibility of the entire inter-domain network.Note# Each IS-IS process in the network requires a unique instance-id to identify itself to the PCE.Figure 5# BGP-LS Topology Distributionrouter isis ACCESS **distribute link-state instance-id 101** net 49.0001.0101.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0 !! router isis CORE **distribute link-state instance-id 100** net 49.0001.0100.0000.0001.00 address-family ipv4 unicast mpls traffic-eng router-id Loopback0 !! router bgp 100 **address-family link-state link-state** ! neighbor-group TvRR remote-as 100 update-source Loopback0 address-family link-state link-state ! neighbor 100.0.0.10 use neighbor-group TvRR ! neighbor 100.1.0.10 use neighbor-group TvRR ! Segment Routing Traffic Engineering (SR-TE) and Services IntegrationThis section shows how to integrate Traffic Engineering (SR-TE) with services.ODN is configured by first defining a global ODN color associated with specificSR Policy constraints. The color and BGP next-hop address on the service routewill be used to dynamically instantiate a SR Policy to the remote VPN endpoint.On Demand Next-Hop (ODN) configuration – IOS-XRThe following is an example of the elements needed in addition to the base SRconfiguration. An on-demand policy must be created matching the a color to set the attributes of the SR-TE Policy. ODN does not require PCEP, but for inter-domainpath computation is required.On-Demand Route PoliciesColoring service routes requires routing policies to set a specific extended community on those routes and apply the policy during the import and export of the routes. Coloring can be performed in a policy at the BGP neighbor level or at the individual service level. The following example shows global level coloring, however it is recommended for granularity and ease of management to color routes at the service level.In CST 5.0 (XR 7.5.2) or higher ODN policies can be applied on import or export for L3VPN prefixes. EVPN Type1/3 prefixes must be applied on export.Community Set and Routing Policy Definitionextcommunity-set opaque BLUE 100end-setroute-policy ODN set extcommunity color BLUEend-policyroute-policy c2001 if evpn-route-type is 1 then set extcommunity color c2001 elseif evpn-route-type is 3 then set extcommunity color c2002 endifend-policy Neighbor level applicationrouter bgp 100 neighbor-group SVRR-EVPN address-family l2vpn evpn route-policy ODN_EVPN out Service level applicationvrf ODN-L3VPN rd 100#1 address-family ipv4 unicast import route-target 100#1 ! export route-target export route-policy ODN-L3VPN-OUT 100#1!evpn evi 2001 bgp route-policy export c2001 segment-routing traffic-eng logging policy status ! on-demand color 100 dynamic pce ! metric type igp ! ! ! pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 ! pce address ipv4 100.1.1.101 On Demand Next-Hop (ODN) configuration – IOS-XEmpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-allmpls traffic-eng auto-tunnel p2p config unnumbered-interface Loopback0mpls traffic-eng auto-tunnel p2p tunnel-num min 1000 max 5000!mpls traffic-eng lsp attributes L3VPN-SRTE path-selection metric igp pce!ip community-list 1 permit 9999!route-map L3VPN-ODN-TE-INIT permit 10 match community 1 set attribute-set L3VPN-SRTE!route-map L3VPN-SR-ODN-Mark-Comm permit 10 match ip address L3VPN-ODN-Prefixes set community 9999 !!router bgp 100 address-family vpnv4 neighbor SvRR send-community both neighbor SvRR route-map L3VPN-ODN-TE-INIT in neighbor SvRR route-map L3VPN-SR-ODN-Mark-Comm out SR-PCE configuration – IOS-XRsegment-routing traffic-eng pcc source-address ipv4 100.0.1.50 pce address ipv4 100.0.1.101 precedence 100 ! pce address ipv4 100.1.1.101 precedence 200 ! report-all timers delegation-timeout 10 timers deadtimer 60 timers initiated state 15 timers initiated orphan 10 ! ! SR-PCE configuration – IOS-XEmpls traffic-eng tunnelsmpls traffic-eng pcc peer 100.0.1.101 source 100.0.1.51mpls traffic-eng pcc peer 100.0.1.111 source 100.0.1.51mpls traffic-eng pcc report-all SR-TE Policy ConfigurationAt the foundation of CST is the use of Segment Routing Traffic Engineering Policies. SR-TE allow providers to create end to end traffic paths with engineered constraints to achieve a SLA objective. SR-TE Policies are either dynamically created by ODN (see ODN section) or users can configure SR-TE Policies on the head-end node.SR-TE Color and EndpointThe components uniquely identifying a SR-TE Policy to a destination PE node are its endpoint and color. The endpoint is the destination node loopback address. Note the endpoint address should not be an anycast address. The color is a 32-bit value which should have a SLA meaning to the network. The color allows for multiple SR-TE Policies to exist between a pair of nodes, each one with its own set of metrics and constraints.SR-TE Candidate Paths Each SR-TE Policy configured on a node must have at least one candidate path defined. If multiple candidate paths are defined, only one is active at any one time. The candidate path with the higher preference value is preferred over candidate paths with a lower preference value. The candidate path configuration specifies whether the path is dynamic or uses an explicit segment list. Within the dynamic configuration one can specify whether to use a PCE or not, the metric type used in the path computation (IGP metric, latency, TE metric, hop count), and the additional constraints placed on the path (link affinities, flex-algo constraints, or a cumulative metric of type IGP metric, latency, TE Metric, or hop count) There is a default candidate path with a preference of 200 using head-end IGP path computation Each candidate path can have multiple explicit segment lists defined with a bandwidth weight value to load balance traffic across multiple explicit pathsService to SR-TE Policy Forwarding - Per-DestinationService traffic can be forwarded over SR-TE Policies in the CST design using per-destination automated steering. Per-destination steering utilizes two BGP components of the service route to forward traffic to a matching SR Policy A color extended community attached to the service route matching the SR Policy color The BGP next-hop address of the service route to match the endpoint of the SR PolicyService to SR-TE Policy Forwarding - Per-FlowService traffic can also be forwarded over SR-TE Policies in the CST design using per-flow automated steering.Per-flow automated steering uses the same BGP criteria as per-destination steering but also uses the CoS of the ingress packet to determine the proper SR Policy to steer traffic over.SR-TE and ODN Configuration ExamplesThe following examples show SR-TE policies using persistent device configuration and the ODN policies to dynamic create the same SR Policies.SR Policy using IGP metric, head-end computationThe local PE device will compute a path using the lowest cumulative IGP metricpath to 100.0.1.50. Note in the multi-domain CST design, this computation willfail to nodes not found within the same IS-IS domain as the PE.segment-routing traffic-eng policy GREEN-PE3-24 color 1024 end-point ipv4 100.0.1.50 candidate-paths preference 1 dynamic ! metric type igpsegment-routing traffic-eng on-demand color 1024 dynamic pcep ! anycast-sid-inclusion ! sid-algorithm 128 !PCE delegated SR Policy using lowest IGP metricThis policy will request a path from the configured primary PCE with the lowestcumulative IGP metric to the endpoint 100.0.1.50segment-routing traffic-eng policy GREEN-PE3-24 color 1024 end-point ipv4 100.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type igpPCE delegated SR Policy using lowest latency metricThis policy will request a path from the configured primary PCE with the lowest cumulative latency to the endpoint 100.0.1.50. As covered in the performance-measurement section, the per-link latency metric value used will be the dynamic/static PM value, a configured TE metric value, or the IGP metric.segment-routing traffic-eng policy GREEN-PE3-24 color 1024 end-point ipv4 100.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type latency segment-routing traffic-eng on-demand color 1024 dynamic pcep ! metric type latency !PCE delegated SR Policy including Anycast SIDsAnycast SIDs provide redundancy to hops in the SR-TE path. 1+N nodes share thesame Loopback address and Node-SID. Traffic with the Anycast SID in the SID listwill route to the closest node with the SID assigned based on IGP cost. The“anycast-sid-inclusion” command is required for the PCE or local computation toprefer Anycast SIDs when computing the end to end path.policy Anycast-APE3-1 color 30001 end-point ipv4 101.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type igp ! anycast-sid-inclusionsegment-routing traffic-eng on-demand color 30001 dynamic pcep ! anycast-sid-inclusion !PCE delegated SR Policy using specific Flexible AlgorithmPlease see the Flex-Algo section for more details on SR Flexible Algorithms. Thefollowing SR-TE policy will restrict path computation to links and nodes belonging to algo128, using the lowest IGP metric to compute the path.policy FA128-APE3-1 color 77801 end-point ipv4 101.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type igp ! ! constraints segments sid-algorithm 128on-demand color 77801 dynamic pcep ! metric type igp ! ! constraints segments sid-algorithm 128SR Policy using explicit segment listThis policy does not perform any path computation, it will utilize thestatically defined segment lists as the forwarding path across the network. Thenode does however check the validity of the node segments in the list. Each nodeSID in the segment list can be defined by either IP address or SID. The fullpath to the egress node must be defined in the list, but you do not need todefine every node explicitly in the path. If you want the path to take aspecific link the correct node and adjacency SID must be defined in the list.Multiple explicit paths can be defined with a weight assigned, the ratio ofweights is used to balance traffic across each explicit path.segment-routing traffic-eng segment-list anycast-path index 1 mpls label 17034 index 2 mpls label 16150 ! policy anycast-path-ape3 color 9999 end-point ipv4 100.0.1.50 candidate-paths preference 1 explicit segment-list anycast-pathPer-Flow Segment Routing Configuration (NCS Platforms)The following configuration is required on the NCS 5500 / 5700 platforms toallocate the PFP Binding SID (BSID) from a specific label block.mpls label blocks\u000b block name sample-pfp-bsid-block type pfp start 40000 end 41000 client any Per-Flow QoS ConfigurationThe Forward Class must be set in the ingress QoS policy so traffic is steeredinto the correct child Per-Destination Policy.policy-map per-flow-steering class MatchIPP1 set forward-class 1! class MatchIPP2 set forward-class 2! class MatchIPv4_SRC set forward-class 3 ! class MatchIPv6_SRC set forward-class 4 end-policy-map!class-map match-any MatchIPP1 match precedence 1end-class-map!class-map match-any MatchIPP2 match precedence 2end-class-map!class-map match-any MatchIPv4_SRC   match access-group ipv4 ipv4_sourcesend-class-map!class-map match-any MatchIPv6_SRC   match access-group ipv4 ipv6_sourcesend-class-mapipv4 access-list ipv4_sources  10 permit ipv4 100.0.0.0/24 any  20 permit ipv4 100.0.1.0/24 any !ipv6 access-list ipv6_sources 10 permit ipv6 2001#100##/64 any 20 permit ipv6 2001#200##/64 any Per-Flow Policy ConfigurationThis example shows both the child Per-Destination Policies as well as the parent Per-Flow Policy. Each Forward-Class is mapped to the color of the child policy. The default Forward Class is meant to catch traffic not matching a configured Forward Class.segment-routing traffic-eng policy PERFLOW     color 100 endpoint 1.1.1.4      candidate-paths       preference 100        per-flow forward-class 0 color 10          forward-class 1 color 20         forward-class 2 color 30 forward-class 3 color 40 forward-class 4 color 50         forward-class default 0 ! policy pe1_fc0 color 10 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFL4-PE1-FC1 ! policy pe1_fc1 color 20 end-point ipv4 192.168.11.1 candidate-paths preference 150 dynamic ! policy pe1_fc2 color 30 end-point ipv4 192.168.11.1 candidate-paths preference 150 explicit segment-list PFL4-PE1-FC2 ! policy pe1_fc3 color 40 end-point ipv4 192.168.11.1 candidate-paths preference 150 dynamic On-Demand Next-Hop Per-Flow ConfigurationThe creation of the SR-TE Policies can be fully automated using ODN. ODN is used to create the child Per-Destination Policies as well as the Per-Flow Policy.segment-routing traffic-eng on-demand color 10 dynamic metric type igp ! ! ! on-demand color 20 dynamic sid-algorithm 128 ! ! on-demand color 30 dynamic metric type te ! ! on-demand color 30 dynamic metric type igp ! ! on-demand color 50 dynamic metric type latency ! ! on-demand color 100 per-flow forward-class 0 color 10 forward-class 1 color 20 forward-class 2 color 30 forward-class 3 color 40 forward-class 4 color 50 QoS ImplementationSummaryPlease see the CST 3.0 HLD for in-depth information on design choices.Core QoS configurationThe core QoS policies defined for CST 3.0 utilize priority levels, with no bandwidth guarantees per traffic class. In a production network it is recommended to analyze traffic flows and determine an appropriate BW guarantee per traffic class. The core QoS uses four classes. Note the “video” class uses priority level 6 since only levels 6 and 7 are supported for high priority multicast. Traffic Type Priority Level Core EXP Marking     Network Control 1 6   Voice 2 5   High Priority 3 4   Video 6 2   Default 0 0 Class maps used in QoS policiesClass maps are used within a policy map to match packet criteria or internal QoS markings like traffic-class or qos-groupclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 match mpls experimental topmost 5 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 match mpls experimental topmost 4 end-class-map!class-map match-any match-video-cs4-exp2 description Video match dscp 32 match mpls experimental topmost 2 end-class-map!class-map match-any match-cs6-exp6 description Highest priority control-plane traffic match dscp cs6 match mpls experimental topmost 6 end-class-map!class-map match-any match-qos-group-1 match qos-group 1 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-map!class-map match-any match-qos-group-6 match qos-group 3 end-class-map!class-map match-any match-traffic-class-1 description ~Match highest priority traffic-class 1~ match traffic-class 1 end-class-map!class-map match-any match-traffic-class-2 description ~Match high priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match medium traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-traffic-class-6 description ~Match video traffic-class 6~ match traffic-class 6 end-class-map Core ingress classifier policypolicy-map core-ingress-classifier class match-cs6-exp6 set traffic-class 1 ! class match-ef-exp5 set traffic-class 2 ! class match-cs5-exp4 set traffic-class 3 ! class match-video-cs4-exp2 set traffic-class 6 ! class class-default set mpls experimental topmost 0 set traffic-class 0 set dscp 0 ! end-policy-map! Core egress queueing mappolicy-map core-egress-queuing class match-traffic-class-2 priority level 2 queue-limit 100 us ! class match-traffic-class-3 priority level 3 queue-limit 500 us ! class match-traffic-class-6 priority level 6 queue-limit 500 us ! class match-traffic-class-1 priority level 1 queue-limit 500 us ! class class-default queue-limit 250 ms ! end-policy-map! Core egress MPLS EXP marking mapThe following policy must be applied for PE devices with MPLS-based VPN services in order for service traffic classified in a specific QoS Group to be marked. VLAN-based P2P L2VPN services will by default inspect the incoming 802.1p bits and copy those the egress MPLS EXP if no specific ingress policy overrides that behavior. Note the EXP can be set in either an ingress or egress QoS policy. This QoS example sets the EXP via the egress map.policy-map core-egress-exp-marking class match-qos-group-1 set mpls experimental imposition 6 ! class match-qos-group-2 set mpls experimental imposition 5 ! class match-qos-group-3 set mpls experimental imposition 4 ! class match-qos-group-6 set mpls experimental imposition 2 ! class class-default set mpls experimental imposition 0 ! end-policy-map! H-QoS configurationEnabling H-QoS on NCS 540 and NCS 5500Enabling H-QoS on the NCS platforms requires the following global command and requires a reload of the device.hw-module profile qos hqos-enable Example H-QoS policy for 5G servicesThe following H-QoS policy represents an example QoS policy reserving 5Gbps on a sub-interface. On ingress each child class is policed to a certain percentage of the 5Gbps policer. In the egress queuing policy, shaping is used with guaranteed each class a certain amount of egress bandwidth, with high priority traffic being serviced in a low-latency queue (LLQ).Class maps used in ingress H-QoS policiesclass-map match-any edge-hqos-2-in match dscp 46 end-class-map!class-map match-any edge-hqos-3-in match dscp 40 end-class-map!class-map match-any edge-hqos-6-in match dscp 32 end-class-map Parent ingress QoS policypolicy-map hqos-ingress-parent-5g class class-default service-policy hqos-ingress-child-policer police rate 5 gbps ! ! end-policy-map H-QoS ingress child policiespolicy-map hqos-ingress-child-policer class edge-hqos-2-in set traffic-class 2 police rate percent 10 ! ! class edge-hqos-3-in set traffic-class 3 police rate percent 30 ! ! class edge-hqos-6-in set traffic-class 6 police rate percent 30 ! ! class class-default set traffic-class 0 set dscp 0 police rate percent 100 ! ! end-policy-map Egress H-QoS parent policy (Priority levels)policy-map hqos-egress-parent-4g-priority class class-default service-policy hqos-egress-child-priority shape average 4 gbps ! end-policy-map! Egress H-QoS child using priority onlyIn this policy all classes can access 100% of the bandwidth, queues are services based on priority level. The lower priority level has preference.policy-map hqos-egress-child-priority class match-traffic-class-2 shape average percent 100 priority level 2 ! class match-traffic-class-3 shape average percent 100 priority level 3 ! class match-traffic-class-6 priority level 4 shape average percent 100 ! class class-default ! end-policy-map Egress H-QoS child using reserved bandwidthIn this policy each class is reserved a certain percentage of bandwidth. Each class may utilize up to 100% of the bandwidth, if traffic exceeds the guaranteed bandwidth it is eligible for drop.policy-map hqos-egress-child-bw class match-traffic-class-2 bandwidth remaining percent 30 ! class match-traffic-class-3 bandwidth remaining percent 30 ! class match-traffic-class-6 bandwidth remaining percent 30 ! class class-default bandwidth remaining percent 10 ! end-policy-map Egress H-QoS child using shapingIn this policy each class is shaped to a defined amount and cannot exceed the defined bandwidth.policy-map hqos-egress-child-shaping class match-traffic-class-2 shape average percent 30 ! class match-traffic-class-3 shape average percent 30 ! class match-traffic-class-6 shape average percent 30 ! class class-default shape average percent 10 ! end-policy-map! Support for Time Sensitive Networking in N540-FH-CSR-SYS and N540-FH-AGG-SYSThe Fronthaul family of NCS 540 routers support frame preemption based on theIEEE 802.1Qbu-2016 and Time Sensitive Networking (TSN) standards.Time Sensitive Networking (TSN) is a set of IEEE standards that addresses thetiming-critical aspect of signal flow in a packet switched Ethernet network toensure deterministic operation. TSN operates at the Ethernet layer on physicalinterfaces. Frames are marked with a specific QoS class (typically 7 in a devicewith classes 0-7) qualify as express traffic, while other classes other thancontrol plane traffic are marked as preemptable traffic.This allows critical signaling traffic to traverse a device as quickly aspossible without having to wait for lower priority frames before beingtransmitted on the wire.Please see the TSN configuration guide for NCS 540 Fronthaul routers athttps#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5xx/fronthaul/b-fronthaul-config-guide-ncs540-fh/m-fh-tsn-ncs540.pdfTime Sensitive Networking Configurationclass-map match-any express-traffic match cos 7class-map match-any preemptable-traffic match cos 2class-map match-any express-class match traffic-class 7 class-map match-any preemptable-class match traffic-class 2 policy-map mark-traffic class express-traffic set traffic-class 7 class preemptable-traffic set traffic-class 2 policy-map tsn-policy class express-class priority level 1 class preemptable-class priority level 2 class best-effort bandwidth percent 50 Ingress Interfaceinterface TenGigabitEthernet0/0/0/1 ip address 14.0.0.1 255.255.255.0 service-policy input mark-traffic Egress Interfaceinterface TenGigabitEthernet0/0/0/0 ip address 12.0.0.1 255.255.255.0 service-policy output tsn-policy frame-preemption ServicesEnd-To-End VPN ServicesFigure 6# End-To-End Services TableEnd-To-End VPN Services Data PlaneFigure 10# End-To-End Services Data PlaneL3VPN MP-BGP VPNv4 On-Demand Next-HopFigure 7# L3VPN MP-BGP VPNv4 On-Demand Next-Hop Control PlaneAccess Routers# Cisco ASR920 IOS-XE and NCS540 IOS-XR Operator# New VPNv4 instance via CLI or NSO Access Router# Advertises/receives VPNv4 routes to/from ServicesRoute-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Please refer to “On Demand Next-Hop (ODN)” sections for initial ODN configuration.Access Router Service Provisioning (IOS-XR)ODN route-policy configurationextcommunity-set opaque ODN-GREEN 100end-setroute-policy ODN-L3VPN-OUT set extcommunity color ODN-GREEN passend-policy VRF definition configurationvrf ODN-L3VPN rd 100#1 address-family ipv4 unicast import route-target 100#1 ! export route-target export route-policy ODN-L3VPN-OUT 100#1 ! ! address-family ipv6 unicast import route-target 100#1 ! export route-target export route-policy ODN-L3VPN-OUT 100#1 ! ! VRF Interface configurationinterface TenGigE0/0/0/23.2000 mtu 9216 vrf ODN-L3VPN ipv4 address 172.106.1.1 255.255.255.0 encapsulation dot1q 2000 BGP VRF configuration with static/connected onlyrouter bgp 100 vrf VRF-MLDP rd auto address-family ipv4 unicast redistribute connected redistribute static ! address-family ipv6 unicast redistribute connected redistribute static ! Access Router Service Provisioning (IOS-XE)VRF definition configurationvrf definition L3VPN-SRODN-1 rd 100#100 route-target export 100#100 route-target import 100#100 address-family ipv4 exit-address-family VRF Interface configurationinterface GigabitEthernet0/0/2 mtu 9216 vrf forwarding L3VPN-SRODN-1 ip address 10.5.1.1 255.255.255.0 negotiation autoend BGP VRF configuration Static & BGP neighborStatic routing configurationrouter bgp 100 address-family ipv4 vrf L3VPN-SRODN-1 redistribute connected exit-address-family BGP neighbor configurationrouter bgp 100 neighbor Customer-1 peer-group neighbor Customer-1 remote-as 200 neighbor 10.10.10.1 peer-group Customer-1 address-family ipv4 vrf L3VPN-SRODN-2 neighbor 10.10.10.1 activate exit-address-family L2VPN Single-Homed EVPN-VPWS On-Demand Next-HopFigure 8# L2VPN Single-Homed EVPN-VPWS On-Demand Next-Hop Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Advertises/receives EVPN-VPWS instance to/fromServices Route-Reflector (sRR) Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Note# Please refer to On Demand Next-Hop (ODN) – IOS-XR section for initial ODN configuration. The correct EVPN L2VPN routes must be advertised with a specific color ext-community to trigger dynamic SR Policy instantiation.Access Router Service Provisioning (IOS-XR)#Port based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 1000 target 1 source 1 interface TenGigE0/0/0/5 l2transport VLAN Based service configurationl2vpn xconnect group evpn_vpws p2p odn-1 neighbor evpn evi 1000 target 1 source 1 !! interface TenGigE0/0/0/5.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric! L2VPN Static Pseudowire (PW) – Preferred Path (PCEP)Figure 9# L2VPN Static Pseudowire (PW) – Preferred Path (PCEP) ControlPlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Request SR-PCE to provide path (shortest IGP metric)to remote access router SR-PCE# Computes and provides the path to remote router(s) Access Router# Programs Segment Routing Traffic Engineering(SRTE) Policy to reach remote access router Access Router Service Provisioning (IOS-XR)#Note# EVPN VPWS dual homing is not supported when using an SR-TE preferred path.Note# In IOS-XR 6.6.3 the SR Policy used as the preferred path must be referenced by its generated name and not the configured policy name. This requires first issuing the commandDefine SR Policy traffic-eng policy GREEN-PE3-1 color 1001 end-point ipv4 100.0.1.50 candidate-paths preference 1 dynamic pcep ! metric type igp Determine auto-configured policy name The auto-configured policy name will be persistant and must be used as a reference in the L2VPN preferred-path configuration.RP/0/RP0/CPU0#A-PE8#show segment-routing traffic-eng policy candidate-path name GREEN-PE3-1   SR-TE policy database Color# 1001, End-point# 100.0.1.50 Name# srte_c_1001_ep_100.0.1.50 Port Based Service configurationinterface TenGigE0/0/0/15 l2transport ! ! l2vpn pw-class static-pw-class-PE3 encapsulation mpls control-word preferred-path sr-te policy srte_c_1001_ep_100.0.1.50 ! ! ! p2p Static-PW-to-PE3-1 interface TenGigE0/0/0/15 neighbor ipv4 100.0.0.3 pw-id 1000 mpls static label local 1000 remote 1000 pw-class static-pw-class-PE3 VLAN Based Service configurationinterface TenGigE0/0/0/5.1001 l2transport encapsulation dot1q 1001 rewrite ingress tag pop 1 symmetric ! ! l2vpn pw-class static-pw-class-PE3 encapsulation mpls control-word preferred-path sr-te policy srte_c_1001_ep_100.0.1.50 p2p Static-PW-to-PE7-2 interface TenGigE0/0/0/5.1001 neighbor ipv4 100.0.0.3 pw-id 1001 mpls static label local 1001 remote 1001 pw-class static-pw-class-PE3 Access Router Service Provisioning (IOS-XE)#Port Based service with Static OAM configurationinterface GigabitEthernet0/0/1 mtu 9216 no ip address negotiation auto no keepalive service instance 10 ethernet encapsulation default xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word ! pseudowire-static-oam class static-oam timeout refresh send 10 ttl 255 ! ! ! pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 status protocol notification static static-oam ! VLAN Based Service configurationinterface GigabitEthernet0/0/1 no ip address negotiation auto service instance 1 ethernet Static-VPWS-EVC encapsulation dot1q 10 rewrite ingress tag pop 1 symmetric xconnect 100.0.2.54 100 encapsulation mpls manual pw-class mpls mpls label 100 100 no mpls control-word ! ! ! pseudowire-class mpls encapsulation mpls no control-word protocol none preferred-path interface Tunnel1 L2VPN EVPN E-TreeNote# ODN support for EVPN E-Tree is supported on ASR9K only in CST 3.5. Support for E-Tree across all CST IOS-XR nodes will be covered in CST 4.0 based on IOS-XR 7.2.2. In CST 3.5, if using E-Tree across multiple IGP domains, SR-TE Policies must be configured between all Root nodes and between all Root and Leaf nodes.IOS-XR Root Node Configuraitonevpn evi 100 advertise-mac ! ! l2vpn bridge group etree bridge-domain etree-ftth interface TenGigE0/0/0/14.100 routed interface BVI100 ! evi 100 IOS-XR Leaf Node ConfigurationA single command is needed to enable leaf function for an EVI. Configuring “etree leaf” will signal to other nodes this is a leaf node. In this case we also have a L3 IRB configured within the EVI. In order to isolate the two ACs, each AC is configured with the “split-horizon group” configuration command. The BVI interfaceis configured with “local-proxy-arp” to intercept ARP requests between hosts on each AC. This is needed if hosts in two different ACs are using the same IP address subnet, since ARP traffic will be suppressed acrossed the ACs.evpn evi 100 etree leaf ! advertise-mac ! ! l2vpn bridge group etree bridge-domain etree-ftth interface TenGigE0/0/0/23.1098 split-horizon-group interface TenGigE0/0/0/24.1098 split-horizon group routed interface BVI100 ! evi 100 interface BVI11011 local-proxy-arpHierarchical ServicesFigure 11# Hierarchical Services TableL3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE)Figure 12# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Provider Edge Router# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together withPseudowire-Headend (PWHE) via CLI or NSO Provider Edge Router# Path to remote PE is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p L3VPN-VRF1 interface TenGigE0/0/0/5.501 neighbor evpn evi 13 target 501 source 501 ! ! !interface TenGigE0/0/0/5.501 l2transport encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric Port based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 13 target 502 source 502 ! ! !! interface TenGigE0/0/0/5 l2transport Access Router Service Provisioning (IOS-XE)#VLAN based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric ! Port based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation default Provider Edge Router Service Provisioning (IOS-XR)#VRF configurationvrf L3VPN-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#501 ! export route-target 100#501 ! ! address-family ipv6 unicast import route-target 100#501 ! export route-target 100#501 ! ! BGP configurationrouter bgp 100 vrf L3VPN-ODNTE-VRF1 rd 100#501 address-family ipv4 unicast redistribute connected ! address-family ipv6 unicast redistribute connected ! ! PWHE configurationinterface PW-Ether1 vrf L3VPN-ODNTE-VRF1 ipv4 address 10.13.1.1 255.255.255.0 ipv6 address 1000#10#13##1/126 attach generic-interface-list PWHE! EVPN VPWS configuration towards Access PEl2vpn xconnect group evpn-vpws-l3vpn-A-PE3 p2p L3VPN-ODNTE-VRF1 interface PW-Ether1 neighbor evpn evi 13 target 501 source 501 ! Figure 13# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 withPseudowire-Headend (PWHE) Data PlaneL3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRBFigure 14# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 withAnycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together with Anycast IRBvia CLI or NSO Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2 l2transport !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 n-flag-clear L2VPN configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30 VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 15# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB Datal PlaneL2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRBFigure 16# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L2VPN Multipoint EVPN instance together withAnycast IRB via CLI or NSO (Anycast IRB is optional when L2 and L3is required in same service instance) Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Please note that provisioning on Access and Provider Edge routers issame as in “L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB”. In this use case there is BGP EVPN instead of MP-BGPVPNv4/6 in the core.Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !!interface TenGigE0/0/0/2 l2transport!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 L2VPN Configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30! VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 17# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Data PlaneL2/L3VPN – EVPN Head-End ConfigurationFigure 16# L2/L3VPN – EVPN Head-EndAccess Routers# Cisco NCS 540, 5500, 560 IOS-XR Operator# New EVPN-VPWS Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New EVPN-VPWS Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L2VPN Multipoint EVPN instance together withL3 PWHE interface via CLI or NSO Provider Edge Routers# Path to remote PEs is known via CORE-ISIS IGP. Access Router Service Provisioning (IOS-XR)#Interface Configurationinterface TenGigE0/0/0/5.2002 l2transport description EVPN-VPWS-PWHE-HEADEND encapsulation dot1q 2002!interface TenGigE0/0/0/5.2003 l2transport description EVPN-VPWS-PWHE-HEADEND encapsulation dot1q 2003!interface TenGigE0/0/0/5.2022 l2transport description EVPN-VPWS-PWHE-HEADEND encapsulation dot1q 2022 L2VPN Configuration In this example we use the NCS 540/5500 flexible xconnect service type to bundlemultiple downstream interfaces into a single EVPN-VPWS to the EVPN Head End as atrunk. FXC can bundle VLANs from the same physical interface or differentphysical interfaces.l2vpn flexible-xconnect-service vlan-unaware PWHE-Headend interface TenGigE0/0/0/5.2002 interface TenGigE0/0/0/5.2003 interface TenGigE0/0/0/5.2022 neighbor evpn evi 2002 target 2002 Provider Edge Routers Service Provisioning (IOS-XR)#A similar configuration is found on all PE routers. Each pair of EVPN-HE routers share the same IP addresses and EVPN ESI on their PW-Ether2002 interfaces.cef adjacency route override rib EVPN-HE L3 Interface ConfigurationThe following shows an example of both untagged and taggged interfaces. The same EVPN-VPWS is used as a trunk to carry traffic between Access and Head-End PE.interface PW-Ether2002 mtu 1518 ipv4 address 100.9.2.1 255.255.255.252 vrf L3VPN-AnyCast-ODNTE-VRF1 mac-address 0.1111.1 load-interval 30 attach generic-interface-list PWHE!interface PW-Ether2002.2002 vrf L3VPN-ODNTE-VRF1 ipv4 address 11.4.1.1 255.255.255.0 encapsulation dot1q 2002!interface PW-Ether2002.2003 ipv4 address 11.5.1.1 255.255.255.0 encapsulation dot1q 2003 EVPN Configurationevpn interface PW-Ether2002 ethernet-segment identifier type 0 99.99.99.99.99.01.00.00.00 convergence nexthop-tracking EVPN-VPWS to Access PExconnect group EVPN-HeadEnd p2p L3VPN-ODNTE-VRF11 interface PW-Ether2002 neighbor evpn evi 2002 target 2002 source 2002 VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! L2/L3VPN – EVPN Centralized GatewayFigure 16# L2/L3VPN – EVPN Centralized GatewayAccess Routers# Cisco NCS 540, 5500, 560 IOS-XR Operator# New EVPN-ELAN or ETREE instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New EVPN-ELAN or ETREE instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN Multipoint EVPN instance together withAnycast IRB interface via CLI or NSO Provider Edge Routers# Path to remote PEs is known via CORE-ISIS IGP. Access Router Service Provisioning (IOS-XR)#Interface Configurationinterface TenGigE0/0/0/5.2300 l2transport description EVPN-ELAN-CGW1-PE1/PE2 encapsulation dot1q 2300 rewrite ingress tag pop 1 symmetric L2VPN and EVPN Configurationl2vpn bridge group EVPN-ELAN-CGW1 bridge-domain ELAN-CGW1 interface TenGigE0/0/0/5.2300 ! evi 2400 ! !evpn evi 2400 advertise-mac Provider Edge Routers Service Provisioning (IOS-XR)#A similar configuration is found on all PE routers. Each pair of EVPN-HE routers share the same IP addresses and MAC address providing a redundant Anycast IRB L3 gateway to L2 connected access devices.EVPN CGW L3 BVI Interface ConfigurationIn this example the interface is part of a core L3VPN, but the interface could reside in the global routing table (default VRF).interface BVI100 vrf cgw ipv4 address 100.10.1.1 255.255.0.0 ipv6 address 100#10##1/64 mac-address 0.dc1.dc2 L2VPN Configuration Note the access-evi configuration used for the EVI connected to the A-PE accessrouters.l2vpn bridge group EVPN-ELAN-CGW1 bridge-domain ELAN-CGW1 access-evi 2400 routed interface BVI100 EVPN Configuration In this case we are using ODN to create on-demand SR-TE policies between the core CGW PEs and access PEs.evpn evi 2400 bgp route-policy export cgw_srte_odn route-policy import cgw_srte_odn ! advertise-mac bvi-mac ! virtual access-evi ethernet-segment identifier type 0 00.00.ac.ce.55.00.e1.00.00 ! core-isolation-group 1 !! VRF configurationvrf cgw address-family ipv4 unicast import route-target 10#10 ! export route-policy C1234 export route-target 10#10 ! ! address-family ipv6 unicast export route-policy C1234 Ethernet CFM for L2VPN service assuranceEthernet Connectivity Fault Management is an Ethernet OAM component used to validate end-to-end connectivity between service endpoints. Ethernet CFM is defined by two standards, 802.1ag and Y.1731. Within an SP network, Maintenance Domains are created based on service scope. Domains are typically separated by operator boundaries and may be nested but cannot overlap. Within each service, maintenance points can be created to verify bi-directional end to end connectivity. These are known as MEPs (Maintenance End-Point) and MIPs (Maintenance Intermediate Points). These maintenance points process CFM messages. A MEP is configured at service endpoints and has directionality where an “up” MEP faces the core of the network and a “down” MEP faces a CE device or NNI port. MIPs are optional and are created dynamically. Detailed information on Ethernet CFM configuration and operation can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/interfaces/75x/configuration/guide/b-interfaces-hardware-component-cg-ncs5500-75x/m-configuring-ethernet-oam.htmlMaintenance Domain configurationA Maintenance Domain is defined by a unique name and associated level. The level can be 0-7. The numerical identifier usually corresponds to the scope of the MD, where 7 is associated with CE endpoints, 6 associated with PE devices connected to a CE. Additional levels may be required based on the topology and service boundaries which occur along the end-to-end service. In this example we only a single domain and utilize level 0 for all MEPs.ethernet cfm domain EVPN-VPWS-PE3-PE8 level 0 MEP configuration for EVPN-VPWS servicesFor L2VPN xconnect services, each service must have a MEP created on the end PE device. There are two components to defining a MEP, first defining the Ethernet CFM “service” and then defining the MEP on the physical or logical interface participating in the L2VPN xconnect service. In the following configuration the xconnect group “EVPN-VPWS-ODN-PE3” and P2P EVPN VPWS service odn-8 are already defined. The Ethernet CFM service of “odn-8” does NOT have to match the xconnect service name. The MEP crosscheck defines a remote MEP to listen for Continuity Check messages from. It does not have to be the same as the local MEP defined on the physical sub-interface (103), but for P2P services it is best practice to make them identical. This configuration will send Ethernet CFM Continuity Check (CC) messages every 1 minute to verify end to end reachability.L2VPN configurationl2vpn xconnect group EVPN-VPWS-ODN-PE3 p2p odn-8 interface TenGigE0/0/0/23.8 neighbor evpn evi 1318 target 8 source 8 ! ! !! Physical sub-interface configurationinterface TenGigE0/0/0/23.8 l2transport encapsulation dot1q 8 rewrite ingress tag pop 1 symmetric ethernet cfm mep domain EVPN-VPWS-PE3-PE8 service odn-8 mep-id 103 ! !! Ethernet CFM service configurationethernet cfm domain EVPN-VPWS-PE3-PE8 service odn-8 xconnect group EVPN-VPWS-ODN-PE3 p2p odn-8 mip auto-create all continuity-check interval 1m mep crosscheck mep-id 103 ! log crosscheck errors log continuity-check errors log continuity-check mep changes ! !! Multicast Source Distribution using BGP Multicast AFI/SAFIThe Converged SDN Transport is inherently multi-domain to increase scalability.Multicast distribution trees built across the network using either native PIM,mLDP, or SR Tree-SID require the source be known to the receiver nodes tosatisfy multicast’s RPF (Reverse Path Forwarding) check. The recommended way todistribute source addresses across the network is use the BGP IPv4/IPv6multicast address family, utilizing the ABR nodes as inline RRs.In the case of MVPN the sources are distributed inside the L3VPN as VPNV4 andVPNV6 prefixes.Multicast BGP Configurationrouter bgp 100 ! address-family ipv4 multicast redistribute connected route-policy mcast-sources ! address-family ipv6 multicast redistribute connected route-policy mcast-sources Multicast Profile 14 using mLDP and ODN L3VPNIn ths service example we will implement multicast delivery across the CST network using mLDP transport for multicast and SR-MPLS for unicast traffic. L3VPN SR paths will be dynamically created using ODN. Multicast profile 14 is the “Partitioned MDT - MLDP P2MP - BGP-AD - BGP C-Mcast Signaling” Using this profile each mVPN will use a dedicated P2MP tree, endpoints will be auto-discovered using NG-MVPN BGP NLRI, and customer multicast state such as source streams, PIM, and IGMP membership data will be signaled using BGP. Profile 14 is the recommended profile for high scale and utilizing label-switched multicast (LSM) across the core.Please note that mLDP requires an IGP path to the source PE loopback address. The CST design utilizes a multi-domain approach which normally does not advertise IGP routes across domain boundaries. If mLDP is being utilized across domains, controlled redistribution should be used to advertise the source PE loopback addresses to receiver PEsMulticast core configurationThe multicast “core” includes transit endpoints participating in mLDP only. See the mLDP core configuration section for details on end-to-end mLDP configuration.Unicast L3VPN PE configurationIn order to complete an RPF check for SSM sources, unicast L3VPN configuration is required. Additionally the VRF must be defined under the BGP configuration with the NG-MVPN address families configured. In our use case we are utilizing ODN for creating the paths between L3VPN endpoints with a route-policy attached to the mVPN VRF to set a specific color on advertised routes.ODN opaque ext-community setextcommunity-set opaque MLDP 1000end-set ODN route-policyroute-policy ODN-MVPN set extcommunity color MLDP passend-policy Global L3VPN VRF definitionvrf VRF-MLDP address-family ipv4 unicast import route-target 100#38 ! export route-policy ODN-MVPN export route-target 100#38 ! ! address-family ipv6 unicast import route-target 100#38 ! export route-policy ODN-MVPN export route-target 100#38 ! !! BGP configurationrouter bgp 100 vrf VRF-MLDP rd auto address-family ipv4 unicast redistribute connected redistribute static ! address-family ipv6 unicast redistribute connected redistribute static ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! !! Multicast PE configurationThe multicast “edge” includes all endpoints connected to native multicast sources or receivers.Define RPF policyroute-policy mldp-partitioned-p2mp set core-tree mldp-partitioned-p2mpend-policy! Enable Multicast and define mVPN VRFmulticast-routing address-family ipv4 interface Loopback0 enable ! ! vrf VRF-MLDP address-family ipv4 mdt source Loopback0 rate-per-route interface all enable accounting per-prefix bgp auto-discovery mldp ! mdt partitioned mldp ipv4 p2mp mdt data 100 ! !! Enable PIM for mVPN VRF In this instance there is an interface TenGigE0/0/0/23.2000 which is using PIM within the VRFrouter pim address-family ipv4 rp-address 100.0.1.50 ! vrf VRF-MLDP address-family ipv4 rpf topology route-policy mldp-partitioned-p2mp mdt c-multicast-routing bgp ! interface TenGigE0/0/0/23.2000 enable ! ! Enable IGMP for mVPN VRF interface To discover listeners for a specific group, enable IGMP on interfaces within the VRF. These interested receivers will be advertised via BGP to establish end to end P2MP trees from the source.router igmp vrf VRF-MLDP interface TenGigE0/0/0/23.2001 ! version 3 !! Multicast distribution using Tree-SID with static S,G MappingTree-SID utilizes only Segment Routing to create and forward multicast traffic across an optimized tree. The Tree-SID tree is configured on the SR-PCE for deployment to the network. PCEP is used to instantiate the correct computed segments end to end. On the head-end source node,Note# Tree-SID requires all nodes in the multicast distribution network to have connections to the same SR-PCE instances, please see the PCEP configuration section of the Implmentation GuideTree-SID SR-PCE ConfigurationEndpoint Set ConfigurationThe P2MP endpoint sets are defined outside of the SR Tree-SID Policy configuration in order to be reusaable across multiple trees. This is a required step in the configuration of Tree-SID.pce address ipv4 100.0.1.101 timers reoptimization 600 ! segment-routing traffic-eng p2mp endpoint-set APE7-APE8 ipv4 100.0.2.57 ipv4 100.0.2.58 ! timers reoptimization 120 timers cleanup 30P2MP Tree-SID SR Policy ConfigurationThis configuration defines the Tree-SID P2MP SR Policy to be used across thenetwork. Note the name of the Tree-SID must be unique across the netowrk andreferenced explicitly on all source and receiver nodes. Within the policyconfiguration, supported constraints can be applied during path computation ofthe optimized P2MP tree. Note the source address must be specified and the MPLSlabel used must be within the SRLB for all nodes across the network.pce segment-routing traffic-eng policy treesid-1 source ipv4 100.0.0.1 color 100 endpoint-set APE7-APE8 treesid mpls 18600 candidate-paths constraints affinity include-any color1 ! ! ! preference 100 dynamic metric type igp ! ! !Tree-SID Common Config on All NodesSegment Routing Local BlockWhile the SRLB config is covered elsewhere in this guide, it is recommended to set the values the same across the Tree-SID domain. The values shown are for demonstration only.segment-routing local-block 18000 19000 !!PCEP ConfigurationTree-SID relies on PCE initiated segments to the node, so a session to the PCE is required for all nodes in the domain.segment-routing traffic-eng pcc source-address ipv4 100.0.2.53 pce address ipv4 100.0.1.101 precedence 200 ! pce address ipv4 100.0.2.101 precedence 100 ! pce address ipv4 100.0.2.102 precedence 100 ! report-all timers delegation-timeout 10 timers deadtimer 60 timers initiated state 15 timers initiated orphan 10 ! !!Static Tree-SID Source Node Multicast ConfigurationPIM ConfigurationIn this configuration a single S,G of 232.0.0.20 with a source of 104.14.1.2 is mapped to Tree-SID treesid-1 for distribution across the network.router pim address-family ipv4 interface Loopback0 enable ! interface Bundle-Ether111 enable ! interface Bundle-Ether112 enable ! interface TenGigE0/0/0/16 enable ! sr-p2mp-policy treesid-1 static-group 232.0.0.20 104.14.1.2 !!Multicast Routing Configurationmulticast-routing address-family ipv4 interface all enable mdt static segment-routing ! address-family ipv6 mdt static segment-routing ! !Static Tree-SID Receiver Node Multicast ConfigurationGlobal Routing Table MulticastPIM Configurationrouter pim address-family ipv4 rp-address 100.0.0.1 ! !!On the router connected to the receivers, configure the address family to use the Tree-SID for static S,G mapping.multicast-routing address-family ipv4 mdt source Loopback0 rate-per-route interface all enable static sr-policy Tree-SID-GRT mdt static segment-routing accounting per-prefix address-family ipv6 mdt source Loopback0 rate-per-route interface all enable static sr-policy Tree-SID-GRT mdt static segment-routing account per-prefix !!Multicast Routing Configurationmulticast-routing address-family ipv4 interface all enable static sr-policy treesid-1 ! address-family ipv6 static sr-policy treesid-1 ! !mVPN Multicast ConfigurationPIM ConfigurationIn this configuration, we are mapping the PIM RP to the TREESID sourcerouter pim vrf TREESID address-family ipv4 rp-address 100.0.0.1 ! !!Multicast Routing ConfigurationOn the PE connected to the receivers, within the VRF associated with the Tree-SID SR Policy, enable the Tree-SID for static mapping of S,G multicast.multicast-routing vrf TREESID address-family ipv4 interface all enable static sr-policy treesid-1 ! address-family ipv6 static sr-policy treesid-1 ! !Tree-SID Verification on PCEYou can view the end to end path using the “show pce lsp p2mp” command.RP/0/RP0/CPU0#XTC-ACCESS1-PHY#show pce lsp p2mpWed Sep 2 19#31#50.745 UTCTree# treesid-1 Label# 18600 Operational# up Admin# up Transition count# 1 Uptime# 00#06#39 (since Wed Sep 02 19#25#11 UTC 2020) Source# 100.0.0.1 Destinations# 100.0.2.53, 100.0.2.52 Nodes# Node[0]# 100.0.2.3 (AG3) Role# Transit Hops# Incoming# 18600 CC-ID# 1 Outgoing# 18600 CC-ID# 1 (10.23.253.1) Outgoing# 18600 CC-ID# 1 (10.23.252.0) Node[1]# 100.0.2.1 (PA3) Role# Transit Hops# Incoming# 18600 CC-ID# 2 Outgoing# 18600 CC-ID# 2 (10.21.23.1) Node[2]# 100.0.0.3 (PE3) Role# Transit Hops# Incoming# 18600 CC-ID# 3 Outgoing# 18600 CC-ID# 3 (10.3.21.1) Node[3]# 100.0.0.5 (P1) Role# Transit Hops# Incoming# 18600 CC-ID# 4 Outgoing# 18600 CC-ID# 4 (10.3.5.0) Node[4]# 100.0.0.7 (P3) Role# Transit Hops# Incoming# 18600 CC-ID# 5 Outgoing# 18600 CC-ID# 5 (10.5.7.0) Node[5]# 100.0.1.1 (NCS540-PA1) Role# Transit Hops# Incoming# 18600 CC-ID# 6 Outgoing# 18600 CC-ID# 6 (10.1.7.1) Node[6]# 100.0.0.1 (PE1) Role# Ingress Hops# Incoming# 18600 CC-ID# 7 Outgoing# 18600 CC-ID# 7 (10.1.11.1) Node[7]# 100.0.2.53 (A-PE8) Role# Egress Hops# Incoming# 18600 CC-ID# 8 Node[8]# 100.0.2.52 (A-PE7) Role# Egress Hops# Incoming# 18600 CC-ID# 9Multicast distribution using fully dynamic Tree-SIDIn this example we will use dynamic source/receiver discovery using BGP and PCEPsignaling to create the SR Tree-SID multicast distribution trees.Please see https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-5/segment-routing/configuration/guide/b-segment-routing-cg-asr9000-75x/configure-sr-tree-sid.html for for full descriptions of configuration and optional parameters.Note# There MUST be a BGP route to the source PE to satisfy the Tree-SID RPF check on receiver nodes. It is recommended for multicast to use the IPv4/IPv6 Multicast address family to distribute source information. Please see the section Multicast Source Distribution using BGP Multicast AFI/SAFIPE BGP ConfigurationThe following is used to enable the IPv4/IPV6 MVPN AFI/SAFI globally. They address families are also added to the SvRR neighbor group.router bgp 100 address-family ipv4 mvpn ! address-family ipv6 mvpn !neighbor-group SvRR remote-as 100 update-source Loopback0 address-family ipv4 unicast ! address-family vpnv4 unicast soft-reconfiguration inbound always ! address-family vpnv6 unicast soft-reconfiguration inbound always ! address-family ipv4 mvpn soft-reconfiguration inbound always ! address-family ipv6 mvpn soft-reconfiguration inbound always ! address-family l2vpn evpn soft-reconfiguration inbound always ! !PE Multicast Routing ConfigurationNote the new configuration specific to SR auto-discovery and the color specified for the default MDT. The same configuration is used on both source and receiver PE routers.div class=~highlighter-rouge~>multicast-routing address-family ipv4 interface Loopback0 enable ! mdt source Loopback0 interface all enable ! address-family ipv6 interface all enable ! vrf tree-sid address-family ipv4 mdt source Loopback0 interface all enable bgp auto-discovery segment-routing ! mdt default segment-routing mpls color 80 ! !PE PIM ConfigurationThe PIM configuration requires the following route-policy be defined.route-policy sr-p2mp-core-tree set core-tree sr-p2mpend-policyrouter pim address-family ipv4 interface Loopback0 enable ! ! vrf tree-sid address-family ipv4 rpf topology route-policy sr-p2mp-core-tree mdt c-multicast-routing bgp ! multipath ssm range ssmHierarchical Services ExamplesFigure 11# Hierarchical Services TableL3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE)Figure 12# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 with Pseudowire-Headend (PWHE) Control PlaneAccess Routers# Cisco NCS 540, 5500, 560 IOS-XR, ASR 920 IOS-XE Operator# New EVPN-VPWS instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR Operator# New EVPN-VPWS instance via CLI or NSO Provider Edge Router# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together withPseudowire-Headend (PWHE) via CLI or NSO Provider Edge Router# Path to remote PE is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p L3VPN-VRF1 interface TenGigE0/0/0/5.501 neighbor evpn evi 13 target 501 source 501 ! ! !interface TenGigE0/0/0/5.501 l2transport encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric Port based service configurationl2vpn xconnect group evpn-vpws-l3vpn-PE1 p2p odn-1 interface TenGigE0/0/0/5 neighbor evpn evi 13 target 502 source 502 ! ! !! interface TenGigE0/0/0/5 l2transport Access Router Service Provisioning (IOS-XE)#VLAN based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation dot1q 501 rewrite ingress tag pop 1 symmetric ! Port based service configurationl2vpn evpn instance 14 point-to-point vpws context evpn-pe4-pe1 service target 501 source 501 member GigabitEthernet0/0/1 service-instance 501 !interface GigabitEthernet0/0/1 service instance 501 ethernet encapsulation default Provider Edge Router Service Provisioning (IOS-XR)#VRF configurationvrf L3VPN-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#501 ! export route-target 100#501 ! ! address-family ipv6 unicast import route-target 100#501 ! export route-target 100#501 ! ! BGP configurationrouter bgp 100 vrf L3VPN-ODNTE-VRF1 rd 100#501 address-family ipv4 unicast redistribute connected ! address-family ipv6 unicast redistribute connected ! ! PWHE configurationinterface PW-Ether1 vrf L3VPN-ODNTE-VRF1 ipv4 address 10.13.1.1 255.255.255.0 ipv6 address 1000#10#13##1/126 attach generic-interface-list PWHE! EVPN VPWS configuration towards Access PEl2vpn xconnect group evpn-vpws-l3vpn-A-PE3 p2p L3VPN-ODNTE-VRF1 interface PW-Ether1 neighbor evpn evi 13 target 501 source 501 ! Figure 13# L3VPN – Single-Homed EVPN-VPWS, MP-BGP VPNv4/6 withPseudowire-Headend (PWHE) Data PlaneL3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 with Anycast IRBFigure 14# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4 withAnycast IRB Control PlaneAccess Routers# Cisco NCS5501-SE IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L3VPN instance (VPNv4/6) together with Anycast IRBvia CLI or NSO Provider Edge Routers# Path to remote PEs is known via CORE-ISISIGP. Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2 l2transport !! l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 n-flag-clear L2VPN configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30 VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 15# L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB Datal PlaneL2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPN with Anycast IRBFigure 16# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Control PlaneAccess Routers# Cisco NCS 540, 560, 5500 IOS-XR or Cisco ASR920 IOS-XE Operator# New Static Pseudowire (PW) instance via CLI or NSO Access Router# Path to PE Router is known via ACCESS-ISIS IGP. Provider Edge Routers# Cisco ASR9000 IOS-XR (Same on both PErouters in same location PE1/2 and PE3/4) Operator# New Static Pseudowire (PW) instance via CLI or NSO Provider Edge Routers# Path to Access Router is known viaACCESS-ISIS IGP. Operator# New L2VPN Multipoint EVPN instance together withAnycast IRB via CLI or NSO (Anycast IRB is optional when L2 and L3is required in same service instance) Provider Edge Routers# Path to remote PEs is known via CORE-ISIS IGP. Please note that provisioning on Access and Provider Edge routers issame as in “L3VPN – Anycast Static Pseudowire (PW), MP-BGP VPNv4/6 withAnycast IRB”. In this use case there is BGP EVPN instead of MP-BGPVPNv4/6 in the core.Access Router Service Provisioning (IOS-XR)#VLAN based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2.1 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !interface TenGigE0/0/0/2.1 l2transport encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word ! Port based service configurationl2vpn xconnect group Static-VPWS-PE12-H-L3VPN-AnyCast p2p L3VPN-VRF1 interface TenGigE0/0/0/2 neighbor ipv4 100.100.100.12 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! !!interface TenGigE0/0/0/2 l2transport!l2vpn pw-class static-pw-h-l3vpn-class encapsulation mpls control-word Access Router Service Provisioning (IOS-XE)#VLAN based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation dot1q 1 rewrite ingress tag pop 1 symmetric xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Port based service configurationinterface GigabitEthernet0/0/5 no ip address media-type auto-select negotiation auto service instance 1 ethernet encapsulation default xconnect 100.100.100.12 4001 encapsulation mpls manual mpls label 4001 4001 mpls control-word ! Provider Edge Routers Service Provisioning (IOS-XR)#cef adjacency route override rib AnyCast Loopback configurationinterface Loopback100 description Anycast ipv4 address 100.100.100.12 255.255.255.255!router isis ACCESS interface Loopback100 address-family ipv4 unicast prefix-sid index 1012 L2VPN Configurationl2vpn bridge group Static-VPWS-H-L3VPN-IRB bridge-domain VRF1 neighbor 100.0.1.50 pw-id 5001 mpls static label local 5001 remote 5001 pw-class static-pw-h-l3vpn-class ! neighbor 100.0.1.51 pw-id 4001 mpls static label local 4001 remote 4001 pw-class static-pw-h-l3vpn-class ! routed interface BVI1 split-horizon group core ! evi 12001 ! ! EVPN configurationevpn evi 12001 ! advertise-mac ! ! virtual neighbor 100.0.1.50 pw-id 5001 ethernet-segment identifier type 0 12.00.00.00.00.00.50.00.01 Anycast IRB configurationinterface BVI1 host-routing vrf L3VPN-AnyCast-ODNTE-VRF1 ipv4 address 12.0.1.1 255.255.255.0 mac-address 12.0.1 load-interval 30! VRF configurationvrf L3VPN-AnyCast-ODNTE-VRF1 address-family ipv4 unicast import route-target 100#10001 ! export route-target 100#10001 ! !! BGP configurationrouter bgp 100 vrf L3VPN-AnyCast-ODNTE-VRF1 rd auto address-family ipv4 unicast redistribute connected ! ! Figure 17# L2/L3VPN – Anycast Static Pseudowire (PW), Multipoint EVPNwith Anycast IRB Data PlaneRemote PHY CIN ImplementationSummaryDetail can be found in the CST high-level design guide for design decisions, this section will provide sample configurations.Sample QoS PoliciesThe following are usable policies but policies should be tailored for specific network deployments.Class mapsClass maps are used within a policy map to match packet criteria for further treatmentclass-map match-any match-ef-exp5 description High priority, EF match dscp 46 match mpls experimental topmost 5 end-class-map!class-map match-any match-cs5-exp4 description Second highest priority match dscp 40 match mpls experimental topmost 4 end-class-map!class-map match-any match-video-cs4-exp2 description Video match dscp 32 match mpls experimental topmost 2 end-class-map!class-map match-any match-cs6-exp6 description Highest priority control-plane traffic match dscp cs6 match mpls experimental topmost 6 end-class-map!class-map match-any match-qos-group-1 match qos-group 1 end-class-map!class-map match-any match-qos-group-2 match qos-group 2 end-class-map!class-map match-any match-qos-group-3 match qos-group 3 end-class-map!class-map match-any match-qos-group-6 match qos-group 3 end-class-map!class-map match-any match-traffic-class-1 description ~Match highest priority traffic-class 1~ match traffic-class 1 end-class-map!class-map match-any match-traffic-class-2 description ~Match high priority traffic-class 2~ match traffic-class 2 end-class-map!class-map match-any match-traffic-class-3 description ~Match medium traffic-class 3~ match traffic-class 3 end-class-map!class-map match-any match-traffic-class-6 description ~Match video traffic-class 6~ match traffic-class 6 end-class-map RPD and DPIC interface policy mapsThese are applied to all interfaces connected to cBR-8 DPIC and RPD devices.Note# Egress queueing maps are not supported on L3 BVI interfacesRPD/DPIC ingress classifier policy mappolicy-map rpd-dpic-ingress-classifier class match-cs6-exp6 set traffic-class 1 set qos-group 1 ! class match-ef-exp5 set traffic-class 2 set qos-group 2 ! class match-cs5-exp4 set traffic-class 3 set qos-group 3 ! class match-video-cs4-exp2 set traffic-class 6 set qos-group 6 ! class class-default set traffic-class 0 set dscp 0 set qos-group 0 ! end-policy-map! P2P RPD and DPIC egress queueing policy mappolicy-map rpd-dpic-egress-queuing class match-traffic-class-1 priority level 1 queue-limit 500 us ! class match-traffic-class-2 priority level 2 queue-limit 100 us ! class match-traffic-class-3 priority level 3 queue-limit 500 us ! class match-traffic-class-6 priority level 6 queue-limit 500 us ! class class-default queue-limit 250 ms ! end-policy-map! Core QoSPlease see the general QoS section for core-facing QoS configurationCIN Timing ConfigurationPlease see the G.8275.1 and G.8275.2 timing configuration guides in this document for configuring G.8275.2 on downstream RPD interfaces. Starting in CST 4.0, PTP can be enabled on either physical L3 interfaces or BVI interfaces. PTP is not supported on Bundle Ethernet interfaces.Starting in CST 4.0 it is recommended to use G.8275.1 end to end across the timing domain, and utilize G.8275.2 on specific interfaces using the PTP Multi-Profile configuration outlined in this document. G.8275.1 allows the use of Bundle Ethernet interfaces within the CIN network.PTP Messaging RatesThe following are recommended rate values to be used for PTP messaging. PTP variable IOS-XR configuration value IOS-XE value Announce Interval 1 1 Announce Timeout 5 5 Sync Frequency 16 -4 Delay Request Frequency 16 -4 Example CBR-8 RPD DTI Profileptp r-dti 4 profile G.8275.2 ptp-domain 60 clock-port 1 clock source ip 192.168.3.1 sync interval -4 announce timeout 5 delay-req interval -4 Multicast configurationSummaryWe present two different configuration options based on either native multicast deployment or the use of a L3VPN to carry Remote PHY traffic. The L3VPN option shown uses Label Switched Multicast profile 14 (partitioned mLDP) however profile 6 could also be utilized.Global multicast configuration - Native multicastOn CIN aggregation nodes all interfaces should have multicast enabled.multicast-routing address-family ipv4 interface all enable ! address-family ipv6 interface all enable enable ! Global multicast configuration - LSM using profile 14On CIN aggregation nodes all interfaces should have multicast enabled.vrf VRF-MLDP address-family ipv4 mdt source Loopback0 rate-per-route interface all enable accounting per-prefix bgp auto-discovery mldp ! mdt partitioned mldp ipv4 p2mp mdt data 100 ! ! PIM configuration - Native multicastPIM should be enabled for IPv4/IPv6 on all core facing interfacesrouter pim address-family ipv4 interface Loopback0 enable ! interface TenGigE0/0/0/6 enable ! interface TenGigE0/0/0/7 enable ! ! PIM configuration - LSM using profile 14The PIM configuration is utilized even though no PIM neighbors may be connected.route-policy mldp-partitioned-p2mp set core-tree mldp-partitioned-p2mpend-policy!router pim address-family ipv4 interface Loopback0 enable vrf rphy-vrf address-family ipv4 rpf topology route-policy mldp-partitioned-p2mp mdt c-multicast-routing bgp ! ! IGMPv3/MLDv2 configuration - Native multicastInterfaces connected to RPD and DPIC interfaces should have IGMPv3 and MLDv2 enabledrouter igmp interface BVI100 version 3 ! interface TenGigE0/0/0/25 version 3 !!router mld interface BVI100 version 2 interface TenGigE0/0/0/25 version 3 ! ! IGMPv3/MLDv2 configuration - LSM profile 14Interfaces connected to RPD and DPIC interfaces should have IGMPv3 and MLDv2 enabled as neededrouter igmp vrf rphy-vrf interface BVI101 version 3 ! interface TenGigE0/0/0/15 ! !!router mld vrf rphy-vrf interface TenGigE0/0/0/15 version 2 ! !! IGMPv3 / MLDv2 snooping profile configuration (BVI aggregation)In order to limit L2 multicast replication for specific groups to only interfaces with interested receivers, IGMP and MLD snooping must be enabled.igmp snooping profile igmp-snoop-1!mld snooping profile mld-snoop-1! RPD DHCPv4/v6 relay configurationIn order for RPDs to self-provision DHCP relay must be enabled on all RPD-facing L3 interfaces. In IOS-XR the DHCP relay configuration is done in its own configuration context without any configuration on the interface itself.Native IP / Default VRFdhcp ipv4 profile rpd-dhcpv4 relay helper-address vrf default 10.0.2.3 ! interface BVI100 relay profile rpd-dhcpv4!dhcp ipv6 profile rpd-dhcpv6 relay helper-address vrf default 2001#10#0#2##3 iana-route-add source-interface BVI100 ! interface BVI100 relay profile rpd-dhcpv6 RPHY L3VPNIn this example it is assumed the DHCP server exists within the rphy-vrf VRF, if it does not then additional routing may be necessary to forward packets between VRFs.dhcp ipv4 vrf rphy-vrf relay profile rpd-dhcpv4-vrf profile rpd-dhcpv4-vrf relay helper-address vrf rphy-vrf 10.0.2.3 relay information option allow-untrusted ! inner-cos 5 outer-cos 5 interface BVI101 relay profile rpd-dhcpv4-vrf interface TenGigE0/0/0/15 relay profile rpd-dhcpv4-vrf! cBR-8 DPIC interface configuration without Link HAWithout link HA the DPIC port is configured as a normal physical interfaceinterface TenGigE0/0/0/25 description .. Connected to cbr8 port te1/1/0 service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 carrier-delay up 0 down 0 load-interval 30 cBR-8 DPIC interface configuration with Link HAWhen using Link HA faster convergence is achieved when each DPIC interface is placed into a BVI with a statically assigned MAC address. Each DPIC interface is placed into a separate bridge-domain with a unique BVI L3 interface. The same MAC address should be utilized on all BVI interfaces. Convergence using BVI interfaces is <50ms, L3 physical interfaces is 1-2s.Even DPIC port CIN interface configurationinterface TenGigE0/0/0/25 description ~Connected to cBR8 port Te1/1/0~ lldp ! carrier-delay up 0 down 0 load-interval 30 l2transport !!l2vpn bridge group cbr8 bridge-domain port-ha-0 interface TenGigE0/0/0/25 ! routed interface BVI500 ! ! ! interface BVI500 description ~BVI for cBR8 port HA, requires static MAC~ service-policy input rpd-dpic-ingress-classifier ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 mac-address 8a.9698.64 load-interval 30! Odd DPIC port CIN interface configurationinterface TenGigE0/0/0/26 description ~Connected to cBR8 port Te1/1/1~ lldp ! carrier-delay up 0 down 0 load-interval 30 l2transport !!l2vpn bridge group cbr8 bridge-domain port-ha-1 interface TenGigE0/0/0/26 ! routed interface BVI501 ! ! ! interface BVI501 description ~BVI for cBR8 port HA, requires static MAC~ service-policy input rpd-dpic-ingress-classifier ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 mac-address 8a.9698.64 load-interval 30! cBR-8 Digital PIC Interface Configurationinterface TenGigE0/0/0/25 description .. Connected to cbr8 port te1/1/0 service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 4.4.9.101 255.255.255.0 ipv6 address 2001#4#4#9##101/64 carrier-delay up 0 down 0 load-interval 30 RPD interface configurationP2P L3In this example the interface has PTP enabled towards the RPDinterface TeGigE0/0/0/15 description To RPD-1 mtu 9200 ptp profile g82752_master_v4 ! service-policy input rpd-dpic-ingress-classifier service-policy output rpd-dpic-egress-queuing ipv4 address 192.168.2.0 255.255.255.254 ipv6 address 2001#192#168#2##0/127 ipv6 enable ! BVIl2vpn bridge group rpd bridge-domain rpd-1 mld snooping profile mld-snoop-1 igmp snooping profile igmp-snoop-1 interface TenGigE0/0/0/15 ! interface TenGigE0/0/0/16 ! interface TenGigE0/0/0/17 ! routed interface BVI100 ! ! ! !!interface BVI100 description ... to downstream RPD hosts ptp profile g82752_master_v4 ! service-policy input rpd-dpic-ingress-classifier ipv4 address 192.168.2.1 255.255.255.0 ipv6 address 2001#192#168#2##1/64 ipv6 enable ! RPD/DPIC agg device IS-IS configurationThe standard IS-IS configuration should be used on all core interfaces with the addition of specifying all DPIC and RPD connected as IS-IS passive interfaces. Using passive interfaces is preferred over redistributing connected routes. This configuration is needed for reachability between DPIC and RPDs across the CIN network.router isis ACCESS interface TenGigE0/0/0/25 passive address-family ipv4 unicast ! address-family ipv6 unicast Additional configuration for L3VPN DesignGlobal VRF ConfigurationThis configuration is required on all DPIC and RPD connected routers as well as ancillary elements communicating with Remote PHY elementsvrf rphy-vrf address-family ipv4 unicast import route-target 100#5000 ! export route-target 100#5000 ! ! address-family ipv6 unicast import route-target 100#5000 ! export route-target 100#5000 ! ! BGP ConfigurationThis configuration is required on all DPIC and RPD connected routers as well as ancillary elements communicating with Remote PHY elementsrouter bgp 100 vrf rphy-vrf rd auto address-family ipv4 unicast label mode per-vrf redistribute connected ! address-family ipv6 unicast label mode per-vrf redistribute connected ! address-family ipv4 mvpn ! address-family ipv6 mvpn ! ! cBR-8 Segment Routing ConfigurationIn the CST 4.0 design we introduce Segment Routing on the cBR-8. Configuration of SR on the cBR-8 follows the configuration on other IOS-XE devices. This configuration guide covers only IGP SR-MPLS, and not SR-TE configuration. This allows the cBR-8 to send/receive traffic from other SR-MPLS nodes within the same IGP domain. The cBR-8 can also utilize these paths for BGP next-hop resolution for Global Routing Table (GRT) and BSOD L2VPN/L3VPN services. The following example configuration is for the SUP connection via IS-IS to the provider network, SR is not supported on DPIC interfaces.IS-IS Configurationrouter isis access net 49.0001.0010.0000.0013.00 is-type level-2-only router-id Loopback0 authentication mode md5 level-1 authentication mode md5 level-2 authentication key-chain ISIS-KEY level-1 authentication key-chain ISIS-KEY level-2 metric-style wide fast-flood 10 set-overload-bit on-startup 120 max-lsp-lifetime 65535 lsp-refresh-interval 65000 spf-interval 5 50 200 prc-interval 5 50 200 lsp-gen-interval 5 5 200 log-adjacency-changes segment-routing mpls segment-routing prefix-sid-map advertise-local fast-reroute per-prefix level-2 all fast-reroute ti-lfa level-2 passive-interface Bundle1 passive-interface Loopback0 ! address-family ipv6 multi-topology exit-address-family mpls traffic-eng router-id Loopback0 mpls traffic-eng level-2Segment Routing Configurationsegment-routing mpls ! set-attributes address-family ipv4 sr-label-preferred exit-address-family ! global-block 16000 32000 ! connected-prefix-sid-map address-family ipv4 1.0.0.13/32 index 213 range 1 exit-address-family !!Interface ConfigurationThe connected prefix map is used to advetise the Loopback0 interface as a SR Node SID.interface TenGigabitEthernet4/1/6 description ~Connected to PE4 TenGigE 0/0/0/19~ ip address 4.1.6.1 255.255.255.0 ip router isis access load-interval 30 cdp enable ipv6 address 2001#4#1#6##1/64 ipv6 router isis access mpls ip mpls traffic-eng tunnels isis circuit-type level-2-only isis network point-to-point isis authentication mode md5 isis authentication key-chain ISIS-NCS isis csnp-interval 10 level-1 isis csnp-interval 10 level-2 hold-queue 400 inCloud Native Broadband Network Gateway (cnBNG)See the high level design for more information on Cisco cnBNG solution. Thefollowing covers the configuration of the User Plane router, in this case an ASR9000 router.The following configuring is used for a deployment using IPoE subscribersessions. The configuration of some external elements such as the RADIUSauthentication server are outside the scope of this document. The cnBNG control plane software deployment is also out of scope for this document, please see the cnBNG documentation located at#interface Loopback10 ipv6 enable !cnbng-nal location 0/RSP0/CPU0 hostidentifier ASR9k-1 !! up-server ip should be the ip of UP interface which will be used as source for SCi communication up-server ipv4 113.1.1.1 vrf default !! cp-server ip is the IP of UDP Proxy configuration cp-server primary ipv4 113.1.1.2 auto-loopback vrf default interface Loopback10 primary-address 1.1.1.1 ! ! !! retry-count specifies how many times UP should retry the connection with CP before declaring CP as dead cp-association retry-count 10 secondary-address-update enable!dhcp ipv4 profile cnbng_v4 cnbng ! interface Bundle-Ether12.101 cnbng profile cnbng_v4!dhcp ipv6 profile cnbng_v6 cnbng ! interface Bundle-Ether12.101 cnbng profile cnbng_v6! interface Bundle-Ether12.101 ipv4 point-to-point ipv4 unnumbered Loopback10 ipv6 address 2001##1/64 ipv6 enable load-interval 30 encapsulation dot1q 101 ipsubscriber ipv4 l2-connected initiator dhcp ! ipv6 l2-connected initiator dhcpPseudowire Headend Configuration In this use case subscribers are tunneled to the User Plane using EVPN-VPWS from a remote access node.interface PW-Ether2000 mtu 1518 ipv4 address 17.1.1.1 255.255.255.0 attach generic-interface-list PWHE!interface PW-Ether2000.2000 ipv4 address 182.168.10.1 255.255.255.252 ipv6 address 2000#111#1##1#1/64 ipv6 enable service-policy type control subscriber IPoE_PWHE1 encapsulation dot1q 2000 ipsubscriber ipv4 l2-connected initiator dhcp ! ipsubscriber ipv6 l2-connected initiator dhcp !!dhcp ipv6 profile ipoev6_proxy proxy helper-address vrf default 2001#12#3##2 source-interface Loopback0 ! interface PW-Ether2000.2000 proxy profile ipoev6_proxy !!l2vpn logging pseudowire ! xconnect group pwhe-bng p2p pwhe-bng1 interface PW-Ether2000 neighbor evpn evi 40 target 900 source 900Model-Driven Telemetry ConfigurationSummaryThis is not an exhaustive list of IOS-XR model-driven telemetry sensor paths, but gives some basic paths used to monitor a Converged SDN Transport deployment. Each sensor path may have its own cadence of collection and transmission, but it’s recommended to not use values less than 60s when using many sensor paths.Device inventory and monitoring Metric Sensor path Full inventory via OpenConfig model openconfig-platform#components NCS 540/5500 NPU resources cisco-ios-xr-fretta-bcm-dpa-hw-resources-oper/dpa/stats/nodes/node/hw-resources-datas/hw-resources-data Optics information cisco-ios-xr-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info System uptime cisco-ios-xr-shellutil-oper#system-time/uptime System CPU utilization cisco-ios-xr-wdsysmon-fd-oper#system-monitoring/cpu-utilization Interface Data Metric Sensor path Interface optics state Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info/transport-admin-state OpenConfig interface stats openconfig-interfaces#interfaces Interface data rates, based on load-interval Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/data-rate Interface counters similar to “show int” Cisco-IOS-XR-infra-statsd-oper#infra-statistics/interfaces/interface/latest/generic-counters Full interface information Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface Interface stats Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statistics Subset of interface stats Cisco-IOS-XR-pfi-im-cmd-oper#interfaces/interface-xr/interface/interface-statistics/basic-interface-stats LLDP Monitoring Metric Sensor path All LLDP Info Cisco-IOS-XR-ethernet-lldp-oper#lldp LLDP neighbor info Cisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/neighbors LLDP statistics Cisco-IOS-XR-ethernet-lldp-oper#lldp/nodes/node/statistics Aggregate bundle information (use interface models for interface counters) Metric Sensor path OpenConfig LAG information openconfig-if-aggregate#aggregate OpenConfig LAG state only openconfig-if-aggregate#aggregate/state OpenConfig LACP information openconfig-lacp#lacp Cisco full bundle information Cisco-IOS-XR-bundlemgr-oper#bundles Cisco BFD over Bundle stats Cisco-IOS-XR-bundlemgr-oper#bundle-information/bfd-counters Cisco Bundle data Cisco-IOS-XR-bundlemgr-oper#lacp-bundles/bundles/bundle/data Cisco Bundle member data Cisco-IOS-XR-bundlemgr-oper#lacp-bundles/bundles/bundle/members PTP and SyncE Information Metric Sensor path PTP servo status Cisco-IOS-XR-ptp-oper#ptp/platform/servo/device-status PTP servo statistics Cisco-IOS-XR-ptp-oper#ptp/platform/servo PTP foreign master information Cisco-IOS-XR-ptp-oper#ptp/interface-foreign-masters PTP interface counters, key is interface name Cisco-IOS-XR-ptp-oper#ptp/interface-packet-counters Frequency sync info Cisco-IOS-XR-freqsync-oper#frequency-synchronization/summary/frequency-summary SyncE interface information, key is interface name Cisco-IOS-XR-freqsync-oper#frequency-synchronization/interface-datas/interface-data BGP Information Metric Sensor path   BGP established neighbor count across all AF Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/vrfs/vrf/process-info/global/established-neighbors-count-total   BGP total neighbor count Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/vrfs/vrf/process-info/global/neighbors-count-total   BGP prefix SID count Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/vrfs/vrf/process-info/global/prefix-sid-label-index-count   BGP total VRF count including default VRF Cisco-IOS-XR-ipv4-bgp-oper#process-info/ipv4-bgp-oper#global/ipv4-bgp-oper#total-vrf-count   BGP convergence Cisco-IOS-XR-ipv4-bgp-oper#bgp/instances/instance/instance-active/default-vrf/afs/af/af-process-info/performance-statistics/global/ has-converged BGP IPv4 route count Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-ext/active-routes-count   OpenConfig BGP information openconfig-bgp#bgp   OpenConfig BGP neighbor info only openconfig-bgp#bgp/neighbors   IS-IS Information Metric Sensor path IS-IS neighbor info sensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighbors IS-IS interface info sensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/interfaces IS-IS adj information sensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/adjacencies IS-IS neighbor summary sensor-path Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighbor-summaries IS-IS node count Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/topologies/topology/topology-levels/topology-level/topology-summary/router-node-count/reachable-node-count IS-IS adj state Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/levels/level/adjacencies/adjacency/adjacency-state IS-IS neighbor count Cisco-IOS-XR-clns-isis-oper#isis/instances/instance/neighbor-summaries/neighbor-summary/level2-neighbors/neighbor-up-count IS-IS total route count Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2/active-routes-count Routing protocol RIB information Metric Sensor path IS-IS L1 Info Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1 IS-IS L2 Info Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2 IS-IS Summary Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sum Total route count per protocol Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count IPv6 IS-IS L1 info Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l1 IPv6 IS-IS L2 info Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-l2 IPv6 IS-IS summary Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-isis-sum IPv6 total route count per protocol Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/proto-route-count BGP RIB informationIt is not recommended to monitor these paths using MDT with large tables Metric Sensor path OC BGP RIB openconfig-rib-bgp#bgp-rib IPv4 BGP RIB Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-ext IPv4 BGP RIB Cisco-IOS-XR-ip-rib-ipv4-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-int IPv6 BGP RIB Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-ext IPv6 BGP RIB Cisco-IOS-XR-ip-rib-ipv6-oper#rib/rib-table-ids/rib-table-id/summary-protos/summary-proto/rtype-bgp-int Routing policy Information Metric Sensor path Routing policy information Cisco-IOS-XR-policy-repository-oper#routing-policy/policies Ethernet CFM Metric Sensor path Ethernet CFM MA/MEP information Cisco-IOS-XR-ethernet-cfm-oper#cfm/global/maintenance-points/maintenance-point EVPN Information Metric Sensor path EVPN information Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/evpn-summary Total EVPN Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/evpn-summary/total-count EVPN total ES entries Cisco-IOS-XR-evpn-oper#evpn/active/summary/es-entries EVPN local Eth Auto Discovery routes Cisco-IOS-XR-evpn-oper#evpn/active/summary/local-ead-routes EVPN remote Eth Auto Discovery routes Cisco-IOS-XR-evpn-oper#evpn/active/summary/remote-ead-routes EVPN summary Cisco-IOS-XR-evpn-oper#evpn/nodes/node/summary EVPN neighbor information Cisco-IOS-XR-evpn-oper#evpn/nodes/node/evi-detail/evi-children/neighbors/neighbor EVPN EAD information Cisco-IOS-XR-evpn-oper#evpn/nodes/node/evi-detail/evi-children/ethernet-auto-discoveries/ethernet-auto-discovery Per-Interface QoS Statistics Information Metric Sensor path Input stats Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/ General QoS Stats Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/general-stats Per-queue stats Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/statistics/class-stats/queue-stats-array General service policy information, keys are policy name and interface applied Cisco-IOS-XR-qos-ma-oper#qos/interface-table/interface/input/service-policy-names Per-Policy, Per-Interface, Per-Class statisticsSee sensor path name for detailed information on data leafs Metric Sensor path Per-class matched data rate Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/match-data-rate Pre-policy Matched Bytes Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/pre-policy-matched-bytes Pre-policy Matched Packets Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/pre-policy-matched-packets Dropped bytes per class Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/total-drop-bytes Total dropped packets Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/total-drop-packets Drop rate Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/total-drop-rate Transmit rate Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/total-transmit-rate Per-class transmitted bytes Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/general-stats/transmit-bytes Queue current length Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/queue-instance-length/value Queue max length units Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/queue-max-length/unit Queue max length value Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/queue-max-length/value WRED dropped bytes Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/random-drop-bytes WRED dropped packets Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/random-drop-packets Tail dropped packets per class Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/tail-drop-bytes Tail dropped bytes per class Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/queue-stats-array/tail-drop-packets State per policy instance Cisco-IOS-XR-qos-ma-oper#qos/nodes/node/policy-map/interface-table/interface/input/service-policy-names/service-policy-instance/statistics/class-stats/shared-queue-id L2VPN Information Metric Sensor path L2VPN general forwarding information including EVPN and Bridge Domains Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary Bridge domain information Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/bridge-domain-summary Total BDs active Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/bridge-domain-summary/bridge-domain-count Total BDs using EVPN Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/bridge-domain-summary/bridge-domain-with-evpn-enabled Total MAC count (Local+remote) Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/mac-summary/mac-count L2VPN xconnect Forwarding information Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/xconnect-summary Xconnect total count Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnect-summary/number-xconnects Xconnect down count Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnect-summary/number-xconnects-down Xconnect up count Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnect-summary/number-xconnects-up Xconnect unresolved Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnect-summary/number-xconnects-unresolved Xconnect with down attachment circuits Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-summary/xconnect-summary/ac-down-count-l2vpn Per-xconnect detailed information including state xconnect group and name are keys# Cisco-IOS-XR-l2vpn-oper#l2vpnv2/active/xconnects/xconnect L2VPN bridge domain specific information, will have the BD name as a key Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-bridge-domains/l2fib-bridge-domain L2VPN EVPN IPv4 MAC/IP information Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-evpn-ip4macs L2VPN EVPN IPv6 MAC/IP information Cisco-IOS-XR-l2vpn-oper#l2vpn-forwarding/nodes/node/l2fib-evpn-ip6macs L3VPN Information Metric Sensor path Per-VRF detailed information Cisco-IOS-XR-mpls-vpn-oper#l3vpn/vrfs/vrf SR-PCE PCC and SR Policy Information Metric Sensor path PCC to PCE peer information Cisco-IOS-XR-infra-xtc-agent-oper#pcc/peers SR policy summary info Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-summary Specific SR policy information Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-summary/configured-down-policy-count Specific SR policy information Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-summary/configured-total-policy-count Specific SR policy information Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-summary/configured-up-policy-count SR policy information, key is SR policy name Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policies/policy SR policy forwarding info including packet and byte stats per candidate path, key is policy name and candidate path Cisco-IOS-XR-infra-xtc-agent-oper#xtc/policy-forwardings MPLS performance measurement Metric Sensor path Summary info Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/summary Interface stats for delay measurements Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/summary/delay-summary/interface-delay-summary/delay-transport-counters/generic-counters Interface stats for loss measurement Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/summary/loss-summary/interface-loss-summary Parent interface oper data sensor path Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/interfaces Delay values for each probe measurement Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/interfaces/delay/interface-last-probes Delay values aggregated at computation interval Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/interfaces/delay/interface-last-aggregations Delay values aggregated at advertisement interval Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/interfaces/delay/interface-last-advertisements SR Policy measurement for delay and liveness Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/sr-policies SR Policy delay Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/sr-policies/sr-policy-delay SR Policy liveness detection Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/sr-policies/sr-policy-liveness SR Policy PM Details Cisco-IOS-XR-perf-meas-oper#performance-measurement/nodes/node/sr-policies/sr-policy-details mLDP Information Metric Sensor path mLDP LSP count Cisco-IOS-XR-mpls-ldp-mldp-oper#mpls-mldp/active/default-context/context/lsp-count mLDP peer count Cisco-IOS-XR-mpls-ldp-mldp-oper#mpls-mldp/active/default-context/context/peer-count mLDP database info, where specific LSP information is stored Cisco-IOS-XR-mpls-ldp-mldp-oper#mpls-mldp/active/default-context/databases/database ACL Information Metric Sensor path Details on ACL resource consumption Cisco-IOS-XR-ipv4-acl-oper#ipv4-acl-and-prefix-list/oor/access-list-summary/details/current-configured-ac-es OpenConfig full ACL information openconfig-acl#acl ", "url": "/blogs/latest-converged-sdn-transport-ig", "author": "Phil Bedard", "tags": "iosxr, cisco, 5G, cin, rphy, Metro, Design" } , "#": {} , "blogs-zr-openconfig-mgmt": { "title": "Managing OpenZR+ and OIF ZR transceivers on Cisco routers using OpenConfig", "content": " On This Page Revision History Routed Optical Networking Pluggable Digital Coherent Optics OIF 400ZR and OpenZR+ Standards using QSFP-DD Transceivers Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S) Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S) Cisco Hardware Support for 400G ZR/ZR+ DCO Transceivers Cisco 8000 NCS 500 Cisco ASR 9000 NCS 5500 and NCS 5700 Optical Provisioning Parameters Operational Mode Details OpenConfig OpenConfig Models for DCO provisioning Note on Operational Mode Discovery OpenConfig Platform Component Component Optical Provisioning Parameters OpticalChannel Component Example Openconfig Terminal Device Logical Channel Configuration Traditional Muxponder Use Case Pluggable in Router Use Case OpenConfig Provisioning Examples Standard OIF 400G ZR Example 300G Line Rate Configuration OpenConfig Monitoring Examples Using NETCONF Optical Channel Information Physical Channel Information Using gNMI gNMI GET for OpticalChannel Data gNMI Subscription for OpticalChannel Data Appendix Example XML NETCONF config for other ZR+ configuration modes ZR+ 1x400G 16QAM ZR+ 1x100G QPSK ZR+ 2x100G QPSK ZR+ 4x100G 16QAM Verification of OpenConfig in XR CLI IOS-XR CLI Operational Data Revision History Version Date Comments 1.0 10/10/2022 Initial Publication Routed Optical NetworkingRouted Optical Networking introduced by Cisco in 2020 introduced a fundamental shift in how IP+Optical networks are built. Collapsing previously disparate network layers and services into a single unified domain, Routed Optical Networking simplifies operations and lowers overall network TCO. More information on Routed Optical Networking can be found at the following locations# https#//www.cisco.com/c/en/us/solutions/service-provider/routed-optical-networking.html https#//xrdocs.io/latest-routed-optical-networking-hldIn this blog we will discuss one major component of Routed Optical Networking,the pluggable digital coherent optics, and how they are managed using openmodels from the OpenConfig consortium. Management includes both provisioning thetransceivers as well as monitoring them via telemetry. OpenConfig support is found in IOS-XR 7.7.1 or later across all IOS-XR routers supporting ZR/ZR+ DCO transceivers. As you will see, the optical provisioning is distinct from the IP interface configuration and can be configured independently.We will focus primarily on constructs such as OpenConfig YANG models andprovisioning via NETCONF or gNMI. For users looking for a more UI-driven approach tomanaging Routed Optical Networking services, the Crosswork HierarchicalController application provides a point and click user interface, but still usingopen models to interface with Cisco routers. More information on the Crosswork family of products can be found at https#//www.cisco.com/c/en/us/products/cloud-systems-management/crosswork-network-automation/index.htmlPluggable Digital Coherent OpticsOne of the foundations of Routed Optical Networking is the use of small formfactor pluggable digital coherent optics. These optics can be used in a widevariety of network applications, reducing CapEx/OpEx cost and reducingcomplexity vs. using traditional external transponder equipment.OIF 400ZR and OpenZR+ Standards using QSFP-DD TransceiversThe networking industry saw a point to improve network efficiency by shiftingcoherent DWDM functions to router pluggables. Technology advancements haveshrunk the DCO components into the standard QSFP-DD form factor, meaning nospecialized hardware and the ability to use the highest capacity routersavailable today. ZR/OpenZR+ QSFP-DD optics can be used in the same ports as thehighest speed 400G non-DCO transceivers.Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S)Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S)Two industry optical standards have emerged to cover a variety of use cases. TheOIF created the 400ZR specification,https#//www.oiforum.com/technical-work/hot-topics/400zr-2 as a 400G interopablestandard for metro reach coherent optics. The industry saw the benefit of theapproach, but wanted to cover longer distances and have flexibility inwavelength rates, so the OpenZR+ MSA was created, https#//www.openzrplus.org.The following table outlines the specs of each standard. ZR400 and OpenZR+ transceivers are tunable across the ITU C-Band, 196.1 To 191.3 THz.The following part numbers are used for Cisco’s ZR400 and OpenZR+ MSA transceivers Standard Part 400ZR QDD-400G-ZR-S OpenZR+ QDD-400G-ZRP-S The Cisco datasheet for these transceivers can be found at https#//www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/datasheet-c78-744377.htmlCisco Hardware Support for 400G ZR/ZR+ DCO TransceiversCisco supports the OpenZR+ and OIF ZR transceivers across all IOS-XR productlines with 400G QSFP-DD ports, including the ASR 9000, NCS 540, NCS 5500, NCS5700, and Cisco 8000. Please see the Routed Optical Networking Design or theindividual product pages below for more information on each platform.Cisco 8000https#//www.cisco.com/c/en/us/products/collateral/routers/8000-series-routers/datasheet-c78-742571.htmlNCS 500https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-500-series-routers/ncs-540-large-density-router-ds.htmlCisco ASR 9000https#//www.cisco.com/c/en/us/products/collateral/routers/asr-9000-series-aggregation-services-routers/data_sheet_c78-501767.htmlNCS 5500 and NCS 5700https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-736270.htmlhttps#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-744698.htmlOptical Provisioning ParametersOptical transceivers are responsible for taking information on their electrical “host” interface and translating it into a format suitable for transmission across an analog medium, and vice versa. Thus the name “transceiver”. The aforementioned standards bodies have defined the electrical host interface and optical line interface specifications. The resulting configuration of those internal transceiver interfaces and parameters are driven by user configuration. The following represents the user-configurable attributes for Cisco ZR/ZR+ DCO transceivers. Parameter Units Meaning Output Frequency Hz Frequency is another method to define the DWDM wavelength being used on the line side of the transceiver Transmit Power dBm The transmit power defines the signal power level. dBm is the power ratio of dB referenced to 1mW using the expression dBm = 10log(mW). As an example 0dBm = 1mW, -3dBm=.50mW, +3dBM=2mW Line Rate Gbps This is the output trunk rate of the signal, and may be determined by configuration or implicitly by the number of channels assigned Operational Mode Integer The operational mode is an integer representing optical parameters specific to the transceiver. This includes settings such as the line rate, modulation, FEC type, and other vendor specific settings. The Frequency, Line Rate, and Operational Mode are required components. TheTransmit Power is optional, a default power will be used based on theoperational mode if none is supplied.Operational Mode DetailsIt’s worth expanding on the role of the “Operational Mode” used in provisioning the transceivers. Cisco has defined a set of integer values used provision the QDD-400G-ZRP-S and QDD-400G-ZR-S optics based on standard parameters and Cisco Acacia specific parameters. The following table lists these modes. PID Operational Mode Line Rate FEC Type Modulation Baud Rate Pulse Shaping QDD-400G-ZR-S 5003 400 cFEC 16QAM 59.84 No QDD-400G-ZRP-S 5004 400 cFEC 16QAM 59.84 No QDD-400G-ZRP-S 5005 400 oFEC 16QAM 60.14 Yes QDD-400G-ZRP-S 5006 400 oFEC 16QAM 60.14 No QDD-400G-ZRP-S 5007 300 oFEC 8QAM 60.14 Yes QDD-400G-ZRP-S 5008 300 oFEC 8QAM 60.14 No QDD-400G-ZRP-S 5009 200 oFEC QPSK 60.14 Yes QDD-400G-ZRP-S 5010 200 oFEC QPSK 60.14 No QDD-400G-ZRP-S 5011 200 oFEC 8QAM 40.10 Yes QDD-400G-ZRP-S 5012 200 oFEC 16QAM 30.08 Yes QDD-400G-ZRP-S 5013 100 oFEC QPSK 30.08 No OpenConfigTaken from https#//www.openconfig.netOpenConfig defines and implements a common, vendor-independent software layer for managing network devices. OpenConfig operates as an open source project with contributions from network operators, equipment vendors, and the wider community. OpenConfig is led by an Operator Working Group consisting of network operators from multiple segments of the industry.OpenConfig is advancing the paradigm of an abstract set of YANG models used toperform device configuration and monitoring regardless of vendor. Cisco hasworked with the OpenConfig consortium since its inception to implement theseopen community models across IOS-XR, IOS-XE, and NX-OS devices. In IOS-XR 7.7.1more than 100 OpenConfig models and sub-models are implemented covering a widevariety of network configuration including device management, routing protocols,and optical transceiver configuration. We will focus on the models specific toconfiguring the ZR/ZR+ DCO transceivers.The official repository for all OpenConfig models can be found at https#//github.com/openconfig/public/OpenConfig Models for DCO provisioningThis list is only the parent models utilized, and does include imported models. Model Use openconfig-terminal-device Primary model used to configure input interface to output line port structure and add optical parameters to oc-platform openconfig-platform Used to provision optical channel parameters and for monitoring optical channel state openconfig-platform-transceiver Used for monitoring physical channel state data such as RX/TX power, and output frequency Note on Operational Mode DiscoveryThere is new work in OpenConfig to enable the discovery of the operational modes dynamically from the device/transceiver. As of this writing it’s still a relatively new concept and has not been implemented in IOS-XR. This is implemented through the openconfig-terminal-device-properties model. Once implemented a management application can learn the supported optical parameters and constraints to be used in path calculation and provisioning.OpenConfig Platform ComponentThe optical parameters used to provision the parent optical-channel andsubsequent physical channel are applied at the component level of theopenconfig-platform model. The OpticalChannel component type is a logicalcomponent with a 1#1 correlation to a physical port. In Cisco routers TheOpticalChannel component is populated when a transceiver capable of supportingit is inserted.The OpticalChannel will always be represented as[Rack]/[Slot]-OpticalChannel[Rack][Slot][Instance][Port]. The rack componentwill always be 0. As an example on the 8201-32FH the OpticalChannel for port 20is represented as 0/0-OpticalChannel0/0/0/20. On the NCS-57C3-MOD router with aQSFP-DD MPA in MPA slot 3 and DCO transceiver in Port 3 the OpticalChannel is0/0-OpticalChannel0/0/3/2. On a Cisco 9904 modular router with a A9K-8HG-FLEX-TRline card in slot 1 and DCO transceiver in port 0, the OpticalChannel is0/1-OpticalChannel-0/1/0/0.Component Optical Provisioning Parameters Parameter Units frequency Mhz target-output-power dBm to two decimal places expressed in increments of .01dBm (+1dBM=100) operational-mode Integer OpticalChannel Component ExampleQDD-400G-ZRP-S in port 0/0/0/10 on Cisco 8201<components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/10</name> <config> <name>0/0-OpticalChannel0/0/0/10</name> </config> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <target-output-power>-10.00</target-output-power> <frequency>196100000</frequency> <operational-mode>5005</operational-mode> </config> </optical-channel> </component></components>Openconfig Terminal DeviceIn the context of optical device provisioning, one OpenConfig model used is theTerminal Device model. The original intent of the model was to provisionexternal optical transponders, and has been implemented by Cisco for use withthe Cisco NCS 1004 muxponder. The model has been recently enhanced tocover the router pluggable DCO use cases where the “clients” are not physicalexternal facing ports, but internal to the host router and always associatedwith a single external line facing interface. The Terminal Device modelaugments the Platform model to add the additional optical provisioningconfiguration parameters to the OpticalChannel component type.Logical Channel ConfigurationThe logical channel has several configuration components, which will be the same across all similar configurations.Each logical-channel created must be assigned an integer value. It is up to the user to determine the best overall values to use, but the values should not overlap between configuration on two different ports.Traditional Muxponder Use CaseA traditional muxponder maps client physical interfaces to framed outputtimeslots, which can then be further aggregated or mapped to a physical outputchannel on the DWDM line side. There is no connection between the client portand line port until the mapping is created. The Terminal Device model followsthis structure by using a hierarchical structure of channels from client toeventually output line port. Physical client channels are mapped to intermediatelogical channels, which are ultimately mapped to a physical line output channel.The model is flexible based on the multiplexing/aggregation required.The example below shows the mapping for a 2x100G muxponder application where thetwo client ports each map to a 100G logical channel, those map to a 200G logicalchannel, and ultimately to a 200G line port associated with the output opticalchannel. Note the numbers assigned to the logical channels are arbitraryintegers.Pluggable in Router Use CaseIn the case where a pluggable coherent optic is inserted into a router, thehierarchical model can be simplified. In the traditional muxponder use caseabove, there is a physical client transceiver with its own properties which mustbe mapped into an intermediate logical channel. In the case of a routerpluggable, there is no physical client component, only the logical componentsassociated with the host side of the DCO transceiver. In Cisco routers, it isrepresented as one more Ethernet interfaces depending on the configuration.Looking at a picture of the two optics is helpful in showing how the hierarchical structure is configured for the DCO optics.The example below shows a similar 200G application, but instead of two clientphysical ports, there are two HundredGigE interfaces created which areimplicitly connected to the line port since they are integrated into the sametransceiver. This is a fundamental difference from the muxponder use case wherethere is no implicit mapping between client and output port. The host side Ethernet interfaces of the DCO cannot be mapped to another line port.Note this example is only possible with the OpenZR+ transceiver since itsupports line rates of 100G, 200G, 300G, and 400G where the OIF 400ZR onlysupports 400G.OpenConfig Provisioning ExamplesThe following examples are used to illustrate the complete provisioning payloads used. The payloads are given in XML for use with NETCONF, supported by all IOS-XR routers. We will go through two examples in detail and then the rest for the standard modes will provided in the appendix.Standard OIF 400G ZR ExampleOIF 400ZR transceivers, Cisco PID QDD-400G-ZR-S, can be configured in either 1x400G or 4x100G mode. In this example we will show the 1x400G mode, which is the most common configuration. The details of the configuration are# Router Type Port Used Operational Mode Frequency TX Power 8201-32FH 0/0/0/20 5003 194300000 -100 Note the 5003 operational mode code which can be expanded as# PID Rate FEC Modulation Baud Rate Pulse Shaping QDD-400G-ZR-S 400G cFEC 16QAM 60.14 No <config> <terminal-device xmlns=~http#//openconfig.net/yang/terminal-device~> <logical-channels> <channel> <index>100</index> <config> <index>100</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_400G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_400GE</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>400</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>200</index> <config> <index>200</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>400</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments> </channel> </logical-channels> </terminal-device> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <target-output-power>-100</target-output-power> <operational-mode>5003</operational-mode> <frequency>194300000</frequency> </config> </optical-channel> </component> </components> </config>Let’s examine some specific portions of the config in more detail. <logical-channels> <channel> <index>100</index> <config> <index>100</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_400G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_400GE</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config>Here we create the first logical channel, associated with the host Ethernet interface, FourHundredGigE0/0/0/20. Since our application has a single 400G interface, the following is configured. The types following the idx# YANG component are defined in openconfig-transport-types.yang. User-Defined Index Tributary Rate Class Tributary Protocol Channel Type 100 400G 400GE ETHERNET Next we must map this logical-channel to either a parent logical-channel or output OpticalChannel. Cisco uses a specific “CoherentDSP” interface to represent the framing layer of the DCO transceiver, so there is a parent logical channel representing that layer of the connection. In this case I have a single 400G child interface, so all 400G is mapped to the parent logical channel.<logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>400</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment></logical-channel-assignments></channel>Next we must define the parent logical-channel associated with the internal interface CoherentDSP0/0/0/20, and map that to the output OpticalChannel associated with the physical port. The rate is configured as 400 to represent 400G.<channel> <index>200</index> <config> <index>200</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>400</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments></channel>We will use the PROT_OTN encapsulation type for the channel, even though it’s not technically a traditional G.709 OTN frame. User-Defined Index Channel Type 200 PROT_OTN This completes the configuration of the mappings between logical Ethernet and physical output port. Now we must configure the optical parameters using the openconfig-platform model.<components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <target-output-power>-100</target-output-power> <operational-mode>5003</operational-mode> <frequency>194300000</frequency> </config> </optical-channel> </component></components>As you can see the configuration is relatively straightforward, applying the target-output-power, operational-mode, and frequency configuration.300G Line Rate ConfigurationWhen we configure ZR+ optics in a 300G line rate configuration, we must map individual 100G channels and Ethernet interfaces to a parent 300G container. Thereis no 300G Ethernet interface type defined, and based on how modern router NPUs are designed they are not typically well suited for creating intermediate containers of arbitrary size. The same is true of the 200G line rate as well.Parameters of configuration are# Router Type Port Used Operational Mode Frequency TX Power 8201-32FH 0/0/0/20 5007 195200000 Default Note the 5007 operational mode code which can be expanded as# PID Rate FEC Modulation Baud Rate Pulse Shaping QDD-400G-ZRP-S 300G oFEC 8QAM 60.14 Yes The full XML payload is#<config> <terminal-device xmlns=~http#//openconfig.net/yang/terminal-device~> <logical-channels> <channel> <index>100</index> <config> <index>200</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>101</index> <config> <index>101</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>102</index> <config> <index>102</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>200</index> <config> <index>200</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>300</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments> </channel> </logical-channels> </terminal-device> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <operational-mode>5007</operational-mode> <frequency>195200000</frequency> </config> </optical-channel> </component> </components> </config>First we will inspect the channel configuration for the child logical channel associated with the router Ethernet interface HundredGigE0/0/0/20/0. Inspecting the first one we see the following#<channel> <index>201</index> <config> <index>201</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>100</logical-channel> </config> </assignment> </logical-channel-assignments></channel>Here are the attributes used for the logical channel configuration# User-Defined Index Tributary Rate Class Tributary Protocol Channel Type     100 100G 100G_MLG ETHERNET   101 100G 100G_MLG ETHERNET   102 100G 100G_MLG ETHERNET Note the Tributary Protocol type is now 100G_MLG. MLG stands for Multi-Link Group meaning this logical-channel is part of a larger MLG. The logical channel is still mapped to the upstream CoherentDSP0/0/0/20 logical-channel representing the channel responsible for multiplexing the child signals into a single output frame. Details on how this is done in OpenZR+ can be found in the OpenZR+ specifications athttps#//openzrplus.org. When IOS-XR receives the payload it will use the structured channel assignment information to properly allocate the child Ethernet interfaces.The second logical channel associated with HundredGigE0/0/0/20/1 is similar with the only difference being the index of 101 instead of 100.<channel> <index>202</index> <config> <index>202</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>100</logical-channel> </config> </assignment> </logical-channel-assignments></channel>The Coherent DSP level logical-channel configuration is similar to the first example, except the allocation is now configured as 300 instead of 400 to reflect the 300G line rate.<channel> <index>100</index> <config> <index>100</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>300</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments></channel>The OpticalChannel configuration is also similar with the exception the target-output-power setting has been omitted. In this case the device default power of -1000 (-10dBM) will be used.<components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <operational-mode>5007</operational-mode> <frequency>195200000</frequency> </config> </optical-channel> </component></components>OpenConfig Monitoring ExamplesThe optics may also be monitored using the same OpenConfig models used for provisioning, as in OpenConfig models both config and state have leafs in the same model. We will look at two methods for retrieving operational state data, using a NETCONF GET and using GNMi which can be used in different ways to retrieve operational state data.Using NETCONFOptical Channel InformationRequest from openconfig-platform for OpticalChannel 0/0-OpticalChannel0/0/0/8 associated with port 0/0/0/8.<get xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <filter> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name/> <name>0/0-OpticalChannel0/0/0/8</name> </component> </components> </filter></get>Response<component> <name>0/0-OpticalChannel0/0/0/1</name> <config> <name>0/0-OpticalChannel0/0/0/1</name> </config> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <frequency>1600</frequency> <target-output-power>0.00</target-output-power> </config> <state> <target-output-power>0.00</target-output-power> </state> <extended xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-openconfig-terminal-device-ext~> <state> <optics-cd-low-threshold>0</optics-cd-low-threshold> <optics-cd-high-threshold>0</optics-cd-high-threshold> </state> </extended> </optical-channel> </component> <component> <name>0/0-OpticalChannel0/0/0/8</name> <config> <name>0/0-OpticalChannel0/0/0/8</name> </config> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <target-output-power>-115</target-output-power> <frequency>193700000</frequency> <operational-mode>5005</operational-mode> </config> <state> <target-output-power>-10.64</target-output-power> <frequency>193700000</frequency> <chromatic-dispersion> <instant>-2</instant> <interval>30000000000</interval> <min>-4</min> <avg>-2</avg> <max>0</max> <min-time>1664642812995785263</min-time> <max-time>1664642792995781327</max-time> </chromatic-dispersion> <second-order-polarization-mode-dispersion> <instant>37.00</instant> <interval>30000000000</interval> <min>35.00</min> <avg>39.00</avg> <max>42.00</max> <min-time>1664642817995790775</min-time> <max-time>1664642802995783064</max-time> </second-order-polarization-mode-dispersion> <polarization-dependent-loss> <instant>1.10</instant> <interval>30000000000</interval> <min>1.10</min> <avg>1.12</avg> <max>1.20</max> <min-time>1664642790996011456</min-time> <max-time>1664642812995785263</max-time> </polarization-dependent-loss> <operational-mode>5005</operational-mode> </state> <extended xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-openconfig-terminal-device-ext~> <state> <optics-cd-min>-13000</optics-cd-min> <optics-cd-max>13000</optics-cd-max> <optics-cd-low-threshold>-160000</optics-cd-low-threshold> <optics-cd-high-threshold>160000</optics-cd-high-threshold> </state> </extended> </optical-channel> </component>Physical Channel InformationAdditional data from physical channel located as part of the openconfig-platform transceiver data. This is associated with the physical optics port referenced by 0/0-Optics0/0/0/8. ZR/ZR+ optics will always have a single physical channel.Request <get xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~> <filter> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name/> <name>0/0-Optics0/0/0/8</name> </component> </components> </filter></get>Response <component> <name>0/0-Optics0/0/0/8</name> <transceiver xmlns=~http#//openconfig.net/yang/platform/transceiver~> <physical-channels> <channel> <index>1</index> <config> <index>1</index> </config> <state> <index>1</index> <laser-bias-current> <instant>65.67</instant> <interval>30000000000</interval> <min>0.07</min> <avg>0.07</avg> <max>0.07</max> <min-time>1664642790996011456</min-time> <max-time>1664642790996011456</max-time> </laser-bias-current> <output-power> <instant>-10.64</instant> <interval>30000000000</interval> <min>-10.76</min> <avg>-10.74</avg> <max>-10.69</max> <min-time>1664642790996011456</min-time> <max-time>1664642817995790775</max-time> </output-power> <input-power> <instant>-6.25</instant> <interval>30000000000</interval> <min>-6.34</min> <avg>-6.27</avg> <max>-6.21</max> <min-time>1664642812995785263</min-time> <max-time>1664642790996011456</max-time> </input-power> <output-frequency>193700000</output-frequency> </state> </channel> </physical-channels> <state> <present>PRESENT</present> <form-factor xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#OTHER</form-factor> <date-code>2021-01-09T00#00#00Z+00#00</date-code> <vendor-rev>01</vendor-rev> <serial-no>ACA2501003X</serial-no> <vendor-part>DP04QSDD-E30-19E</vendor-part> <vendor>CISCO-ACACIA</vendor> <connector-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#LC_CONNECTOR</connector-type> <otn-compliance-code xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#OTN_UNDEFINED</otn-compliance-code> <sonet-sdh-compliance-code xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#SONET_UNDEFINED</sonet-sdh-compliance-code> <fault-condition>false</fault-condition> </state> </transceiver> </component>Using gNMIgNMI represents a modern method to manage configuration as well as retrievestate data. gNMI data can be retrieved using different methods including asingle GET request or through a subscription. The subscription type can be oftypes ONCE, STREAM, or POLL. The subsequent mode of the stream can be be SAMPLE,ON_CHANGE, or TARGET_DEFINED. The subscription type and mode commonly used forcontinuous monitoring is STREAM and SAMPLE. SAMPLE also includes a period valuespecifying the interval at which the device sends data. Note since we are using the same models, the data will be identical to the NETCONF example.We will utilize the gNMIc utility found at https#//gnmic.kmrd.dev/ for gNMI examples.gNMI GET for OpticalChannel DataRequestgnmic -a 172.29.11.20#57733 -u admin -p password --insecure --timeout 1m --encoding JSON_IETF get --path 'openconfig-platform#components/component[name='0/0-OpticalChannel0/0/0/8']'Response[ { ~source~# ~172.29.11.20#57733~, ~timestamp~# 1664644343717885105, ~time~# ~2022-10-01T13#12#23.717885105-04#00~, ~updates~# [ { ~Path~# ~openconfig-platform#components/component[name=0/0-OpticalChannel0/0/0/8]~, ~values~# { ~components/component~# { ~config~# { ~name~# ~0/0-OpticalChannel0/0/0/8~ }, ~openconfig-terminal-device#optical-channel~# { ~Cisco-IOS-XR-openconfig-terminal-device-ext#extended~# { ~state~# { ~optics-cd-high-threshold~# 160000, ~optics-cd-low-threshold~# -160000, ~optics-cd-max~# 13000, ~optics-cd-min~# -13000 } }, ~config~# { ~frequency~# 193700000, ~operational-mode~# 5005, ~target-output-power~# ~-115~ }, ~state~# { ~chromatic-dispersion~# { ~avg~# ~-1~, ~instant~# ~-2~, ~interval~# ~30000000000~, ~max~# ~0~, ~max-time~# ~1664644292995782101~, ~min~# ~-2~, ~min-time~# ~1664644307995794039~ }, ~frequency~# 193700000, ~operational-mode~# 5005, ~polarization-dependent-loss~# { ~avg~# ~1.09~, ~instant~# ~1.10~, ~interval~# ~30000000000~, ~max~# ~1.10~, ~max-time~# ~1664644290995999741~, ~min~# ~1.00~, ~min-time~# ~1664644317995786022~ }, ~second-order-polarization-mode-dispersion~# { ~avg~# ~42.37~, ~instant~# ~51.00~, ~interval~# ~30000000000~, ~max~# ~52.00~, ~max-time~# ~1664644297995784225~, ~min~# ~39.00~, ~min-time~# ~1664644307995794039~ }, ~target-output-power~# ~-10.75~ } } } } } ] }]gNMI Subscription for OpticalChannel Datagnmic -a 172.29.11.20#57733 -u cisco -p cisco --insecure --timeout 1h --encoding JSON_IETF subscribe --path 'openconfig-platform#components/component[name='0/0-OpticalChannel0/0/0/8']' --mode stream --stream-mode sample --sample-interval 30s [ { ~source~# ~172.29.11.20#57733~, ~subscription-name~# ~default-1664645644~, ~timestamp~# 1664645651219000000, ~time~# ~2022-10-01T13#34#11.219-04#00~, ~prefix~# ~openconfig-platform#~, ~updates~# [ { ~Path~# ~openconfig-platform#components/component[name=0/0-OpticalChannel0/0/0/8]~, ~values~# { ~components/component~# { ~config~# { ~name~# ~0/0-OpticalChannel0/0/0/8~ }, ~openconfig-terminal-device#optical-channel~# { ~Cisco-IOS-XR-openconfig-terminal-device-ext#extended~# { ~state~# { ~optics-cd-high-threshold~# 160000, ~optics-cd-low-threshold~# -160000, ~optics-cd-max~# 13000, ~optics-cd-min~# -13000 } }, ~config~# { ~frequency~# 193700000, ~operational-mode~# 5005, ~target-output-power~# ~-115~ }, ~state~# { ~chromatic-dispersion~# { ~avg~# ~-1~, ~instant~# ~-2~, ~interval~# ~30000000000~, ~max~# ~0~, ~max-time~# ~1664644292995782101~, ~min~# ~-2~, ~min-time~# ~1664644307995794039~ }, ~frequency~# 193700000, ~operational-mode~# 5005, ~polarization-dependent-loss~# { ~avg~# ~1.09~, ~instant~# ~1.10~, ~interval~# ~30000000000~, ~max~# ~1.10~, ~max-time~# ~1664644290995999741~, ~min~# ~1.00~, ~min-time~# ~1664644317995786022~ }, ~second-order-polarization-mode-dispersion~# { ~avg~# ~42.37~, ~instant~# ~51.00~, ~interval~# ~30000000000~, ~max~# ~52.00~, ~max-time~# ~1664644297995784225~, ~min~# ~39.00~, ~min-time~# ~1664644307995794039~ }, ~target-output-power~# ~-10.75~ } } } } } ] }]AppendixExample XML NETCONF config for other ZR+ configuration modesZR+ 1x400G 16QAM<config> <terminal-device xmlns=~http#//openconfig.net/yang/terminal-device~> <logical-channels> <channel> <index>30001</index> <config> <index>30001</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_400G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_400GE</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>400</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>30000</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>30000</index> <config> <index>30000</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>400</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments> </channel> </logical-channels> </terminal-device> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <target-output-power>-115</target-output-power> <operational-mode>5005</operational-mode> <frequency>194300000</frequency> </config> </optical-channel> </component> </components> </config>ZR+ 1x100G QPSK <config> <terminal-device xmlns=~http#//openconfig.net/yang/terminal-device~> <logical-channels> <channel> <index>30001</index> <config> <index>30001</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>30000</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>30000</index> <config> <index>30000</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments> </channel> </logical-channels> </terminal-device> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <target-output-power>-115</target-output-power> <operational-mode>5013</operational-mode> </config> </optical-channel> </component> </components> </config>ZR+ 2x100G QPSK<config> <terminal-device xmlns=~http#//openconfig.net/yang/terminal-device~> <logical-channels> <channel> <index>30012</index> <config> <index>30012</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>30010</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>30013</index> <config> <index>30013</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>30010</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>30010</index> <config> <index>30010</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>200</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments> </channel> </logical-channels> </terminal-device> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <target-output-power>-100</target-output-power> <operational-mode>5009</operational-mode> <frequency>191300000</frequency> </config> </optical-channel> </component> </components> </config>ZR+ 4x100G 16QAMThis mode does not have widespread applicability in routing applications but is included for completeness.<config> <terminal-device xmlns=~http#//openconfig.net/yang/terminal-device~> <logical-channels> <channel> <index>30009</index> <config> <index>30009</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>30013</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>30010</index> <config> <index>30010</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>30013</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>30011</index> <config> <index>30011</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>30013</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>30012</index> <config> <index>30012</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>30013</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>30013</index> <config> <index>30013</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>400</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments> </channel> </logical-channels> </terminal-device> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <target-output-power>-115</target-output-power> <operational-mode>5005</operational-mode> <frequency>191300000</frequency> </config> </optical-channel> </component> </components> </config>Verification of OpenConfig in XR CLIThe IOS-XR CLI does contain configuration commands to either configure or verify OpenConfig configuration. The example below is for a 300G line rate application.terminal-device logical-channel 30000 admin-state enable description Coherent Logical Channel logical-channel-type Otn assignment-id 1 allocation 300 assignment-type optical description Coherent to optical assignment assigned-optical-channel 0_0-OpticalChannel0_0_0_8 ! ! logical-channel 30001 rate-class 100G admin-state enable description ETH Logical Channel trib-protocol 400GE logical-channel-type Ethernet assignment-id 1 allocation 400 assignment-type logical description ETH to Coherent assignment assigned-logical-channel 30000 ! ! logical-channel 30002 rate-class 100G admin-state enable description ETH Logical Channel trib-protocol 100G-MLG logical-channel-type Ethernet assignment-id 1 allocation 100 assignment-type logical description ETH to Coherent assignment assigned-logical-channel 30000 ! ! logical-channel 30003 rate-class 100G admin-state enable description ETH Logical Channel trib-protocol 100G-MLG logical-channel-type Ethernet assignment-id 1 allocation 100 assignment-type logical description ETH to Coherent assignment assigned-logical-channel 30000 ! ! optical-channel 0_0-OpticalChannel0_0_0_8 power -115 frequency 194300000 line-port Optics0/0/0/8 operational-mode 5007 !!IOS-XR CLI Operational DataThe main commands used to monitor optical information for the ZR/ZR+ optics is theshow controller optics and show controller coherentdsp commands.Example for QDD-400G-ZR-SRP/0/RP0/CPU0#ron-8201-1#show controllers optics 0/0/0/20Thu Oct 6 14#31#25.413 PDT Controller State# Down Transport Admin State# In Service Laser State# On LED State# Yellow FEC State# FEC ENABLED Optics Status Optics Type# QSFPDD 400G ZR DWDM carrier Info# C BAND, MSA ITU Channel=61, Frequency=193.10THz, Wavelength=1552.524nm Alarm Status# ------------- Detected Alarms# None LOS/LOL/Fault Status# Alarm Statistics# ------------- HIGH-RX-PWR = 0 LOW-RX-PWR = 0 HIGH-TX-PWR = 0 LOW-TX-PWR = 5 HIGH-LBC = 0 HIGH-DGD = 0 OOR-CD = 0 OSNR = 9 WVL-OOL = 0 MEA = 0 IMPROPER-REM = 0 TX-POWER-PROV-MISMATCH = 0 Laser Bias Current = 52.5 mA Actual TX Power = -10.06 dBm RX Power = -40.00 dBm RX Signal Power = -40.00 dBm Frequency Offset = 0 MHz Laser Temperature = 40.40 Celsius Laser Age = 0 % DAC Rate = 1x1 Performance Monitoring# Enable THRESHOLD VALUES ---------------- Parameter High Alarm Low Alarm High Warning Low Warning ------------------------ ---------- --------- ------------ ----------- Rx Power Threshold(dBm) 13.0 -23.0 10.0 -21.0 Rx Power Threshold(mW) 19.9 0.0 10.0 0.0 Tx Power Threshold(dBm) 0.0 -18.0 -2.0 -16.0 Tx Power Threshold(mW) 1.0 0.0 0.6 0.0 LBC Threshold(mA) 0.00 0.00 0.00 0.00 Temp. Threshold(celsius) 80.00 -5.00 75.00 15.00 Voltage Threshold(volt) 3.46 3.13 3.43 3.16 LBC High Threshold = 98 % Configured Tx Power = -10.00 dBm Configured Tx Power(mW) = 0.10 mW Configured CD High Threshold = 160000 ps/nm Configured CD lower Threshold = -160000 ps/nm Configured OSNR lower Threshold = 9.00 dB Configured DGD Higher Threshold = 80.00 ps Baud Rate = 59.8437500000 GBd Modulation Type# 16QAM Chromatic Dispersion 0 ps/nm Configured CD-MIN -2400 ps/nm CD-MAX 2400 ps/nm Second Order Polarization Mode Dispersion = 0.00 ps^2 Optical Signal to Noise Ratio = 0.00 dB SNR = 0.00 dB Polarization Dependent Loss = 0.00 dB Polarization Change Rate = 0.00 rad/s Differential Group Delay = 0.00 ps Temperature = 42.00 Celsius Voltage = 3.34 V Transceiver Vendor Details Form Factor # QSFP-DD Optics type # QSFPDD 400G ZR Name # CISCO-ACACIA OUI Number # 7c.b2.5c Part Number # DP04QSDD-E20-19E Rev Number # 10 Serial Number # ACA245100ET PID # QDD-400G-ZR-S VID # ES03 Firmware Version # Major.Minor.Build Active # 61.20.13 Inactive # 61.10.12 Date Code(yy/mm/dd) # 20/12/28 Fiber Connector Type# LC Otn Application Code# Undefined Sonet Application Code# UndefinedRP/0/RP0/CPU0#ron-8201-1#show controllers coherentDSP 0/0/0/10Thu Oct 6 14#31#55.222 PDTPort # CoherentDSP 0/0/0/10Controller State # DownInherited Secondary State # NormalConfigured Secondary State # NormalDerived State # In ServiceLoopback mode # NoneBER Thresholds # SF = 1.0E-5 SD = 1.0E-7Performance Monitoring # EnableBandwidth # 400.0Gb/sAlarm Information#LOS = 1 LOF = 0 LOM = 0OOF = 0 OOM = 0 AIS = 0IAE = 0 BIAE = 0 SF_BER = 0SD_BER = 0 BDI = 0 TIM = 0FECMISMATCH = 0 FEC-UNC = 0 FLEXO_GIDM = 0FLEXO-MM = 0 FLEXO-LOM = 0 FLEXO-RDI = 0FLEXO-LOF = 0Detected Alarms # LOSBit Error Rate InformationPREFEC BER # 5.0E-01POSTFEC BER # 0.0E+00Q-Factor # 0.00 dBQ-Margin # 0.00dBOTU TTI ReceivedFEC mode # C_FEC", "url": "/blogs/zr-openconfig-mgmt", "author": "Phil Bedard", "tags": "iosxr, design, optical, ron, routing, sdn, controller" } , "blogs-2022-12-01-cst-routed-optical-2-0": { "title": "Cisco Routed Optical Networking", "content": " On This Page PDF Download Revision History Solution Component Software Versions What is Routed Optical Networking? Key Drivers Changing Networks Network Complexity Inefficiences Between Network Layers Operational Complexity Network Cost Routed Optical Networking Solution Overview Today’s Complex Multi-Layer Network Infrastructure DWDM OTN Ethernet/IP Enabling Technologies Pluggable Digital Coherent Optics QSFP-DD and 400ZR and OpenZR+ Standards Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S) Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S) Cisco Routers Cisco Private Line Emulation Circuit Style Segment Routing Cisco DWDM Network Hardware Routed Optical Networking Network Use Cases Where to use 400ZR and where to use OpenZR+ Supported DWDM Optical Topologies NCS 2000 64 Channel FOADM P2P Deployment NCS 1010 64 Channel FOADM P2P Deployment NCS 2000 Colorless Add/Drop Deployment NCS 2000 Multi-Degree ROADM Deployment NCS 1010 Multi-Degree Deployment Long-Haul Deployment Core Networks Metro Aggregation Access DCI and 3rd Party Location Interconnect Routed Optical Networking Private Line Services Circuit Style Segment Routing CS SR-TE paths characteristics CS SR-TE path liveness detection CS SR-TE path failover behavior CS SR-TE Policy operational details Private Line Emulation Hardware Supported Client Transceivers Private Line Emulation Pseudowire Signaling Private Line Emulation EVPN-VPWS Configuration PLE Monitoring and Telemetry Client Optics Port State PLE CEM Controller Stats PLE CEM PM Statistics PLE Client PM Statistics Routed Optical Networking Architecture Hardware Routed Optical Networking Validated Routers Cisco 8000 Series Cisco 5700 Systems and NCS 5500 Line Cards ASR 9000 Series NCS 500 Series Routed Optical Networking Optical Hardware Network Convergence System 1010 Network Convergence System 2000 Network Convergence System 1000 Multiplexer Network Convergence System 1001 NCS 2000 and NCS 1001 Hardware Routed Optical Networking Automation Overview IETF ACTN SDN Framework Cisco’s SDN Controller Automation Stack Cisco Open Automation Crosswork Hierarchical Controller Crosswork Network Controller Cisco Optical Network Controller Cisco Network Services Orchestrator and Routed Optical Networking ML Core Function Pack Routed Optical Networking Service Management Supported Provisioning Methods OpenZR+ and 400ZR Properties ZR/ZR+ Supported Frequencies Supported Line Side Rate and Modulation Crosswork Hierarchical Controller UI Provisioning Inter-Layer Link Definition IP Link Provisioning Operational Discovery NSO RON-ML CFP Provisioning Routed Optical Networking Inter-Layer Links RON-ML End to End Service RON-ML API Provisioning IOS-XR CLI Configuration Model-Driven Configuration using IOS-XR Native Models using NETCONF or gNMI Model-Driven Configuration using OpenConfig Models Routed Optical Networking Assurance Crosswork Hierarchical Controller Multi-Layer Path Trace Routed Optical Networking Link Assurance ZRM Layer TX/RX Power ZRC Layer BER and Q-Factor / Q-Margin OTS Layer RX/TX Power Graph Event Monitoring IOS-XR CLI Monitoring of ZR400/OpenZR+ Optics Optics Controller Coherent DSP Controller EPNM Monitoring of Routed Optical Networking EPNM Chassis View of DCO Transceivers Chassis View Interface/Port View EPNM DCO Performance Measurement DCO Physical Layer PM KPIs Cisco IOS-XR Model-Driven Telemetry for Routed Optical Networking Monitoring ZR/ZR+ DCO Telemetry NCS 1010 Optical Line System Monitoring Open-source Monitoring Additional Resources Cisco Routed Optical Networking 2.0 Solution Guide Cisco Routed Optical Networking Home Cisco Routed Optical Networking Tech Field Day Cisco Champion Podcasts Appendix A Acronyms DWDM Network Hardware Overview Optical Transmitters and Receivers Multiplexers/Demultiplexers Optical Amplifiers Optical add/drop multiplexers (OADMs) Reconfigurable optical add/drop multiplexers (ROADMs) PDF Downloadhttps#//github.com/ios-xr/design/blob/master/Routed-Optical-Networking/2022-12-01-cst-routed-optical-2_0.pdfRevision History Version Date Comments 1.0 01/10/2022 Initial Routed Optical Networking Publication 2.0 12/01/2022 Private Line Services, NCS 1010, CW HCO updates Solution Component Software Versions Element Version Router IOS-XR 7.7.1 NCS 2000 SVO 12.3.1 NCS 1010 IOS-XR 7.7.1 Cisco Optical Network Controller 2.0 Crosswork Network Controller 4.1 Crosswork Hierarchical Controller 5.3 Cisco EPNM 6.1.0 What is Routed Optical Networking?Routed Optical Networking as part of Cisco’s Converged SDN Transportarchitecture brings network simplification to the physical networkinfrastructure, just as EVPN and Segment Routing simplify the service andtraffic engineering network layers. Routed Optical Networking collapses complextechnologies and network layers into a single cost efficient and easy to managenetwork infrastructure. Here we present the Cisco Routed Optical Networkingarchitecture and validated design.Key DriversChanging NetworksInternet traffic has seen a compounded annual growth rate of 30% or higher overthe last ten years, as more devices are connected, end user bandwidth speedsincrease, and applications continue to move to the cloud. The introduction of 5Gin mobile carriers and backhaul providers is also a disruptor, networks must bebuilt to handle the advanced services and traffic increase associated with 5G.Networks must evolve so the infrastructure layer can keep up with the servicelayer. 400G Ethernet is the next evolution for SP IP network infrastructure, andwe must make that as efficient as possible.Network ComplexityComputer networks at their base are a set of interconnected nodes to deliverdata between two endpoints. In the very beginning, these networks were designedusing a layered approach to separate functions. The OSI model is an example ofhow functional separation has led to innovation by allowing different standardsbodies to work in parallel at each layer. In some cases even these OSI layersare further split into different layers. While these layers can bring some costbenefit, it also brings added complexity. Each layer has its own management,control plane, planning, and operational model.Inefficiences Between Network LayersOTN and IP network traffic must be converted into wavelengthsignals to traverse the DWDM network. This has traditionally required dedicatedexternal hardware, a transponder. All of these layers bring complexity, andtoday some of those layers, such as OTN, bring little to the table in terms ofefficiency or additional value. OTN switching, like ATM previously, has not beenable to keep up with traffic demands due to very complex hardware. UnlikeEthernet/IP, OTN also does not have a widely interoperable control plane, locking providers into a single vendor or solution long-term.Operational ComplexityNetworks involving opaque layers are difficult to plan, build, and operate. IPand optical networks often have duplicate teams covering similar tasks. Networkprotection and restoration is also often complicated by different schemesrunning independently across layers. The industry has tried over decades tosolve some of these issues with complex control planes such as GMPLS, but we arenow at an evolution point where simplifying the physical layers and reducingcontrol plane complexity in the optical layer allows a natural progression to asingle control-plane and protection/restoration layer.Network CostSimplyfing networks reduces both capex and opex. As we move to 400G, the networkcost is shifted away from routers and router ports to optics. Any way we canreduce the number of 400G interconnects on the network will greatly reduce cost.Modeling networks with 400ZR and OpenZR+ optics in place of traditionaltransponders and muxponders shows this in almost any network scenario. It also results in a reduced space and power footprint.Routed Optical Networking Solution OverviewAs part of the Converged SDN Transport architecture, Routed Optical Networkingextends the key tenet of network simplification. Routed Optical Networkingtackles the challenges of building and managing networks by simplifying both theinfrastructure and operations.Today’s Complex Multi-Layer Network InfrastructureDWDMMost modern SP networks start at the physical fiber optic layer. Above thephysical fiber is technology to allow multiple photonic wavelengths to traversea single fiber and be switched at junction points, we will call that the DWDMlayer.OTNIn some networks, above this DWDM layer is an OTN layer, OTN being theevolution of traditional SONET/SDH networks. OTN grooms low speed TDM servicesinto higher speed containers, and if OTN switching is involved, allows switchingthese services at intermediate points in the network. OTN is primarily used in network to carry guaranteed bandwidth services.Ethernet/IPIn all high bandwidth networks today, there is an Ethernet layer on which IPservices traverse, since almost all data traffic today is IP. Ethernetand IP is used due to its ability to support statistical multiplexing, topologyflexibility, and widespread interoperability between different vendors based onwell-defined standards. In larger networks today carrying Internet traffic, theEthernet/IP layer does not typically traverse an OTN layer, the OTN layer isprimarily used only for business services.Enabling TechnologiesPluggable Digital Coherent OpticsSimple networks are easier to build and easier to operate. As networks scale tohandle traffic growth, the level of network complexity must decline or at leastremain flat.IPoDWDM has attempted to move the transponder function into the router to removethe transponder and add efficiency to networks. In lower bandwidth applications,it has been a very successful approach. CWDM, DWDM SFP/SFP+, and CFP2-DCOpluggable transceivers have been used for many years now to build access,aggregation, and lower speed core networks. The evolution to 400G andadvances in technology created an opportunity to unlock this potentialin higher speed networks.Transponder or muxponders have typically been used to aggregate multiple 10G or100G signals into a single wavelength. However, with reach limitations, and thefact transponders are still operating at 400G wavelength speeds, the transponderbecomes a 1#1 input to output stage in the network, adding no benefit.The Routed Optical Networking architecture unlocks this efficiency for networksof all sizes, due to advancements in coherent plugable technology.QSFP-DD and 400ZR and OpenZR+ StandardsAs mentioned, the industry saw a point to improve network efficiency by shiftingcoherent DWDM functions to router pluggables. Technology advancements haveshrunk the DCO components into the standard QSFP-DD form factor, meaning nospecialized hardware and the ability to use the highest capacity routersavailable today. ZR/OpenZR+ QSFP-DD optics can be used in the same ports as thehighest speed 400G non-DCO transceivers.Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S)Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S)Two industry optical standards have emerged to cover a variety of use cases. TheOIF created the 400ZR specification,https#//www.oiforum.com/technical-work/hot-topics/400zr-2 as a 400G interopablestandard for metro reach coherent optics. The industry saw the benefit of theapproach, but wanted to cover longer distances and have flexibility inwavelength rates, so the OpenZR+ MSA was created, https#//www.openzrplus.org.The following table outlines the specs of each standard. ZR400 and OpenZR+ transceivers are tunable across the ITU C-Band, 196.1 To 191.3 THz.The following part numbers are used for Cisco’s ZR400 and OpenZR+ MSA transceivers Standard Part 400ZR QDD-400G-ZR-S OpenZR+ QDD-400G-ZRP-S Cisco datasheet for these transceivers can be found at https#//www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/datasheet-c78-744377.htmlCisco RoutersWe are at a point in NPU development where the pace of NPU bandwidth growth hasoutpaced network traffic growth. Single NPUs such as Cisco’s Silicon One have acapacity exceeding 12.8Tbps in a single NPU package without sacrificingflexibility and rich feature support. This growth of NPU capacity also bringsreduction in cost, meaning forwarding traffic at the IP layer is moreadvantageous vs. a network where layer transitions happen often.Cisco supports 400ZR and OpenZR+ optics across the NCS 540, NCS 5500, NCS 5700,ASR 9000, and Cisco 8000 series routers. This enabled providers to utilize the architecture across their end to end infrastructure in a variety of router roles. SeeCisco Private Line EmulationStarting in Routed Optical Networking 2.0, Cisco now supports Private LineEmulation (PLE) hardware and IOS-XR support to provide bit-transparent privateline services over the converged packet network. Private Line Emulation supportsthe transport of Ethernet, SONET/SDH, OTN, and Fiber Channel services. See thePLE section of the document for in-depth information on PLE.Circuit Style Segment RoutingCircuit Style Segment Routing (CS-SR) is another Cisco advancement bringing TDM circuit like behavior to SR-TE Policies. These policies use deterministic hop by hop routing, co-routed bi-directional paths, hot standby protect paths, and end to end liveness detection. Standard Ethernet services not requiring bit transparency can be transported over a Segment Routing network similar to OTN networks without the additional cost, complexity, and inefficiency of an OTN network layer.Cisco DWDM Network HardwareRouted Optical Networking shifts an expensive and now often redundanttransponder function into a pluggable transceiver. However, to make the mostefficient use of a valuable resource, the underlying fiber optic network, westill need a DWDM layer. Routed Optical Networking is flexible enough to workacross point to point, ROADM based optical networks, or a mix of both. Ciscomultiplexers, amplifiers, and ROADMs can satisfy any network need.Cisco NCS 1010Routed Optical Networking 2.0 introduces the new Cisco NCS 1010 open optical line system. The NCS 1010 represents an evolution in open optical line systems, utilizing the same IOS-XR software as Cisco routers and NCS 1004 series transponders. This enables the rich XR automation and telemetry support to extend to the DWDM photonic line system. The NCS 1010 also simplifies how operators build DWDM networks with advanced integrated functions and a flexible twin 1x33 WSS.See the validated design hardware section for more information.Routed Optical Networking Network Use CasesCisco is embracing Routed Optical Networking in every SP router role. Access,aggregation, core, peering, DCI, and even PE routers can be enabled with highspeed DCO optics. Routed Optical Networking is also not limited to SP networks,there are applications across enterprise, government, and education networks.Where to use 400ZR and where to use OpenZR+The OIF 400ZR and OpenZR+ MSA standards have important differences.400ZR supports 400G rates only, and targets metro distance point to pointconnections up to 120km. 400ZR mandates a strict power consumption of 15W aswell. Networks requiring only 400G over distances less than 120km may benefitfrom using 400ZR optics. DCI and 3rd party peering interconnection are good usecases for 400ZR.If a provider needs flexibility in rates and distances and wants to standardizeon a single optics type, OpenZR+ can fulfill the need. In areas of the networkwhere 400G may not be needed, OpenZR+ optics can be run at 100G or 200G.Additionally, hardware with QSFP-DD 100G ports can utilize OpenZR+ optics in100G mode. This can be ideal for high density access and aggregation networks.Supported DWDM Optical TopologiesFor those unfamiliar with DWDM hardware, please see the overview of DWDM networkhardware in Appendix AThe future of networks may be a flat L3 network with simple point to pointinterconnection, but it will take time to migrate to this type of architecture.Routed Optical Network supports an evolution to the architecture by working overmost modern photonic DWDM networks. Below gives just a few of the supportedoptical topologies including both point to point and ROADM based DWDM networks.NCS 2000 64 Channel FOADM P2P DeploymentThis example provides up to 25.6Tb on a single network span, and highlights thesimplicity of the Routed Optical Networking solution. The “optical” portion ofthe network including the ZR/ZR+ configuration can be completed in a matter ofminutes from start to finish.NCS 1010 64 Channel FOADM P2P DeploymentThe NCS 1010 includes two add/drop ports with embedded bi-directional EDFAamplifiers, ideal for connecting the new MD-32-E/O 32 channel, 150Ghz spacedpassive multiplexer. Connecting both even and odd multiplexers allows the use of 64 total channels.NCS 2000 Colorless Add/Drop DeploymentUsing the NCS2K-MF-6AD-CFS colorless NCS2K-MF-LC module along with the LC16 LCaggregation module, and SMR20-FS ROADM module, a scalable colorless add/dropcomplex can be deployed to support 400ZR and OpenZR+.NCS 2000 Multi-Degree ROADM DeploymentIn this example a 3 degree ROADM node is shown with a local add/drop degree. TheRouted Optical Networking solution fully supports ROADM based networks withoptical bypass. The traffic demands of the network will dictate the mostefficient network build. In cases where an existing or new build requires DWDMswitching capability, ZR and ZR+ wavelengths are easily provisioned over theinfrastructure.NCS 1010 Multi-Degree DeploymentA multi-degree NCS 1010 site utilizes a separate NCS 1010 OLT device for each degree. The degree may be an add/drop or bypass degree. In our example Site 3 can support the add/drop of wavelengths via its A/D ports on the upper node, orexpress those wavelengths through the interconnect to site 4 via the additional 1010 OLT unit connected to site 4. In our example the wavelength originating at sites 1 and 4 using ZR+ optics is expressed through site 3.Long-Haul DeploymentCisco has demonstrated in a physical lab 400G OpenZR+ services provisionedacross 1200km using NCS 2000 and NCS 1010 optical line systems. 300G, 200G,and 100G signals can achieve even greater distances. OpenZR+ is not just forshorter reach applications, it fulfills an ideal sweet spot in most providernetworks in terms of bandwidth and reach.Core NetworksLong-haul core networks also benefit from the CapEx and OpEx savings of movingto Routed Optical Networking. Moving to a simpler IP enabled convergedinfrastructure makes networks easier to manage and operate vs. networks withcomplex underlying optical infrastructure. The easiest place to start in thejourney is replacing external transponders with OpenZR+ QSFP-DD transceivers. At400G connecting a 400G gray Ethernet port to a transponder with a 400G or 600Gline side is not cost or environmentally efficient. Cisco can assist in modeling your core network to determine the TCO of Routed Optical Networking compared to traditional approaches.Metro AggregationTiered regional or metro networks connecting hub locations to larger aggregation site or datacenters can also benefit from Routed Optical Networking. Whether deployed in a hub and spoke topology or hop by hop IP ring, Routed Optical Networking satisfied provider’s growth demands at a lower cost than traditional approaches.AccessAccess deployments in a ring or point-to-point topology are ideal for Routed Optical Networking. Shorter distances over dark fiber may not require active optical equipment, and with up to 400G per span may provide the bandwidthnecessary for growth over a number of years without the use of additional multiplexers.DCI and 3rd Party Location InterconnectIn this use case, Routed Optical Networking simplifies deployments by eliminating active transponders, reducing power, space, and cabling requirements between end locations. 25.6Tbps of bandwidth is available over a single fiber using 64 400G wavelengths and simple optical amplifiers and multiplexers requiring no additional configuration after initial turn-up.Routed Optical Networking Private Line ServicesRelease 2.0 introduces Circuit Style Segment Routing TE Policies and Private Line Emulation hardware to enable traditional TDM-like private line services over the converged Segment Routing packet network. The following provides an overview of the hardware and software involved in supporting PL services. The figure below gives an overview of PLE service signaling and transport.Circuit Style Segment RoutingCS-SR provides the underlying TDM-like transport to support traditional private line Ethernet services without additional hardware and emulated bit-transparent services using Private Line Emulation hardware.CS SR-TE paths characteristics Co-routed Bidirectional - Meaning the paths between two client ports are symmetric Deterministic without ECMP - Meaning the path does not vary based on any load balancing criteria Persistent - Paths are routed on a hop by hop basis, so they are not subject to path changes induced by network changes End-to-end path protection - Entire paths are switched from working to protect with the protect path in a hot standby state for fast transitionSR CS-TE policies are built using link adjacency SIDs without protection to ensure the paths do not take a TI-LFA path during path failover and instead fail over to the pre-determined protect path.CS SR-TE path liveness detectionPaths can be configured with end to end liveness detection. Liveness detection uses TWAMP-lite probes which are looped at the far end to determine if the end to end path is up bi-directionally. If more than the set number of probes is missed (set by the multiplier) the path will be considered down. Once liveness detection is enabled probes will be sent on all candidate paths. Either the default liveness probe profile can be used or if you want to modify the default parameters a customized one can be created.CS SR-TE path failover behaviorCS SR-TE policies contain multiple candidate paths. The highest preference candidate path is considered the working path, the second highest preference path is the protect path, and if a third lower preference path is configured would be a dynamic restoration path. This provides 1#1+R protection for CS SR-TE policies. The following below shows the configuration of a CS SR-TE Policy with a working, protect, and restoration path. Tsegment-routing traffic-eng policy to-55a2-1 color 1001 end-point ipv4 100.0.0.44 path-protection ! candidate-paths preference 25 dynamic metric-type igp ! ! preference 50 explicit segment-list protect-forward-path reverse-path segment-list protect-reverse-path ! ! preference 100 explicit segment-list working-forward-path reverse-path segment-list working-reverse-path ! ! ! performance-measurement liveness-detection liveness-profile name liveness-checkCS SR-TE Policy operational detailsRP/0/RP0/CPU0#ron-ncs55a2-1#show segment-routing traffic-eng policy color 1001Sat Dec 3 13#32#38.356 PSTSR-TE policy database---------------------Color# 1001, End-point# 100.0.0.42 Name# srte_c_1001_ep_100.0.0.42 Status# adjmin# up Operational# up for 2d09h (since Dec 1 04#08#12.648) Candidate-paths# Preference# 100 (configuration) (active) Name# to-100.0.0.42 Requested BSID# dynamic PCC info# Symbolic name# cfg_to-100.0.0.42_discr_100 PLSP-ID# 1 Constraints# Protection Type# protected-preferred Maximum SID Depth# 12 Explicit# segment-list forward-adj-path-working (valid) Reverse# segment-list reverse-adj-path-working Weight# 1, Metric Type# TE SID[0]# 15101 [adjacency-SID, 100.1.1.21 - 100.1.1.20] SID[1]# 15102 SID[2]# 15103 SID[3]# 15104 Reverse path# SID[0]# 15001 SID[1]# 15002 SID[2]# 15003 SID[3]# 15004 Protection Information# Role# WORKING Path Lock# Timed Lock Duration# 300(s) State# ACTIVE Preference# 50 (configuration) (protect) Name# to-100.0.0.42 Requested BSID# dynamic PCC info# Symbolic name# cfg_to-100.0.0.42_discr_50 PLSP-ID# 2 Constraints# Protection Type# protected-preferred Maximum SID Depth# 12 Explicit# segment-list forward-adj-path-protect(valid) Reverse# segment-list reverse-adj-path-protect Weight# 1, Metric Type# TE SID[0]# 15119 [adjacency-SID, 100.1.42.1 - 100.1.42.0] Reverse path# SID[0]# 15191 Protection Information# Role# PROTECT Path Lock# Timed Lock Duration# 300(s) State# STANDBY Attributes# Binding SID# 24017 Forward Class# Not Configured Steering labeled-services disabled# no Steering BGP disabled# no IPv6 caps enable# yes Invalidation drop enabled# no Max Install Standby Candidate Paths# 0Private Line Emulation HardwareStarting in IOS-XR 7.7.1 the NC55-OIP-02 Modular Port Adapter (MPA) is supportedon the NCS-55A2-MOD and NCS-57C3-MOD platforms. The NC55-OIP-02 has 8 SFP+ portsEach port on the PLE MPA can be configured independently. The PLE MPA is responsible for receiving data frames from the native PLE client and packaging those into fixed frames for transport over the packet network.More information on the NC55-OIP-02 can be found in its datasheet located athttps#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/network-con-5500-series-ds.pdf. A full detailed to end to end configuration for PLE can be found in the Routed Optical Networking 2.0 Solution Guide found at https#//www.cisco.com/c/en/us/td/docs/optical/ron/2-0/solution/guide/b-ron-solution-20/m-ron.pdfSupported Client Transceivers Transport Type Supported Transceivers Ethernet SFP-10G-SR/LR/ER, GLC-LH/EX/ZX-SMD, 1G/10G CWDM OTN (OTU2e) SFP-10G-LR-X, SFP-10G-ER-I, SFP-10G-Z SONET/SDH ONS-SC+-10G-LR/ER/SR (OC-192/STM-64), ONS-SI-2G-L1/L2/S1 (OC-48/STM-16) Fiber Channel DS-SFP-FCGE, DS-SFP-FC8G, DS-SFP-FC16G, DS-SFP-FC32G, 1/2/4/8G FC CWDM Note FC32G transceivers are supported in the even ports only and will disable the adjacent odd SFP+ port.Private Line Emulation Pseudowire SignalingPLE utilizes IETF SAToP pseudowire encoding carried over dynamically signalled EVPN-VPWS circuits. Enhancements to the EVPN VPWS service type have been introduced to the IETF viahttps#//datatracker.ietf.org/doc/draft-schmutzer-bess-ple.PLE services use Differential Clock Recovery (DCR) to ensure proper frame timing between the two PLE clients. In order to mmaintain accuracy of the clock each PLE endpoint router must have its frequency source traceable to a common primary reference clock (PRC).Private Line Emulation EVPN-VPWS ConfigurationPLE services can be configured to utilize a CS SR-TE Policy or use dynamic MPLS protocols. The example belows shows the use of CS SR-TE Policy as transport for the PLE EVPN-VPWS service. Note the name of the sr-te policy in the preferred path command is the persistent generated name and not the name used in the CLI configuration. This can be determined using the “show segment-routing traffic-engineering policies” command.l2vpn pw-class circuit-style-srte encapsulation mpls preferred-path sr-te policy srte_c_1001_ep_100.0.0.42 ! ! xconnect group ple p2p ple-cs-1 interface CEM0/0/2/1 neighbor evpn evi 100 target 4201 source 4401 pw-class circuit-style-srte ! !PLE Monitoring and TelemetryThe following “show” command can be used to monitor the state of PLE ports and services.Client Optics Port StateRP/0/RP0/CPU0#ron-ncs55a2-1#show controllers optics 0/0/2/1Sat Dec 3 14#00#10.873 PST Controller State# Up Transport Admin State# In Service Laser State# On LED State# Not Applicable Optics Status Optics Type# SFP+ 10G SR Wavelength = 850.00 nm Alarm Status# ------------- Detected Alarms# None LOS/LOL/Fault Status# Laser Bias Current = 8.8 mA Actual TX Power = -2.60 dBm RX Power = -2.33 dBm Performance Monitoring# Disable THRESHOLD VALUES ---------------- Parameter High Alarm Low Alarm High Warning Low Warning ------------------------ ---------- --------- ------------ ----------- Rx Power Threshold(dBm) 2.0 -13.9 -1.0 -9.9 Tx Power Threshold(dBm) 1.6 -11.3 -1.3 -7.3 LBC Threshold(mA) 13.00 4.00 12.50 5.00 Temp. Threshold(celsius) 75.00 -5.00 70.00 0.00 Voltage Threshold(volt) 3.63 2.97 3.46 3.13 Polarization parameters not supported by optics Temperature = 33.00 Celsius Voltage = 3.30 V Transceiver Vendor Details Form Factor # SFP+ Optics type # SFP+ 10G SR Name # CISCO-FINISAR OUI Number # 00.90.65 Part Number # FTLX8574D3BCL-CS Rev Number # A Serial Number # FNS23300J42 PID # SFP-10G-SR VID # V03 Date Code(yy/mm/dd) # 19/07/25PLE CEM Controller StatsRP/0/RP0/CPU0#ron-ncs57c3-1#show controllers CEM 0/0/3/1Sat Sep 24 11#34#22.533 PDTInterface # CEM0/0/3/1Admin state # UpOper state # UpPort bandwidth # 10312500 kbpsDejitter buffer (cfg/oper/in-use) # 0/813/3432 usecPayload size (cfg/oper) # 1280/1024 bytesPDV (min/max/avg) # 980/2710/1845 usecDummy mode # last-frameDummy pattern # 0xaaIdle pattern # 0xffSignalling # No CASRTP # EnabledClock type # DifferentialDetected Alarms # NoneStatistics Info---------------Ingress packets # 517617426962, Ingress packets drop # 0Egress packets # 517277124278, Egress packets drop # 0Total error # 0 Missing packets # 0, Malformed packets # 0 Jitter buffer underrun # 0, Jitter buffer overrun # 0 Misorder drops # 0Reordered packets # 0, Frames fragmented # 0Error seconds # 0, Severely error seconds # 0Unavailable seconds # 0, Failure counts # 0Generated L bits # 0, Received L bits # 0Generated R bits # 339885178, Received R bits # 17Endpoint Info-------------Passthrough # NoPLE CEM PM StatisticsRP/0/RP0/CPU0#ron-ncs57c3-1#show controllers CEM 0/0/3/1 pm current 30-sec cemSat Sep 24 11#37#02.374 PDTCEM in the current interval [11#37#00 - 11#37#02 Sat Sep 24 2022]CEM current bucket type # ValidINGRESS-PKTS # 2521591 Threshold # 0 TCA(enable) # NOEGRESS-PKTS # 2521595 Threshold # 0 TCA(enable) # NOINGRESS-PKTS-DROPPED # 0 Threshold # 0 TCA(enable) # NOEGRESS-PKTS-DROPPED # 0 Threshold # 0 TCA(enable) # NOINPUT-ERRORS # 0 Threshold # 0 TCA(enable) # NOOUTPUT-ERRORS # 0 Threshold # 0 TCA(enable) # NOMISSING-PKTS # 0 Threshold # 0 TCA(enable) # NOPKTS-REORDER # 0 Threshold # 0 TCA(enable) # NOJTR-BFR-UNDERRUNS # 0 Threshold # 0 TCA(enable) # NOJTR-BFR-OVERRUNS # 0 Threshold # 0 TCA(enable) # NOMIS-ORDER-DROPPED # 0 Threshold # 0 TCA(enable) # NOMALFORMED-PKT # 0 Threshold # 0 TCA(enable) # NOES # 0 Threshold # 0 TCA(enable) # NOSES # 0 Threshold # 0 TCA(enable) # NOUAS # 0 Threshold # 0 TCA(enable) # NOFC # 0 Threshold # 0 TCA(enable) # NOTX-LBITS # 0 Threshold # 0 TCA(enable) # NOTX-RBITS # 0 Threshold # 0 TCA(enable) # NORX-LBITS # 0 Threshold # 0 TCA(enable) # NORX-RBITS # 0 Threshold # 0 TCA(enable) # NOPLE Client PM StatisticsRP/0/RP0/CPU0#ron-ncs57c3-1#show controllers EightGigFibreChanCtrlr0/0/3/4 pm current 30-sec fcSat Sep 24 11#51#55.168 PDTFC in the current interval [11#51#30 - 11#51#55 Sat Sep 24 2022]FC current bucket type # Valid IFIN-OCTETS # 16527749196 Threshold # 0 TCA(enable) # NO RX-PKT # 196758919 Threshold # 0 TCA(enable) # NO IFIN-ERRORS # 0 Threshold # 0 TCA(enable) # NO RX-BAD-FCS # 0 Threshold # 0 TCA(enable) # NO IFOUT-OCTETS # 0 Threshold # 0 TCA(enable) # NO TX-PKT # 0 Threshold # 0 TCA(enable) # NO TX-BAD-FCS # 0 Threshold # 0 TCA(enable) # NO RX-FRAMES-TOO-LONG # 0 Threshold # 0 TCA(enable) # NO RX-FRAMES-TRUNC # 0 Threshold # 0 TCA(enable) # NO TX-FRAMES-TOO-LONG # 0 Threshold # 0 TCA(enable) # NO TX-FRAMES-TRUNC # 0 Threshold # 0 TCA(enable) # NORouted Optical Networking Architecture HardwareAll Routed Optical Networking solution routers are powered by Cisco IOS-XR.Routed Optical Networking Validated RoutersBelow is a non-exhaustive snapshot of platforms validated for use with ZR andOpenZR+ transceivers. Cisco supports Routed Optical Networking in the NCS 540,NCS 5500/5700, ASR 9000, and Cisco 8000 router families. The breadth of coverageenabled the solution across all areas of the network.Cisco 8000 SeriesThe Cisco 8000 and its Silicone One NPU represents the next generation inrouters, unprecedented capacity at the lowest power consumption while supportinga rich feature set applicable for a number of network roles.See more information on Cisco 8000 at https#//www.cisco.com/c/en/us/products/collateral/routers/8000-series-routers/datasheet-c78-742571.htmlSpecific information on ZR/ZR+ support can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/cisco8000/Interfaces/73x/configuration/guide/b-interfaces-config-guide-cisco8k-r73x/m-zr-zrp-cisco-8000.htmlCisco 5700 Systems and NCS 5500 Line CardsThe Cisco 5700 family of fixed and modular systems and line cards are flexibleenough to use at any location in the networks. The platform has seen widespreaduse in peering, core, and aggregation networks.See more information on Cisco NCS 5500 and 5700 at https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-736270.html andhttps#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-744698.htmlSpecific information on ZR/ZR+ support can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/interfaces/73x/configuration/guide/b-interfaces-hardware-component-cg-ncs5500-73x/m-zr-zrp.htmlASR 9000 SeriesThe ASR 9000 is the most widely deployed SP router in the industry. It has arich heritage dating back almost 20 years, but Cisco continues to innovate onthe ASR 9000 platform. The ASR 9000 series now supports 400G QSFP-DD on avariety of line cards and the ASR 9903 2.4Tbps 3RU platform.See more information on Cisco ASR 9000 at https#//www.cisco.com/c/en/us/products/collateral/routers/asr-9000-series-aggregation-services-routers/data_sheet_c78-501767.htmlSpecific information on ZR/ZR+ support can be found at https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-3/interfaces/configuration/guide/b-interfaces-hardware-component-cg-asr9000-73x/m-zr-zrp.html#Cisco_Concept.dita_59215d6f-1614-4633-a137-161ebe794673NCS 500 SeriesThe 1Tbps N540-24QL16DD-SYS high density router brings QSFP-DD and Routed Optical NetworkingZR/OpenZR+ optics to a flexible access and aggregation platform. Using OpenZR+ optics it allows a migration path from 100G to 400G access rings or uplinks when used in an aggregation role.See more information on Cisco NCS 540 at https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-500-series-routers/ncs-540-large-density-router-ds.htmlRouted Optical Networking Optical HardwareNetwork Convergence System 1010The NCS 1010 Open Optical Line System (O-OLS) is a next-generation DWDM platform available in fixed variants to satisfy building a modern flexible DWDM photonic network.The NCS 1010 Optical Line Terminal (OLT) uses a twin 33-port WSS architectureallowing higher scale for either add/drop or express wavelengths. The OLT alsohas two LC add/drop ports with integrated fixed gain EDFA to support theadd/drop of lower power optical signals. OLTs are available in models with orwithout RAMAN amplification. NCS 1010 Inline Amplifier nodes are available asbi-directional EDFA, EDFA with RAMAN in one direction, or bi-directional RAMAN.Each model of NCS 1010 is also available to support both C and L bands. In Routed Optical Networking 2.0 ZR and ZR+ optics utilize the C band, but may be used on the same fiber withL band signals using the NCS 1010 C+L combiner.The NCS 1010 utilizes IOS-XR, inheriting the advanced automation and telemetryfeatures similar to IOS-XR routers.NCS 1010 OLT with RAMAN NCS 1010 ILA with RAMAN The NCS1K-MD32-E/O-C 32-port 150Ghz spaced passive multiplexer is used with the NCS 1010, supporting the 75Ghz ZR/ZR+ signals and future higher baud rate signals. The MD-32 contains photodiodes to monitor RX power levels on each add/drop port.NCS 1010 MD-32 Passive Filter The NCS 1010 supports point to point and express DWDM optical topologies in Routed Optical Networking 2.0. All NCS 1010 services in Routed Optical Networking are managed using Cisco Optical NetworkController.See more information on the NCS 1010 series at https#//www.cisco.com/c/en/us/products/collateral/optical-networking/network-convergence-system-1000-series/network-conver-system-1010-ds.htmlNetwork Convergence System 2000The NCS 2000 Optical Line System is a flexible platform supporting all modernoptical topologies and deployment use cases. Simple point to point tomulti-degree CDC deployments are all supported as part of Routed OpticalNetworking.See more information on the NCS 2000 series at https#//www.cisco.com/c/en/us/products/optical-networking/network-convergence-system-2000-series/index.htmlNetwork Convergence System 1000 MultiplexerThe NCS1K-MD-64-C is a new fixed multiplexer designed specificallyfor the 400G 75Ghz 400ZR and OpenZR+ wavelengths, allowing up to 25.6Tbps on asingle fiber.Network Convergence System 1001The NCS 1001 is utiized in point to point network spans as an amplifier andoptionally protection switch. The NCS 1001 now has specific support for 75Ghzspaced 400ZR and OpenZR+ wavelengths, with the ability to monitor incomingwavelengths for power. The 1001 features the ability to determine the properamplifier gain setpoints based on the desired user power levels.See more information on the NCS 1001 at https#//www.cisco.com/c/en/us/products/collateral/optical-networking/network-convergence-system-1000-series/datasheet-c78-738782.htmlNCS 2000 and NCS 1001 HardwareThe picture below does not represent all available hardware on the NCS 2000, however does capture the modules typically used in Routed Optical Networking deployments.Routed Optical Networking AutomationOverviewRouted Optical Networking by definition is a disaggregated optical solution,creating efficiency by moving coherent endpoints in the router. The solutionrequires a new way of managing the network, one which unifies the IP and Opticallayers, replacing the traditional siloed tools used in the past. Realtransformation in operations comes from unifying teams and workflows, ratherthan trying to make an existing tool fit a role it was not originally designedfor. Cisco’s standards based hierarchical SDN solution allows providers tomanage a multi-vendor Routed Optical Networking solution using standardinterfaces and YANG models.IETF ACTN SDN FrameworkThe IETF Action and Control of Traffic Engineered Networks group (ACTN) hasdefined a hierarchical controller framework to allow vendors to plug componentsinto the framework as needed. The lowest level controller, the ProvisioningNetwork Controller (PNC), is responsible for managing physical devices. Thesecontroller expose their resources through standard models and interface to aHierarchical Controller (HCO), called a Multi-Domain Service Controller (MDSC)in the ACTN framework.Note that while Cisco is adhering to the IETF framework proposed in RFC8453 , Cisco is supporting the mostwidely supported industry standards for controller to controller communicationand service definition. In optical the de facto standard is Transport API fromthe ONF for the management of optical line system networks and optical services.In packet we are leveraging Openconfig device models where possible and IETFmodels for packet topology (RFC8345) and xVPN services (L2NM and L3NM)Cisco’s SDN Controller Automation StackAligning to the ACTN framework, Cisco’s automation stack includes amulti-vendor IP domain controller (PNC), optical domain controller (PNC), andmulti-vendor hierarchical controller (HCO/MDSC).Cisco Open AutomationCisco believes not all providers consume automation in the same way, so we arededicated to make sure we have open interfaces at each layer of the networkstack. At the device level, we utilize standard NETCONF, gRPC, and gNMIinterfaces along with native, standard, and public consortium YANG models. Thereis no aspect of a Cisco IOS-XR router today not covered by YANG models. At thedomain level we have Cisco’s network controllers, which use the same standardinterfaces to communicate with devices and expose standards based NBIs. Ourmulti-layer/multi-domain controller likewise uses the same standard interfaces.Crosswork Hierarchical ControllerResponsible for Multi-Layer Automation is the Crosswork Hierarchical Controller. Crosswork Hierarchical Controller is responsible for the following network functions# CW HCO unifies data from the IP and optical networks into a single networkmodel. HCO utilizes industry standard IETF topology models for IP and TAPI foroptical topology and service information. HCO can also leverage legacy EMS/NMSsystems or device interrogation. Responsible for managing multi-layer Routed Optical Networking links using asingle UI. Providing assurance at the IP and optical layers in a single tool. Thenetwork model allows users to quickly correlate faults and identify at whichlayer faults have occurred. Additional HCO applications include the following Root Cause Analysis# Quickly correlate upper layer faults to an underlying cause. Layer Relations# Quickly identify the lower layer resources supporting higher layer network resource or all network resources reliant on a selected lower layer network resource. Network Inventory# View IP and optical node hardware inventory along with with network resources such as logical links, optical services, and traffic engineering tunnels Network History# View state changes across all network resources at any point in time Performance# View historical link utilization Please see the following resources for more information on Crosswork HCO. https#//www.cisco.com/c/en/us/products/collateral/cloud-systems-management/crosswork-network-automation/solution-overview-c22-744695.htmlCrosswork Network ControllerCrosswork Network Controller is a multi-vendor IP domain controller. CrossworkNetwork Controller is responsible for the following IP network functions. Collecting Ethernet, IP, RSVP-TE, and SR network information for internalapplications and exposing northbound via IETF RFC 8345 topology models Collecting traffic information from the network for use with CNC’s trafficoptimization application, Crosswork Optimization Engine Perform provisioning of SR-TE, RSVP-TE, L2VPN, and L3VPN using standardindustry models (IETF TEAS-TE, L2NM, L3NM) via UI or northbound API Visualization and assurance of SR-TE, RSVP-TE, and xVPN services Use additional Crosswork applications to perform telemetry collection/alerting,zero-touch provisioning, and automated and assurance network changesMore information on Crosswork and Crosswork Network Controller can be found at https#//www.cisco.com/c/en/us/products/collateral/cloud-systems-management/crosswork-network-automation/datasheet-c78-743456.htmlCisco Optical Network ControllerCisco Optical Network Controller (Cisco ONC) is responsible for managing Cisco optical line systems and circuit services. Cisco ONC exposes a ONF TAPI northbound interface, the de facto industry standard for optical network management. Cisco ONC runs as an application on the same Crosswork Infrastructure as CNC.More information on Cisco ONC can be found at https#//www.cisco.com/c/en/us/support/optical-networking/optical-network-controller/series.htmlCisco Network Services Orchestrator and Routed Optical Networking ML Core Function PackCisco NSO is the industry standard for service orchestration and deviceconfiguration management. The RON-ML CFP can be used to fully configure an IPlink between routers utilizing 400ZR/OpenZR+ optics over a Cisco optical linesystem using Cisco ONC. This includes IP addressing and adding links to anexisting Ethernet LAG. The CFP can also support optical-only provisioning on therouter to fit into existing optical provisioning workflows.Routed Optical Networking Service ManagementSupported Provisioning MethodsWe support multiple ways to provision Routed Optical Networking services based on existing provider workflows. Unified IP and Optical using Crosswork Hierarchical Controller Unified IP and Optical using Cisco NSO Routed Optical Networking Multi-Layer Function Pack ZR/ZR+ Optics using IOS-XR CLI Model-driven ZR/ZR+ Optics configuration using Netconf or gNMI OpenConig ZR/ZR+ Optics configuration using Netconf or gNMIOpenZR+ and 400ZR PropertiesZR/ZR+ Supported FrequenciesThe frequency on Cisco ZR/ZR+ transceivers may be set between 191.275Thz and196.125Thz in increments of 6.25Ghz, supporting flex spectrum applications. Tomaximize the available C-Band spectrum, these are the recommended 6475Ghz-spaced channels, also aligning to the NCS1K-MD-64-C fixed channel add/dropmultiplexer.                 196.100 196.025 195.950 195.875 195.800 195.725 195.650 195.575 195.500 195.425 195.350 195.275 195.200 195.125 195.050 194.975 194.900 194.825 194.75 194.675 194.600 194.525 194.450 194.375 194.300 194.225 194.150 194.075 194.000 193.925 193.850 193.775 193.700 193.625 193.550 193.475 193.400 193.325 193.250 193.175 193.100 193.025 192.950 192.875 192.800 192.725 192.650 192.575 192.500 192.425 192.350 192.275 192.200 192.125 192.050 191.975 191.900 191.825 191.750 191.675 191.600 191.525 191.450 191.375 Supported Line Side Rate and ModulationOIF 400ZR transceivers support 400G only per the OIF specification. OpenZR+transceivers can support 100G, 200G, 300G, or 400G line side rate. See routerplatform documentation for supported rates. The modulation is determined by theline side rate. 400G will utilize 16QAM, 300G 8QAM, and 200G/100G rates willutilize QPSK.Crosswork Hierarchical Controller UI ProvisioningEnd-to-End IP+Optical provisioning can be done using Crosswork Hierarchical Controller’s GUI IP Linkprovisioning. Those familiar with traditional GUI EMS/NMS systems for servicemanagement will have a very familiar experience. Crosswork Hierarchical Controller provisioning will provisionboth the router optics as well as the underlying optical network to support theZR/ZR+ wavelength.Inter-Layer Link DefinitionEnd to end provisioning requires first defining the Inter-Layer link between therouter ZR/ZR+ optics and the optical line system add/drop ports. This is doneusing a GUI based NMC (Network Media Channel) Cross-Link application in Crosswork HCO.The below screenshot shows defined NMC cross-links.IP Link ProvisioningOnce the inter-layer links are created, the user can then proceed inprovisioning an end to end circuit. The provisioning UI takes as input the tworouter endpoints, the associated ZR/ZR+ ports, and the IP addressing or bundlemembership of the link. The optical line system provisioning is abstracted fromthe user, simplifying the end to end workflow. The frequency and power isautomatically derived by Cisco Optical Network Controller based on the add/dropport and returned as a parameter to be used in router optics provisioning.Operational DiscoveryThe Crosswork Hierarchical Controller provisioning process also performs a discovery phase to ensure theservice is operational before considering the provisioning complete. Ifoperational discovery fails, the end to end service will be rolled back.NSO RON-ML CFP ProvisioningProviders familiar with using Cisco Network Service Orchestrator have an optionto utilize NSO to perform IP+Optical provisioning of Routed Optical Networkingservices. Cisco has created the Routed Optical Network Multi-Layer Core FunctionPack, RON-ML CFP to perform end to end provisioning of services. Theaforementioned Crosswork HCO provisioning utilizes the RON-ML CFP to perform end deviceprovisioning.Please see the Cisco Routed Optical Networking RON-ML CFP documentation located atRouted Optical Networking Inter-Layer LinksSimilar to the use case with CW HCO provisioning, before end to end provisioningcan be performed, inter-layer links must be provisioned between the opticalZR/ZR+ port and the optical line system add/drop port. This is done using the“inter-layer-link” NSO service. The optical end point can be defined as either aTAPI SIP or by the TAPI equipment inventory identifier. Inter-layer links are not required for router-only provisioning.RON-ML End to End ServiceThe RON-ML service is responsible for end to end IP+optical provisioning. RON-MLsupports full end to end provisioning, router-only provisioning, or optical-onlyprovisioning where only the router ZR/ZR+ configuration is performed. Thefrequency and transmit power can be manually defined or optionally provided byCisco ONC when end to end provisioning is performed.RON-ML API ProvisioningUse the following URL for NSO provisioning# http#//<nso host>/restconf/dataInter-Layer Link Service{ ~data~# { ~cisco-ron-cfp#ron~# { ~inter-layer-link~# [ { ~end-point-device~# ~ron-8201-1~, ~line-port~# ~0/0/0/20~, ~ols-domain~# { ~network-element~# ~ron-ols-1~, ~optical-add-drop~# ~1/2008/1/13,14~, ~optical-controller~# ~onc-real-new~ } } ] } }}Provisioning ZR+ optics and adding interface to Bundle-Ether 100 interface{ ~cisco-ron-cfp#ron~# { ~ron-ml~# [ { ~name~# ~E2E_Bundle_ZRP_ONC57_2~, ~mode~# ~transponder~, ~bandwidth~# ~400~, ~circuit-id~# ~E2E Bundle ONC-57 S9|chan11 - S10|chan11~, ~grid-type~# ~100mhz-grid~, ~ols-domain~# { ~service-state~# ~UNLOCKED~ }, ~end-point~# [ { ~end-point-device~# ~ron-8201-1~, ~terminal-device-optical~# { ~line-port~# ~0/0/0/11~, ~transmit-power~# -100 }, ~ols-domain~# { ~end-point-state~# ~UNLOCKED~ }, ~terminal-device-packet~# { ~bundle~# [ { ~id~# 100 } ], ~interface~# [ { ~index~# 0, ~membership~# { ~bundle-id~# 100, ~mode~# ~active~ } } ] } }, { ~end-point-device~# ~ron-8201-2~, ~terminal-device-optical~# { ~line-port~# ~0/0/0/11~, ~transmit-power~# -100 }, ~ols-domain~# { ~end-point-state~# ~UNLOCKED~ }, ~terminal-device-packet~# { ~bundle~# [ { ~id~# 100 } ], ~interface~# [ { ~index~# 0, ~membership~# { ~bundle-id~# 100, ~mode~# ~active~ } } ] } } ] } ] } }IOS-XR CLI ConfigurationConfiguring the router portion of the Routed Optical Networking link is verysimple. All optical configuration related to the ZR/ZR+ optics configuration islocated under the optics controller relevent to the faceplate port. Defaultconfiguration the optics will be in an up/up state using a frequency of193.10Thz.The basic configuration with a specific frequency of 195.65 Thz is located below, the only required component is the bolded channel frequency setting.ZR/ZR+ Optics Configurationcontroller Optics0/0/0/20 transmit-power -100 dwdm-carrier 100MHz-grid frequency 1956500 logging events link-statusModel-Driven Configuration using IOS-XR Native Models using NETCONF or gNMIAll configuration performed in IOS-XR today can also be done using NETCONF/YANG. The following payload exhibits the models and configuration used to perform router optics provisioning. This is a more complete example showing the FEC, power, and frequency configuration. .Note in Release 2.0 using IOS-XR 7.7.1 the newer IOS-XR Unified Models are utilized for provisioning<data xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~><controllers xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-interface-cfg~>    <controller>        <controller-name>Optics0/0/0/0</controller-name>        <transmit-power xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-cont-optics-cfg~>-115</transmit-power>        <fec xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-cont-optics-cfg~>OFEC</fec>        <dwdm-carrier xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-cont-optics-cfg~>          <grid-100mhz>            <frequency>1913625</frequency>          </grid-100mhz>        </dwdm-carrier>        <dac-rate xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-dac-rate-cfg~>1x1.25</dac-rate>      </controller></controllers> </data>Model-Driven Configuration using OpenConfig ModelsStarting on Release 2.0 all IOS-XR 7.7.1+ routers supporting ZR/ZR+ optics can be configured using OpenConfig models. Provisioning utilizes the openconfig-terminal-device model and its extensions to the openconfig-platform model to support DWDM configuration parameters.Below is an example of an OpenConfig payload to configure ZR/ZR+ optics port 0/0/0/20 with a 300G trunk rate with frequency 195.20 THz.Please visit the blog at https#//xrdocs.io/design/blogs/zr-openconfig-mgmt for in depth information about configuring and monitoring ZR/ZR+ optics using OpenConfig models.<config> <terminal-device xmlns=~http#//openconfig.net/yang/terminal-device~> <logical-channels> <channel> <index>100</index> <config> <index>200</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>101</index> <config> <index>101</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>102</index> <config> <index>102</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>200</index> <config> <index>200</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>300</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments> </channel> </logical-channels> </terminal-device> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <operational-mode>5007</operational-mode> <frequency>195200000</frequency> </config> </optical-channel> </component> </components> </config>Routed Optical Networking AssuranceCrosswork Hierarchical ControllerMulti-Layer Path TraceUsing topology and service data from both the IP and Optical network CW HCO candisplay the full service from IP services layer to the physical fiber. Below isan example of the “waterfall” trace view from the OTS (Fiber) layer to theSegment Routing TE layer across all layers. CW HCO identifies specific RoutedOptical Networking links using ZR/ZR+ optics as seen by the ZRC (ZR Channel) andZRM (ZR Media) layers from the 400ZR specification.When faults occur at a specific layer, faults will be highlighted in red,quickly identifying the layer a fault has occurred. In this case we can see thefault has occurred at an optical layer, but is not a fiber fault. Having theability to pinpoint the fault layer even within a specific domain is a powerfulway to quickly determine the root cause of the fault.Routed Optical Networking Link AssuranceThe Link Assurance application allows users to view a network link and all of its dependent layers. This includes Routed Optical Networking multi-layer services. In addition to viewing layer information, fault and telemetry information is also available by simply selecting a link or port.ZRM Layer TX/RX PowerZRC Layer BER and Q-Factor / Q-MarginOptionally the user can see graphs of collected telemetry data to quickly identify trends or changes in specific operational data. Graphs of collected performance data is accessed using the “Performance” tab when a link or port is selected.OTS Layer RX/TX Power GraphEvent MonitoringCrosswork HCO records any transition of a network resource between up/down operational states. This is reflected in the Link Assurance tool under the “Events” tab.IOS-XR CLI Monitoring of ZR400/OpenZR+ OpticsOptics ControllerThe optics controller represents the physical layer of the optics. In the caseof ZR/ZR+ optics this includes the frequency information, RX/TX power, OSNR, andother associated physical layer information.RP/0/RP0/CPU0#ron-8201-1#show controllers optics 0/0/0/20Thu Jun 3 15#34#44.098 PDT Controller State# Up Transport Admin State# In Service Laser State# On LED State# Green FEC State# FEC ENABLED Optics Status Optics Type# QSFPDD 400G ZR DWDM carrier Info# C BAND, MSA ITU Channel=10, Frequency=195.65THz, Wavelength=1532.290nm Alarm Status# ------------- Detected Alarms# None LOS/LOL/Fault Status# Alarm Statistics# ------------- HIGH-RX-PWR = 0 LOW-RX-PWR = 0 HIGH-TX-PWR = 0 LOW-TX-PWR = 4 HIGH-LBC = 0 HIGH-DGD = 1 OOR-CD = 0 OSNR = 10 WVL-OOL = 0 MEA = 0 IMPROPER-REM = 0 TX-POWER-PROV-MISMATCH = 0 Actual TX Power = -7.17 dBm RX Power = -9.83 dBm RX Signal Power = -9.18 dBm Frequency Offset = 9 MHz Baud Rate = 59.8437500000 GBd Modulation Type# 16QAM Chromatic Dispersion 6 ps/nm Configured CD-MIN -2400 ps/nm CD-MAX 2400 ps/nm Second Order Polarization Mode Dispersion = 34.00 ps^2 Optical Signal to Noise Ratio = 35.50 dB Polarization Dependent Loss = 1.20 dB Polarization Change Rate = 0.00 rad/s Differential Group Delay = 2.00 psPerformance Measurement DataRP/0/RP0/CPU0#ron-8201-1#show controllers optics 0/0/0/20 pm current 30-sec optics 1Thu Jun 3 15#39#40.428 PDTOptics in the current interval [15#39#30 - 15#39#40 Thu Jun 3 2021]Optics current bucket type # Valid MIN AVG MAX Operational Configured TCA Operational Configured TCA Threshold(min) Threshold(min) (min) Threshold(max) Threshold(max) (max)LBC[% ] # 0.0 0.0 0.0 0.0 NA NO 100.0 NA NOOPT[dBm] # -7.17 -7.17 -7.17 -15.09 NA NO 0.00 NA NOOPR[dBm] # -9.86 -9.86 -9.85 -30.00 NA NO 8.00 NA NOCD[ps/nm] # -489 -488 -488 -80000 NA NO 80000 NA NODGD[ps ] # 1.00 1.50 2.00 0.00 NA NO 80.00 NA NOSOPMD[ps^2] # 28.00 38.80 49.00 0.00 NA NO 2000.00 NA NOOSNR[dB] # 34.90 35.12 35.40 0.00 NA NO 40.00 NA NOPDL[dB] # 0.70 0.71 0.80 0.00 NA NO 7.00 NA NOPCR[rad/s] # 0.00 0.00 0.00 0.00 NA NO 2500000.00 NA NORX_SIG[dBm] # -9.23 -9.22 -9.21 -30.00 NA NO 1.00 NA NOFREQ_OFF[Mhz]# -2 -1 4 -3600 NA NO 3600 NA NOSNR[dB] # 16.80 16.99 17.20 7.00 NA NO 100.00 NA NOCoherent DSP ControllerThe coherent DSP controller represents the framing layer of the optics. It includes Bit Error Rate, Q-Factor, and Q-Margin information.RP/0/RP0/CPU0#ron-8201-1#show controllers coherentDSP 0/0/0/20Sat Dec 4 17#24#38.245 PSTPort # CoherentDSP 0/0/0/20Controller State # UpInherited Secondary State # NormalConfigured Secondary State # NormalDerived State # In ServiceLoopback mode # NoneBER Thresholds # SF = 1.0E-5 SD = 1.0E-7Performance Monitoring # EnableBandwidth # 400.0Gb/sAlarm Information#LOS = 10 LOF = 0 LOM = 0OOF = 0 OOM = 0 AIS = 0IAE = 0 BIAE = 0 SF_BER = 0SD_BER = 0 BDI = 0 TIM = 0FECMISMATCH = 0 FEC-UNC = 0 FLEXO_GIDM = 0FLEXO-MM = 0 FLEXO-LOM = 3 FLEXO-RDI = 0FLEXO-LOF = 5Detected Alarms # NoneBit Error Rate InformationPREFEC BER # 1.7E-03POSTFEC BER # 0.0E+00Q-Factor # 9.30 dBQ-Margin # 2.10dBFEC mode # C_FECPerformance Measurement DataRP/0/RP0/CPU0#ron-8201-1#show controllers coherentDSP 0/0/0/20 pm current 30-sec fecThu Jun 3 15#42#28.510 PDTg709 FEC in the current interval [15#42#00 - 15#42#28 Thu Jun 3 2021]FEC current bucket type # Valid EC-BITS # 20221314973 Threshold # 83203400000 TCA(enable) # YES UC-WORDS # 0 Threshold # 5 TCA(enable) # YES MIN AVG MAX Threshold TCA Threshold TCA (min) (enable) (max) (enable)PreFEC BER # 1.5E-03 1.5E-03 1.6E-03 0E-15 NO 0E-15 NOPostFEC BER # 0E-15 0E-15 0E-15 0E-15 NO 0E-15 NOQ[dB] # 9.40 9.40 9.40 0.00 NO 0.00 NOQ_Margin[dB] # 2.20 2.20 2.20 0.00 NO 0.00 NOEPNM Monitoring of Routed Optical NetworkingEvolved Programmable Network Manager, or EPNM, can also be used to monitor router ZR/ZR+ performance measurement data and display device level alarms when faults occur. EPNM stores PM and alarm data for historical analysis.EPNM Chassis View of DCO TransceiversThe following shows a chassis view of a Cisco 8201 router. The default view is to show all active alarms on the device and its components. Clicking on a specific component will give information on the component and narrow the scope of alarms and data.Chassis ViewInterface/Port ViewEPNM DCO Performance MeasurementEPNM continuously monitors and stores PM data for DCO optics for important KPIs such as TX/RX power, BER, and Q values. The screenshots below highlight monitoring. While EPNM stores historical data, clicking on a speciic KPI will enable realtime monitoring by polling for data every 20 seconds.DCO Physical Layer PM KPIsThe following shows common physical layer KPIs such as OSNR and RX/TX power. This is exposed by monitoring the Optics layer of the interface. DCO.The following shows common framing layer KPIs such as number of corrected words per interval and (BIEC) Bit Error Rate. This is exposed by monitoring the CoherentDSP layer of the interface.Cisco IOS-XR Model-Driven Telemetry for Routed Optical Networking MonitoringAll operational data on IOS-XR routers and optical line systems can be monitored using streaming telemetry based on YANG models. Routed Optical Networking is no different, so a wealth of information can be streamed from the routers in intervals as low as 5s.ZR/ZR+ DCO TelemetryThe following represents a list of validated sensor paths useful for monitoringthe DCO optics in IOS-XR and the data fields available within thesesensor paths. Note PM fields also support 15m and 24h paths in addition to the 30s paths shown in the table below. Sensor Path Fields Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info alarm-detected, baud-rate, dwdm-carrier-frequency, controller-state, laser-state, optical-signal-to-noise-ratio, temperature, voltage Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-lanes/optics-lane receive-power, receive-signal-power, transmit-power Cisco-IOS-XR-controller-otu-oper#otu/controllers/controller/info bandwidth, ec-value, post-fec-ber, pre-fec-ber, qfactor, qmargin, uc Cisco-IOS-XR-pmengine-oper#performance-management/optics/optics-ports/optics-port/optics-current/optics-second30/optics-second30-optics/optics-second30-optic dd__average, dgd__average, opr__average, opt__average, osnr__average, pcr__average, pmd__average, rx-sig-pow__average, snr__average, sopmd__average Cisco-IOS-XR-pmengine-oper#performance-management/otu/otu-ports/otu-port/otu-current/otu-second30/otu-second30fecs/otu-second30fec ec-bits__data, post-fec-ber__average, pre-fec-ber__average, q__average, qmargin__average, uc-words__data NCS 1010 Optical Line System MonitoringThe following represents a list of validated sensor paths useful for monitoringthe different optical resources on the NCS 1010 OLS. The OTS controller represents the lowest layer port interconnecting optical elements. The NCS 1010 supports per-channel monitoring, exposed as the OTS-OCH Sensor Path Fields Cisco-IOS-XR-controller-ots-oper#ots-oper/ots-ports/ots-port/ots-info total-tx-power, total-rx-power, transmit-signal-power, receive-signal-power, agress-ampi-gain, ingress-ampli-gain, controller-state Cisco-IOS-XR-controller-ots-och-oper#ots-och-oper/ots-och-ports/ots-och-port/ots-och-info total-tx-power, total-rx-power, transport-admin-state, line-channel, add-drop-channel Cisco-IOS-XR-controller-oms-oper rx-power, tx-power, controller-state, led-state Cisco-IOS-XR-controller-och-oper#och-oper/och-ports/och-port/och-info channel-frequency, channel-wavelength, controller-state, rx-power, tx-power, channel-width, led-state Cisco-IOS-XR-pmengine-oper#performance-management/optics/optics-ports opr, opt, opr-s, opt-s Cisco-IOS-XR-olc-oper#olc/span-loss-ctrlr-tables/span-loss-ctrlr-table neighbor-rid, rx-span-loss, tx-span-loss, name Open-source MonitoringCisco model-driven telemetry along with the open source collector Telegraf and the open source dashboard software Grafana can be used to quickly build powerful dashboards to monitor ZR/ZR+ and NCS 1010 OLS performance.Additional ResourcesCisco Routed Optical Networking 2.0 Solution Guidehttps#//www.cisco.com/content/en/us/td/docs/optical/ron/2-0/solution/guide/b-ron-solution-20.htmlCisco Routed Optical Networking Home https#//www.cisco.com/c/en/us/solutions/service-provider/routed-optical-networking.html Cisco Routed Optical Networking Tech Field Day Solution Overview# https#//techfieldday.com/video/build-your-network-with-cisco-routed-optical-networking-solution/ Automation Demo# https#//techfieldday.com/video/cisco-routed-optical-networking-solution-demo/Cisco Champion Podcasts Cisco Routed Optical Networking Solution for the Next Decade https#//smarturl.it/CCRS8E24 Simplify Network Operations with Crosswork Hierarchical Controller# https#//smarturl.it/CCRS8E48 Appendix AAcronyms     DWDM Dense Waveform Division Multiplexing OADM Optical Add Drop Multiplexer FOADM Fixed Optical Add Drop Multiplexer ROADM Reconfigurable Optical Add Drop Multiplexer DCO Digital Coherent Optics FEC Forward Error Correction OSNR Optical Signal to Noise Ratio BER Bit Error Rate DWDM Network Hardware OverviewOptical Transmitters and ReceiversOptical transmitters provide the source signals carried across the DWDM network.They convert digital electrical signals into a photonic light stream on aspecific wavelength. Optical receivers detect pulses of light and and convertsignals back to electrical signals. In Routed Optical Networking, digital coherent QSFP-DD OpenZR+ and 400ZR transceivers in routers are used as optical transmitters and receivers.Multiplexers/DemultiplexersMultiplexers take multiple wavelengths on separate fibers and combine them intoa single fiber. The output of a multiplexer is a composite signal.Demultiplexers take composite signals that compatible multiplexers generate andseparate the individual wavelengths into individual fibers.Optical AmplifiersOptical amplifiers amplify an optical signal. Optical amplifiers increase thetotal power of the optical signal to enable the signal transmission acrosslonger distances. Without amplifiers, the signal attenuation over longerdistances makes it impossible to coherently receive signals. We use differenttypes of optical amplifiers in optical networks. For example# preamplifiers,booster amplifiers, inline amplifiers, and optical line amplifiers.Optical add/drop multiplexers (OADMs)OADMs are devices capable of adding one or more DWDM channels into or droppingthem from a fiber carrying multiple channels.Reconfigurable optical add/drop multiplexers (ROADMs)ROADMs are programmable versions of OADMs. With ROADMs, you can change thewavelengths that are added or dropped. ROADMs make optical networks flexible andeasily modifiable.", "url": "/blogs/2022-12-01-cst-routed-optical-2_0/", "author": "Phil Bedard", "tags": "iosxr, design, optical, ron, routing" } , "blogs-latest-converged-sdn-transport-srv6": { "title": "Cisco Converged SDN Transport SRv6 Transport High Level Design", "content": " On This Page Revision History Solution Component Software Versions Summary SRv6 Technology Overview SRv6 Benefits Scale Simple Forwarding Forwarding and Service Congruency Segment Routing v6 IETF Standards and Drafts IPv6 Segment Routing Header (SRH) SRv6 Locator SRv6 Compressed SID (micro-SID / uSID) SRv6 micro-SID Terminology SRv6 Addressing with Compressed SID SRv6 uSID Carrier Format Global C-SID Block (GIB) Local C-SID Block (LIB) Baseline SRv6 Forwarding Behavior Compressed SID without SRH Compressed SID with additional SRv6 Headers TI-LFA Mid-Point Protection SRv6 Deployment Overview Scalable Deployment using Domain Summarization and Redistribution Mutual Redistribution with Redundant Connectivity Unreachable Prefix Announcement SR Flexible Algorithms SRv6-TE Policies for Advanced TE SRv6 Network Functions and Endpoint Behaviors SRv6 Compressed SID Behavior SRv6 Network Implementation Network Domain Planning SRv6 Network Address Planning Locator Planning and Formatting Router Configuration SRv6 Micro-SID Hardware Enablement Loopback Interface Configuration PE Locator Configuration PE IS-IS Router Configuration Domain Boundary IS-IS Configuration Unreachable Prefix Advertisement Configuration Summary Prefix Configuration Core and Access Mutual Redistribution SRv6-TE Policies On-Demand SRv6-TE Policy Enabling Services over SRv6 Services Route Reflector Design SRv6 Service Forwarding L3VPN Forwarding Example L3VPN Configuration Example Egress PE Configuration L3VPN Route on Ingress Node Forwarding Entry on Ingress Node L2VPN EVPN-VPWS L2VPN EVPN-VPWS State EVPN ELAN Egres PE EVPN Configuration Egress PE BVI Configuration Egress PE SRv6 SID Allocation MPLS and SRv6 Migration SRv6 and MPLS Service Interworking Gateway Dual-Connected PE SRv6 Automation Crosswork Network Controller 4.1 Additional Resources Revision History Version Date Comments 1.0 03/01/2023 Initial version Solution Component Software Versions Element Version IOS-XR Routers (ASR 9000, NCS 5500, NCS 540) 7.8.1 Crosswork Network Controller 4.1 SummarySegment Routing has become the de facto underlay transport architecture for next-generation IP networks. SR simplifies the underlay network while also enhancing it with additional capabilities to carry differentiated services and maintain end-to-end SLAs for network deployments such as 5G mobile services.Segment Routing is an architecture, with a base set of functions achievable using standards based technology. The architecture gives SR the flexibility to adopt the best technology for a specific network and the use cases it needs to support.One component where SR supports this flexibility is in the data-plane transport layer. At its core SR is an underlay transport technology allowing a network to carry overlay services, such as L2VPN, EVPN, and simple IP services. The most ubiquitous data-plane transport used today in packet networks is MPLS. MPLS originated as Cisco tag switching almost 25 years ago (1998) and powers both service provider and enterprise networks worldwide.Segment Routing supports MPLS using the SR-MPLS data-plane, where SR SIDs are allocated as MPLS labels and is widely deployed today.IPv6 is the next-generation of IP addressing, also available for more than two decades.IPv6 promised simplified networks and services by utilizing the large amount of address space to ues IPv6 addressing to easily correlate packet to service. The vision has been there but the proper technology did not fully enable it. Segment Routing v6 is the technology which not only provides an IPv6-only data-plane across the network, but also creates a symmetry between data-plane, overlay services, and performance monitoring.SRv6 is the technology for enabling next-generation IPv6 based networks tosupport complex user and infrastructure services.SRv6 Technology OverviewSRv6 BenefitsScaleOne of the main benefits of SRv6 is the ability to build networks at huge scalethrough address summarization. MPLS networks require a unique label per node be distributed to all nodes requiring mutual reachability. In most cases there is also the requirement of distributing a /32 IP prefix as well. In the MPLS CST design we have the option to eliminate the IP and MPLS label distribution by utilizing a PCE to compute end to end paths. While this method works well and is also available for SRv6 it can lead to a large number of SR-TE or SRv6-TE policies.SRv6 allows us to summarize domain loopback addresses at IGP or BGP boundaries.This means a domain of 1000 nodes no longer requires advertising 1000 IPprefixes and associated labels, but can be summarized into a single IPadvertisement reachable via a simple longest prefix match (LPM) lookup.Networks of tens of thousands of nodes can now provide full reachability withvery few IPv6 routes. See the deployment options section for more information.Simple ForwardingIn SRv6, if a node is not a terminating node it simply forwards the traffic using IPv6 IP forwarding. This means nodes which are not SRv6 aware can also participate in a SRv6 network.Forwarding and Service CongruencyAs you will see in the services section, the destination IPv6 address in SRv6 is the service endpoint. Coupled with the simple forwarding this aids in troubleshooting and is much easier to understand than the MPLS layered service and data plane.Segment Routing v6 IETF Standards and Drafts IETF Draft or RFC Description RFC 8754 IPv6 Segment Routing Header RFC 8986 SRv6 Network Programming RFC 9252 BGP Overlay Services Based on Segment Routing over IPv6 RFC 9259 SRv6 OAM draft-ietf-spring-srv6-srh-compression SRv6 compressed SIDs (uSID) ietf-lsr-isis-srv6-extensions IS-IS extensions to support SRv6 ietf-lsr-ospfv3-srv6-extensions OSPFv3 extensions to support SRv6 draft-ppsenak-lsr-igp-pfx-reach-loss Unreachable Prefix Announcement IPv6 Segment Routing Header (SRH)Defined in RFC 8754, the SRv6 header includes the SRv6 SID list along withadditional attributes to program the SRv6 IPv6 data plane path. The SRH may beinserted by the source node or a SRv6 termination point. The SRH is not requiredin all SRv6 use cases such as a simple L3VPN or L2VPN with no trafficengineering requirements. Unlike MPLS, the SRv6 end IPv6 address can be used toidentify the endpoint node and the service.SRv6 LocatorThe SRv6 locator is part of the SRv6 SID structure of Locator#Function#Argument. The locator is allocated to nodes acting as SRv6 endpoints. The locator is defined asa specific amount of bits of the IPv6 address steering traffic to the endpoint node.SRv6 Compressed SID (micro-SID / uSID)SRv6 is made more efficient with the use compressed SIDs. Compressed SIDs are also known as micro-SIDs (uSID). In the case of micro-SID, multiple SRv6 SIDs can be encoded in a single 128-bit SRv6 SID. Additional SRv6 SIDs can be included in the path by adding an additional SRH if necessary.SRv6 micro-SID Terminology Item Definition Compressed-SID (C-SID) A short encoding of a SID in an SRv6 packet that does not include the SID locator block bits uSID A Micro SID. A type of Compressed-SID referred as NEXT-C-SID uSID Locator Block A block of uSIDs uSID Containe A 128-bit SRv6 SID that contains a sequence of uSIDs. It can be encoded in the DA of an IPv6 header or at any position in the Segment List of an SRH Active uSID First uSID after the uSID locator block Next uSID Next uSID after the Active uSID Last uSID From left to right, the last uSID before the first End-of-Container uSID End-of-Container (EoC) uSID Reserved uSID (all-zero ID) used to mark the end of a uSID container. All the empty uSID container positions must be filled with the End-of-Container ID SRv6 Addressing with Compressed SIDUsing the compressed SID or micro-SID format requires defining the IPv6 addressstructure segmenting the IPv6 address into a C-SID portion and Locator block portion. A dedicated IPv6 prefix should be used for SRv6 and SRv6 micro-SID allocation. RIR assigned public prefixes can be utilized or private ULA space defined in RFC4193.SRv6 uSID Carrier FormatMicro-SID requires defining a carrier format used globally across the network. A parent block size is dedicated to micro-SID allocations and the length of each micro-SID. IOS-XR supports the F3216 format, defining a 32-bit micro-SID block and 16-bit ID format.Global C-SID Block (GIB)The global ID block defines the block of SIDs used to identify nodes either uniquely or as part of an anycast group. These addresses are used by other nodes on the network to send SRv6 service traffic to the endpoint with the Global C-SID assigned.Local C-SID Block (LIB)The local ID block is used to define SIDs which are local to a node. These are typically used to identify services terminating on a node.Baseline SRv6 Forwarding BehaviorForwarding in SRv6 follows the semantics of simple IPv6 routing. The destination is always identified as an IPv6 address. In the case of an SRv6 packet without an additional SRH, traffic is routed to the endpoint destination node hop by hop based on normal destination based prefix lookups. In the case of a SRH, the SRH is only processed by a node if it is the destination address in the outer IPv6 header. If the node is the last SID in the SID list it will pop the SRH and processthe packet further. If the node is not the last SID in the SID list it will replacethe outer IPv6 destination address with the next IPv6 address in the SID list.Compressed SID without SRHCompressed SIDs have the ability to instantiate a multi-hop SRv6 path using a single 128-bit IPv6 address. Each micro-SID in the F3216 format uses 16 bits to identify the next node. If the node receives the packet with its own address as the IPv6 destination address it will further process the packet. It will either shift the micro-SID component of the address 16 bits to the left and copy the new address into the IPv6 destination address or if the locator is local to the node further process the service packet. Using the F3216 carrier format in IOS-XR, upto 6 micro-SIDs can be encoded in a single 128-bit IPv6 address.The example below illustrates the forwarding behavior with no additional SRH.Compressed SID with additional SRv6 HeadersUsing additional SRv6 headers increases the depth of the micro-SID list to support use cases with longer traffic engineered paths. This allows SRv6 with micro-SID to enable path hop counts greater than SR-MPLS.TI-LFA Mid-Point ProtectionSRv6 fully supports Topology Independent Loop-Free Alternates ensuring fast trafficprotection the case of link and node failures. A per-prefix pre-computed loop-free backup path is created. If the path requires traversing links which may end up in a loop, the protecting node will insert an SRH with the appropriate SIDs to reach the Q node with loop-free reachability to the destination prefix.SRv6 Deployment OverviewScalable Deployment using Domain Summarization and RedistributionIn the CST design we utilize separate IGP instances to segment the network.These segments can be based on scale, place in the network, or for otheradministrative reasons. We do not recommend exceeding 2000 routers in a singleIGP domains.SRv6 and its summarization capabilities are ideal for building high scalenetworks based on the CST design. At each domain boundary the IPv6 locator blocks are summarized and redistributed across adjacent domains. The end to end redistribution of summary prefixes enabled reachability between any two nodes on the network by simply doing a longest-prefix match on the destination address.Mutual Redistribution with Redundant ConnectivityRedistribution between IGP domains interconnected by multiple links requiresadditional consideration. While summary prefixes may not affect intra-domainconnectivity, if they are redistributed back into the domain initiallydistributing them routing loops may occur. It’s important to make sure therouter is properly configured to not violate the split-horizon rule ofadvertising routes back to the same domain they received them from. See the IGPimplementation section for more details.Unreachable Prefix AnnouncementSummarization hides the state of longer prefixes within the aggregatesummary, leading to traffic loss or slower failover when an egress PE isunreachable. UPA is an IGP function to quickly poison a prefix which has becomeunreachable to an upstream node. It enables the notification of an individualprefix becoming unreachable, outside of the local area/domain and across thenetwork in a manner that does not leave behind any persistent state in thelink-state database. When an ingress PE receives the UPA for an egress PE it can trigger fast switchover to an alternate path, such as a BGP PIC pre-programmed backup path.SR Flexible AlgorithmsFlexible Algorithms or Flex-Algo is an important component in SRv6 networks. SRv6 enables advanced forwarding behavior without utilizing SR-TE Policies, increasing scale and simplifying network deployments. Flex-Algo is used in the CST SRv6 design to differentiate traffic based on latency or path constraints.SRv6-TE Policies for Advanced TESRv6 supports the same TE Policy functionality as SR-MPLS. In cases where more advanced TE is required than Flex-Algo provides, such as defining explicit paths or requiring a path be disjoint from another path, SRv6-TE can be utilized. In CST SRv6 1.0, on-demand networking can be used for supported services with SR-PCE to compute both intra-domain and inter-domain paths. Provisioning, visualization, and monitoring of SRv6-TE paths is available in Crosswork Network Controller 4.1.SRv6 Network Functions and Endpoint BehaviorsRFC 8986 defines a set of SRv6 endpoint behaviors satisfying specific network service functions. The table below defines a base set of functions and the identifiers used. See RFC 8986 for details on each behavior. Behavior Identifier Behavior Description End SRv6 version of prefix-SID End.X L3 cross-connect, SRv6 version of Adj-SID End.T IPv6 table lookup End.DX6 Decapsulate and perform IPv6 cross-connect, per-CE IPv6 L3VPN use case End.DX4 Decapsulate and perform IPv4 cross-connect, per-CE IPv4 L3VPN use case End.DT6 Decapsulate and perform IPv6 route lookup, per-VRF IPv6 L3VPN use case End.DT4 Decapsulate and perform IPv4 route lookup, per-VRF IPv4 L3VPN use case End.DT46 Decapsulate and perform IPv4 or IPv6 route lookup, per-VRF IP L3VPN use case End.DX2 Decapsulate and perform L2 cross-connect, P2P L2VPN use case End.DX2V Decapsulate and perform L2 VLAN lookup, P2P L2VPN using VLANs use case End.DT2U Decapsulate and perform unicast MAC lookup, L2VPN ELAN use case End.DT2M Decapsulate and perform L2 flooding, L2VPN ELAN use case End.B6.Encaps Identifies SRv6 SID bound to a SRv6 Policy, binding SID End.B6.Encaps.Red End.B6 with reduced SRH End.BM Endpoint bound to an SR-MPLS Policy SRv6 Compressed SID BehaviorWhen SRv6 SID compression is used enhanced methods are used when processing End SRv6 packets. Since multiple SIDs are encoded in a single IPv6 address as the argument component of the SRv6 address, a portion of the argument equal to the SID length is copied into a specific portion of the IPv6 destination address matching the next node in the path. Micro-SID Behavior Identifier Behavior Description uN NEXT-CSID End behavior with shift and lookup (Prefix-SID) uA NEXT-CSID End.X behavior with shift and xconnect (Adj-SID) uDT NEXT-CSID End.DT behavior (End.DT4/End.DT6/End.DT2U/End.DT2M) uDX NEXT-CSID End.DX behavior (End.DX4/End.DX6/End.DX2) SRv6 Network ImplementationImplementing SRv6 in the network requires the following steps# Network Domain Planning SRv6 Network Address Planning SRv6 Router Configuration SRv6 Enabled Services ConfigurationNetwork Domain PlanningNetworks require segmentation to scale. The initial step in designing scalable networks is to determine where network boundaries. This leads directly to IGP segmentation of the network utilized in the CST SRv6 design. In our example network we have three domains, two access domains and one core domain. Each of these IGP domains is assigned a unique instance identifier.SRv6 Network Address PlanningSRv6 using micro-SID requires allocating the appropriate parent IPv6 prefix and then further delegation based on the micro-SID carrier format. We will use the F3216 format, using a 32 bit block and 16 bits for each SID. It is recommended the parent prefix, such as a /24 be allocated ONLY for SRv6 use. The block can be from public IPv6 resources or utilize a ULA private block.Specific segments of the IPv6 address can be used to represent areas of the network, flexible algorithms, or other network data. It is important to use the specific bits or bytes of the address used as identifiers to also aid in summarization.Locator Planning and FormattingThe SRv6 locator identifies a node and its specific services. SRv6 using micro-SID should use a specific locator format that adheres to the micro-SID carrier format and lends itself to summarization at network boundaries. The following is a recommended way to define the locator format which allows for efficient summarization. It is required to use a locator prefix length of /48 on all nodes when using the F3216 carrier format.The example locator above encodes the following information# Identifier Bit Length Usage fccc#00 24 Base SRv6 locator prefix used network wide XY 8 Together identifies the uSID block X 4 General-use identifier, NCS platforms require this byte be set to 0 Y 4 In our case the X portion identifies Flexible Algorithm, 0-3F usable for global SIDs ZZ 8 Identifies domain NN 8 Identifies node In our example network, we have a /32 micro-SID prefix allocated network wide for each Flex-Algo. This is recommended as it quickly allows operators to identify the FA being used and promotes more efficient summarization. However, if FA is not being used these bits could be used for a different identifier.The 8-bit domain identifier allows 255 domains, the 8-bit node identifier allows 255 nodes per domain. This is flexible however, the structure could be shifted to allow less domains and more nodes per domain.Using the schema above our example address is as follows# Identifier Value Meaning fccc#00 24 Base SRv6 locator prefix used network wide X 0 General-use identifier, NCS platforms require this byte be set to 0 Y 1 Flexible Algorithm 128 ZZ 02 Domain 102 NN 15 Node assigned identifier 15 This allows each domain’s SRv6 SIDs to be summarized per flex-algo at the /40 prefix length.Router ConfigurationThe CST design supports using SRv6 micro-SID only. Legacy SRv6 using the 128-bit carrier format is not supported.SRv6 Micro-SID Hardware EnablementOn NCS 540 and NCS 5500 platforms the following command enables SRv6 with micro-SID carrier. hw-module profile segment-routing srv6 mode micro-segment format f3216Loopback Interface ConfigurationWhile not required, it is recommended to use an IPv6 address from the Algo 0 (base IGP)locator address block for the Loopback interface. interface Loopback0 ipv4 address 101.0.2.53 255.255.255.255 ipv6 address fccc#0#214##1/128!PE Locator ConfigurationSRv6 enabled routers terminating services must have SRv6 locators configured. In our Flex-Algo use case we will have a single locator configured for each Flex-Algo, although more locators can be configured as needed. The unode behavior is set to psp-usd which performs penultimate-segment-popping and ultimate-segment-decapsulation. See RFC 8986 for more information on these behaviors.Locators are used to allocate both static and dynamic /64 SIDs used for services and link adjacency SIDs. The SIDs used for dynamic allocation are in the e0000-ffff range in bits 48-63 of the IPv6 address.In this case the locator value is assigned based on the following as identified in the addressing section# Identifier Value Domain Access 2 Global Base SRv6 Block FCCC#00##/24 Global FA 0 Block FCCC#0000##/32 Global FA 128 Block FCCC#0001 ##/32 Global FA 129 Block FCCC#0002##/32 Global FA 130 Block FCCC#0003##/32 Global FA 131 Block FCCC#0004##/32 Access Domain 2 FCCC#00XX#02##/40 Unique node in Access Domain 2 FCCC#00XX#0214##/48 PE Locator Router Configurationsegment-routing srv6 encapsulation source-address fccc#0#214##1 ! locators locator LocAlgo0 micro-segment behavior unode psp-usd prefix fccc#0#214##/48 ! locator LocAlgo128 micro-segment behavior unode psp-usd prefix fccc#1#214##/48 algorithm 128 ! locator LocAlgo129 micro-segment behavior unode psp-usd prefix fccc#2#214##/48 algorithm 129 ! locator LocAlgo130 micro-segment behavior unode psp-usd prefix fccc#3#214##/48 algorithm 130 ! locator LocAlgo131 micro-segment behavior unode psp-usd prefix fccc#4#214##/48 algorithm 131 ! ! !!Configured LocatorsRP/0/RP0/CPU0#cst-a-pe3#show segment-routing srv6 locatorThu Jan 5 17#27#30.127 UTCName ID Algo Prefix Status Flags-------------------- ------- ---- ------------------------ ------- --------LocAlgo0 1 0 fccc#0#214##/48 Up ULocAlgo128 2 128 fccc#1#214##/48 Up ULocAlgo129 3 129 fccc#2#214##/48 Up ULocAlgo130 4 130 fccc#3#214##/48 Up ULocAlgo131 5 131 fccc#4#214##/48 Up UDynamic Micro-SID Service SIDs based on LocatorAs shown below the primary locator for Algo 0 is identified as fccc#0#103##/48. Each service has one or more SIDs allocated starting at fccc#0#103#e000##/64.RP/0/RP0/CPU0#cst-a-pe3#show segment-routing srv6 sidThu Jan 5 17#30#55.250 UTC*** Locator# 'LocAlgo0' ***SID Behavior Context Owner State RW-------------------------- ---------------- -------------------------------- ------------------ ----- --fccc#0#103## uN (PSP/USD) 'default'#259 sidmgr InUse Yfccc#0#103#e000## uDT2U 4550#0 l2vpn_srv6 InUse Yfccc#0#103#e001## uDT2M 4550#0 l2vpn_srv6 InUse Yfccc#0#103#e002## uDX2 4600#600 l2vpn_srv6 InUse Yfccc#0#103#e003## uDX2 650#650 l2vpn_srv6 InUse Yfccc#0#103#e004## uDT4 'l3vpn-v4-srv6' bgp-100 InUse YPE IS-IS Router ConfigurationSRv6 is the IPv6 data plane for Segment Routing and utilizes the same SID distribution semantics as SR-MPLS. This is achieved through IGP extensions responsible for distributing SID reachability within an IGP domain or even across IGP domains. The CST design utilizes IS-IS.It is recommended to utilize IS-IS as an IPv6 IGP. OSPFv3 is not widely deployed in networks today and typically lags behind IS-IS in feature development.In this example we are utilizing the base Algo 0 and four additional Algos, 128,129,130,131. Each requires a separate Locator be applied to a Loopback interfaceon the router. The locator names are those defined in the earlier SRv6 base configuration section.router isis ACCESS address-family ipv6 unicast metric-style wide microloop avoidance segment-routing router-id Loopback0 segment-routing srv6 locator LocAlgo0 ! locator LocAlgo128 ! locator LocAlgo129 ! locator LocAlgo130 ! locator LocAlgo131 ! ! maximum-redistributed-prefixes 10000 level 2 ! interface Loopback0 address-family ipv6 unicast ! ! interface TenGigE0/0/0/9 bfd fast-detect ipv6 point-to-point hello-password keychain ISIS-KEY address-family ipv6 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa ! !!Domain Boundary IS-IS ConfigurationThe multiple instance IS-IS configuration is similar to the SR-MPLS CST design. The primary differences are the use of summarization, redistribution between instances, and the use of UPA.Unreachable Prefix Advertisement ConfigurationThe UPA configuration is enabled by configuring the UPA parameters under prefix-unreachable and enabling the “adv-unreachable” for the summary prefix. The adv-metric sets the metric of the unreachable prefix and the adv-lifetime sets the amount of time it should be advertised in milliseconds. Prefixes generated by UPA can also be limited to those with a specific IGP tag by using the unreachable-component-tag option. As an example Loopback and SRv6 Locator addresses can be tagged so they generate a UPA but infrastructure links are omitted.Summary Prefix ConfigurationA flex-algo algorithm can be attached to the summary prefix and used in pathcomputation. Using the explicit keyword means only SRv6 prefix SIDs with thespecified algorithm attached will be considered as contributing prefixes for thesummary.Core and Access Mutual RedistributionRedistribution between IGP instances should always utilize route policies with appropriate prefix-sets or tags to restrict the prefixes advertised between domains.Link-state IGP protocols only allow prefix summarization at boundaries such as IS-IS areas, levels, or OSPF area boundaries. Summarization can also be performed on external prefixes redistributed from another protocol or IGP instance.In our case the summary-prefix configuration for the ACCESS IGP instances is in the CORE instance configuration. Tagging can be used to filter multiple prefixes based on a single tag, such as all prefixes belonging to a specific domain.UPA prefixes must also be redistributed beyond domain boundaries. The longer UPAcomponent prefix is generated in the CORE IS-IS instance, and must beredistributed into the appropriate remote instance. As an example iffccc#0#100#1/128 in Access Domain 1 is unreachable, a UPA is generated in theCORE domain for the /128 prefix, and then must be redistributed in Access domain2 so ingress PEs can more quickly switch to an alternative path. In our exampleBGP next-hop for service routes is the /128 Loopback address assigned from Algo0, so we must leak /128 UPAs from the core into each access domain matching theremote domain prefixes. This can be simplified by using tagging instead of strict prefix-lists.Access-1 to Core Boundaryprefix-set ACCESS1-PE-uSID fccc#0#100##/40 eq 48, fccc#1#100##/40 eq 48, fccc#2#100##/40 eq 48, fccc#3#100##/40 eq 48, fccc#4#100##/40 eq 48end-setprefix-set ACCESS2-PE-uSID-Summary fccc#0#200##/40, fccc#1#200##/40, fccc#2#200##/40, fccc#3#200##/40, fccc#4#200##/40end-setprefix-set ACCESS2-PE-uSID-UPA fccc#0#200##/40 eq 128end-setroute-policy CORE-TO-ACCESS1-SRv6 if destination in ACCESS2-PE-uSID-Summary then pass else drop endifend-policyroute-policy ACCESS1-TO-CORE-SRv6 if destination in ACCESS1-PE-uSID then pass else drop endifend-policyrouter isis ACCESS address-family ipv6 unicast prefix-unreachable adv-metric 4261412866 adv-lifetime 1000 rx-process-enable ! summary-prefix fccc##/40 tag 100 adv-unreachable redistribute isis CORE route-policy CORE-TO-ACCESS1-SRv6In the ACCESS instance configuration the summary prefix is equivalent to fccc#0000#00/40 as IOS-XR removes trailing zeroes from the address in the configuration.router isis CORE address-family ipv6 unicast prefix-unreachable adv-metric 4261412866 adv-lifetime 1000 rx-process-enable ! summary-prefix fccc#0#100##/40 tag 101 adv-unreachable summary-prefix fccc#1#100##/40 algorithm 128 tag 101 adv-unreachable explicit summary-prefix fccc#2#100##/40 algorithm 129 tag 101 adv-unreachable explicit summary-prefix fccc#3#100##/40 algorithm 130 tag 101 adv-unreachable explicit summary-prefix fccc#4#100##/40 algorithm 131 tag 101 adv-unreachable explicit redistribute isis ACCESS route-policy ACCESS1-TO-CORE-SRv6 ! !!Access-2 to Core Boundaryprefix-set ACCESS2-PE-uSID fccc#0#200##/40 le 48, fccc#1#200##/40 le 48, fccc#2#200##/40 le 48, fccc#3#200##/40 le 48, fccc#4#200##/40 le 48end-setprefix-set ACCESS1-PE-uSID-Summary fccc#0#100##/40, fccc#1#100##/40, fccc#2#100##/40, fccc#3#100##/40, fccc#4#100##/40end-setprefix-set ACCESS1-PE-uSID-UPA fccc#0#100##/40 eq 128end-setroute-policy CORE-TO-ACCESS2-SRv6 if destination in ACCESS1-PE-uSID-Summary then pass else drop endifend-policyroute-policy ACCESS2-TO-CORE-SRv6 if destination in ACCESS2-PE-uSID then pass else drop endifend-policyrouter isis ACCESS address-family ipv6 unicast prefix-unreachable adv-metric 4261412866 adv-lifetime 1000 rx-process-enable ! summary-prefix fccc##/40 tag 100 adv-unreachable redistribute isis CORE route-policy CORE-TO-ACCESS2-SRv6 router isis CORE address-family ipv6 unicast prefix-unreachable adv-metric 4261412866 adv-lifetime 1000 rx-process-enable ! summary-prefix fccc#0#200##/40 tag 102 adv-unreachable summary-prefix fccc#1#200##/40 algorithm 128 tag 102 adv-unreachable explicit summary-prefix fccc#2#200##/40 algorithm 129 tag 102 adv-unreachable explicit summary-prefix fccc#3#200##/40 algorithm 130 tag 102 adv-unreachable explicit summary-prefix fccc#4#200##/40 algorithm 131 tag 102 adv-unreachable explicit redistribute isis ACCESS route-policy ACCESS2-TO-CORE-SRv6 ! !!SRv6-TE PoliciesSRv6 supports Traffic Engineering policies, using different metric types and additional path constraints to engineer traffic paths from head-end to endpoint node. SRv6 TE Policy configuration follows the same configuration as SR-MPLS TE policies.In the CST SRv6 design 1.0 dynamic path calculation is supported using SR-PCE.Please see the CST implementation guide for PCE configuration details.Additionally, in CST SRv6 1.0 path computation for SRv6-TE Policies requires a path of nodes supporting for SRv6, if the only path is via a node not supporting SRv6, path calculation will fail.segment-routing traffic-eng policy srte_c_21009_ep_fccc#0#215##1 srv6 locator LocAlgo0 binding-sid dynamic behavior ub6-insert-reduced ! source-address ipv6 fccc#0#103##1 color 21009 end-point ipv6 fccc#0#215##1 candidate-paths preference 100 dynamic pcep ! metric type igp ! ! ! ! ! !!Operational DetailsRP/0/RP0/CPU0#cst-a-pe3#show segment-routing traffic-eng policy color 21009Thu Jan 5 23#42#24.222 UTCSR-TE policy databaseColor# 21009, End-point# fccc#0#215##1 Name# srte_c_21009_ep_fccc#0#215##1 Status# Admin# up Operational# up for 13#36#55 (since Jan 5 10#05#28.918) Candidate-paths# Preference# 100 (configuration) (active) Name# srte_c_21009_ep_fccc#0#215##1 Requested BSID# dynamic PCC info# Symbolic name# cfg_srte_c_21009_ep_fccc#0#215##1_discr_100 PLSP-ID# 52 Constraints# Protection Type# protected-preferred Maximum SID Depth# 7 Dynamic (pce 101.0.1.101) (valid) Metric Type# IGP, Path Accumulated Metric# 260 SID[0]# fccc#0#108##/48 Behavior# uN (PSP/USD) (48) Format# f3216 LBL#32 LNL#16 FL#0 AL#80 Address# fccc#0#108##1 SID[1]# fccc#0#e##/48 Behavior# uN (PSP/USD) (48) Format# f3216 LBL#32 LNL#16 FL#0 AL#80 Address# fccc#0#e##1 SID[2]# fccc#0#215##/48 Behavior# uN (PSP/USD) (48) Format# f3216 LBL#32 LNL#16 FL#0 AL#80 Address# fccc#0#215##1 SRv6 Information# Locator# LocAlgo0 Binding SID requested# Dynamic Binding SID behavior# End.B6.Insert.Red Attributes# Binding SID# fccc#0#103#e014## Forward Class# Not Configured Steering labeled-services disabled# no Steering BGP disabled# no IPv6 caps enable# yes Invalidation drop enabled# no Max Install Standby Candidate Paths# 0Explicit segment list definitionIn the case of building a policy with explicit paths, the sid-format must be defined so the appropriate uSID container can be populated with each node SID in the path.segment-routing traffic-eng segment-lists srv6 sid-format usid-f3216 ! segment-list APE7-srv6 srv6 index 1 sid fccc#0#109## index 2 sid fccc#0#20f## index 3 sid fccc#0#215## ! ! ! !!On-Demand SRv6-TE PolicyOn-demand next-hop or ODN is supported for SRv6. In the 1.0 version of the CST SRv6 design, ODN paths must be computed using SR-PCE.In this case the dynamic binding SID for the policy is associated with the flex-algo Algo128 locator and has a constraint to use algorithm 128.segment-routing traffic-eng on-demand color 6128 srv6 locator LocAlgo128 binding-sid dynamic behavior ub6-insert-reduced ! source-address ipv6 fccc#1#103## dynamic pcep ! metric type igp ! ! constraints segments sid-algorithm 128 ! ! ! !!Enabling Services over SRv6Segment Routing with the IPv6 data plane is used to support all of the services supported by the MPLS data plane while also enabling advanced functionality not capable in MPLS based networks. In CST SRv6 1.0 L3VPN, EVPN-ELAN, and EVPN-VPWS services are supported using SRv6 micro-SID.Services Route Reflector DesignThe Converged SDN Transport design makes use of a set of service BGP route reflectors (S-RRs) communicating BGP service routes between PE nodes in a scalable and resilient manner. In CST SRv6 1.0 SRv6 service routes are expected from an IPv6 peer.SRv6 Service ForwardingMPLS uses a multi-label stack to carry overlay VPN services over an MPLS underlay network. There is always at least a 2-label stack identifying the egress node and specific underlay service or service entity.SRv6 utilizes the flexibility of IPv6 to encode the service information without multiple layers. Since the Locator assigned to the egress node is a /48 all traffic within the /48 will reach the node. This leaves the remaining 80 bits to be used for identifying services and service components. IOS-XR will dynamically assign a /64 out of the /48 Locator for services. As an example a L3VPN with per-VRF SRv6 SID allocation will be assigned a /64 as will an EVPN-VPWS service.In the simplest forwarding use case the ingress node simply sets the SRv6 packet destination address to the IPv6 service address. It is forwarded hop by hop based on the IPv6 DA, meaning intermediate nodes not SRv6 aware can also forward the traffic.L3VPN Forwarding ExampleIn the single-domain example below, a VPNv4 L3VPN is configured between routers R1 and R3. R3 advertises the VPNv4 prefix with the appropriate parameters required for R1 to properly build the packet to R3. As you can see, there is no SRH involvedin this simple example, all of the information is encoded in the VPNv4 advertisement to allow R1 to use a single IPv6 destination address to send traffic to the appropriateservice.Traffic will be routed across the proper Flex-Algo path. R4 will utilizestandard LPM (longest prefix match) routing using the Algo 128 topology.The uDT4 behavior means “decapsulate the packet and perform an IPv4 routing lookup”. The local SID fccc#1#215#e004##/64 is assigned to the specific L3VPN VRF.L3VPN Configuration ExampleThis is an example of a 3-node L3VPN using SRv6. Each node has already been assigned a SRv6 Locator to be used with this L3VPN. In this case we are using the Locator defined for the base algo, LocAlgo0. Service SIDs will be allocated from the fccc##103##/48 block. This service is carrying IPv4 routes over SRv6 and utilizes theuDT4 behavior type.Egress PE ConfigurationLocator Configurationsegment-routing srv6 locators locator LocAlgo0 micro-segment behavior unode psp-usd prefix fccc#0#103##/48VRF Configuration The VRF configuration is identical to non-SRv6 use cases.BGP ConfigurationSRv6 must be explicitly enabled for services utilizing SRv6. In IOS-XR 7.8.1 per-vrf is the only SID allocation mode supported.router bgp 100 vrf l3vpn-v4-srv6 rd 100#6001 address-family ipv4 unicast segment-routing srv6 locator LocAlgo0 alloc mode per-vrf ! redistribute connected ! !!SID Allocation Here we see the two SIDs allocated to each address family.RP/0/RP0/CPU0#cst-a-pe3#show segment-routing srv6 sid detail fccc#0#103#e004##Sun Jan 8 16#54#01.080 UTC*** Locator# 'LocAlgo0' ***SID Behavior Context Owner State RW-------------------------- ---------------- -------------------------------- ------------------ ----- --fccc#0#103#e004## uDT4 'l3vpn-v4-srv6' bgp-100 InUse Y SID Function# 0xe004 SID context# { table-id=0xe0000005 ('l3vpn-v4-srv6'#IPv4/Unicast) } Locator# 'LocAlgo0' Allocation type# Dynamic Created# Nov 28 20#20#21.184 (5w5d ago)L3VPN Route on Ingress NodeHere we see the route received from the egress node. There is a new SubTLV containing the SRv6 encoding type and we can see the service SID has been encoded as part of the BGP label TLV.RP/0/RP0/CPU0#cst-a-pe8#show bgp vrf l3vpn-v4-srv6 64.4.4.0/24 detailBGP routing table entry for 64.4.4.0/24, Route Distinguisher# 100#6001Versions# Process bRIB/RIB SendTblVer Speaker 2215286 2215286 Flags# 0x20041012+0x00000000;Last Modified# Dec 20 03#42#26.946 for 2w5dPaths# (1 available, best #1) Not advertised to any peer Path #1# Received by speaker 0 Flags# 0x2000000085060005+0x00, import# 0x39f Not advertised to any peer Local fccc#0#103##1 (metric 120) from fccc#0#216##1 (101.0.1.50), if-handle 0x00000000 Received Label 0xe0040 Origin incomplete, metric 0, localpref 100, valid, internal, best, group-best, import-candidate, imported Received Path ID 0, Local Path ID 1, version 2215286 Extended community# RT#100#6001 Originator# 101.0.1.50, Cluster list# 101.0.2.202, 101.0.0.200, 101.0.1.201 PSID-Type#L3, SubTLV Count#1, R#0x00, SubTLV# T#1(Sid information), Sid#fccc#0#103##, F#0x00, R2#0x00, Behavior#63, R3#0x00, SS-TLV Count#1 SubSubTLV# T#1(Sid structure)# Length [Loc-blk,Loc-node,Func,Arg]#[32,16,16,0], Tpose-len#16, Tpose-offset#48 Source AFI# VPNv4 Unicast, Source VRF# l3vpn-v4-srv6, Source Route Distinguisher# 100#6001Forwarding Entry on Ingress NodeIn the forwarding entry we see the SRv6 encapsulation with a SID list of the service address. This address is used as a the destination address of the IPv6 packet sent from the ingress router. The router will utilize the /40 summary to reach the end service address.RP/0/RP0/CPU0#cst-a-pe8#show cef vrf l3vpn-v4-srv6 64.4.4.0/2464.4.4.0/24, version 11, SRv6 Headend, internal 0x5000001 0x30 (ptr 0x8c326328) [1], 0x0 (0x0), 0x0 (0x9016d028) Updated Dec 20 03#42#26.875 Prefix Len 24, traffic index 0, precedence n/a, priority 3 gateway array (0xa4a8b360) reference count 1, flags 0x2010, source rib (7), 0 backups [1 type 3 flags 0x48441 (0xa5ba6658) ext 0x0 (0x0)] LW-LDI[type=0, refc=0, ptr=0x0, sh-ldi=0x0] gateway array update type-time 1 Dec 20 03#42#26.874 LDI Update time Dec 20 03#42#26.874 Level 1 - Load distribution# 0 [0] via fccc#0#103##/128, recursive via fccc#0#103##/128, 619 dependencies, recursive [flags 0x6000] path-idx 0 NHID 0x0 [0x8df4b168 0x0] next hop VRF - 'default', table - 0xe0800000 next hop fccc#0#103##/128 via fccc#0#100##/40 SRv6 H.Encaps.Red SID-list {fccc#0#103#e004##} Load distribution# 0 1 (refcount 1) Hash OK Interface Address 0 Y TenGigE0/0/0/8 fe80##28a#96ff#fe4a#8078 1 Y TenGigE0/0/0/9 fe80##28a#96ff#fec7#e878RP/0/RP0/CPU0#cst-a-pe8#show route ipv6 fccc#0#103#e004##Routing entry for fccc#0#100##/40 Known via ~isis ACCESS~, distance 115, metric 120 Tag 1003, type level-2 Installed Dec 1 01#53#02.153 for 5w3d Routing Descriptor Blocks fe80##28a#96ff#fe4a#8078, from fccc#0#e##1, via TenGigE0/0/0/8, Protected, ECMP-Backup (Local-LFA) Route metric is 120 fe80##28a#96ff#fec7#e878, from fccc#0#e##1, via TenGigE0/0/0/9, Protected, ECMP-Backup (Local-LFA) Route metric is 120 No advertising protos.L2VPN EVPN-VPWSEVPN-VPWS is configured similar to the MPLS use case with the exception of specifying the transport type as srv6.RP/0/RP0/CPU0#cst-a-pe3#show run l2vpn xconnect group EVPN-VPWS-SRv6_MHSun Jan 8 17#22#09.140 UTCl2vpn xconnect group EVPN-VPWS-SRv6_MH p2p EVPN-VPWS-SRv6_MH interface TenGigE0/0/0/5.600 neighbor evpn evi 4600 service 600 segment-routing srv6 ! ! !!L2VPN EVPN-VPWS StateThe SRv6 behavior of uDX2 means micro-SID behavior with L2 cross-connect. This is a multi-homed service on the remote side, so there are two service SIDs listed as remote endpoints.Group EVPN-VPWS-SRv6_MH, XC EVPN-VPWS-SRv6_MH, state is up; Interworking none AC# TenGigE0/0/0/5.600, state is up Type VLAN; Num Ranges# 1 Rewrite Tags# [] VLAN ranges# [600, 600] MTU 1500; XC ID 0x264; interworking none Statistics# packets# received 480066518, sent 480034205 bytes# received 480066514256, sent 479074130038 drops# illegal VLAN 0, illegal length 0 EVPN# neighbor ##ffff#10.0.0.1, PW ID# evi 4600, ac-id 600, state is up ( established ) XC ID 0xc0000031 Encapsulation SRv6 Encap type Ethernet Ignore MTU mismatch# Enabled Transmit MTU zero# Enabled Reachability# Up SRv6 Local Remote ---------------- ---------------------------- -------------------------- uDX2 fccc#0#103#e002## fccc#0#214#e002## fccc#0#215#e002## AC ID 600 600 MTU 1514 0 Locator LocAlgo0 N/A Locator Resolved Yes N/A SRv6 Headend H.Encaps.L2.Red N/A Statistics# packets# received 480034205, sent 480066518 bytes# received 479074130038, sent 480066514256EVPN ELANIn this case we are using Algorithm 128, the low latency Flex Algo for the end to end path between the ingress node and egress node.Egres PE EVPN Configurationevpn evi 4500 segment-routing srv6 bgp route-target import 100#4500 route-target export 100#4500 ! advertise-mac ! locator LocAlgo128 !!Egress PE BVI Configurationl2vpn bridge group srv6_evpn_MH bridge-domain srv6_evpn_MH_1 interface Bundle-Ether25.4500 ! evi 4500 segment-routing srv6 ! ! !!Egress PE SRv6 SID AllocationThe EVPN detailed output shows the two SRv6 SIDs allocated for the service.These are allocated from the Algo128 Locator block. Unicast and BUM are handledby different labels in MPLS based EVPN, and with SRv6 use two separate SIDs one for the uDT2U unicast behavior and one for the uDT2M multicast behavior.RP/0/RP0/CPU0#cst-a-pe7#show evpn evi vpn-id 4500 detailVPN-ID Encap Bridge Domain Type---------- ---------- ---------------------------- -------------------4500 SRv6 srv6_evpn_MH_1 EVPN Stitching# Regular Unicast SID# fccc#1#215#e000## Multicast SID# fccc#1#215#e001## E-Tree# Root Forward-class# 0 Advertise MACs# Yes Advertise BVI MACs# No Aliasing# Enabled UUF# Enabled Re-origination# Enabled Multicast# Source connected # No IGMP-Snooping Proxy# No MLD-Snooping Proxy # No BGP Implicit Import# Enabled VRF Name# SRv6 Locator Name# LocAlgo128 Preferred Nexthop Mode# Off BVI Coupled Mode# No BVI Subnet Withheld# ipv4 No, ipv6 No RD Config# none RD Auto # (auto) 101.0.2.52#4500 RT Auto # 100#4500 Route Targets in Use Type ------------------------------ --------------------- 100#4500 Import 100#4500 Export*** Locator# 'LocAlgo128' ***SID Behavior Context Owner State RW-------------------------- ---------------- -------------------------------- ------------------ ----- --fccc#1#215#e000## uDT2U 4500#0 l2vpn_srv6 InUse Yfccc#1#215#e001## uDT2M 4500#0 l2vpn_srv6 InUse YMPLS and SRv6 MigrationIn 7.8.1 IOS-XR supports two transition technologies used to interworkbetween traditional MPLS and SRv6 domains. Interworking between different data planetypes can take place using transport layer interworking or service layer interworking. The transition methods supported in CST SRv6 1.0 and IOS-XR 7.8.1 utilize service interworking and support IPv4 and IPv6 L3VPN services.SRv6 and MPLS Service Interworking GatewayIETF draft draft-agrawal-spring-srv6-mpls-interworking-10 covers the semantics of the SRv6 and MPLS gateway functionality. A gateway node translates L3VPN service routes and their associated data plane forwarding information between SR-MPLS and SRv6 endpoints.In CST SRv6 1.0 the ASR 9000 and NCS 5500 series support this functionality using per-VRF MPLS label and SRv6 SID allocation.vrf gw-l3vpn-v4-srv6 address-family ipv4 unicast enable label-mode segment-routing srv6 import route-target 100#6200 stitching 100#6210 ! export route-target 100#6200 stitching 100#6210 ! !!router bgp 100 nsr bgp router-id 101.0.0.3 bgp redistribute-internal bgp graceful-restart segment-routing srv6 locator LocAlgo0 !neighbor 101.0.2.202 use neighbor-group SvRR address-family vpnv4 unicast import reoriginate stitching-rt route-reflector-client advertise vpnv4 unicast re-originated next-hop-self ! address-family vpnv6 unicast import reoriginate stitching-rt route-reflector-client advertise vpnv6 unicast re-originated !! neighbor fccc#0#10##1 use neighbor-group SvRR-Client-srv6 address-family vpnv4 unicast import stitching-rt reoriginate route-reflector-client encapsulation-type srv6 advertise vpnv4 unicast re-originated stitching-rt next-hop-self ! address-family vpnv6 unicast import stitching-rt reoriginate route-reflector-client encapsulation-type srv6 advertise vpnv6 unicast re-originated stitching-rt next-hop-self ! !The interworking gateway uses the concept of a stitching Route Target to identify prefixes requiring re-origination using the opposite data plane. L3VPN prefixes are advertised to the IPv4 BGP neighbor using the MPLS data plane and prefixes advertised to the SRv6 BGP neighbor using the SRv6 data plane. The gateway node is always in the data path for service prefixes being translated.The folowing shows the data plane in the MPLS to SRv6 directionIn-Depth SRv6 to MPLS Service Interworking Documentationhttps#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-8/segment-routing/configuration/guide/b-segment-routing-cg-asr9000-78x/configure-srv6-micro-sid.html#id_133508Dual-Connected PEDual Connected PE allows the seamless migration of L3VPN PE routers from theMPLS to SRv6 data plane. In the dual-connected PE scenario the PE cancommunicate with MPLS and SRv6 neighbors within the same VRF. By default, IOS-XRwill advertise prefixes using an MPLS label, advertising with an SRv6 SIDrequires the encapsulation-type srv6 keyword. In the case below, routesadvertised to the 101.0.2.202 neighbor will use the default MPLS label, routesadvertised to fccc#0#10##1 will use the SRv6 encapsulation. The PE node willproperly process incoming packets using the MPLS label or SRv6 SID.vrf dual-l3vpn-v4-srv6 address-family ipv4 unicast enable label-mode segment-routing srv6 import route-target 100#7200 100#7210 ! export route-target 100#7200 100#7210 ! !!router bgp 100 segment-routing srv6 locator LocAlgo0 !neighbor 101.0.2.202 use neighbor-group SvRR-MPLS-ONLY address-family vpnv4 unicast route-reflector-client next-hop-self ! address-family vpnv6 unicast route-reflector-client next-hop-self !! neighbor fccc#0#10##1 use neighbor-group SvRR-SRV6-ONLY address-family vpnv4 unicast route-reflector-client encapsulation-type srv6 next-hop-self ! address-family vpnv6 unicast route-reflector-client encapsulation-type srv6 next-hop-self ! !SRv6 AutomationCrosswork Network Controller 4.1Crosswork Network Controller 4.1 supports the provisioning and visualization of SRv6 domains and SRv6-TE Policies. CNC also supports provisioning L2VPN EVPN-VPWS and L3VPN services utilizing an SRv6 data plane.The Traffic Engineering dashboard gives summary information for all TE path typesSelecting an SRv6-TE Policy will highlight the end to end path across domainsSelecting the ellipses and “details” will show details about the policyWe can now see the full details of the SRv6-TE policy incuding the uSID listwhich has been created to build the end-to-end path across IGP instances. Sincewe are crossing two domains we have two additional micro-SIDs at cst-pa1 andcst-pe3. The last SID is the egress node, a-pe7.Additional ResourcesConverged SDN Transport DesignHigh Level Design Guide# https#//xrdocs.io/design/blogs/latest-converged-sdn-transport-hld Implementation Guide# https#//xrdocs.io/design/blogs/latest-converged-sdn-transport-igCisco Segment Routing Homehttps#//segment-routing.net contains many blogs, demo videos, and configuration guides for SRv6 and SRv6 Micro-SID.SRv6 CCO DocumentationSRv6 Micro-SID Configuration Guide for ASR 9000# https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-8/segment-routing/configuration/guide/b-segment-routing-cg-asr9000-78x/configure-srv6-micro-sid.htmlSRv6 Traffic Engineering Configuration Guide for ASR 9000# https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-8/segment-routing/configuration/guide/b-segment-routing-cg-asr9000-78x/configure-srv6-traffic-engineering.htmlSRv6 Micro-SID Configuration Guide for NCS 5500# https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/segment-routing/78x/b-segment-routing-cg-ncs5500-78x/configure-srv6-micro-sid.html SRv6 Traffic Engineering Configuration Guide for NCS 5500# https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/segment-routing/78x/b-segment-routing-cg-ncs5500-78x/configure-srv6-traffic-engineering.html", "url": "/blogs/latest-converged-sdn-transport-srv6", "author": "Phil Bedard", "tags": "iosxr, design, sr, 5g, transport, cst, srv6, routing" } , "blogs-2023-05-16-service-assurance-a-guide-for-the-perplexed": { "title": "Service Assurance: A Guide For the Perplexed", "content": "IntroductionThe concept of IP assurance is not quite as old as IP, but it’s close# ARAPNET deployed IPv4 in January of 1983; by December, Mike Muuss had developed the ping utility to test reachability and delay. Techniques for assuring the network’s forwarding functions through direct performance measurement have been evolving ever since, resulting in a confusing array of tools, protocols and choices. Many network architects are left wondering# “What should I deploy for assurance, where, and why?”These are big questions, so I’m going to tackle them across a series of blog posts. This first one will cover the what and why of assurance. In part two, I’ll cover the protocols and tools you have to work with. And in part three, I’ll look at some key design questions.What Is Service Assurance?Assurance is the process of measuring performance and managing faults with the ultimate goal of optimization. You can “assure” many things – a router, a path, a service, or an end-to-end digital experience. Assurance addresses different questions depending on the layer it applies to, so I like to divide it into three categories# Network Assurance seeks to answer the question of whether the network devices are operating as designed# Are interfaces up? Are protocols running? Are CPU, memory and bandwidth utilizations within acceptable ranges? Customer Assurance (also called Digital Experience Monitoring) includes Application Performance Monitoring. It measures things like page load times, voice & video quality and application responsiveness. This data could be provided by the application itself (e.g. WebEx metrics) or measured by a third party tool (e.g. ThousandEyes, AppDynamics). Service Assurance sits between Network Assurance and Customer Assurance. A Service Provider is responsible for more than just making sure that routers are up and protocols are running (Network Assurance) but not usually responsible for the full Digital Experience (Customer Assurance) since the provider doesn’t own all the applications, compute and other resources involved. The “service”, then, defines the limits of the Service Provider’s liability. The Service Is The ProductAs a network engineer, I am often guilty of thinking of “services” in terms of the technologies that enable them# L3VPN, L2VPN, EVPN, BGP, Segment-Routing, Traffic Engineering, etc. But if you take a step back, the only reason to invest in these cool technologies is because they deliver value to end customers. Those customers don’t buy EVPN or BGP; they buy point-to-point or multi-point services (along with enhancements like security and management) to meet specific business needs.A service has a contract, a price, start and end dates, and a Service Level Agreement (SLA). The SLA defines the expected quality level of the service and can include metrics around availability, bandwidth, latency, loss and jitter.SPs are highly motivated to assure that the service meets the SLA for several reasons# Lost Revenue# If the SP cannot deliver the terms of the contract, then the end customer is entitled to a refund. In the case of outages and degradations, the SP must be able to prove that the failure did not occur in the SP network or else pay up. Service Differentiation# SPs are always looking for ways to differentiate with new and improved services. Increasingly, monitoring and reporting are considered part of a differentiated service. One SP called this “Service Assurance as a Service” (SAaaS). For example, if you are providing a low latency service, you must be able to prove that the service meets delay guarantees. Service assurance is becoming monetizable. Customer-driven Measurement# The end customer has the most immediate experience of the quality of the service. After all, it’s their data that’s traversing the service, and their applications that are directly impacted by service quality. Increasingly, Enterprises are investing in tools that can deliver very detailed metrics about the services they pay for. These tools include hardware-based probes from Netscout or Accedian, monitoring solutions like Thousand Eyes, or even the integrated tools in SD-WAN solutions like Viptela. These tools are often more accurate than anything the SP measures, leading to a situation where more than half of customer outages and degradation issues are reported by the end customers, not the SPs. Customer Experience and Retention# One survey showed that 90% of customers only contact their SPs when they are ready to end their contracts due to service disruptions. Because service assurance data can give an objective measurement of customer experience, it can be used to understand and improve customer retention. Reduced Customer Support Calls# One SP estimated that well over 50% of customer support calls concerned outages that did not involve the SP’s network or past outages that had been already fixed. By collecting service assurance data and providing it to their customers, SPs could offload those support calls to a self-service portal, conserving valuable technical support resources. Streamline Troubleshooting# Swamped by barrage of network events and alarms, network operators are challenged to understand the root cause of reported service disruptions as well as the service impact of network faults. Good service assurance data can help localize failures and identify root causes more quickly. Active Beats Passive for Service AssuranceA lot of Network Assurance relies on what are called “passive” metrics. Passive monitoring looks at state and statistics, such as interface statistics and memory utilization, on individual devices. Techniques like sFlow and Netflow are passive metrics, since they observe streams of traffic at different points in the network. Passive metrics offer a good picture of the health of a device at a given moment in time. Taken together, they create a good, high level picture of the overall functioning of the network.While passive metrics are effective for Network Assurance, they provide only an indirect measurement of the health of a service. Everything can look good from a device perspective (BGP neighbors up, interface utilization normal, etc) but service traffic could still be getting delayed, dropped or sub-optimally forwarded along the path. The only way to know if the service actually meets the SLA is to measure traffic sent through the service’s data path. Since services like L2 and L3VPNs do not define an embedded assurance function, an additional mechanism must be employed to probe the data path of the service.The most common way to do this is by injecting synthetic traffic to probe the network. This process is commonly referred to as “active monitoring.” There are different kinds of probes that can be generated from different devices at different parts of the network, but all active monitoring involves sending, receiving and tracking traffic to directly measure the end customer’s experience of the service.See RFC 7799 for more on the distinction between active and passive measurements as well as a definition of “hybrid” methods which modify customer traffic to carry performance data.ConclusionUnlike optical services (which carry performance, fault and path data in the header of every frame), IP/MPLS-based services like L2 and L3VPNs do not define an embedded assurance function. Nevertheless, end customers expect their SLAs to be delivered as promised. Assuring service performance through some form of active monitoring is increasingly a must-have for Service Providers.In part two of this series, I’ll take a deep dive into protocols and tools for active assurance.", "url": "/blogs/2023-05-16-service-assurance-a-guide-for-the-perplexed/", "author": "Shelly Cadora", "tags": "iosxr, cisco, Service Assurance" } , "blogs-2023-05-26-service-assurance-a-guide-for-the-perplexed-part-two": { "title": "Service Assurance: A Guide for the Perplexed, Part Two", "content": "In Part One of this series on Service Assurance, I discussed what and why Service Providers need it. In Part Two, I take a closer look at the specific protocols and tools you have to accomplish the task.Active Assurance at Layer 3The original form of active assurance are old friends to most network engineers# ping and traceroute. Both rely on ICMP messages to report availability, latency and path information. Ubiquitous as they are, ICMP-based utilities face real headwinds as assurance tools. For one, ICMP can travel a different path through the network and, indeed, through a device, than normal customer traffic, thus resulting in inaccurate reports of the actual performance. In addition, more SPs are enhancing security policies to drop ICMP traffic altogether, rendering these tools less useful.To enhance these stalwart tools, Cisco developed a set of performance measurement features collectively referred to as “IP-SLA.” IP-SLA enabled measurements of loss, delay and jitter to IP endpoints using a variety of operations, including ICMP, UDP, HTTP and more. IP-SLA is a form of active assurance that sends and receives probes from the router itself.The obvious usefulness of performance measurement made it an excellent candidate for standardization. In 2006, the IETF standardized One-Way Active Measurement Protocol (OWAMP). OWAMP provided the precision of one-way measurements but the requirement for network-wide clock synchronization limited its adoption.In 2008, RFC 5357 introduced Two-Way Active Measurement Protocol (TWAMP) to extend OWAMP to allow for two-way and round-trip measurements (with or without clock synchronization). Because TWAMP defined multiple logical roles for session establishment, most vendors ended up implementing a simpler architecture, “TWAMP Light”, that only needed a Sender and Responder.Unfortunately, TWAMP Light was only an appendix to RFC 5357 and not sufficiently specified to prevent interoperability issues. Hence we have RFC 8762, Simple Two Way Active Measurement Protocol (STAMP), which codified and extended TWAMP Light. STAMP is extensible and backwards compatible with TWAMP-Light, so hopefully it will stick around for a while.Other work is ongoing in the IETF to standardize STAMP extensions in order to leverage the forwarding capabilities of Segment Routing networks (sometimes called Segment Routing Performance Measurement or SR-PM).Insider Tip# In Cisco, “SR-PM” is sometimes used as a shorthand term for the superset of all L3 performance management features (SR and IP). So if you’re interested in TWAMP-Light and a Cisco person is talking to you about SR-PM, don’t worry, you’re both talking about the same thing.Active Assurance at Layer 2A different set of standards governs service assurance in Layer 2 networks (think L2VPN, VPLS, EVPN VPWS, etc). The building blocks of Ethernet Operation, Administration and Management (OAM) began with IEEE 802.1ag, which defined Connectivity Fault Management (CFM). The ITU came out with its own Ethernet OAM standard, Y.1731. Both 802.1ag and Y.1731 cover fault management, while performance management is solely covered by ITU-T Y.1731. Nowadays, IEEE 802.1ag is considered a subset of ITU-T Y.1731.Service Activation Testing MethodologiesIn addition to the “AMPs” and Y.1731, you may run across a few other measurement standards.Y.1564 is an out-of-service ethernet test methodology that validates the proper configuration and performance of an ethernet service before it is handed over to the end customer.RFC 6349 defines a methodology for TCP throughput testing.These methods are not suitable for on-going service assurance since they are only intended to be used during offline service activation or active troubleshooting.Where to Run# Embedded or External?Active assurance methods like TWAMP and Y.1731 require a sender and receiver/responder. The functionality can be embedded in an existing router or it can be a dedicated external device.Embedded# Simple But EffectiveThere are many reasons to take advantage of active probing capabilities built in to your routers. First of all, it’s free! You’ve already paid for the device to forward traffic, so if it can also do active assurance, then by all means, try this first. Operationally, it’s also a slam dunk. Whatever tools you use to manage your router config can also manage probe configuration. There are no new systems to manage or integrate.Embedded probes have a unique advantage in that they can test the internals of the network infrastructure. SR-PM, IOS XR’s performance measurement toolkit, includes the capability to test link performance as well as end-to-end traffic engineering paths. Emerging measurement techniques like Path Tracing bring ECMP awareness to performance measurement. These are things external probes can’t do.One common argument against embedded probes is performance. And that was certainly true in the past, when probes were punted to the RP for processing. If the punt path was congested, the probe would report poor performance when the actual datapath was fine. In modern systems, however, hardware timestamping ensures that the NPU stamps the probes in the datapath, giving a much more accurate measurement of network delay. In addition, many systems can support “hardware offload” which pushes the entire process of probe generation into the NPU, giving you a high fidelity measurement of the actual datapath and much higher performance than was possible in the past. So if you looked at IP-SLA a decade ago and dismissed it because of performance, you should take another look at modern implementations.Another consideration is interoperability. If you’re using embedded probes in a multi-vendor network, then you have to ensure interoperability between the vendors’ implementations of the protocol. This was especially painful with TWAMP-Light, since the lack of specificity in the RFC made it easy to interpret differently. This will get better as STAMP becomes the industry standard.Functionality is the final consideration for embedded probes. The limited memory and compute on a router means that more elaborate customer experience tests (e.g. page download or MOS scoring) are really not well-suited. Moreover, your upgrade cycle for assurance features is tied to the upgrade cycle of the entire router which can be measured in months or years. That’s a long time to wait for a new assurance feature.In sum, embedded probes offer an inexpensive way to get simple, scalable measurements of services, traffic engineered and ECMP paths and physical links with excellent fidelity to the actual data path and better performance than ever before. But if interoperability is a problem or you need more complex and/or end-to-end tests, then you may have to consider an external probing system.External# Extensive But ExpensiveExternal probing devices come in all shapes and sizes, from Network Interface Devices (NIDs) to pluggable SFPs to containerized agents running in generic compute. They can be deployed at any place in the network that a service provider has a presence, including the end customer site (if the SP has deployed a managed service) and in the cloud. Network vendor interoperability is not an issue since the probes are generated and received by the external probing devices, not the networking devices.Physical probe generators can be deployed in-line which measures the service exactly as the end customer experiences it. This is very accurate but also very expensive, as you need one device for every service. Other deployment models place the NID, SFP or containerized agent at a place in the network where probes can be injected into multiple service paths (e.g. on a trunk port with many VLANs associated with many different VRFs).Unlike routers, whose primary function is to forward traffic, external probes are dedicated to the sole purpose of measuring the network. The breadth of functionality they support can be much wider, encompassing Ethernet OAM, TWAMP, and service activation protocols as well as detailed insight into Layer 7 transactions (e.g. HTTP, DNS, TCP, SSL, etc) and high-level path analysis (e.g. using traceroute). Taken together, the information from external probes deployed at the right places in the network can give a good snapshot of the end customer’s digital experience.While external probes give good insight into end-to-end performance all the way up to the application layer, they can’t dig into the internals of the service provider network. The network is a black box to external probes. Things like link performance, path performance, and ECMP paths are essentially invisible to external probes.Probably the biggest drawback to external probes is cost, both capex and opex. Hardware probes, whether NIDs or SFPs, are expensive. Once service provider reported spending as much on NIDs as on routers in their edge deployment! But operational costs can also be of concern. Every external probe represents one more network element to manage# hardware has to be deployed and monitored, software has to be upgraded and maintained. Adding thousands or tens of thousands probes to your network is not a project to be taken lightly.ConclusionThere are a couple of things to take away from this brief overview of protocols and probes. First, use the right protocol for the service layer you need to test (typically TWAMP for Layer 3, Y.1731 for Layer 2). Second, your use case will determine whether embedded or external probes will serve you best. The closer you want to get to the network infrastructure, the more sense it makes to use the specialized, embedded probe capabilities you’ve already paid for. The closer you get to the end-to-end customer experience, the more you’ll want to look at external probing capabilities. To help think through those gray areas where either option could work, I’ll dig into some more design considerations in Part Three of this series.", "url": "/blogs/2023-05-26-service-assurance-a-guide-for-the-perplexed-part-two/", "author": "Shelly Cadora", "tags": "iosxr, Service Assurance" } , "blogs-2023-05-26-service-assurance-a-guide-for-the-perplexed-part-three": { "title": "Service Assurance: A Guide For The Perplexed, Part Three", "content": "In Part Two of this series on Service Assurance, I reviewed the methods, protocols and tools available for active assurance of services. In this part, I want to look some high level design options to help guide your deployment. I call these# Do nothing Do everything Do somethingThe Do Nothing ApproachService assurance can be a complex undertaking. One option is to do nothing# build a solid network with plenty of redundancy and lots of extra capacity, monitor for network performance and faults (“Network Assurance”) and have faith that the architecture can deliver the needed SLAs. This is a simple approach that can deliver value in a well-designed network. Because IP networks in general and services in particular don’t come with any built-in mechanisms for assurance, many operators start here by necessity. But this “best-effort” approach to service assurance has several weaknesses# If an SP can’t measure the latency or jitter of a service, they can’t sell more expensive low-latency or jitter-bound services. For lack of assurance, money is left on the table. The end customer knows more about the quality of their service and can detect impairments long before the SP is even aware of a problem. Troubleshooting a faulty service requires a lot of legwork on the part of smart (i.e. expensive) network engineers. When the number of services is small and commands a high price, that isn’t a big problem. But as Cloud and Video push bandwidth demand ever higher, that approach to services won’t scale. The cost of over-engineering excess capacity may exceed the cost of a good service assurance solution. The Do Everything ApproachIf “do-nothing” isn’t working out, then there’s always the “do-everything” approach, by which I mean assuring every service individually.As a first option, you should investigate the built-in Y.1731 or TWAMP capabilities at the PE (or even CPE, if the CPE is provider-managed). Since the capabilities are built-in to the router, this is the most cost-effective approach. The important thing to verify is that supported probe scale (number of probe sessions, number of simultaneous probes) matches the required service scale on the CPE or PE. Probe scale varies widely by silicon and software release, so get specifics.If your routers don’t support the scale or protocols you need, you can deploy an external NID or SFP to generate probes. An in-line deployment (where the NID or SFP is directly in the service data path) gives the most accurate measurement but could introduce forwarding issues of its own. This approach also limits the choices when it comes to purchasing SFPs (especially for 100G and above). A less expensive (but less accurate) option for multi-point services is to attach the SFP or NID on a separate port at the PE and inject probes into each service from there.The “do-everything” is obviously the best approach for ensuring your service SLAs, but in reality, the cost of deploying and maintaining it can make it impractical. It can also end up being very redundant. In a multi-point service, the number of required probes is equal to the number of connections which, for a fully meshed network, is calculated according to the formula (n*(n-1))/2. Say you have a small Enterprise L3VPN service with 5 customer sites connected to 5 PEs. You’ll need (5 * 4 / 2) = 10 probes for each class of service in the L3VPN. If the customer paid for Gold, Silver and Best Effort QoS profiles, that’s 3 * 10 = 30 probes for one small service. Now imagine you have 500 L3VPNs on that same set of 5 devices. If you tested each service individually, then you’d be running 30 * 500 = 15000 probes, all probing the same 10 paths.The Do Something ApproachGiven the scale and complexity issues, true per-service assurance may be unattainable. But there is a middle ground. Instead of taking a service as the unit of measurement, you could instead measure the shared transport paths and use that as a proxy for all the services that use those paths. Instead of 15000 probes for the 500 L3VPNS above, you could measure all 10 paths and 3 classes of service with 30 probes and use that in fulfillment of the SLA measurement.Probing the transport path is not a direct measurement of the service performance# it measures the performance of a shared path that goes between the internally facing PE interfaces and may not capture service-specific forwarding issues inside the PEs themselves. But in many cases, it can serve as a reasonable proxy measurement since the majority of forwarding issues occur in the shared transport network. In addition, path measurement is a natural fit for the built-in probing capabilities of your routers, making it a relatively inexpensive thing to deploy.Probing paths is inherently more scalable than probing services since many thousands of services might share the same path. However, relying on path probe data opens a new challenge# how to associate a service with a path (or paths). If all services use the shortest path between two PEs and the variations associated with ECMP paths can be ignored, then this might be “good enough.” In a more complicated scenario, a service might be configured to use a TE path but have fallen back to the shortest path because the TE path failed. Under these conditions, maintaining a real time mapping of services to paths represents a non-trivial amount of work for a management application (although, to be fair, the mapping of a service to the results of a service assurance probe requires work for external probes as well).ConclusionAlthough I’ve presented “nothing”, “everything” and “something” as separate deployment options, in reality, most providers will end up doing some of each.In the end, there are no really hard and fast rules about “what to deploy, where” when it comes to service assurance. Design choices are always about trade-offs. Active assurance is important but often expensive. If a service is going to include premium SLAs that can only be measured by individual probes, then that expense can be included in the price of the premium service. Perhaps only a portion of the services offered by a provider actually need individually monitored SLAs. Prudent deployment of path measurement and best-effort design principles could suffice for the rest.The good news is that service assurance is an area of active development for Cisco. New standards and tools are in process; new silicon will bring better scale and new features. IP networks may not have started with built-in assurance but the demands of today’s networks will continue to drive its evolution.", "url": "/blogs/2023-05-26-service-assurance-a-guide-for-the-perplexed-part-three/", "author": "Shelly Cadora", "tags": "iosxr, Service Assurance" } , "blogs-2023-07-31-convergence-of-ip-optical-and-control-routed-optical-networks": { "title": "Convergence of IP, Optical and Control: Routed Optical Networks", "content": "Many thanks Daniele Ceccareli, Principal Product Manager at Cisco, for co-authoring this post.Convergence of packet and optical network technologies has been attempted for many years, but it is only happening now, why? Because of RON…Routed Optical Networking (RON) is an architecture that combines optical networking technologies and packet network technologies into a single network that provides all types of services to customers from a single platform. This new paradigm has always been appealing because the resulting network has less layers and an overall simpler structure, but it is finally becoming reality for the following reasons# Advancements in Silicon enabling coherent WDM transceivers that fit into standard router ports without compromising on density Significant increase in volume for both pluggable and router Silicon due to massive deployments by web-scale companies driving down the cost These two factors enable dramatically lower overall cost (78%), power consumption (97%) and lower footprint (95%) - the numbers in parentheses are based on customer reported savings. Mature management and control architecture based on a hierarchical approach that allows for much easier and faster innovation compared to the control plane approach of the past (remember GMPLS?).Today’s collection of disparate management tools is being replaced with per technology SDN controllers orchestrated by a Hierarchical Controller (HCO) enabling simpler provisioning, troubleshooting, performance monitoring and much more. More details about Routed Optical Networking (RON) can be found here# Routed Optical Networking - HLDAchieving a full multivendor solution implies that the control system must have in depth understanding of how each domain and each technology must be configured, as well as an understanding of the possible failure modes, allowing for detailed troubleshooting. This implies that the control system must comprise of vendor specific tools that provide intimate knowledge of the different domains, and an umbrella controller on top that integrates information from all the domains into a single vendor-agnostic database. In other words, a hierarchical structure of domain controllers and a hierarchical controller as shown in the following figure.This architecture has been widely adopted by standardization bodies (e.g. IETF, ONF, MEF) and the major Service Providers all over the world. It is part of a larger hierarchy that includes OSS tools like service orchestrators and assurance systems and includes other parts of the Service Provider network such as access networks and data center resources.The following figure provides its mapping against IETF ACTN architecture (RFC 8453), where a clear and clean boundary separation between the packet and optical domains is provided and the Hierarchical Controller (MDSC) is the only entity capable to manage multidomain and multilayer services from a single UI/NBI.A few alternative architectures have been proposed by different vendors, spanning from an ambitious single “godbox” controller that knows everything about the IP layer and optical layer, to more modest attempts to just control the pluggable in the router via the optical controller of the line system.The godbox idea is clearly a bad idea# have we not learned from the past? This didn’t work well even when controlling IP and optical gear of the same vendor, let alone attempting to do it across the entire industry and keep it up to speed and fully tested against all possible gear…The more modest concept of controlling the pluggables in the router by the optical controller doesn’t sound so bad – after all this controller controls transponders, so what’s the difference? The state of the router and its pluggables is mainly owned by the IP controller and having multiple owners (optical controller for the WDM pluggables, IP controller for the rest of the router) is a recipe for resource contention, race conditions, synchronization issues, and security headaches.Add to this the fact that the state of the pluggables is not independent of the state of the rest of the router# when you modify parameters on the pluggable, you affect IP layer behavior. For example, if you change the modulation format to a lower bitrate format, the IP layer needs to know about this and send less traffic down the link.But the problem is even worse than that# many routers connect to other routers via different vendor line systems# for example, an edge router may connect to a core router via a core WDM system and to aggregation routers via a metro WDM system, typically not from the same vendor. So now we don’t have just two controllers wanting to change router state but 3 or 4…We believe that the right solution is simple# all changes in the router must be done via a single owner to the router state and this owner is the IP controller. This is aligned with the hierarchical SDN architecture and standards and allows for clean roles & responsibilities for the different controllers. Allowing an optical controller to be this single owner will block the evolution of RON – and this evolution holds much more value than mere transponder replacement.More details about the control architecture for RON can be found here.", "url": "/blogs/2023-07-31-convergence-of-ip-optical-and-control-routed-optical-networks/", "author": "Ori Gerstel", "tags": "iosxr, Optical, RON" } , "#": {} , "blogs-2023-08-24-cst-routed-optical-2-1": { "title": "Cisco Routed Optical Networking", "content": " On This Page PDF Download Revision History Solution Component Software Versions What is Routed Optical Networking? Key Drivers Changing Networks Network Complexity Inefficiences Between Network Layers Operational Complexity Network Cost Routed Optical Networking Solution Overview Today’s Complex Multi-Layer Network Infrastructure DWDM OTN Ethernet/IP Enabling Technologies Pluggable Digital Coherent Optics QSFP-DD, 400ZR, and OpenZR+ Standards Cisco High Power OpenZR+ Transceiver (DP04QSDD-HE0) New for 2.1 Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S) Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S) Cisco Routers Cisco Private Line Emulation Circuit Style Segment Routing Cisco DWDM Network Hardware Routed Optical Networking Network Use Cases Where to use 400ZR and where to use OpenZR+ Supported DWDM Optical Topologies NCS 2000 64 Channel FOADM P2P Deployment NCS 1010 64 Channel FOADM P2P Deployment NCS 2000 Colorless Add/Drop Deployment NCS 2000 Multi-Degree ROADM Deployment NCS 1010 Multi-Degree Deployment Long-Haul Deployment Core Networks Metro Aggregation Access DCI and 3rd Party Location Interconnect Routed Optical Networking Private Line Services Circuit Style Segment Routing CS SR-TE paths characteristics CS SR-TE path liveness detection CS SR-TE path failover behavior Static CS SR-TE Policies CS SR-TE Policy operational details Dynamic Circuit Style SR-TE Bi-directional Association ID Disjoint path group ID Path constraints Circuit-Style SR-TE with Bandwidth Admission Control using Crosswork Circuit Style Manager CNC Circuit-Style Manager Configuration SR-PCE to Circuit Style Manager (CSM) Communication Bandwidth Admission Control Operation CNC CS-SRTE Monitoring CNC CS-SRTE Provisioning Private Line Emulation Hardware Supported Client Transceivers Private Line Emulation Pseudowire Signaling Private Line Emulation EVPN-VPWS Configuration PLE Monitoring and Telemetry Client Optics Port State PLE CEM Controller Stats PLE CEM PM Statistics PLE Client PM Statistics Routed Optical Networking Architecture Hardware Routed Optical Networking Validated Routers Cisco 8000 Series Cisco 5700 Systems and NCS 5500 Line Cards ASR 9000 Series NCS 500 Series Routed Optical Networking Optical Hardware Network Convergence System 1010 Network Convergence System 2000 Network Convergence System 1000 Multiplexer Network Convergence System 1001 NCS 2000 and NCS 1001 Hardware Routed Optical Networking Automation Overview IETF ACTN SDN Framework Cisco’s SDN Controller Automation Stack Cisco Open Automation Crosswork Hierarchical Controller Crosswork Network Controller Cisco Optical Network Controller Cisco Network Services Orchestrator and Routed Optical Networking ML Core Function Pack Routed Optical Networking Service Management Supported Provisioning Methods OpenZR+ and 400ZR Properties ZR/ZR+ Supported Frequencies Supported Line Side Rate, FEC, and Modulation 50Ghz Spectrum Compatiblity Modes (New) Crosswork Hierarchical Controller UI Provisioning Cross-Layer Link Definition Cross-Layer Link Validation (New) IP Link Provisioning using Crosswork HCO Operational Discovery NSO RON-ML CFP Provisioning RON-ML End to End Service RON-ML API Provisioning IOS-XR CLI Configuration Model-Driven Configuration using IOS-XR Native Models using NETCONF or gNMI Model-Driven Configuration using OpenConfig Models Routed Optical Networking Assurance Crosswork Hierarchical Controller Multi-Layer Path Trace Routed Optical Networking Link Assurance ZRM Layer TX/RX Power ZRC Layer BER and Q-Factor / Q-Margin OTS Layer RX/TX Power Graph Event Monitoring IOS-XR CLI Monitoring of ZR400/OpenZR+ Optics Optics Controller Coherent DSP Controller EPNM Monitoring of Routed Optical Networking EPNM Chassis View of DCO Transceivers Chassis View Interface/Port View EPNM DCO Performance Measurement DCO Physical Layer PM KPIs Cisco IOS-XR Model-Driven Telemetry for Routed Optical Networking Monitoring ZR/ZR+ DCO Telemetry NCS 1010 Optical Line System Monitoring Open-source Monitoring Additional Resources Cisco Routed Optical Networking 2.1 Solution Guide Cisco Routed Optical Networking Home Cisco Routed Optical Networking Tech Field Day Cisco Champion Podcasts Appendix A Acronyms DWDM Network Hardware Overview Optical Transmitters and Receivers Multiplexers/Demultiplexers Optical Amplifiers Optical add/drop multiplexers (OADMs) Reconfigurable optical add/drop multiplexers (ROADMs) PDF Downloadhttps#//github.com/ios-xr/design/blob/master/Routed-Optical-Networking/2023-08-24-cst-routed-optical-2_1.pdfRevision History Version Date Comments 1.0 01/10/2022 Initial Routed Optical Networking Publication 2.0 12/01/2022 Private Line Services, NCS 1010, CW HCO updates 2.1 06/24/2023 High-Power ZR+ Optics, Bandwidth Guaranteed PLE, Connectivity Verification Solution Component Software Versions Element Version Router IOS-XR 7.9.1 NCS 2000 SVO 12.3.1 NCS 1010 IOS-XR 7.9.1 Cisco Optical Network Controller 2.1 Crosswork Network Controller 5.0 Crosswork Hierarchical Controller 7.0 Cisco EPNM 7.0.1 What is Routed Optical Networking?Routed Optical Networking as part of Cisco’s Converged SDN Transportarchitecture brings network simplification to the physical networkinfrastructure, just as EVPN and Segment Routing simplify the service andtraffic engineering network layers. Routed Optical Networking collapses complextechnologies and network layers into a single cost efficient and easy to managenetwork infrastructure. Here we present the Cisco Routed Optical Networkingarchitecture and validated design.Key DriversChanging NetworksInternet traffic has seen a compounded annual growth rate of 30% or higher overthe last ten years, as more devices are connected, end user bandwidth speedsincrease, and applications continue to move to the cloud. The introduction of 5Gin mobile carriers and backhaul providers is also a disruptor, networks must bebuilt to handle the advanced services and traffic increase associated with 5G.Networks must evolve so the infrastructure layer can keep up with the servicelayer. 400G Ethernet is the next evolution for SP IP network infrastructure, andwe must make that as efficient as possible.Network ComplexityComputer networks at their base are a set of interconnected nodes to deliverdata between two endpoints. In the very beginning, these networks were designedusing a layered approach to separate functions. The OSI model is an example ofhow functional separation has led to innovation by allowing different standardsbodies to work in parallel at each layer. In some cases even these OSI layersare further split into different layers. While these layers can bring some costbenefit, it also brings added complexity. Each layer has its own management,control plane, planning, and operational model.Inefficiences Between Network LayersOTN and IP network traffic must be converted into wavelengthsignals to traverse the DWDM network. This has traditionally required dedicatedexternal hardware, a transponder. All of these layers bring complexity, andtoday some of those layers, such as OTN, bring little to the table in terms ofefficiency or additional value. OTN switching, like ATM previously, has not beenable to keep up with traffic demands due to very complex hardware. UnlikeEthernet/IP, OTN also does not have a widely interoperable control plane, locking providers into a single vendor or solution long-term.Operational ComplexityNetworks involving opaque layers are difficult to plan, build, and operate. IPand optical networks often have duplicate teams covering similar tasks. Networkprotection and restoration is also often complicated by different schemesrunning independently across layers. The industry has tried over decades tosolve some of these issues with complex control planes such as GMPLS, but we arenow at an evolution point where simplifying the physical layers and reducingcontrol plane complexity in the optical layer allows a natural progression to asingle control-plane and protection/restoration layer.Network CostSimplyfing networks reduces both capex and opex. As we move to 400G, the networkcost is shifted away from routers and router ports to optics. Any way we canreduce the number of 400G interconnects on the network will greatly reduce cost.Modeling networks with 400ZR and OpenZR+ optics in place of traditionaltransponders and muxponders shows this in almost any network scenario. It also results in a reduced space and power footprint.Routed Optical Networking Solution OverviewAs part of the Converged SDN Transport architecture, Routed Optical Networkingextends the key tenet of network simplification. Routed Optical Networkingtackles the challenges of building and managing networks by simplifying both theinfrastructure and operations.Today’s Complex Multi-Layer Network InfrastructureDWDMMost modern SP networks start at the physical fiber optic layer. Above thephysical fiber is technology to allow multiple photonic wavelengths to traversea single fiber and be switched at junction points, we will call that the DWDMlayer.OTNIn some networks, above this DWDM layer is an OTN layer, OTN being theevolution of traditional SONET/SDH networks. OTN grooms low speed TDM servicesinto higher speed containers, and if OTN switching is involved, allows switchingthese services at intermediate points in the network. OTN is primarily used in network to carry guaranteed bandwidth services.Ethernet/IPIn all high bandwidth networks today, there is an Ethernet layer on which IPservices traverse, since almost all data traffic today is IP. Ethernetand IP is used due to its ability to support statistical multiplexing, topologyflexibility, and widespread interoperability between different vendors based onwell-defined standards. In larger networks today carrying Internet traffic, theEthernet/IP layer does not typically traverse an OTN layer, the OTN layer isprimarily used only for business services.Enabling TechnologiesPluggable Digital Coherent OpticsSimple networks are easier to build and easier to operate. As networks scale tohandle traffic growth, the level of network complexity must decline or at leastremain flat.IPoDWDM has attempted to move the transponder function into the router to removethe transponder and add efficiency to networks. In lower bandwidth applications,it has been a very successful approach. CWDM, DWDM SFP/SFP+, and CFP2-DCOpluggable transceivers have been used for many years now to build access,aggregation, and lower speed core networks. The evolution to 400G andadvances in technology created an opportunity to unlock this potentialin higher speed networks.Transponder or muxponders have typically been used to aggregate multiple 10G or100G signals into a single wavelength. However, with reach limitations, and thefact transponders are still operating at 400G wavelength speeds, the transponderbecomes a 1#1 input to output stage in the network, adding no benefit.The Routed Optical Networking architecture unlocks this efficiency for networksof all sizes, due to advancements in coherent plugable technology.QSFP-DD, 400ZR, and OpenZR+ StandardsAs mentioned, the industry saw a point to improve network efficiency by shiftingcoherent DWDM functions to router pluggables. Technology advancements haveshrunk the DCO components into the standard QSFP-DD form factor, meaning nospecialized hardware and the ability to use the highest capacity routersavailable today. ZR/OpenZR+ QSFP-DD optics can be used in the same ports as thehighest speed 400G non-DCO transceivers.Cisco High Power OpenZR+ Transceiver (DP04QSDD-HE0) New for 2.1Routed Optical Networking 2.1 introduces the Cisco HIgh Power +1dB ZR+ transceiver. This high launch power DCO enables the use of the optics with optical add drop systems requiring higher input power, and enables longer distances when used in passive or dark fiber applications without amplification.Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S)Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S)Two industry optical standards have emerged to cover a variety of use cases. TheOIF created the 400ZR specification,https#//www.oiforum.com/technical-work/hot-topics/400zr-2 as a 400G interopablestandard for metro reach coherent optics. The industry saw the benefit of theapproach, but wanted to cover longer distances and have flexibility inwavelength rates, so the OpenZR+ MSA was created, https#//www.openzrplus.org.The following table outlines the specs of each standard. ZR400 and OpenZR+ transceivers are tunable across the ITU C-Band, 196.1 To 191.3 THz.The following part numbers are used for Cisco’s ZR400 and OpenZR+ MSA transceivers Standard Part 400ZR QDD-400G-ZR-S OpenZR+ QDD-400G-ZRP-S High Power OpenZR+ DP04QSDD-HE0 Cisco datasheet for the QDD-400G-ZRP-S and QDD-400G-ZR-S transceivers can be found at https#//www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/datasheet-c78-744377.htmlCisco datasheet for the DP04QSDD-HE0 can be found at https#//www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/400g-qsfp-dd-high-power-optical-module-ds.htmlCisco RoutersWe are at a point in NPU development where the pace of NPU bandwidth growth hasoutpaced network traffic growth. Single NPUs such as Cisco’s Silicon One have acapacity exceeding 12.8Tbps in a single NPU package without sacrificingflexibility and rich feature support. This growth of NPU capacity also bringsreduction in cost, meaning forwarding traffic at the IP layer is moreadvantageous vs. a network where layer transitions happen often.Cisco supports 400ZR and OpenZR+ optics across the NCS 540, NCS 5500, NCS 5700,ASR 9000, and Cisco 8000 series routers. This enabled providers to utilize the architecture across their end to end infrastructure in a variety of router roles. SeeCisco Private Line EmulationStarting in Routed Optical Networking 2.0, Cisco now supports Private LineEmulation (PLE) hardware and IOS-XR support to provide bit-transparent privateline services over the converged packet network. Private Line Emulation supportsthe transport of Ethernet, SONET/SDH, OTN, and Fiber Channel services. See thePLE section of the document for in-depth information on PLE.Circuit Style Segment RoutingCircuit Style Segment Routing (CS-SR) is another Cisco advancement bringing TDM circuit like behavior to SR-TE Policies. These policies use deterministic hop by hop routing, co-routed bi-directional paths, hot standby protect pathswith end to end liveness detection, and bandwidth guaranteed services.Standard Ethernet services not requiring bit transparency can be transportedover a Segment Routing network similar to OTN networks without the additionalcost, complexity, and inefficiency of an OTN network layer.Cisco DWDM Network HardwareRouted Optical Networking shifts an expensive and now often redundanttransponder function into a pluggable transceiver. However, to make the mostefficient use of a valuable resource, the underlying fiber optic network, westill need a DWDM layer. Routed Optical Networking is flexible enough to workacross point to point, ROADM based optical networks, or a mix of both. Ciscomultiplexers, amplifiers, and ROADMs can satisfy any network need.Cisco NCS 1010Routed Optical Networking 2.0 introduces the new Cisco NCS 1010 open opticalline system. The NCS 1010 represents an evolution in open optical linesystems, utilizing the same IOS-XR software as Cisco routers and NCS 1004series transponders. This enables the rich XR automation and telemetry supportto extend to the DWDM photonic line system. The NCS 1010 also simplifies howoperators build DWDM networks with advanced integrated functions and aflexible twin 1x33 WSS.See the validated design hardware section for more information.Routed Optical Networking Network Use CasesCisco is embracing Routed Optical Networking in every SP router role. Access,aggregation, core, peering, DCI, and even PE routers can be enabled with highspeed DCO optics. Routed Optical Networking is also not limited to SP networks,there are applications across enterprise, government, and education networks.Where to use 400ZR and where to use OpenZR+The OIF 400ZR and OpenZR+ MSA standards have important differences.400ZR supports 400G rates only, and targets metro distance point to pointconnections up to 120km. 400ZR mandates a strict power consumption of 15W aswell. Networks requiring only 400G over distances less than 120km may benefitfrom using 400ZR optics. DCI and 3rd party peering interconnection are good usecases for 400ZR.If a provider needs flexibility in rates and distances and wants to standardizeon a single optics type, OpenZR+ can fulfill the need. In areas of the networkwhere 400G may not be needed, OpenZR+ optics can be run at 100G or 200G.Additionally, hardware with QSFP-DD 100G ports can utilize OpenZR+ optics in100G mode. This can be ideal for high density access and aggregation networks.Supported DWDM Optical TopologiesFor those unfamiliar with DWDM hardware, please see the overview of DWDM networkhardware in Appendix AThe future of networks may be a flat L3 network with simple point to pointinterconnection, but it will take time to migrate to this type of architecture.Routed Optical Network supports an evolution to the architecture by working overmost modern photonic DWDM networks. Below gives just a few of the supportedoptical topologies including both point to point and ROADM based DWDM networks.NCS 2000 64 Channel FOADM P2P DeploymentThis example provides up to 25.6Tb on a single network span, and highlights thesimplicity of the Routed Optical Networking solution. The “optical” portion ofthe network including the ZR/ZR+ configuration can be completed in a matter ofminutes from start to finish.NCS 1010 64 Channel FOADM P2P DeploymentThe NCS 1010 includes two add/drop ports with embedded bi-directional EDFAamplifiers, ideal for connecting the new MD-32-E/O 32 channel, 150Ghz spacedpassive multiplexer. Connecting both even and odd multiplexers allows the use of 64 total channels.NCS 2000 Colorless Add/Drop DeploymentUsing the NCS2K-MF-6AD-CFS colorless NCS2K-MF-LC module along with the LC16 LCaggregation module, and SMR20-FS ROADM module, a scalable colorless add/dropcomplex can be deployed to support 400ZR and OpenZR+.NCS 2000 Multi-Degree ROADM DeploymentIn this example a 3 degree ROADM node is shown with a local add/drop degree. TheRouted Optical Networking solution fully supports ROADM based networks withoptical bypass. The traffic demands of the network will dictate the mostefficient network build. In cases where an existing or new build requires DWDMswitching capability, ZR and ZR+ wavelengths are easily provisioned over theinfrastructure.NCS 1010 Multi-Degree DeploymentA multi-degree NCS 1010 site utilizes a separate NCS 1010 OLT device for each degree. The degree may be an add/drop or bypass degree. In our example Site 3 can support the add/drop of wavelengths via its A/D ports on the upper node, orexpress those wavelengths through the interconnect to site 4 via the additional 1010 OLT unit connected to site 4. In our example the wavelength originating at sites 1 and 4 using ZR+ optics is expressed through site 3.Long-Haul DeploymentCisco has demonstrated in a physical lab 400G OpenZR+ services provisionedacross 1200km using NCS 2000 and NCS 1010 optical line systems. 300G, 200G,and 100G signals can achieve even greater distances. OpenZR+ is not just forshorter reach applications, it fulfills an ideal sweet spot in most providernetworks in terms of bandwidth and reach.Core NetworksLong-haul core networks also benefit from the CapEx and OpEx savings of movingto Routed Optical Networking. Moving to a simpler IP enabled convergedinfrastructure makes networks easier to manage and operate vs. networks withcomplex underlying optical infrastructure. The easiest place to start in thejourney is replacing external transponders with OpenZR+ QSFP-DD transceivers. At400G connecting a 400G gray Ethernet port to a transponder with a 400G or 600Gline side is not cost or environmentally efficient. Cisco can assist in modeling your core network to determine the TCO of Routed Optical Networking compared to traditional approaches.Metro AggregationTiered regional or metro networks connecting hub locations to larger aggregation site or datacenters can also benefit from Routed Optical Networking. Whether deployed in a hub and spoke topology or hop by hop IP ring, Routed Optical Networking satisfied provider’s growth demands at a lower cost than traditional approaches.AccessAccess deployments in a ring or point-to-point topology are ideal for Routed Optical Networking. Shorter distances over dark fiber may not require active optical equipment, and with up to 400G per span may provide the bandwidthnecessary for growth over a number of years without the use of additional multiplexers.DCI and 3rd Party Location InterconnectIn this use case, Routed Optical Networking simplifies deployments by eliminating active transponders, reducing power, space, and cabling requirements between end locations. 25.6Tbps of bandwidth is available over a single fiber using 64 400G wavelengths and simple optical amplifiers and multiplexers requiring no additional configuration after initial turn-up.Routed Optical Networking Private Line ServicesRelease 2.0 introduces Circuit Style Segment Routing TE Policies and Private Line Emulation hardware to enable traditional TDM-like private line services over the converged Segment Routing packet network. The following provides an overview of the hardware and software involved in supporting PL services. The figure below gives an overview of PLE service signaling and transport.Circuit Style Segment RoutingCS-SR provides the underlying TDM-like transport to support traditionalprivate line Ethernet services without additional hardware and bit-transparentEthernet, OTN, SONET/SDH, and Fiber Channel services using Private LineEmulation hardware.CS SR-TE paths characteristics Co-routed Bidirectional - Meaning the paths between two client ports are symmetric Deterministic without ECMP - Meaning the path does not vary based on any load balancing criteria Persistent - Paths are routed on a hop by hop basis, so they are not subject to path changes induced by network changes End-to-end path protection - Entire paths are switched from working to protect with the protect path in a hot standby state for fast transition Bandwidth guaranteed pathsSR CS-TE policies are built using link adjacency SIDs without protection to ensure the paths do not take a TI-LFA path during path failover and instead fail over to the pre-determined protect path.CS SR-TE path liveness detectionPaths can be configured with end to end liveness detection. Liveness detectionuses STAMP (Simple Two-Way Active Measurement Protocol) probes which arelooped at the far end to determine if the end to end path is upbi-directionally. If more than the set number of probes is missed (set by themultiplier) the path will be considered down. Once liveness detection isenabled probes will be sent on all candidate paths. Either the defaultliveness probe profile can be used or if you want to modify the defaultparameters a customized one can be created.CS SR-TE path failover behaviorCS SR-TE policies contain multiple candidate paths. The highest preferencecandidate path is considered the working path, the second highest preferencepath is the protect path, and if a third lower preference path is configuredwould be a dynamic restoration path. This provides 1#1+R protection for CSSR-TE policies.Static CS SR-TE PoliciesStatic CS SR-TE policies are policies using explicit pre-defined paths for theworking and protect paths in both the forward and reverse direction. Theseexplicit paths define the hop by hop path using adjacency-SIDs. The explicit SIDlists are defined on the head-end routers and part of the persistent deviceconfiguration. They can be built by a user or an external controller which iscreating the paths by available network information. Explicit SID lists can be referenced by multiple SR Policies.The following below shows the configuration of a static CS SR-TEPolicy with a working, protect, and restoration path.segment-routing traffic-eng policy to-55a2-1 color 1001 end-point ipv4 100.0.0.44 path-protection ! candidate-paths preference 25 dynamic metric-type igp ! ! preference 50 explicit segment-list protect-forward-path reverse-path segment-list protect-reverse-path ! ! preference 100 explicit segment-list working-forward-path reverse-path segment-list working-reverse-path ! ! ! performance-measurement liveness-detection liveness-profile name liveness-checkCS SR-TE Policy operational detailsRP/0/RP0/CPU0#ron-ncs55a2-1#show segment-routing traffic-eng policy color 1001Sat Dec 3 13#32#38.356 PSTSR-TE policy database---------------------Color# 1001, End-point# 100.0.0.42 Name# srte_c_1001_ep_100.0.0.42 Status# adjmin# up Operational# up for 2d09h (since Dec 1 04#08#12.648) Candidate-paths# Preference# 100 (configuration) (active) Name# to-100.0.0.42 Requested BSID# dynamic PCC info# Symbolic name# cfg_to-100.0.0.42_discr_100 PLSP-ID# 1 Constraints# Protection Type# protected-preferred Maximum SID Depth# 12 Explicit# segment-list forward-adj-path-working (valid) Reverse# segment-list reverse-adj-path-working Weight# 1, Metric Type# TE SID[0]# 15101 [adjacency-SID, 100.1.1.21 - 100.1.1.20] SID[1]# 15102 SID[2]# 15103 SID[3]# 15104 Reverse path# SID[0]# 15001 SID[1]# 15002 SID[2]# 15003 SID[3]# 15004 Protection Information# Role# WORKING Path Lock# Timed Lock Duration# 300(s) State# ACTIVE Preference# 50 (configuration) (protect) Name# to-100.0.0.42 Requested BSID# dynamic PCC info# Symbolic name# cfg_to-100.0.0.42_discr_50 PLSP-ID# 2 Constraints# Protection Type# protected-preferred Maximum SID Depth# 12 Explicit# segment-list forward-adj-path-protect(valid) Reverse# segment-list reverse-adj-path-protect Weight# 1, Metric Type# TE SID[0]# 15119 [adjacency-SID, 100.1.42.1 - 100.1.42.0] Reverse path# SID[0]# 15191 Protection Information# Role# PROTECT Path Lock# Timed Lock Duration# 300(s) State# STANDBY Attributes# Binding SID# 24017 Forward Class# Not Configured Steering labeled-services disabled# no Steering BGP disabled# no IPv6 caps enable# yes Invalidation drop enabled# no Max Install Standby Candidate Paths# 0Dynamic Circuit Style SR-TERelease 2.1 of Routed Optical Networking introduces the concept of dynamic CSSR-TE. Dynamic CS SR-TE utilizes Cisco’s SR-PCE Path Computation Element tocompute the working and protection paths of the CS SR-TE policy. Each head-endnode acts as a Path Computation Client, utilizing PCEP and Circuit Style PCEPextensions to communicate the required path characteristics to SR-PCE. UtilizingSR-PCE to compute the optimal disjoint paths simplifies the configuration anddeployment of Circuit-Style Policies.Bi-directional Association IDCircuit style paths are bi-directional and co-routed. The association ID valueis used to communicate the constraint to SR-PCE. SR-PCE will then compute thesame co-routed path from each head-end router to the tail-end router. Theworking paths on both head-end routers require using the same ID. The protectpaths will also use the same ID, but unique from the working path ID. Likewise,the optional restoration paths will also utilize a unique ID. The identifiersshould be globally unique for each pair of CS SR-TE policies. The following gives an example of the identifiers# Router Path Type Association ID A Working 100 Z Working 100 A Protect 200 Z Protect 200 A Restoration 300 Z Restoration 300 Disjoint path group IDAnother property of CS SR-TE policies is working and protect paths are fully disjoint. A disjoint group ID is used to communicate the constraint to SR-PCE. On each head-end node the working and protect candidate paths are assigned the same disjoint group ID. The disjoint group ID will be globally unique on each head-end node. Available options for path disjointness are node, link, and srlg. Router Path Type Disjoint Group ID A Working 101 Z Working 201 A Protect 101 Z Protect 201 Path constraintsCircuit style paths utilize only unprotected adjacency SID, those constraints are communicated to SR-PCE using specific configuration and flags in PCEP.The following configuration shows the full dynamic SR-TE configuration on each head-end node.Router Apolicy dynamic-cs-srte-to-55a2-p2 color 119 end-point ipv4 100.0.0.44 path-protection ! candidate-paths preference 100 dynamic pcep ! metric type igp ! ! constraints segments protection unprotected-only adjacency-sid-only ! disjoint-path group-id 10 type link ! bidirectional co-routed association-id 101 ! ! preference 200 dynamic pcep ! metric type igp ! ! lock duration 30 ! constraints segments protection unprotected-only adjacency-sid-only ! disjoint-path group-id 10 type link ! bidirectional co-routed association-id 201 ! ! preference 50 dynamic pcep ! metric type igp ! ! backup-ineligible lock duration 60 ! constraints segments protection unprotected-only adjacency-sid-only ! ! bidirectional co-routed association-id 301 ! ! ! performance-measurement liveness-detection liveness-profile backup name CS-PROTECT liveness-profile name CS-WORKING invalidation-action downRouter Zpolicy dynamic-cs-srte-to-57c3-p2 color 119 end-point ipv4 100.0.0.42 path-protection ! candidate-paths preference 100 dynamic pcep ! metric type igp ! ! constraints segments protection unprotected-only adjacency-sid-only ! disjoint-path group-id 11 type link ! bidirectional co-routed association-id 101 ! ! preference 200 dynamic pcep ! metric type igp ! ! lock duration 30 ! constraints segments protection unprotected-only adjacency-sid-only ! disjoint-path group-id 11 type link ! bidirectional co-routed association-id 201 ! ! preference 50 dynamic pcep ! metric type igp ! ! backup-ineligible lock duration 60 ! constraints segments protection unprotected-only adjacency-sid-only ! ! bidirectional co-routed association-id 301 ! performance-measurement liveness-detection liveness-profile backup name CS-PROTECT liveness-profile name CS-WORKING invalidation-action downCircuit-Style SR-TE with Bandwidth Admission Control using Crosswork Circuit Style ManagerRouted Optical Networking 2.1 with Crosswork Network Controller 5.0 and IOS-XR7.9.1 now supports utilizing the new Circuit Style Manager to provide BandwidthAdmission Controller and guaranteed bandwidth paths for Circuit-Style Policies.CNC 5.0 also supports full provisioning, monitoring, and visualization ofCircuit-Style Policies.CNC Circuit-Style Manager ConfigurationIn CNC 5.0, Circuit Style Manager uses a simple network-wide bandwidthpercentage setting to reserve a specific amount of bandwidth for BW-guaranteedCS-SRTE policies. CNC’s network model will track the allocation of bandwidth oneach link and the amount of capacity reserved by active CS-SRTE Policies.The Link CS BW Min Threshold configuration is used to trigger system alerts whenthe BW on a link meets or exceeds the threshold percentage configured by theuser.Allocated and reserved link bandwidthSR-PCE to Circuit Style Manager (CSM) CommunicationCSM communicates to SR-PCE through the SR-PCE northbound API. When the session is established between CSM and SR-PCE, SR-PCE will delegate all CS-SRTE Policies with bandwidth constraints to CSM for path computation.Bandwidth Admission Control OperationBW CAC is supported for dynamic CS SR-TE Policies. Utilizing the “bandwidth”configuration option for the policy triggers the inclusion of the “bandwidth”object in the PCEP request to SR-PCE. SR-PCE will delegate path computationrequests with bandwidth constraints to CNC CSM. Based on the CS-SRTE Policyconfiguration, CSM will compute a Working, Protect, and Restoration path to beused by the policy. The paths computed by CSM will adhere to the CS-SRTEproperties with Working and Protect paths being fully disjoint (link, node, orSRLG) and each path will be co-routed meaning the Working path from A to Z willbe identical to the path from Z to A.CS SR-TE Policy Bandwidth Configurationsegment-routing traffic-eng policy srte_c_3000_ep_100.0.0.44 bandwidth 10000000 color 3000 end-point ipv4 100.0.0.44 path-protectionOperational InformationRP/0/RP0/CPU0#ron-ncs55a2-2#show segment-routing traffic-eng policy color 3000 | beg AttributesMon Jun 12 17#55#44.238 PDT Attributes# Binding SID# 24010 Forward Class# Not Configured Steering labeled-services disabled# no Steering BGP disabled# no IPv6 caps enable# yes Bandwidth Requested# 10.000 Gbps Bandwidth Current# 10.000 GbpsCNC CS-SRTE MonitoringStarting in CNC 5.0, CS-SRTE Policies are fully supported. CS-SRTE Policies are identified as a specific type of policy in the Traffic Engineering dashboard and have enhanced visualization and monitoring capabilities.CS-SRTE Dashboard CS-SRTE Policy Visualization of Working and Protect Paths When visualizing CS-SRTE policies we can see both Working and Protect paths including which path is currently active, denoted by the A icon.CS-SRTE Policy Path Details When we inspect the policy path details we can see both the requested and reserved bandwidth for the path and then the additional details such as path constraints and hop by hop path.CNC CS-SRTE ProvisioningIn CNC 5.0 the Circuit-Style SR-TE Function Pack is supported. This service typesimplifies CS-SRTE provisioning by dynamically allocating IDs for thebi-directional association ID and disjoint-group ID. It also simplifiesprovisioning by provisioning both the A to Z and Z to A policies at one time vs.defining each one independently.NSO Payload for Dynamic CS SR-TE Policy{ ~data~# { ~cisco-cs-sr-te-cfp#cs-sr-te-policy~# [ { ~name~# ~clus-demo-ple-fc8~, ~head-end~# { ~device~# ~ron-ncs55a2-1~, ~ip-address~# ~100.0.0.44~ }, ~tail-end~# { ~device~# ~ron-ncs55a2-2~, ~ip-address~# ~100.0.0.22~ }, ~color~# 3000, ~bandwidth~# 10000000, ~disjoint-path~# { ~forward-path~# { ~type~# ~node~, ~group-id~# 900 }, ~reverse-path~# { ~type~# ~node~, ~group-id~# 901 } }, ~path-protection~# { }, ~performance-measurement~# { ~liveness-detection~# { ~profile~# ~ple~, ~backup~# ~ple~, ~logging~# { ~session-state-change~# [null] } } }, ~working-path~# { ~dynamic~# { ~constraints~# { ~segments~# { ~protection~# ~unprotected-only~ } }, ~pce~# { }, ~metric-type~# ~latency~, ~bidirectional-association-id~# 1000 } }, ~protect-path~# { ~dynamic~# { ~constraints~# { ~segments~# { ~protection~# ~unprotected-only~ } }, ~pce~# { }, ~metric-type~# ~latency~, ~bidirectional-association-id~# 1001 }, ~revertive~# true, ~wait-to-revert-timer~# 30 } } ] }}Private Line Emulation HardwareStarting in IOS-XR 7.7.1 the NC55-OIP-02 Modular Port Adapter (MPA) is supportedon the NCS-55A2-MOD and NCS-57C3-MOD platforms. The NC55-OIP-02 has 8 SFP+ portsEach port on the PLE MPA can be configured independently. The PLE MPA is responsible for receiving data frames from the native PLE client and packaging those into fixed frames for transport over the packet network.More information on the NC55-OIP-02 can be found in its datasheet located athttps#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/network-con-5500-series-ds.pdf. A full detailed to end to end configuration for PLE can be found in the Routed Optical Networking 2.0 Solution Guide found at https#//www.cisco.com/c/en/us/td/docs/optical/ron/2-0/solution/guide/b-ron-solution-20/m-ron.pdfSupported Client Transceivers Transport Type Supported Transceivers Ethernet SFP-10G-SR/LR/ER, GLC-LH/EX/ZX-SMD, 1G/10G CWDM OTN (OTU2e) SFP-10G-LR-X, SFP-10G-ER-I, SFP-10G-Z SONET/SDH ONS-SC+-10G-LR/ER/SR (OC-192/STM-64), ONS-SI-2G-L1/L2/S1 (OC-48/STM-16) Fiber Channel DS-SFP-FCGE, DS-SFP-FC8G, DS-SFP-FC16G, DS-SFP-FC32G, 1/2/4/8G FC CWDM Note FC32G transceivers are supported in the even ports only and will disable the adjacent odd SFP+ port.Private Line Emulation Pseudowire SignalingPLE utilizes IETF SAToP pseudowire encoding carried over dynamically signalled EVPN-VPWS circuits. Enhancements to the EVPN VPWS service type have been introduced to the IETF viahttps#//datatracker.ietf.org/doc/draft-schmutzer-bess-ple.PLE services use Differential Clock Recovery (DCR) to ensure proper frame timing between the two PLE clients. In order to mmaintain accuracy of the clock each PLE endpoint router must have its frequency source traceable to a common primary reference clock (PRC).Private Line Emulation EVPN-VPWS ConfigurationPLE services can be configured to utilize a CS SR-TE Policy or use dynamic MPLS protocols. The example belows shows the use of CS SR-TE Policy as transport for the PLE EVPN-VPWS service. Note the name of the sr-te policy in the preferred path command is the persistent generated name and not the name used in the CLI configuration. This can be determined using the “show segment-routing traffic-engineering policies” command.l2vpn pw-class circuit-style-srte encapsulation mpls preferred-path sr-te policy srte_c_1001_ep_100.0.0.42 ! ! xconnect group ple p2p ple-cs-1 interface CEM0/0/2/1 neighbor evpn evi 100 target 4201 source 4401 pw-class circuit-style-srte ! !PLE Monitoring and TelemetryThe following “show” command can be used to monitor the state of PLE ports and services.Client Optics Port StateRP/0/RP0/CPU0#ron-ncs55a2-1#show controllers optics 0/0/2/1Sat Dec 3 14#00#10.873 PST Controller State# Up Transport Admin State# In Service Laser State# On LED State# Not Applicable Optics Status Optics Type# SFP+ 10G SR Wavelength = 850.00 nm Alarm Status# ------------- Detected Alarms# None LOS/LOL/Fault Status# Laser Bias Current = 8.8 mA Actual TX Power = -2.60 dBm RX Power = -2.33 dBm Performance Monitoring# Disable THRESHOLD VALUES ---------------- Parameter High Alarm Low Alarm High Warning Low Warning ------------------------ ---------- --------- ------------ ----------- Rx Power Threshold(dBm) 2.0 -13.9 -1.0 -9.9 Tx Power Threshold(dBm) 1.6 -11.3 -1.3 -7.3 LBC Threshold(mA) 13.00 4.00 12.50 5.00 Temp. Threshold(celsius) 75.00 -5.00 70.00 0.00 Voltage Threshold(volt) 3.63 2.97 3.46 3.13 Polarization parameters not supported by optics Temperature = 33.00 Celsius Voltage = 3.30 V Transceiver Vendor Details Form Factor # SFP+ Optics type # SFP+ 10G SR Name # CISCO-FINISAR OUI Number # 00.90.65 Part Number # FTLX8574D3BCL-CS Rev Number # A Serial Number # FNS23300J42 PID # SFP-10G-SR VID # V03 Date Code(yy/mm/dd) # 19/07/25PLE CEM Controller StatsRP/0/RP0/CPU0#ron-ncs57c3-1#show controllers CEM 0/0/3/1Sat Sep 24 11#34#22.533 PDTInterface # CEM0/0/3/1Admin state # UpOper state # UpPort bandwidth # 10312500 kbpsDejitter buffer (cfg/oper/in-use) # 0/813/3432 usecPayload size (cfg/oper) # 1280/1024 bytesPDV (min/max/avg) # 980/2710/1845 usecDummy mode # last-frameDummy pattern # 0xaaIdle pattern # 0xffSignalling # No CASRTP # EnabledClock type # DifferentialDetected Alarms # NoneStatistics Info---------------Ingress packets # 517617426962, Ingress packets drop # 0Egress packets # 517277124278, Egress packets drop # 0Total error # 0 Missing packets # 0, Malformed packets # 0 Jitter buffer underrun # 0, Jitter buffer overrun # 0 Misorder drops # 0Reordered packets # 0, Frames fragmented # 0Error seconds # 0, Severely error seconds # 0Unavailable seconds # 0, Failure counts # 0Generated L bits # 0, Received L bits # 0Generated R bits # 339885178, Received R bits # 17Endpoint Info-------------Passthrough # NoPLE CEM PM StatisticsRP/0/RP0/CPU0#ron-ncs57c3-1#show controllers CEM 0/0/3/1 pm current 30-sec cemSat Sep 24 11#37#02.374 PDTCEM in the current interval [11#37#00 - 11#37#02 Sat Sep 24 2022]CEM current bucket type # ValidINGRESS-PKTS # 2521591 Threshold # 0 TCA(enable) # NOEGRESS-PKTS # 2521595 Threshold # 0 TCA(enable) # NOINGRESS-PKTS-DROPPED # 0 Threshold # 0 TCA(enable) # NOEGRESS-PKTS-DROPPED # 0 Threshold # 0 TCA(enable) # NOINPUT-ERRORS # 0 Threshold # 0 TCA(enable) # NOOUTPUT-ERRORS # 0 Threshold # 0 TCA(enable) # NOMISSING-PKTS # 0 Threshold # 0 TCA(enable) # NOPKTS-REORDER # 0 Threshold # 0 TCA(enable) # NOJTR-BFR-UNDERRUNS # 0 Threshold # 0 TCA(enable) # NOJTR-BFR-OVERRUNS # 0 Threshold # 0 TCA(enable) # NOMIS-ORDER-DROPPED # 0 Threshold # 0 TCA(enable) # NOMALFORMED-PKT # 0 Threshold # 0 TCA(enable) # NOES # 0 Threshold # 0 TCA(enable) # NOSES # 0 Threshold # 0 TCA(enable) # NOUAS # 0 Threshold # 0 TCA(enable) # NOFC # 0 Threshold # 0 TCA(enable) # NOTX-LBITS # 0 Threshold # 0 TCA(enable) # NOTX-RBITS # 0 Threshold # 0 TCA(enable) # NORX-LBITS # 0 Threshold # 0 TCA(enable) # NORX-RBITS # 0 Threshold # 0 TCA(enable) # NOPLE Client PM StatisticsRP/0/RP0/CPU0#ron-ncs57c3-1#show controllers EightGigFibreChanCtrlr0/0/3/4 pm current 30-sec fcSat Sep 24 11#51#55.168 PDTFC in the current interval [11#51#30 - 11#51#55 Sat Sep 24 2022]FC current bucket type # Valid IFIN-OCTETS # 16527749196 Threshold # 0 TCA(enable) # NO RX-PKT # 196758919 Threshold # 0 TCA(enable) # NO IFIN-ERRORS # 0 Threshold # 0 TCA(enable) # NO RX-BAD-FCS # 0 Threshold # 0 TCA(enable) # NO IFOUT-OCTETS # 0 Threshold # 0 TCA(enable) # NO TX-PKT # 0 Threshold # 0 TCA(enable) # NO TX-BAD-FCS # 0 Threshold # 0 TCA(enable) # NO RX-FRAMES-TOO-LONG # 0 Threshold # 0 TCA(enable) # NO RX-FRAMES-TRUNC # 0 Threshold # 0 TCA(enable) # NO TX-FRAMES-TOO-LONG # 0 Threshold # 0 TCA(enable) # NO TX-FRAMES-TRUNC # 0 Threshold # 0 TCA(enable) # NORouted Optical Networking Architecture HardwareAll Routed Optical Networking solution routers are powered by Cisco IOS-XR.Routed Optical Networking Validated RoutersBelow is a non-exhaustive snapshot of platforms validated for use with ZR andOpenZR+ transceivers. Cisco supports Routed Optical Networking in the NCS 540,NCS 5500/5700, ASR 9000, and Cisco 8000 router families. The breadth of coverageenabled the solution across all areas of the network.Cisco 8000 SeriesThe Cisco 8000 and its Silicone One NPU represents the next generation inrouters, unprecedented capacity at the lowest power consumption while supportinga rich feature set applicable for a number of network roles.See more information on Cisco 8000 at https#//www.cisco.com/c/en/us/products/collateral/routers/8000-series-routers/datasheet-c78-742571.htmlSpecific information on ZR/ZR+ support can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/cisco8000/Interfaces/73x/configuration/guide/b-interfaces-config-guide-cisco8k-r73x/m-zr-zrp-cisco-8000.htmlCisco 5700 Systems and NCS 5500 Line CardsThe Cisco 5700 family of fixed and modular systems and line cards are flexibleenough to use at any location in the networks. The platform has seen widespreaduse in peering, core, and aggregation networks.See more information on Cisco NCS 5500 and 5700 at https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-736270.html andhttps#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-744698.htmlSpecific information on ZR/ZR+ support can be found at https#//www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/interfaces/73x/configuration/guide/b-interfaces-hardware-component-cg-ncs5500-73x/m-zr-zrp.htmlASR 9000 SeriesThe ASR 9000 is the most widely deployed SP router in the industry. It has arich heritage dating back almost 20 years, but Cisco continues to innovate onthe ASR 9000 platform. The ASR 9000 series now supports 400G QSFP-DD on avariety of line cards and the ASR 9903 2.4Tbps 3RU platform.See more information on Cisco ASR 9000 at https#//www.cisco.com/c/en/us/products/collateral/routers/asr-9000-series-aggregation-services-routers/data_sheet_c78-501767.htmlSpecific information on ZR/ZR+ support can be found at https#//www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-3/interfaces/configuration/guide/b-interfaces-hardware-component-cg-asr9000-73x/m-zr-zrp.html#Cisco_Concept.dita_59215d6f-1614-4633-a137-161ebe794673NCS 500 SeriesThe 1Tbps N540-24QL16DD-SYS high density router brings QSFP-DD and Routed Optical NetworkingZR/OpenZR+ optics to a flexible access and aggregation platform. Using OpenZR+ optics it allows a migration path from 100G to 400G access rings or uplinks when used in an aggregation role.See more information on Cisco NCS 540 at https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-500-series-routers/ncs-540-large-density-router-ds.htmlRouted Optical Networking Optical HardwareNetwork Convergence System 1010The NCS 1010 Open Optical Line System (O-OLS) is a next-generation DWDM platform available in fixed variants to satisfy building a modern flexible DWDM photonic network.The NCS 1010 Optical Line Terminal (OLT) uses a twin 33-port WSS architectureallowing higher scale for either add/drop or express wavelengths. The OLT alsohas two LC add/drop ports with integrated fixed gain EDFA to support theadd/drop of lower power optical signals. OLTs are available in models with orwithout RAMAN amplification. NCS 1010 Inline Amplifier nodes are available asbi-directional EDFA, EDFA with RAMAN in one direction, or bi-directional RAMAN.Each model of NCS 1010 is also available to support both C and L bands. In Routed Optical Networking 2.0 ZR and ZR+ optics utilize the C band, but may be used on the same fiber withL band signals using the NCS 1010 C+L combiner.The NCS 1010 utilizes IOS-XR, inheriting the advanced automation and telemetryfeatures similar to IOS-XR routers.NCS 1010 OLT with RAMAN NCS 1010 ILA with RAMAN The NCS1K-MD32-E/O-C 32-port 150Ghz spaced passive multiplexer is used with the NCS 1010, supporting the 75Ghz ZR/ZR+ signals and future higher baud rate signals. The MD-32 contains photodiodes to monitor RX power levels on each add/drop port.NCS 1010 MD-32 Passive Filter The NCS 1010 supports point to point and express DWDM optical topologies in Routed Optical Networking 2.0. All NCS 1010 services in Routed Optical Networking are managed using Cisco Optical NetworkController.See more information on the NCS 1010 series at https#//www.cisco.com/c/en/us/products/collateral/optical-networking/network-convergence-system-1000-series/network-conver-system-1010-ds.htmlNetwork Convergence System 2000The NCS 2000 Optical Line System is a flexible platform supporting all modernoptical topologies and deployment use cases. Simple point to point tomulti-degree CDC deployments are all supported as part of Routed OpticalNetworking.See more information on the NCS 2000 series at https#//www.cisco.com/c/en/us/products/optical-networking/network-convergence-system-2000-series/index.htmlNetwork Convergence System 1000 MultiplexerThe NCS1K-MD-64-C is a new fixed multiplexer designed specificallyfor the 400G 75Ghz 400ZR and OpenZR+ wavelengths, allowing up to 25.6Tbps on asingle fiber.Network Convergence System 1001The NCS 1001 is utiized in point to point network spans as an amplifier andoptionally protection switch. The NCS 1001 now has specific support for 75Ghzspaced 400ZR and OpenZR+ wavelengths, with the ability to monitor incomingwavelengths for power. The 1001 features the ability to determine the properamplifier gain setpoints based on the desired user power levels.See more information on the NCS 1001 at https#//www.cisco.com/c/en/us/products/collateral/optical-networking/network-convergence-system-1000-series/datasheet-c78-738782.htmlNCS 2000 and NCS 1001 HardwareThe picture below does not represent all available hardware on the NCS 2000, however does capture the modules typically used in Routed Optical Networking deployments.Routed Optical Networking AutomationOverviewRouted Optical Networking by definition is a disaggregated optical solution,creating efficiency by moving coherent endpoints in the router. The solutionrequires a new way of managing the network, one which unifies the IP and Opticallayers, replacing the traditional siloed tools used in the past. Realtransformation in operations comes from unifying teams and workflows, ratherthan trying to make an existing tool fit a role it was not originally designedfor. Cisco’s standards based hierarchical SDN solution allows providers tomanage a multi-vendor Routed Optical Networking solution using standardinterfaces and YANG models.IETF ACTN SDN FrameworkThe IETF Action and Control of Traffic Engineered Networks group (ACTN) hasdefined a hierarchical controller framework to allow vendors to plug componentsinto the framework as needed. The lowest level controller, the ProvisioningNetwork Controller (PNC), is responsible for managing physical devices. Thesecontroller expose their resources through standard models and interface to aHierarchical Controller (HCO), called a Multi-Domain Service Controller (MDSC)in the ACTN framework.Note that while Cisco is adhering to the IETF framework proposed in RFC8453 , Cisco is supporting the mostwidely supported industry standards for controller to controller communicationand service definition. In optical the de facto standard is Transport API fromthe ONF for the management of optical line system networks and optical services.In packet we are leveraging Openconfig device models where possible and IETFmodels for packet topology (RFC8345) and xVPN services (L2NM and L3NM)Cisco’s SDN Controller Automation StackAligning to the ACTN framework, Cisco’s automation stack includes amulti-vendor IP domain controller (PNC), optical domain controller (PNC), andmulti-vendor hierarchical controller (HCO/MDSC).Cisco Open AutomationCisco believes not all providers consume automation in the same way, so we arededicated to make sure we have open interfaces at each layer of the networkstack. At the device level, we utilize standard NETCONF, gRPC, and gNMIinterfaces along with native, standard, and public consortium YANG models. Thereis no aspect of a Cisco IOS-XR router today not covered by YANG models. At thedomain level we have Cisco’s network controllers, which use the same standardinterfaces to communicate with devices and expose standards based NBIs. Ourmulti-layer/multi-domain controller likewise uses the same standard interfaces.Crosswork Hierarchical ControllerResponsible for Multi-Layer Automation is the Crosswork Hierarchical Controller. Crosswork Hierarchical Controller is responsible for the following network functions# CW HCO unifies data from the IP and optical networks into a single networkmodel. HCO utilizes industry standard IETF topology models for IP and TAPI foroptical topology and service information. HCO can also leverage legacy EMS/NMSsystems or use device interrogation. Responsible for managing multi-layer Routed Optical Networking links using asingle UI. Providing assurance at the IP and optical layers in a single tool. Thenetwork model allows users to quickly correlate faults and identify at whichlayer faults have occurred. Additional HCO applications include the following Root Cause Analysis# Quickly correlate upper layer faults to an underlying cause. Layer Relations# Quickly identify the lower layer resources supporting higher layer network resource or all network resources reliant on a selected lower layer network resource. Network Inventory# View IP and optical node hardware inventory along with with network resources such as logical links, optical services, and traffic engineering tunnels Network History# View state changes across all network resources at any point in time Performance# View historical link utilization Please see the following resources for more information on Crosswork HCO. https#//www.cisco.com/c/en/us/products/collateral/cloud-systems-management/crosswork-network-automation/solution-overview-c22-744695.htmlCrosswork Network ControllerCrosswork Network Controller is a multi-vendor IP domain controller. CrossworkNetwork Controller is responsible for the following IP network functions. Collecting Ethernet, IP, RSVP-TE, and SR network information for internalapplications and exposing northbound via IETF RFC 8345 topology models Collecting traffic information from the network for use with CNC’s trafficoptimization application, Crosswork Optimization Engine Enable Bandwidth Guaranteed Circuit-Style Segment Routing paths using Circuit Style Manager New in CNC 5.0 / RON 2.1 Perform provisioning of SR-TE, RSVP-TE, L2VPN, and L3VPN using standardindustry models (IETF TEAS-TE, L2NM, L3NM) via UI or northbound API Visualization and assurance of SR-TE, RSVP-TE, and xVPN services Optmization of network resources with its industry first Tactical Traffic Engineering applications to perform traffic when needed. Use additional Crosswork applications to perform telemetry collection/alerting,zero-touch provisioning, and automated and assurance network changesMore information on Crosswork and Crosswork Network Controller can be found at https#//www.cisco.com/c/en/us/products/collateral/cloud-systems-management/crosswork-network-automation/datasheet-c78-743456.htmlCisco Optical Network ControllerCisco Optical Network Controller (Cisco ONC) is responsible for managing Cisco optical line systems and circuit services. Cisco ONC exposes a ONF TAPI northbound interface, the de facto industry standard for optical network management. Cisco ONC runs as an application on the same Crosswork Infrastructure as CNC.More information on Cisco ONC can be found at https#//www.cisco.com/c/en/us/support/optical-networking/optical-network-controller/series.htmlCisco Network Services Orchestrator and Routed Optical Networking ML Core Function PackCisco NSO is the industry standard for service orchestration and deviceconfiguration management. The RON-ML CFP can be used to fully configure an IPlink between routers utilizing 400ZR/OpenZR+ optics over a Cisco optical linesystem using Cisco ONC. This includes IP addressing and adding links to anexisting Ethernet LAG. The CFP can also support optical-only provisioning on therouter to fit into existing optical provisioning workflows.Routed Optical Networking Service ManagementSupported Provisioning MethodsWe support multiple ways to provision Routed Optical Networking services based on existing provider workflows. Unified IP and Optical using Crosswork Hierarchical Controller IP router DCO provisioning using Cisco NSO Routed Optical Networking Multi-Layer Function Pack ZR/ZR+ Optics using IOS-XR CLI Cisco Native Model-driven ZR/ZR+ Optics configuration using Netconf or gNMI OpenConfig ZR/ZR+ Optics configuration using Netconf or gNMIOpenZR+ and 400ZR PropertiesZR/ZR+ Supported FrequenciesThe frequency on Cisco ZR/ZR+ transceivers may be set between 191.275Thz and196.125Thz in increments of 6.25Ghz, supporting flex spectrum applications. Tomaximize the available C-Band spectrum, these are the recommended 6475Ghz-spaced channels, also aligning to the NCS1K-MD-64-C fixed channel add/dropmultiplexer.                 196.100 196.025 195.950 195.875 195.800 195.725 195.650 195.575 195.500 195.425 195.350 195.275 195.200 195.125 195.050 194.975 194.900 194.825 194.75 194.675 194.600 194.525 194.450 194.375 194.300 194.225 194.150 194.075 194.000 193.925 193.850 193.775 193.700 193.625 193.550 193.475 193.400 193.325 193.250 193.175 193.100 193.025 192.950 192.875 192.800 192.725 192.650 192.575 192.500 192.425 192.350 192.275 192.200 192.125 192.050 191.975 191.900 191.825 191.750 191.675 191.600 191.525 191.450 191.375 Supported Line Side Rate, FEC, and ModulationOIF 400ZR transceivers support 400G only with the cFEC FEC type per the OIFspecification. OpenZR+ transceivers can support 100G, 200G, 300G, or 400G lineside rate. See router platform documentation for supported rates on eachplatform or line card. OpenZR+ optics can utilize the cFEC type in 400G mode to retain compatibility with OIF 400ZR.50Ghz Spectrum Compatiblity Modes (New)Starting in Routed Optical Networking 2.1 and IOS-XR 7.9.1 additional support has been added for the 200G-8QAM and 200G-16QAM modes. 200G-8QAM utilizes a symbol rate of 40.1Gbaud and 200G-16QAM utilizes a symbol rate of 30.1Gbaud. This ensures the signal width is compability with legacy 50Ghz filter optical line systems. 100G-QPSK also utilizes a 30.1Gbaud symbol rate.OpenZR+ Supported Configurations Transceiver Type Rate FEC Modulation Standard QDD-400G-ZR-S 400 cFEC 16QAM OIF 400ZR QDD-400G-ZRP-S 400 cFEC 16QAM OIF 400ZR QDD-400G-ZRP-S 400 oFEC 16QAM OpenZR+ QDD-400G-ZRP-S 300 oFEC 8QAM OpenZR+ QDD-400G-ZRP-S 200 oFEC QPSK OpenZR+ QDD-400G-ZRP-S 200 oFEC 8QAM OpenZR+ QDD-400G-ZRP-S 200 oFEC 16QAM OpenZR+ QDD-400G-ZRP-S 100 oFEC QPSK OpenZR+ DP04QSDD-HE0 400 cFEC 16QAM OIF 400ZR DP04QSDD-HE0 400 oFEC 16QAM OpenZR+ DP04QSDD-HE0 300 oFEC 8QAM OpenZR+ DP04QSDD-HE0 200 oFEC QPSK OpenZR+ DP04QSDD-HE0 200 oFEC 8QAM OpenZR+ DP04QSDD-HE0 200 oFEC 16QAM OpenZR+ DP04QSDD-HE0 100 oFEC QPSK OpenZR+ Crosswork Hierarchical Controller UI ProvisioningEnd-to-End IP+Optical provisioning can be done using Crosswork Hierarchical Controller’s GUI IP Linkprovisioning. Those familiar with traditional GUI EMS/NMS systems for servicemanagement will have a very familiar experience. Crosswork Hierarchical Controller provisioning will provisionboth the router optics as well as the underlying optical network to support theZR/ZR+ wavelength.Cross-Layer Link DefinitionEnd to end provisioning requires first defining the Cross-Layer or Inter-Layer links between therouter ZR/ZR+ optics and the optical line system add/drop ports. This is donein Crosswork HCO using a UI based “Link Manager” application, used to define the Network Media Channel (NMC) interconnection between ZR/ZR+ port and optical add/drop port.The below screenshot shows defined NMC cross-links.Cross-Layer Link Validation (New)Starting in RON 2.1 and HCO 7.0 users now have the ability to validate the connectivity of an NMC Cross-Layer link. Validation is done by manipulating the transmit power of the optics on the routers and continuously monitoring the powerseen on the receive side of the optical line system add/drop port. In RON 2.1 the link validation solution is supported using all XR based Cisco routers and NCS 1010 optical line systems. Validation can be done for all links or per-link using the “Validate Link” option.Using the Cross-Layer Link Validation is service affectingThe screenshot below shows a successful validation.IP Link Provisioning using Crosswork HCOCrosswork HCO supports end-to-end multi-layer provisioning of Routed Optical Networking circuits, providing a simplified way to provisioning DCO optics in the routers and the supporting optical line system OTSiMC channel in a single operation.HCO also supports separating the router and optical line system provisioning as two separate tasks, and also supports router-only provisioning for use cases where eitherdark fiber or passive optical components are being used.Once the cross layer links are created, the user can then proceed inprovisioning an end to end circuit spanning both IP and optical networks. Theprovisioning UI takes as input the two router endpoints, the associated ZR/ZR+ports, and the IP addressing or bundle membership of the link. The optical linesystem provisioning is abstracted from the user, simplifying the end to endworkflow. The frequency and power is automatically derived by Cisco OpticalNetwork Controller based on the add/drop port and returned as a parameter to beused in router optics provisioning.Operational DiscoveryThe Crosswork Hierarchical Controller provisioning process also performs a discovery phase to ensure theservice is operational before considering the provisioning complete. Ifoperational discovery fails, the end to end service will be rolled back.NSO RON-ML CFP ProvisioningProviders familiar with using Cisco Network Service Orchestrator have an optionto utilize NSO to provision optical and IP layer configuration for ZR/ZR+ routerDCOs.Cisco has created the Routed Optical Network Multi-Layer Core Function Pack,RON-ML CFP to perform the provisioning of Routed Optical Networking services onthe router endpoints. The aforementioned Crosswork HCO provisioning utilizes theRON-ML CFP to perform end device provisioning.Please see the Cisco Routed Optical Networking RON-ML CFP documentation for moredetails.RON-ML End to End ServiceThe RON-ML service is responsible for router DCO provisioning. All IP layer configuration such as bundle membership and IP addressing is optional, allowing potentially different teams to perform optical parameter provisioning vs. Ethernet/IP layer configuration.RON-ML API ProvisioningUse the following URL for NSO RESTCONF provisioning using a PATCH operation#http#//<nso host>#<port>/restconf/dataProvisioning ZR+ optics and adding interface to Bundle-Ether 100 interface{ ~cisco-ron-cfp#ron~# { ~ron-ml~# [ { ~name~# ~E2E_Bundle_ZRP_ONC57_2~, ~mode~# ~transponder~, ~bandwidth~# ~400~, ~circuit-id~# ~E2E Bundle ONC-57 S9|chan11 - S10|chan11~, ~grid-type~# ~100mhz-grid~, ~end-point~# [ { ~end-point-device~# ~ron-8201-1~, ~terminal-device-optical~# { ~line-port~# ~0/0/0/11~, ~transmit-power~# -100 }, ~ols-domain~# { ~end-point-state~# ~UNLOCKED~ }, ~terminal-device-packet~# { ~bundle~# [ { ~id~# 100 } ], ~interface~# [ { ~index~# 0, ~membership~# { ~bundle-id~# 100, ~mode~# ~active~ } } ] } }, { ~end-point-device~# ~ron-8201-2~, ~terminal-device-optical~# { ~line-port~# ~0/0/0/11~, ~transmit-power~# -100 }, ~ols-domain~# { ~end-point-state~# ~UNLOCKED~ }, ~terminal-device-packet~# { ~bundle~# [ { ~id~# 100 } ], ~interface~# [ { ~index~# 0, ~membership~# { ~bundle-id~# 100, ~mode~# ~active~ } } ] } } ] } ] } }IOS-XR CLI ConfigurationConfiguring the router portion of the Routed Optical Networking link is verysimple. All optical configuration related to the ZR/ZR+ optics configuration islocated under the optics controller relevent to the faceplate port. Defaultconfiguration the optics will be in an up/up state using a frequency of193.10Thz. The default transmit power is dependent on the optics type. The default transmit power for the QDD-400G-ZR-S (OIF 400ZR) and QDD-400G-ZRP-S is -10 dBm. The default transmit power for the High-Power ZR+ DP04QSDD-HE0 is 0 dBm.The basic configuration with a specific frequency of 195.65 Thz is located below, the only required component is the bolded channel frequency setting.ZR/ZR+ Optics Configurationcontroller Optics0/0/0/20 transmit-power -100 dwdm-carrier 100MHz-grid frequency 1956500 logging events link-statusModel-Driven Configuration using IOS-XR Native Models using NETCONF or gNMIAll configuration performed in IOS-XR today can also be done using NETCONF/YANG. The following payload exhibits the models and configuration used to perform router optics provisioning. This is a more complete example showing the FEC, power, and frequency configuration. .Note in Release 2.0 using IOS-XR 7.7.1 the newer IOS-XR Unified Models are utilized for provisioning<data xmlns=~urn#ietf#params#xml#ns#netconf#base#1.0~><controllers xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-interface-cfg~>    <controller>        <controller-name>Optics0/0/0/0</controller-name>        <transmit-power xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-cont-optics-cfg~>-115</transmit-power>        <fec xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-cont-optics-cfg~>OFEC</fec>        <dwdm-carrier xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-cont-optics-cfg~>          <grid-100mhz>            <frequency>1913625</frequency>          </grid-100mhz>        </dwdm-carrier>        <dac-rate xmlns=~http#//cisco.com/ns/yang/Cisco-IOS-XR-um-dac-rate-cfg~>1x1.25</dac-rate>      </controller></controllers> </data>Model-Driven Configuration using OpenConfig ModelsStarting on Release 2.0 all IOS-XR 7.7.1+ routers supporting ZR/ZR+ optics can be configured using OpenConfig models. Provisioning utilizes the openconfig-terminal-device model and its extensions to the openconfig-platform model to support DWDM configuration parameters.Below is an example of an OpenConfig payload to configure ZR/ZR+ optics port 0/0/0/20 with a 300G trunk rate with frequency 195.20 THz.Please visit the blog at https#//xrdocs.io/design/blogs/zr-openconfig-mgmt for in depth information about configuring and monitoring ZR/ZR+ optics using OpenConfig models.<config> <terminal-device xmlns=~http#//openconfig.net/yang/terminal-device~> <logical-channels> <channel> <index>100</index> <config> <index>200</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>101</index> <config> <index>101</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>102</index> <config> <index>102</index> <rate-class xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#TRIB_RATE_100G</rate-class> <admin-state>ENABLED</admin-state> <description>ETH Logical Channel</description> <trib-protocol xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_100G_MLG</trib-protocol> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_ETHERNET</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>100</allocation> <assignment-type>LOGICAL_CHANNEL</assignment-type> <description>ETH to Coherent assignment</description> <logical-channel>200</logical-channel> </config> </assignment> </logical-channel-assignments> </channel> <channel> <index>200</index> <config> <index>200</index> <admin-state>ENABLED</admin-state> <description>Coherent Logical Channel</description> <logical-channel-type xmlns#idx=~http#//openconfig.net/yang/transport-types~>idx#PROT_OTN</logical-channel-type> </config> <logical-channel-assignments> <assignment> <index>1</index> <config> <index>1</index> <allocation>300</allocation> <assignment-type>OPTICAL_CHANNEL</assignment-type> <description>Coherent to optical assignment</description> <optical-channel>0/0-OpticalChannel0/0/0/20</optical-channel> </config> </assignment> </logical-channel-assignments> </channel> </logical-channels> </terminal-device> <components xmlns=~http#//openconfig.net/yang/platform~> <component> <name>0/0-OpticalChannel0/0/0/20</name> <optical-channel xmlns=~http#//openconfig.net/yang/terminal-device~> <config> <operational-mode>5007</operational-mode> <frequency>195200000</frequency> </config> </optical-channel> </component> </components> </config>Routed Optical Networking AssuranceCrosswork Hierarchical ControllerMulti-Layer Path TraceUsing topology and service data from both the IP and Optical network CW HCO candisplay the full service from IP services layer to the physical fiber. Below isan example of the “waterfall” trace view from the OTS (Fiber) layer to theSegment Routing TE layer across all layers. CW HCO identifies specific RoutedOptical Networking links using ZR/ZR+ optics as seen by the ZRC (ZR Channel) andZRM (ZR Media) layers from the 400ZR specification.When faults occur at a specific layer, faults will be highlighted in red,quickly identifying the layer a fault has occurred. In this case we can see thefault has occurred at an optical layer, but is not a fiber fault. Having theability to pinpoint the fault layer even within a specific domain is a powerfulway to quickly determine the root cause of the fault.Routed Optical Networking Link AssuranceThe Link Assurance application allows users to view a network link and all of its dependent layers. This includes Routed Optical Networking multi-layer services. In addition to viewing layer information, fault and telemetry information is also available by simply selecting a link or port.ZRM Layer TX/RX PowerZRC Layer BER and Q-Factor / Q-MarginOptionally the user can see graphs of collected telemetry data to quickly identify trends or changes in specific operational data. Graphs of collected performance data is accessed using the “Performance” tab when a link or port is selected.OTS Layer RX/TX Power GraphEvent MonitoringCrosswork HCO records any transition of a network resource between up/down operational states. This is reflected in the Link Assurance tool under the “Events” tab.IOS-XR CLI Monitoring of ZR400/OpenZR+ OpticsOptics ControllerThe optics controller represents the physical layer of the optics. In the caseof ZR/ZR+ optics this includes the frequency information, RX/TX power, OSNR, andother associated physical layer information.RP/0/RP0/CPU0#ron-8201-1#show controllers optics 0/0/0/20Thu Jun 3 15#34#44.098 PDT Controller State# Up Transport Admin State# In Service Laser State# On LED State# Green FEC State# FEC ENABLED Optics Status Optics Type# QSFPDD 400G ZR DWDM carrier Info# C BAND, MSA ITU Channel=10, Frequency=195.65THz, Wavelength=1532.290nm Alarm Status# ------------- Detected Alarms# None LOS/LOL/Fault Status# Alarm Statistics# ------------- HIGH-RX-PWR = 0 LOW-RX-PWR = 0 HIGH-TX-PWR = 0 LOW-TX-PWR = 4 HIGH-LBC = 0 HIGH-DGD = 1 OOR-CD = 0 OSNR = 10 WVL-OOL = 0 MEA = 0 IMPROPER-REM = 0 TX-POWER-PROV-MISMATCH = 0 Actual TX Power = -7.17 dBm RX Power = -9.83 dBm RX Signal Power = -9.18 dBm Frequency Offset = 9 MHz Baud Rate = 59.8437500000 GBd Modulation Type# 16QAM Chromatic Dispersion 6 ps/nm Configured CD-MIN -2400 ps/nm CD-MAX 2400 ps/nm Second Order Polarization Mode Dispersion = 34.00 ps^2 Optical Signal to Noise Ratio = 35.50 dB Polarization Dependent Loss = 1.20 dB Polarization Change Rate = 0.00 rad/s Differential Group Delay = 2.00 psPerformance Measurement DataRP/0/RP0/CPU0#ron-8201-1#show controllers optics 0/0/0/20 pm current 30-sec optics 1Thu Jun 3 15#39#40.428 PDTOptics in the current interval [15#39#30 - 15#39#40 Thu Jun 3 2021]Optics current bucket type # Valid MIN AVG MAX Operational Configured TCA Operational Configured TCA Threshold(min) Threshold(min) (min) Threshold(max) Threshold(max) (max)LBC[% ] # 0.0 0.0 0.0 0.0 NA NO 100.0 NA NOOPT[dBm] # -7.17 -7.17 -7.17 -15.09 NA NO 0.00 NA NOOPR[dBm] # -9.86 -9.86 -9.85 -30.00 NA NO 8.00 NA NOCD[ps/nm] # -489 -488 -488 -80000 NA NO 80000 NA NODGD[ps ] # 1.00 1.50 2.00 0.00 NA NO 80.00 NA NOSOPMD[ps^2] # 28.00 38.80 49.00 0.00 NA NO 2000.00 NA NOOSNR[dB] # 34.90 35.12 35.40 0.00 NA NO 40.00 NA NOPDL[dB] # 0.70 0.71 0.80 0.00 NA NO 7.00 NA NOPCR[rad/s] # 0.00 0.00 0.00 0.00 NA NO 2500000.00 NA NORX_SIG[dBm] # -9.23 -9.22 -9.21 -30.00 NA NO 1.00 NA NOFREQ_OFF[Mhz]# -2 -1 4 -3600 NA NO 3600 NA NOSNR[dB] # 16.80 16.99 17.20 7.00 NA NO 100.00 NA NOCoherent DSP ControllerThe coherent DSP controller represents the framing layer of the optics. It includes Bit Error Rate, Q-Factor, and Q-Margin information.RP/0/RP0/CPU0#ron-8201-1#show controllers coherentDSP 0/0/0/20Sat Dec 4 17#24#38.245 PSTPort # CoherentDSP 0/0/0/20Controller State # UpInherited Secondary State # NormalConfigured Secondary State # NormalDerived State # In ServiceLoopback mode # NoneBER Thresholds # SF = 1.0E-5 SD = 1.0E-7Performance Monitoring # EnableBandwidth # 400.0Gb/sAlarm Information#LOS = 10 LOF = 0 LOM = 0OOF = 0 OOM = 0 AIS = 0IAE = 0 BIAE = 0 SF_BER = 0SD_BER = 0 BDI = 0 TIM = 0FECMISMATCH = 0 FEC-UNC = 0 FLEXO_GIDM = 0FLEXO-MM = 0 FLEXO-LOM = 3 FLEXO-RDI = 0FLEXO-LOF = 5Detected Alarms # NoneBit Error Rate InformationPREFEC BER # 1.7E-03POSTFEC BER # 0.0E+00Q-Factor # 9.30 dBQ-Margin # 2.10dBFEC mode # C_FECPerformance Measurement DataRP/0/RP0/CPU0#ron-8201-1#show controllers coherentDSP 0/0/0/20 pm current 30-sec fecThu Jun 3 15#42#28.510 PDTg709 FEC in the current interval [15#42#00 - 15#42#28 Thu Jun 3 2021]FEC current bucket type # Valid EC-BITS # 20221314973 Threshold # 83203400000 TCA(enable) # YES UC-WORDS # 0 Threshold # 5 TCA(enable) # YES MIN AVG MAX Threshold TCA Threshold TCA (min) (enable) (max) (enable)PreFEC BER # 1.5E-03 1.5E-03 1.6E-03 0E-15 NO 0E-15 NOPostFEC BER # 0E-15 0E-15 0E-15 0E-15 NO 0E-15 NOQ[dB] # 9.40 9.40 9.40 0.00 NO 0.00 NOQ_Margin[dB] # 2.20 2.20 2.20 0.00 NO 0.00 NOEPNM Monitoring of Routed Optical NetworkingEvolved Programmable Network Manager, or EPNM, can also be used to monitor router ZR/ZR+ performance measurement data and display device level alarms when faults occur. EPNM stores PM and alarm data for historical analysis.EPNM Chassis View of DCO TransceiversThe following shows a chassis view of a Cisco 8201 router. The default view is to show all active alarms on the device and its components. Clicking on a specific component will give information on the component and narrow the scope of alarms and data.Chassis ViewInterface/Port ViewEPNM DCO Performance MeasurementEPNM continuously monitors and stores PM data for DCO optics for important KPIs such as TX/RX power, BER, and Q values. The screenshots below highlight monitoring. While EPNM stores historical data, clicking on a speciic KPI will enable realtime monitoring by polling for data every 20 seconds.DCO Physical Layer PM KPIsThe following shows common physical layer KPIs such as OSNR and RX/TX power. This is exposed by monitoring the Optics layer of the interface. DCO.The following shows common framing layer KPIs such as number of corrected words per interval and (BIEC) Bit Error Rate. This is exposed by monitoring the CoherentDSP layer of the interface.Cisco IOS-XR Model-Driven Telemetry for Routed Optical Networking MonitoringAll operational data on IOS-XR routers and optical line systems can be monitored using streaming telemetry based on YANG models. Routed Optical Networking is no different, so a wealth of information can be streamed from the routers in intervals as low as 5s.ZR/ZR+ DCO TelemetryThe following represents a list of validated sensor paths useful for monitoringthe DCO optics in IOS-XR and the data fields available within thesesensor paths. Note PM fields also support 15m and 24h paths in addition to the 30s paths shown in the table below. Sensor Path Fields Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info alarm-detected, baud-rate, dwdm-carrier-frequency, controller-state, laser-state, optical-signal-to-noise-ratio, temperature, voltage Cisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-lanes/optics-lane receive-power, receive-signal-power, transmit-power Cisco-IOS-XR-controller-otu-oper#otu/controllers/controller/info bandwidth, ec-value, post-fec-ber, pre-fec-ber, qfactor, qmargin, uc Cisco-IOS-XR-pmengine-oper#performance-management/optics/optics-ports/optics-port/optics-current/optics-second30/optics-second30-optics/optics-second30-optic dd__average, dgd__average, opr__average, opt__average, osnr__average, pcr__average, pmd__average, rx-sig-pow__average, snr__average, sopmd__average Cisco-IOS-XR-pmengine-oper#performance-management/otu/otu-ports/otu-port/otu-current/otu-second30/otu-second30fecs/otu-second30fec ec-bits__data, post-fec-ber__average, pre-fec-ber__average, q__average, qmargin__average, uc-words__data NCS 1010 Optical Line System MonitoringThe following represents a list of validated sensor paths useful for monitoringthe different optical resources on the NCS 1010 OLS. The OTS controller represents the lowest layer port interconnecting optical elements. The NCS 1010 supports per-channel monitoring, exposed as the OTS-OCH Sensor Path Fields Cisco-IOS-XR-controller-ots-oper#ots-oper/ots-ports/ots-port/ots-info total-tx-power, total-rx-power, transmit-signal-power, receive-signal-power, agress-ampi-gain, ingress-ampli-gain, controller-state Cisco-IOS-XR-controller-ots-och-oper#ots-och-oper/ots-och-ports/ots-och-port/ots-och-info total-tx-power, total-rx-power, transport-admin-state, line-channel, add-drop-channel Cisco-IOS-XR-controller-oms-oper rx-power, tx-power, controller-state, led-state Cisco-IOS-XR-controller-och-oper#och-oper/och-ports/och-port/och-info channel-frequency, channel-wavelength, controller-state, rx-power, tx-power, channel-width, led-state Cisco-IOS-XR-pmengine-oper#performance-management/optics/optics-ports opr, opt, opr-s, opt-s Cisco-IOS-XR-olc-oper#olc/span-loss-ctrlr-tables/span-loss-ctrlr-table neighbor-rid, rx-span-loss, tx-span-loss, name Open-source MonitoringCisco model-driven telemetry along with the open source collector Telegraf and the open source dashboard software Grafana can be used to quickly build powerful dashboards to monitor ZR/ZR+ and NCS 1010 OLS performance.Additional ResourcesCisco Routed Optical Networking 2.1 Solution Guidehttps#//www.cisco.com/content/en/us/td/docs/optical/ron/2-1/solution/guide/b-ron-solution-21.htmlCisco Routed Optical Networking Home https#//www.cisco.com/c/en/us/solutions/service-provider/routed-optical-networking.html Cisco Routed Optical Networking Tech Field Day Solution Overview# https#//techfieldday.com/video/build-your-network-with-cisco-routed-optical-networking-solution/ Automation Demo# https#//techfieldday.com/video/cisco-routed-optical-networking-solution-demo/Cisco Champion Podcasts Cisco Routed Optical Networking Solution for the Next Decade https#//smarturl.it/CCRS8E24 Simplify Network Operations with Crosswork Hierarchical Controller# https#//smarturl.it/CCRS8E48 Appendix AAcronyms     DWDM Dense Waveform Division Multiplexing OADM Optical Add Drop Multiplexer FOADM Fixed Optical Add Drop Multiplexer ROADM Reconfigurable Optical Add Drop Multiplexer DCO Digital Coherent Optics FEC Forward Error Correction OSNR Optical Signal to Noise Ratio BER Bit Error Rate DWDM Network Hardware OverviewOptical Transmitters and ReceiversOptical transmitters provide the source signals carried across the DWDM network.They convert digital electrical signals into a photonic light stream on aspecific wavelength. Optical receivers detect pulses of light and and convertsignals back to electrical signals. In Routed Optical Networking, digital coherent QSFP-DD OpenZR+ and 400ZR transceivers in routers are used as optical transmitters and receivers.Multiplexers/DemultiplexersMultiplexers take multiple wavelengths on separate fibers and combine them intoa single fiber. The output of a multiplexer is a composite signal.Demultiplexers take composite signals that compatible multiplexers generate andseparate the individual wavelengths into individual fibers.Optical AmplifiersOptical amplifiers amplify an optical signal. Optical amplifiers increase thetotal power of the optical signal to enable the signal transmission acrosslonger distances. Without amplifiers, the signal attenuation over longerdistances makes it impossible to coherently receive signals. We use differenttypes of optical amplifiers in optical networks. For example# preamplifiers,booster amplifiers, inline amplifiers, and optical line amplifiers.Optical add/drop multiplexers (OADMs)OADMs are devices capable of adding one or more DWDM channels into or droppingthem from a fiber carrying multiple channels.Reconfigurable optical add/drop multiplexers (ROADMs)ROADMs are programmable versions of OADMs. With ROADMs, you can change thewavelengths that are added or dropped. ROADMs make optical networks flexible andeasily modifiable.", "url": "/blogs/2023-08-24-cst-routed-optical-2_1/", "author": "Phil Bedard", "tags": "iosxr, design, optical, ron, routing" } , "blogs-2023-11-15-routed-access-for-rural-broadband": { "title": "Routed Access for Rural Broadband", "content": " On This Page High-Level Design Key Drivers New Sources of Funding New Providers Focus on Fiber Value Proposition Solution Overview PON Agnostic Routed Access Hardware Components NCS 540 Series NCS 5500 / 5700 Fixed Chassis ASR 9900 Design Components Topology Options Routing and Forwarding Fast Re-Route using TI-LFA MPLS VPN Services IPv6 vs IPv4 QOS Use Cases High Speed Residential Data Services High Availability for PON Per-subscriber Policies with BNG Implementation Details Targets Testbed Overview Devices Key Resources to Allocate Role-Based Configuration Transport IOS-XR – All IOS-XR nodes IGP Protocol (ISIS) and Segment Routing MPLS configuration BGP – Access IOS-XR configuration Services EVPN-enabled LAG for Redundant OLT Connections Access Router Service Provisioning# Per-subscriber policies with BNG Access Router Service Provisioning# Service Router Provisioning# Summary# Routed Access for Rural Broadband High-Level DesignRouted Access for Rural Broadband introduces best-practice network design for small operators looking to deploy residential broadband services.Key DriversNew Sources of FundingSubstantial and ongoing federal investment in US broadband has created a historic opportunity to bring high speed access to the unserved and underserved across the country, primarily in rural areas. The following federal programs are some of the major funding initiatives that have brought $100 billion to solve the challenges of rural broadband access# The NTIA Broadband Equity, Access, and Deployment (BEAD) Program, provides $42.45 billion to expand high-speed internet access by funding planning, infrastructure deployment and adoption programs in all 50 states FCC Rural Digital Opportunity Fund (RDOF) Phase 1 & 2 will disburse up to $20.4 billion over 10 years in two phase to bring fixed broadband and voice service to millions of unserved homes and small businesses in rural America. The American Rescue Plan (ARPA) has already spent or committed more than $25 billion to invest in affordable high-speed internet and connectivity. The NTIA Tribal Broadband Connectivity Program (TBCP) is a $3 billion program, from President Biden’s Bipartisan Infrastructure Law and the Consolidated Appropriations Act, to support Tribal governments bringing high-speed Internet to Tribal landsWith an estimated $40 billion in additional private equity funds being deployed in tandem with federal grants, 57 million additional homes will be served with fiber to the home (FTTH) in the next 5 years.New ProvidersWhile traditional, large-scale service providers will be able to expand their services with these new sources of funds, the goals and funding model of federal grants also open the door for new types of providers to enter the market. Smaller scale and non-traditional providers (such as utilities, tribal organizations, non-profit consortiums and public sector organizations) are taking advantage of federal grants to bring broadband to their constituents. For example, the BEAD program grants money to individual states that then distribute the funds through state-level broadband offices. These offices can give preference to smaller, community-based, and non-traditional providers that strive to serve the mission of providing broadband access to local underserved populations.Focus on FiberAlthough many types of technology can enable high-speed broadband access, a clear preference for fiber to the home has emerged. The majority of BEAD funding ($41.6 billion) is to be used to deploy last-mile fiber-optic networks. Only “extremely high-cost” areas will be permitted to deploy non-fiber technologies under the terms of the program.There are several reasons for the focus on fiber for rural broadband deployments. Network design for residential broadband services is constrained by the availability of power for electronic components that make up the network. A passive optical network (PON) running over fiber does not need to be powered by electricity (even splitters function without added power) which means that higher distances can be achieved between a customer’s home or business and the central office or hub. For rural deployments with low population densities and long distances between subscribers, PON is an obvious choice.In addition to being able to transmit high-bandwidth data over long distances with minimal electro-magnetic interference, fiber deployments can last for decades with the same fiber supporting higher and higher data rates as technologies evolves over time. One such transition is happening now. For more than a decade, Gigabit-capable Pon (GPON) has been the gold standard for fiber deployments, providing 2.5Gpbs downstream and 1.25 Gbps upstream. As the demand for bandwidth has grown, providers have started to shift to 10-Gigabit symmetric PON (XGS-PON). Because GPON and XGS-PON run on different wavelengths, both can run on the same fiber. Providers can easily migrate from GPON to XGS-PON by swapping out the equipment on either end of the fiber.With all the advantages of fiber and 10 Gigabits of symmetric bandwidth, XGS-PON is a compelling choice for rural broadband over the next 5 years.Value PropositionConverged SDN Transport is the gold standard for modern service provider networking architecture, delivering a high-scale, programmable network capable of delivering all services (Residential, Business, 4G/5G Mobile Backhaul, Video, IoT). While many of the high-level design principles of Converged SDN Transport can be extended to rural broadband, many deployments will benefit from some modifications. In particular, rural broadband deployments may emphasize# Lower densities# rural deployments will have fewer people over greater distances than urban or suburban deployments Lower scale# unlike urban or suburban deployments, rural deployments may have much smaller total subscriber counts. The RBB Design is targeted at deployments from 5,000 to 100,000 subscribers. More flexibility# rural deployments may require the flexibility to build as they go, starting simple and adding functionality as requirements evolve and funding allows. The RBB design starts with basic connectivity with the option to add redundancy and subscriber-specific features later on. Residential PON# while large providers may need to support a wide variety of services, rural broadband will often focus on providing PON to residential subscribers with a simple high-speed data (with or without voice) service. The RBB design focuses on providing an access network design for PON networks. Simplicity# smaller, non-traditional providers need simple, easy-to-manage networks. The RBB design streamlines the Converged SDN Transport design, eliminating complex features where possible. These features can be added later if the scale and service requirements of the network change.Solution OverviewRouted Access for Rural Broadband is made of the following main building blocks# PON-Agnostic IOS-XR as a common operating system proven in Service Provider Networks Routed access with transport based on Segment Routing Redundancy and traffic isolation provided by Layer 2 (EVPN) and Layer 3 VPN services based on BGP BNG for optional per-subscriber traffic managementPON AgnosticOne of the first decisions a new rural broadband provider makes is how to deploy PON. The optical line terminal (OLT) is the starting point of the optical network and serves as the demarkation point of the access network. Many PON vendors offer OLTs in a variety of form-factors, from massive 48-port PON shelves to 4-port temperature-hardened remote OLTs, to pluggable SFPs that provide full OLT functionality when plugged into an ethernet port.The sparser densities and longer distances in rural broadband will typically favor smaller form factor OLTs. In any case, the Routed Access for Rural Broadband design supports any vendor’s OLT that connects to the access network via common ethernet technologies.Routed AccessAccess domains have traditionally been built using flat Layer 2 native ethernet technologies. But traditions sometimes persist even when the reasons for them no longer exist. In the past, many people defaulted to Layer 2 networks because switching was less expensive and Layer 2 seemed simpler than IP networks. But big shifts in the economics of routing silicon have made routers more affordable and innovations in routing protocols have made it simpler to deploy.There are many benefits to bringing IP to the access network# Arbitrary Topologies# To prevent broadcast storms and duplicate packets, traditional Layer2 topologies must not contain loops. G.8032 is a common loop prevention technique that requires an underlying ring topology. In other topologies, Spanning Tree Protocol is required to block redundant ports. Because IP networks use TTL to prevent loops, arbitrary topologies can be supported without disabling redundant paths. Reconvergence# In Layer 2 networks, MAC-learning occurs in the data plane. Changes in topology (i.e. link failures) require flushing all MAC tables and subsequent flooding of traffic. This can caused extended reconvergence times. L3 networks, on the other hand, can reconverge very quickly Troubleshooting# It can be difficult to know the actual path in complex Layer 2 networks. Efficiency# Because redundant paths must be blocked in Layer 2 topologies, there is no notion of Equal Cost Multi-Pathing (ECMP). IP networks support equal-cost multi-path and active-active redundancy, so that all links can be used at the same time. Resilience# Layer 2 networks achieve multi-homing only with active-standby MC-LAG, which is vulnerable to traffic storms. Layer 3 networks can achieve multi-homing with EVPN-based LAG which supports active-active redundancy.Hardware ComponentsNCS 540 SeriesThe NCS 540 family of routers are powerful and versatile access routers capable of aggregating uplinks from XGS-PON devices. With a wide range of speeds (10G/25G/40G/100G) and port densities, the NCS 540 series can support many variations of access topologies.NCS 5500 / 5700 Fixed ChassisThe NCS 5500 / 5700 fixed series devices are scalable, low-power, and cost-optimized 100G routing platforms for aggregation and high-scale access roles in the Routed Access for Rural Broadband design. The platform family offers industry-leading density of 1/10/25/40/50/100/400G ports with efficient forwarding performance, low jitter, and the lowest power consumption per gigabits/sec at a very cost-effective price point.ASR 9900The ASR 9900 is the router family of choice for edge services. The Routed Access for Rural Broadband utilizes the ASR 9000 in a PE function role, performing Pseudowire headend termination for per-subscriber policy with BNG.Design ComponentsTopology OptionsUnlike the tidy racks of data center networks, the physical design of broadband networks tends to be messy, dominated by the geography and population density of the area to be served. Homes connect back to OLT located at a point-of-presence, central office or outside plant in a point-to-multipoint topology. An XGS-PON OLT can serve homes in a 20 to 100 km radius. A small, remote OLT (hardened for outside conditions) with 4 ports and a 64-way split could serve up to 256 homes. A 16-port PON shelf with the same 64-way split might serve 1024 homes within a 40 km radius. For denser, less rural areas, larger PON shelves can be stacked to serve tens or hundreds of thousands of homes within the supported radius.Once the user’s traffic reaches the OLT, it gets handed off to the access network which takes that traffic to the Internet (or other service). The design of access networks is also determined by geography and the availability of fiber. Two typical designs are ring and point-to-point.In a routed ring access topology, routers are connected in a ring. Because traffic can flow in either direction, the ring topology provides an automatic backup path if a link or router fails with sub 50-millisecond convergence times.In the following example topology, generic XGS-PON OLTs with 10G uplinks connect to a 100G routed access ring of NCS 540s. The 55A1-24Hs, with 24 100G ports, can aggregate multiple 100G NCS 540 rings and, if needed, provide uplinks for 100G OLT shelves.Depending on the density and distribution of subscribers, the uplink capabilities of the your PON solution, the availability requirements, and the acceptable oversubscription rate, different platforms in the NCS540 and NCS5500/5700 families can be used to construct the access ring and aggregation nodes.In a point to point access architecture, the access routers can be single or dual-homed to the central office. Dual-homing enables redundancy should either of the uplinks fail, as well as providing extra capacity during normal operation.Regardless of how the routers connect back to the Internet (ring or point-to-point), each router serves as an aggregation point for individual OLTs, OLT rings or OLT trees. Depending on the availability requirements, those OLTs can be single or dual-homed to the routers.Routing and ForwardingOnce the user traffic reaches the access router, it enters the layer 3 domain of the network. The ethernet frame header is discarded and the packet is routed over the IP network to the final destination.The routing protocol or IGP (Interior Gateway Protocol) allows routers in the same domain to exchange routing information so that packets can be routed according to the destination IP address. This design uses IS-IS as the IGP protocol. Because rural broadband designs are typically small enough to run a single IGP domain, the focus of the design is on intra-domain forwarding. For networks that exceed the scale limits of a single domain, refer to the Inter-Domain Operation guidelines of the Converged SDN Transport design.To improve forwarding performance and enable advanced features, this design uses Multiprotocol Label Switching (MPLS) for the data plane. MPLS is a “layer 2.5” technology that enables routers to transfer packets through the network using labels. Instead of needing expensive route table lookups at every hop, MPLS traffic can be efficiently forwarded through label-switching. Historically, a separate protocol like LDP was required to advertise labels throughout the network. Today, Segment Routing eliminates the need to configure and maintain LDP to advertise labels.Segment Routing reduces the amount of protocols needed in a Service Provider Network. Simple extensions to traditional IGP protocols like ISIS provide full Intra-Domain Routing and Forwarding Information over a label switched infrastructure, along with High Availability (HA) and Fast Re-Route (FRR) capabilities that enable fast recovery should a link or node fail.Segment Routing introduces the idea of a Prefix Segment Identifier or Prefix-SID. A prefix-SID identifies the router and must be unique for every router in the IGP Domain. Prefix-SID is statically allocated by the network operator in the IGP configuration process. The Prefix-SID is advertised by the IGP protocol which eliminates the need to use LDP or RSVP protocol to exchange MPLS labels. For these reasons, Segment Routing is a foundational technology in this design.Fast Re-Route using TI-LFASegment-Routing embeds a simple Fast Re-Route (FRR) mechanism known as Topology Independent Loop Free Alternate (TI-LFA).TI-LFA provides sub 50ms convergence for link and node protection. TI-LFA is completely stateless and does not require any additional signaling mechanism as each node in the IGP Domain calculates a primary and a backup path automatically and independently based on the IGP topology. After the TI-LFA feature is enabled, no further care is expected from the network operator to ensure fast network recovery from failures. This is in stark contrast with traditional MPLS-FRR, which requires RSVP and RSVP-TE and therefore adds complexity in the transport design.MPLS VPN ServicesOne of the advantages of an IP/MPLS network is that it enables the deployment of VPN services. Many people associate L3VPN and L2VPN with expensive business services. But even small providers can benefit from judicious use of MPLS VPNs in residential deployments.In the simplest design, subscriber traffic arrives at the access device, receives an IP address via DHCP and gets access to the Internet via the global routing table. Many large, modern networks are being built to isolate the global routing table from the underlying infrastructure. In this case, the Internet global table is carried as an L3VPN service, leaving the infrastructure layer protected from the global Internet and subscribers.The other use case for MPLS VPN services is when Broadband Network Gateway (BNG) is deployed for per-subscriber traffic management. Because BNG requires more sophisticated treatment of the user traffic and, hence, more expensive forwarding hardware, the BNG device is often deployed in a centralized location to take advantage of economies of scale. Access devices can be configured to backhaul Layer 2 subscriber traffic to the BNG device using EVPN pseudowires.IPv6 vs IPv4To address the inevitable exhaustion of IPv4 addresses, the IETF standardized IPv6 decades ago. Even today, however, IPv6 adoption remains incomplete. Some content on the internet is only available via IPv4; some network infrastructure features are still optimized for IPv4.For rural broadband networks, this design recommends using IPv4 for the underlay transport infrastructure. SR-MPLS is very mature on IPv4 networks and private IP addressing can be used for the routers in the network, thus avoiding address exhaustion concerns for these devices. At the same time, the design supports “dual-stack” (IPv4 and IPv6) deployments to ensure that the network can transition to IPv6-based transport should the need arise. The underlay transport architecture for IPv6 network is called “SRv6.” SRv6 enables next-generation IPv6 based networks to support complex user and infrastructure services at very high scale. Rural broadband networks typically do not have the scale or complexity to make SRv6 as compelling. But if, for any reason, IPv4 and SR-MPLS are not suitable for your deployment, reference the Cisco Converged SDN Transport SRv6 Transport High Level Design for details on SRv6-based transport.For end customers and customer premise equipment, addressing schemes can be IPv4-only, IPv6-only or dual-stack. Deploying IPv4 addresses will require either 1) acquiring globally routable IPv4 addresses; or 2) translating private IPv4 addresses using a Carrier Grade Nat (CGN) solution. Each option has drawbacks. If you have not already been allocated a block, globally routable IPv4 addresses are expensive. In 2022, IPv4 addresses were selling for around $60 each. In addition to being expensive, purchased addresses can come with baggage. Depending on the actions of the previous owner, the address may have acquired a bad reputation that resulted in it being blacklisted from certain networks. This can be a difficult problem to detect, troubleshoot and clean up. If you chose a CGN solution instead, you will need far fewer globally routed addresses but you will have to purchase, configure and maintain another device in the dataplane.Over 40% of the world’s top 1000 websites are accessible via native IPv6, which means that IPv6-only clients can quickly access that content. However, for the rest of the content on the network, some translation or tunneling mechanism must be implemented to allow IPv6-only end users to connect to IPv4-only content which adds complexity to the network design.Given the state of the internet today, a dual-stack deployment with CGN may represent the best tradeoff for rural broadband deployments. Native IPv6 traffic (which includes popular, bandwidth heavy services like Netflix and YouTube) can go natively between clients and content providers. IPv6 traffic will bypass the CGN function, reducing the load on that device. IPv4-only content will continue to leverage CGN until the day that IPv6 has been fully adopted world-wide.QOSQuality of Service (QoS) refers to the ability of a network to provide better service to various types of network traffic. To achieve an end to end QoS objective (i.e. between the subscriber and the Internet), it’s necessary to consider the different mechanisms in the PON network and routed access network.QoS can be applied in two directions# upstream (from the subscriber to the Internet) or downstream (from the internet to the subscriber). In PON networks, upstream traffic is controlled by the OLT, which implements a flow-control mechanism that lends itself to a simpler implementation of QoS. The ITU-T G.983.4 recommendation coupled with other QoS mechanisms enable the OLT to effectively schedule, shape and prioritize upstream traffic.While upstream traffic management in the OLT is well-defined and largely consistent across vendors, downstream traffic management can often be more effectively carried out deeper in the access and core network.The platforms in the design have sophisticated and rich QoS feature sets that are widely deployed in large-scale, converged, multi-service provider networks. Even on the smallest platforms, QoS has many options and can be endlessly customized. Rural broadband networks, however, tend to be relatively simple and well-supplied with bandwidth. Therefore, the goal of this design is to get as much value as possible from the simplest deployment of QoS. Simple policies with basic marking and egress queuing are often sufficient and require less monitoring and maintenance than more complex policies.QoS policies applied to network interfaces act on the aggregate subscriber traffic on that interface. This, combined with OLT QoS capabilities, may be sufficient for simple deployments of sparsely populated areas. Other providers may need to leverage a BNG solution for finer-grained QoS policies.Use CasesHigh Speed Residential Data ServicesThe primary use case for rural broadband deployments leveraging federal grants such as BEAD is a high speed (100M minimum) data service. Other offerings, such as telephone and video, may be offered over a funded network that meets the unicast data requirements but those services are subject to additional regulation.Depending on the configuration of the PON network, unicast subscriber traffic will arrive at the access router from the OLT in one of two ways# VLAN per service (N#1 VLAN) or VLAN per subscriber (1#1 VLAN). In the N#1 VLAN model, a service (e.g. high-speed data) is represented by a VLAN. Many (“N”) subscribers access that service via a single (“1”) VLAN. Other services, like voice and video, use other VLANs. This is the simplest solution to deploy and maintain.In the 1#1 VLAN model, each subscriber is assigned their own VLAN. Typically, the 1#1 VLAN model is deployed using Q-in-Q double tagging (C-VLAN + S-VLAN) to overcome the scale limitations of 4096 VLANs. However, managing thousands of customer VLANs can become administratively heavy, especially for provisioning, troubleshooting and documentation.The routed access design for rural broadband can support either VLAN model.High Availability for PONWhether you’re deploying a single OLT shelf, a tree of OLTs or a ring of OLTs, the connection from the OLT to the access router represents a potential point of failure. To protect against the failure of a single port on either the router or the OLT, multiple links are commonly bundled together for a redundant connection. To protect against a router failure, the OLT can be dual-homed to two different routers using EVPN. EVPN supports “all-active” redundant links to two or more routers. The OLT believes that it is connected to a single device using a normal bundle interface.The routed access design for rural broadband supports a single, non-redundant uplink, a bundled interface to the same router, and a bundled interface to two redundant routers with EVPN.Per-subscriber Policies with BNGTo manage individual subscriber sessions, providers can implement Broadband Network Gateway (BNG). BNG acts as a gatekeeper, verifying that only approved subscribers can get access to the network and implementing per-subscriber policies (such as QoS or access-control lists) to improve security and the quality of subscriber experience.BNG manages all aspects of subscriber access including# Authentication, authorization and accounting of subscriber sessions Address assignment Security Policy management Quality of Service (QoS)Cisco’s implementation of BNG is a mature, feature-rich technology that supports many use cases of varying degrees of complexity. The RBB design focuses on a simple IP over Ethernet (IPoE) deployment that authenticates subscribers, assigns addresses using DHCP, and applies per-subscriber policy.Implementation DetailsTargets Hardware# NCS540 and NCS5500 as Access Router ASR9903 as Provider Edge (PE) node NCS5500 as Aggregation Node Software# IOS-XR 7.9.2 on NCS540, NCS 5500 IOS-XR 7.8.1 on ASR9000 Key technologies Transport# End-To-End Segment-Routing Services# BGP-based L2 and L3 Virtual Private Network services(EVPN and L3VPN) Testbed OverviewFigure 1# Routed Access For Rural Broadband High Level TopologyDevicesAccess Routers Cisco N540-24Z8Q2C-M (IOS-XR) – PE101, PE102, PE104, PE105 Cisco N540X-16Z4G8Q2C-A (IOS-XR) - PE103 Area Border Routers (ABRs) and Provider Edge Routers# Cisco ASR9000 (IOS-XR) – PE3, PE4Key Resources to Allocate IP Addressing IPv4 address plan IPv6 address plan IS-IS unique instance identifiersRole-Based ConfigurationTransport IOS-XR – All IOS-XR nodesIGP Protocol (ISIS) and Segment Routing MPLS configurationRouter isis configurationkey chain ISIS-KEY key 1 accept-lifetime 00#00#00 january 01 2018 infinite key-string password 00071A150754 send-lifetime 00#00#00 january 01 2018 infinite cryptographic-algorithm HMAC-MD5All Access Routers are part of one IGP domain (ISIS ACCESS).router isis ISIS-ACCESS set-overload-bit on-startup 360 is-type level-2-only net 49.0001.0102.0000.0065.00 nsr nsf cisco log adjacency changes lsp-gen-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 lsp-refresh-interval 65000 max-lsp-lifetime 65535 lsp-password keychain ISIS-KEY lsp-password keychain ISIS-KEY level 1 address-family ipv4 unicast metric-style wide spf-interval maximum-wait 5000 initial-wait 50 secondary-wait 200 segment-routing mpls spf prefix-priority critical tag 5000 spf prefix-priority high tag 1000 ! interface Loopback0 address-family ipv4 unicast prefix-sid index 65 ! !TI-LFA FRR configuration interface HundredGigE0/0/1/0 point-to-point hello-password keychain ISIS-KEY address-family ipv4 unicast fast-reroute per-prefix fast-reroute per-prefix ti-lfa metric 100 ! ! !interface Loopback0 ipv4 address 101.0.2.65 255.255.255.255!MPLS Interface configurationinterface HundredGigE0/0/1/0 mtu 9216 ipv4 address 10.23.65.1 255.255.255.254 ipv4 unreachables disable ipv6 address 2405#23#65##/127 load-interval 30 dampening!QoS Policy-Map ConfigurationThe following represent simple policies to classify on ingress and queue on egress. For simplicity, only two classes and queues are used# high-priority (dscp 46) and default (everything else).class-map match-any match-ef description High priority, EF match dscp 46 end-class-map!class-map match-any match-traffic-class-1 description ~Match highest priority traffic-class 1~ match traffic-class 1 end-class-map!policy-map rbb-egress-queuing class match-traffic-class-1 priority level 1 queue-limit 500 us ! class class-default queue-limit 250 ms ! end-policy-map!policy-map core-egress-queuing class match-traffic-class-1 priority level 1 queue-limit 500 us ! class class-default queue-limit 250 ms ! end-policy-map!policy-map rbb-ingress-classifier class match-ef set traffic-class 1 set qos-group 1 ! class class-default set traffic-class 0 set qos-group 0 ! end-policy-map!policy-map core-ingress-classifier class match-ef set traffic-class 1 ! class class-default ! end-policy-mapCore-facing Interface Qos Configinterface HundredGigE0/0/1/0 service-policy input core-ingress-classifier service-policy output core-egress-queuingOLT-facing Interface QoS Config interface Bundle-Ether1 service-policy input rbb-ingress-classifier service-policy output rbb-egress-queuingBGP – AccessEnable BGP when your deployment needs any of the following# Multi-homed, redundant connectivity to the OLT using EVPN-enabled LAG Isolation of user traffic into a non-default VRF Per-subscriber policy with BNGVery small networks can use fully meshed BGP peers. Otherwise, a route reflector is recommended.IOS-XR configurationrouter bgp 100 nsr bgp router-id 101.0.2.65 bgp graceful-restart bgp graceful-restart graceful-reset ibgp policy out enforce-modifications address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family l2vpn evpn ! neighbor-group SvRR remote-as 100 update-source Loopback0 address-family vpnv4 unicast ! address-family vpnv6 unicast ! address-family l2vpn evpn ! ! neighbor 100.0.1.201 use neighbor-group SvRR !ServicesEVPN-enabled LAG for Redundant OLT ConnectionsWhen applied to both pe101 and pe102, the following configuration creates a bridge group for VLAN 300 on both routers which share a default anycast gateway represented by the BVI interface. The BVI interface can be assigned a DHCP relay profile, included in the global routing table or put in a separate VRF.VRF (Virtual Routing and Forwarding) is a feature that allows the router to maintain multiple, independent routing tables. Creating a separate VRF for user and Internet traffic isolates the infrastructure layer and protects it from the Internet. Read more about the uses and benefits of VRFs in network design in the Peering Fabric Design.Note that this configuration can be applied even if you only have a single link to a single access router. By configuring a bundle interface from the beginning, you can easily add another to a second router without changing the configuration of the first router.Access Router Service Provisioning#PON-facing Access Interface Configurationinterface TenGigE0/0/0/0 bundle id 1 mode active !interface Bundle-Ether1 lacp system mac 0000.0000.0001 lacp system priority 1!interface Bundle-Ether1.300 l2transport encapsulation dot1q 300 rewrite ingress tag pop 1 symmetric !interface BVI1 host-routing ipv4 address 10.10.65.1 255.255.255.0 local-proxy-arp ipv6 address 3ffe#501#ffff#101##8/64 mac-address 0.0.3EVPN Ethernet-Segment Configurationevpn interface Bundle-Ether1 ethernet-segment identifier type 0 00.00.00.00.00.00.00.00.01L2VPN Bridge-Group Configurationl2vpn bridge group RBB_BRIDGE_GROUP bridge-domain RBB_BRIDGE_DOMAIN interface Bundle-Ether1.300 ! routed interface BVI1 ! evi 3Per-subscriber policies with BNGThe following configuration is used for a deployment of IPoE subscriber sessions. The configuration of some external elements such as the RADIUS authentication server are outside the scope of this document. For more information about the subscriber features and policy (including RADIUS-based QoS, security ACLs, Lawful Intercept and more), see the Broadband Network Gateway Configuration Guide for Cisco ASR 9000 Series Routers.When applied to both pe101 and pe102, the following configuration creates a bridge group for VLAN 311 on both routers that backhauls Layer 2 traffic to PE3 which can authenticate the subscriber, apply per-subscriber policies, assign IP addresses and route traffic to the internet.Access Router Service Provisioning#Interface Configurationinterface TenGigE0/0/0/0 bundle id 1 mode active !interface Bundle-Ether1 lacp system mac 0000.0000.0001 lacp system priority 1 interface Bundle-Ether1.311 l2transport encapsulation dot1q 311 rewrite ingress tag pop 1 symmetricEVPN Ethernet-Segment Configurationevpn interface Bundle-Ether1 ethernet-segment identifier type 0 00.00.00.00.00.00.00.00.01L2VPN Pseudowire Configuration l2vpn xconnect group RBB-PW-BNG p2p RBB-PW-BNG1 interface Bundle-Ether1.311 neighbor evpn evi 10101 service 101Service Router Provisioning#The pseudowire initiated on pe101 and pe102 is terminated on the BNG router, pe3, on a pseudowire headend endpoint (PW-Ether). When the first dhcp request from the subscriber arrives on the pseudowire headend on pe3, pe3 will authenticate the source mac-address and apply the per-subscriber policy.Interface configurationinterface PW-Ether2101 mtu 1518 ipv4 address 182.168.101.1 255.255.255.0 ipv6 address 2000#101#1##1#1/64 ipv6 enable service-policy type control subscriber RBB_IPoE_PWHE attach generic-interface-list PWHE ipsubscriber ipv4 l2-connected initiator dhcp ! ipsubscriber ipv6 l2-connected initiator dhcp!generic-interface-list PWHE interface HundredGigE0/0/0/15DHCP configurationdhcp ipv4 profile RBB_GROUP proxy helper-address vrf default 10.0.65.2 giaddr 101.0.0.3 relay information option allow-untrusted ! interface PW-Ether2101 proxy profile RBB_GROUPL2VPN Pseudowire configurationl2vpn xconnect group rbb-pwhe-bng p2p rbb-pwhe-bng1 interface PW-Ether2101 neighbor evpn evi 10101 service 101Policy-Map and dynamic policy for BNGpolicy-map type control subscriber RBB_IPoE_PWHE event session-start match-first class type control subscriber DHCP46 do-until-failure 1 authorize aaa list RBB_AUTH1_LIST identifier source-address-mac password cisco 10 activate dynamic-template PWHE_PBNG1 ! ! end-policy-map dynamic-template type ipsubscriber PWHE_PBNG1 ipv4 verify unicast source reachable-via rx ipv4 unnumbered Loopback1 ipv4 unreachables disable ipv6 enable !!To enforce per-subscriber rate-limiting, you can define a simple QoS policy#policy-map PLAN_100M class class-default police rate 100 mbps ! ! end-policy-map!And add it to the dynamic template#dynamic-template type ipsubscriber PWHE_PBNG1 service-policy output PLAN_100MSummary# Routed Access for Rural BroadbandThe Routed Access design brings the benefits of Converged SDN Transport to rural broadband networks with simple, flexible, smaller-scale approach to residential broadband. No matter what PON solution you deploy, Routed Access reduces the complexity associated with Layer 2 networks by introducing the many benefits of IP and Segment Routing as close to the PON network as possible.", "url": "/blogs/2023-11-15-routed-access-for-rural-broadband/", "author": "Shelly Cadora", "tags": "cisco" } , "blogs-xr-dco-monitoring": { "title": "Monitoring Digital Coherent Optics in IOS-XR", "content": " On This Page Revision History Document Overview Routed Optical Networking Pluggable Digital Coherent Optics Overview OIF 400ZR and OpenZR+ Standards using QSFP-DD Transceivers Cisco High-Power OpenZR+ “Bright” Transceiver (DP04QSDD-HE0) Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S) Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S) Cisco Hardware Support for 400G ZR/ZR+ DCO Transceivers Cisco 8000 NCS 500 Cisco ASR 9000 NCS 5500 and NCS 5700 Pluggable Digital Coherent Optics Detail Component Diagram Component Packaging Coherent DSP Photonic (Optical) Components Photonic Integrated Circuit (PIC) IOS-XR DCO Conventions Optics Controller CoherentDSP Controller DCO Performance Measurement Data Overview TL;DR What PM values to monitor and what to look for? Optical Layer PM Data Optical Power - Transmit Optical Power (Total) - Receive Optical Power (Signal) - Receive Optical Signal to Noise Ratio Chromatic Dispersion Polarization Mode Dispersion Differential Group Delay Polarization Dependent Loss Second-Order Polarization Mode Dispersion Polarization Change Rate General guidelines for acceptable values Digital Layer PM Data Pre-FEC Bit Error Rate Post-FEC BER and Uncorrectable Words Quality Factor (Q-Factor) Quality Margin (Q-Margin) General guidelines for acceptable values Environmental PM Temperature Voltage Laser Bias Current (LBC) IOS-XR DCO Monitoring “Current” Performance Data CLI “show controller optics” Optical layer PM data CLI “show controller coherentdsp” Digital layer PM data YANG data models for “current” PM data retrieval YANG data model for “current” Optical PM data YANG data model for “current” Digital PM data SNMP MIBs for “current” PM data retrieval SNMP MIB for “current” Optical PM data SNMP MIB for “current” Digital PM data Events and Alarms Optics Controller Alarm Thresholds Counted Optical Alarms Counted Digital Alarms Common Optical Alarms Common Digital Alarms Common Alarms on Fiber Cut IOS-XR Performance Measurement Engine and Threshold Crossing Alerts Performance Measurement Engine Overview Performance Measurement History Optics Controller PM Engine metrics Coherent DSP Controller PM Engine metrics Displaying and retrieving PM Engine Data YANG data models for XR PM Engine data retrieval YANG data model for current or historical PM Engine Optical PM data YANG data model for current or historical PM Engine Digital PM data Optical PM Engine data native model path Digital PM Engine data native model path PM Threshold Crossing Alert Overview Performance measurement configuration Threshold crossing alert configuration Optical Controller TCA alert configuration example CoherentDSP (Digital) Controller TCA alert configuration example TCA alert message example Monitoring DCO using Cisco Network Automation Crosswork Hierarchical Controller Link Assurance with PM Data Crosswork Network Controller Device Level Monitoring Key Performance Indicators and KPI Profiles L1 Optics Available PM Metrics L1 Optics KPI Profile L1 optics power data L1 optics temperature data Custom KPIs and KPI Profiles Additional Resources Routed Optical Networking Design Guide Routed Optical Networking Landing Page Crosswork Network Automation Home Crosswork Network Controller Crosswork Hierarchical Controller Revision History Version Date Comments 1.0 02/10/2024 Initial Publication Document OverviewIn this blog we will focus on the ongoing monitoring of digital coherent optics(DCO) once they are deployed in the network. More information on theinstallation and provisioning of DCO optics can be found in the above locations.We will look at the types of Performance Measurement data important tobe monitored and how to monitoring the PM data through several interfacesincluding the router CLI interface, using streaming telemetry, and using Cisco’s Crosswork family of network automation products.Routed Optical NetworkingRouted Optical Networking introduced by Cisco in 2020 introduced a fundamental shift in how IP+Optical networks are built. Collapsing previously disparate network layers and services into a single unified domain, Routed Optical Networking simplifies operations and lowers overall network TCO. More information on Routed Optical Networking can be found at the following locations# https#//www.cisco.com/c/en/us/solutions/service-provider/routed-optical-networking.html https#//xrdocs.io/latest-routed-optical-networking-hldPluggable Digital Coherent Optics OverviewOne of the foundations of Routed Optical Networking is the use of small formfactor pluggable digital coherent optics. These optics can be used in a widevariety of network applications, reducing CapEx/OpEx cost and reducingcomplexity vs. using traditional external transponder equipment.OIF 400ZR and OpenZR+ Standards using QSFP-DD TransceiversThe networking industry saw a point to improve network efficiency by shiftingcoherent DWDM functions to router pluggables. Technology advancements haveshrunk the DCO components into the standard QSFP-DD form factor, meaning nospecialized hardware and the ability to use the highest capacity routersavailable today. ZR/OpenZR+ QSFP-DD optics can be used in the same ports as thehighest speed 400G non-DCO transceivers.Cisco High-Power OpenZR+ “Bright” Transceiver (DP04QSDD-HE0)Cisco OpenZR+ Transceiver (QDD-400G-ZRP-S)Cisco OIF 400ZR Transceiver (QDD-400G-ZR-S)Two industry optical standards have emerged to cover a variety of use cases. TheOIF created the 400ZR specification,https#//www.oiforum.com/technical-work/hot-topics/400zr-2 as a 400G interopablestandard for metro reach coherent optics. The industry saw the benefit of theapproach, but wanted to cover longer distances and have flexibility inwavelength rates, so the OpenZR+ MSA was created, https#//www.openzrplus.org.The following table outlines the specs of each standard. ZR400 and OpenZR+ transceivers are tunable across the ITU C-Band, 196.1 To 191.3 THz.The following part numbers are used for Cisco’s ZR400 and OpenZR+ MSA transceivers Standard Part 400ZR QDD-400G-ZR-S OpenZR+ QDD-400G-ZRP-S OpenZR+ High-Power (Bright) DP04QSDD-HE0 The Cisco datasheet for these transceivers can be found at https#//www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/datasheet-c78-744377.html and https#//www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/400g-qsfp-dd-high-power-optical-module-ds.htmlCisco Hardware Support for 400G ZR/ZR+ DCO TransceiversCisco supports the OpenZR+ and OIF ZR transceivers across all IOS-XR productlines with 400G QSFP-DD ports, including the ASR 9000, NCS 540, NCS 5500, NCS5700, and Cisco 8000. Please see the Routed Optical Networking Design or theindividual product pages below for more information on each platform.Cisco 8000https#//www.cisco.com/c/en/us/products/collateral/routers/8000-series-routers/datasheet-c78-742571.htmlNCS 500https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-500-series-routers/ncs-540-large-density-router-ds.htmlCisco ASR 9000https#//www.cisco.com/c/en/us/products/collateral/routers/asr-9000-series-aggregation-services-routers/data_sheet_c78-501767.htmlNCS 5500 and NCS 5700https#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-736270.htmlhttps#//www.cisco.com/c/en/us/products/collateral/routers/network-convergence-system-5500-series/datasheet-c78-744698.htmlPluggable Digital Coherent Optics DetailDigital Coherent Optics are complex technology. In this blog we will not go intolow level detail of the inner workings of the optics, but look at some of thebasic components to help understand where specific performance measurement data is derived from and how best to monitor the optics.Component DiagramComponent PackagingOn modern DCO like Cisco’s ZR/ZR+ multiple elements are co-packaged leading tosimpler design and greater power efficiency.Coherent DSPAt the heart of “Digital” coherent optics is a Digital Signal Processor (DSP).The definition of a DSP is in the nane, the DSP analyzes an incoming signal,typically performs some type of manipulation, and then passes that signal ontoanother element. DSPs can be very simple, such as one re-sampling digital audiosignals, or complex DSPs can perform many functions.The input interface into theDSP and the output interface may or may not be the same format, requiring theDSP to perform the conversion. Signal conversion is just one of many jobsperformed by the DSP in a DCO. It is a bit of a stretch to only call theprocessing component inside the DCO a “DSP” due to the number of functions itperforms.We will consider the Analog to Digital Conversion (ADC) and Digital to AnalogConversion (DAC) components part of the “DSP” since they are typically packagedwith the actual signal processor and associated digital components. The ADC/DAC is responsible for converting electrical analog to digital, not converting electrical analog to an optical signal.Photonic (Optical) ComponentsWhile the DSP takes on many functions performed by photonic components in thepast, we still need photonic components to send/receive light across fiber. Theoptical components within the DCO are similar to those found in non-DCOapplications. Common components are the TLA (Tunable Laser Assembly) generating the signal along with TOFs (Tunable Optical Filter) and other pure photonic components such as splitters, combiners, and waveguides. In higher power optics, an integrated EDFA is used to amplify the signal before it leaves the DCO.Photonic Integrated Circuit (PIC)At some point in the signal path a translation between the electrical signal tothe optical signal must take place. This is done through the use of modulatorsand receivers built using Photonic Integrated Circuits in modern DCO. TThese arecommonly referred to as opto-electric modules because they span both signaldomains. The electrical signal generated by the DSP and then the DAC is used tomodulate the transmit optical laser to match the data being sent.Receivers are likewise responsible for processing theoptical signal into electrical domain signals which after some pre-processingcan be processed by the DSP.IOS-XR DCO ConventionsAs we have seen the DCO has both Digital and Optical components, IOS-XRrepresents the optics in a similar manner as a way to better manage each layer, including the PM data we gather and analyze.Optics ControllerThe optics controller on a IOS-XR router represents the physical pluggablefaceplate port on the router. Whether the pluggable is DCO or not, there will be an optics controller. The optics controller is used to manage and monitor the physical layer characteristics of an optic. A plug and play gray pluggable opticdoesn’t require any physical layer configuration since it’s built into the standardand can only be configured one way. However, there are a number of optical properties which can still be monitored on the optics, such as per-lane receive power.In the case of the DCO, the optics controller is where we configure physicallayer properties such as the frequency and output power. It’s also where wemonitor physical layer alarms and PM data.CoherentDSP ControllerThe CoherentDSP controller matches the naming conventions of the opticscontroller.It represents the digital layer of the DCO and is responsible for alarms and PMdata associated with the digital signal.DCO Performance Measurement DataOverviewIn this section we will introduce the performance measurement data availablewith an explanation of what each component is, why it’s important, and what isconsidered an ideal range of values.TL;DR What PM values to monitor and what to look for?The rest of the document goes into detail about the PM data available and how tomonitor them. These however are the key values which should be monitored. Totcal Optical Receive Power (RX Power) Optical Signal Receive Power (RX Signal Power) Q-Factor Q-MarginRX power, Q-Factor, and Q-Margin will be have some absolute value once the circuit is operational. Each of these values have known minimums based on the DCO specifications, but will be different for every operational circuit. It’s important to monitor the values for changes.Note changes may occur due to other network changes such as adding more channels on the same fiber span.Optical Layer PM DataOptical Power - TransmitTransmit power, or TX power represents the strength of the optical signalleaving the DCO. It can be adjusted by user configuration. In most casesleaving the TX power to the default value is advised. On the QDD-400G-ZR-S andQDD-400G-ZRP-S the default power for 400G mode is -10dBm. On the DP04QSDD-HE0the default power for 400G mode is 0dBm. Changes in the TX power can help identify hardware issues. The TX power will fluctuate slightly but changes of more than .2 dBm may indicate an issue.Optical Power (Total) - ReceiveReceive power, or RX power represents the strength of the optical signalentering the DCO from the line side. There is a direct correlation between theRX power and the quality of the signal. The DCO has a defined receiversensitivity range. If the RX power drops below the minimum RX sensitivity value,it may not be able to process the incoming signal. It is difficult to give aspecific acceptable range for RX power since other impairments such as noise mayalso degrade the signal even at acceptable RX power levels. However, in practicethe signal should always be above -20 dBm. The total optical power is measuredat the ingress point into the optic.**Note in IOS-XR the minimum reported power is -40dBm indicating there is no signal present.Optical Power (Signal) - ReceiveReceive signal power represents the signal strength of the coherent channel theDCO is tuned to receive. it’s important to represent both the total power andthe signal power since total power indicates there is some optical signal beingreceived from the far end, and the RX signal power indicates the correct wavelength is being received. If the total RX power is in the normal range and the RX signal power is very low it indicates a frequency configuration issue.**Note in IOS-XR the minimum reported power is -40dBm indicating there is no signal present.Optical Signal to Noise RatioOSNR is one of the key PM values in monitoring an optical signal. Background noise isinherent in almost all analog signals. It can also be introduced into the signalby different photonic elements. Amplifiers can introduce and increase noise. Asthe primary signal is amplified so is the noise inherent in analog signals.Measuring the true optical signal to noise ratio requires test measurementequipment which cannot be packaged in the size of a DCO. The DCO estimates theOSNR based on DSP compensation and are not precise in all conditions.Chromatic DispersionChromatic Dispersion is a linear optical impairment which degrades theoverall signal and must be corrected by the DCO. Different wavelengths travel atdifferent speeds, this can commonly be seen using a prism. 400G signals use75Ghz of spectrum which covers a wide band of frequencies so they do not alltravel together. As the signal travels along the fiber the signal spreads outmeaning the receive end must compensate for the delay between the beginning andend of the signal. The spread of the signal is measured in picoseconds/nanometeror ps/nm.In IOS-XR the user can configure the “sweep” range used to compensate for the chromatic dispersion using the cd-min and cd-max threshold values.The user can also monitor the current estimated Chromatic Dispersion values.These values are derived from calculations done by the DSP during its CDcompensation.Polarization Mode DispersionPMD is another type of signal dispersion or spread. PMD is unique to coherentsignals and measures the spread in time between the X and Y polarized signalscomprising the coherent signal. We don’t measure PMD directly but several of thefollowing PM values are used to measure the effect of PMD. PMD is typicallyintroduced via imperfections in the fiber or mechanical manipulation of thefiber such as bending.Differential Group DelayDGD is the measure of PMD and is used to express the difference in arrival timeof the two orthogonal signals. DGD can also be known as First Order PMD orFOPMD. DGD is measured in picoseconds. Perfect fiber has no PMD/DGD but fiberused in the field is not perfect. Modern SM fiber can introduce .5-1ps of DGDover 100km, it is directly related to fiber length. Optical components can alsointroduce DGD. DGD is wavelength dependent so may be different for differentwavelengths, it can also be introduced by temperature fluctuations in the fiber.Polarization Dependent LossSignal polarization can also have another effect in introducing signal loss as the signal propagates through the fiber. The value is dependent on fiber length and is compensated for by the coherent DSP. The value should remain low in most instances and is measured in dB. Keep in mind this is dB and not the dBm that optical power is expressed in.Second-Order Polarization Mode DispersionSOPMD is a measure of the PMD rates of change related to signal frequency. Thename is due to SOPMD being the second order diffeential with respect to PMD. TheDGD value does not remain static so the DSP must compensate not only for the DGDvalue but the rates of change in DGD. This value is measured in picosecondssquared or ps^2.Polarization Change RateAs the polarized signals travel through the fiber, the fiber can cause changesin the polarization state or SOP (State of Polarization). The coherent DSP must compensate for these changes to properly decode the signal. The PCR is measured in radians/second or rad/s since the state change is rotational. On shorter fiber spans this value should be 0 but longer spans or poor fiber may introduce higher values.General guidelines for acceptable valuesThe following represents the range of values for each optical PM metric and what is considered a “good” value. The “range” represents the range of reporting on Cisco ZR/ZR+ DCO. Metric Units Range Healthy range Comment Optical Power - Receive dBm or mW -33 to +15 -14 to +8 High alarm is set to +10, low to -24 Optical Signal Power - Receive dBm or mW -33 to +15 -14 to +8 High alarm is set to +10, low to -24 Optical Signal Power - Transmit dBm or mW -15 to +5 -12 to +2 Healthy depends on DCO model and mode OSNR dB 16.5 to 28.5 need comment Healthy range changes dependent on mode CD ps/nm -100 to 100 need comment   DGD ps -10 to 10   need comment PCR rad/s 0-50 need comment   PDL dB       Digital Layer PM DataWe’ve covered optical impairments in the analog optical domain. Ultimately any type of signal degradation can lead to errors at the digital layer.Pre-FEC Bit Error RateHigh speed coherent optics expect transmission errors at the bit level due to optical signal impairments. Forward error correction uses an algorithm to send extra data with the signal so it can be used to “correct” the original signal when information is lost or incorrect. Modern algorithms are used to minimize the amount of extra data which needs to be sent.The Pre-FEC BER is expressed in a ratio of bit errors per samples bits, and at the rates being used for ZR/ZR+ optics is very small. It’s expressed in scientific notation such as 3.7E-04, which is .00037. It’s difficult to monitor this as an absolute value, but can be monitored for change over time to help identify issues with the fiber. Q-Margin explained below is an easier value to monitor.Post-FEC BER and Uncorrectable WordsPost-FEC BER measures the amount of bits which are unable to be corrected, andultimately lead to UCs or uncorrectable words/bytes.Any value other than 0 means there is a critical issue with the signal, and inmost cases due to the sharp FEC cliff it’s unlikely the interface will be up ifyou are receiving bit errors.Quality Factor (Q-Factor)The Q-Factor is a DSP calculated value closely related to the Pre-FEC BER andOSNR. The Q-Factor provides a minimum SNR value required to meet a certain BERrequirement. Based on the properties of the DCO optics we know at a certain BERwe require a specific SNR, if the BER is very high then we need a higher SNR.The Q-Factor expressed this relationship via a single value. On the DCO thevalue is based on measured BER over a specific time period.Quality Margin (Q-Margin)The Q-Margin is a calculated value used to convey the health of the overallsignal after being processed. It indicates how much signal margin exists and ifdegradation occurs how much it can degrade before the signal is lost. A Q-Marginless than 0.5dB is considered unhealthy. The Q-Margin is useful during both circuitturn-up as well as checking the ongoing health of the circuit.General guidelines for acceptable valuesThe following table highlights some general acceptable values and how the OSNR, Pre-FEC BER, and Q-margin relate to each other. Q-margin Pre-FEC BER OSNR margin Optical channel health < 0.5dB > 1.5E-2 < 1dB Unhealthy 0.5dB to 1dB 1.5E-2 to 1.0E-2 1dB to 2.2dB Acceptable 1.0dB to 1.5dB 1.0E-2 to 7.0E-3 2.2dB to 3.4dB Healthy > 1.5dB <7.0E-3 > 3.4dB Very Healthy Environmental PMTemperatureModern DCO have multiple temperature sensors. The “temperature” reading reported by the operating system may be the case temperature or the DSP temperature. The laser component typically has its own temperature sensor and is reported as the laser temperature.VoltageThe voltage supplied to the DCO is specified by various standards such as the QSFP-DD MSA specifications. The voltage should be approximately 3.3v but can vary. Larger fluctuations in the voltage indicate a hardware problem.Laser Bias Current (LBC)The LBC is a measure of the bias current applied to the transmit laser used tomaintain stable optical transmit power. This value may change due tofluctuations in voltage and temperature. The value can be measured either as anabsolute value in milliamps or a percentage of the operating threshold of the laser.IOS-XR DCO MonitoringThe PM values we’ve introduced will be used in monitoring the health of our DCO circuit. We’ll focus on IOS-XR as the network operating system, but similar methods are usually available with other network operating systems.“Current” Performance DataWhen the user issues show commands such as “show controller optics” and “show controller coherentdsp” what is shown for PM data is the last read or “current” value or the metric. Certain data is collected at different intervals but in general all data shown has an update interval below 30s. The following tables list the instant PM data available via different methods.CLI “show controller optics” Optical layer PM dataAs you can see below the PM data we discussed in the overview section is shown when issuing the command. The “RX Power” is the total optical receive power, the “RX Signal Power” is the power of the specific coherent channel the DCO is tuned to receive. Laser Bias Current = 273.1 mA Actual TX Power = -9.95 dBm Actual TX Power(mW) = 0.10 mW RX Power = -2.92 dBm RX Power(mW) = 0.51 mW RX Signal Power = -3.15 dBm Frequency Offset = -1 MHz Laser Temperature = 51.48 Celsius Laser Age = 0 % LBC High Threshold = 98 % Chromatic Dispersion 2 ps/nm Second Order Polarization Mode Dispersion = 46.00 ps^2 Optical Signal to Noise Ratio = 35.60 dB SNR = 18.80 dB Polarization Dependent Loss = 0.60 dB Polarization Change Rate = 0.00 rad/s Differential Group Delay = 1.00 ps Temperature = 52.00 Celsius Voltage = 3.29 V CLI “show controller coherentdsp” Digital layer PM dataAs you can see below the PM data we discussed in the overview section is shown when issuing the command.PREFEC BER # 3.1E-04POSTFEC BER # 0.0E+00Q-Factor # 10.60 dBQ-Margin # 4.10 dB YANG data models for “current” PM data retrievalThe following YANG model paths can be used to retrieve all of the data shown in the CLI commands. Interfaces such as NETCONF, gNMI/GRPC, or native Cisco MDT can be used to retrieve the data on demand, on change, or via periodic subscription.YANG data model for “current” Optical PM dataCisco-IOS-XR-controller-optics-oper#optics-oper/optics-ports/optics-port/optics-info YANG data model for “current” Digital PM dataCisco-IOS-XR-controller-otu-oper#otu/controllers/controller/infoSNMP MIBs for “current” PM data retrievalIn IOS-XR the optics and coherentdsp controllers are modeled as interfaces. The SNMP OIDs used for DCO monitoring will reference the interface ifIndex for the respective controller.SNMP MIB for “current” Optical PM dataThe following Cisco native SNMP MIB can be used to retrieve the current PM data.CISCO-OPTICAL-MIB (OID 1.3.6.1.4.1.9.9.828)IF-MIB##ifDescr.27 = STRING# Optics0/0/0/24 CISCO-OPTICAL-MIB##coiOpticalControllerFrequency.27 = Gauge32# 1940000 100 MHz SNMP MIB for “current” Digital PM dataThe following Cisco native SNMP MIB can be used to retrieve the current PM data.CISCO-OTN-IF-MIB (OID 1.3.6.1.4.1.9.9.639)IF-MIB##ifDescr.65 = STRING# CoherentDSP0/0/0/24 CISCO-OTN-IF-MIB##coiIfControllerPreFECBERMantissa.65 = INTEGER# 140CISCO-OTN-IF-MIB##coiIfControllerPreFECBERExponent.65 = INTEGER# -5Previous values equal PreFEC BER of 1.4e-05Events and AlarmsThere are many alarms associated with the state of the optics. This is not meant to be an exhaustive list of alarms, just highlight some of the more commonly seen alarms during specific events.Platform alarms generated by DCO events are reported via the system log andoutput to remote monitoring tools as syslog messages and SNMP traps if enabled.These alarms will show up when executing a “show alarms” command on the CLI andretrieving alarms via YANG models such as openconfig-alarms andCisco-IOS-XR-alarmgr-server-oper.Optics Controller Alarm ThresholdsThe following PM threshold values are defined for the optics controller. See the table below on user-configurable values. When a threshold is crossed the system will generate an alarm of different severity based on Low vs. High crossing.It is recommended to leave the values at their defaults unless the user has characterized the optical network closely to determine proper values. Parameter High Alarm Low Alarm High Warning Low Warning ------------------------ ---------- --------- ------------ ----------- Rx Power Threshold(dBm) 13.0 -24.0 10.0 -21.0 Rx Power Threshold(mW) 19.9 0.0 10.0 0.0 Tx Power Threshold(dBm) 4.0 -18.0 2.0 -16.0 Tx Power Threshold(mW) 2.5 0.0 1.5 0.0 LBC Threshold(mA) 0.00 0.00 0.00 0.00 Temp. Threshold(celsius) 80.00 -5.00 75.00 15.00 Voltage Threshold(volt) 3.46 3.13 3.43 3.16 LBC High Threshold = 98 % Configured CD High Threshold = 160000 ps/nm Configured CD lower Threshold = -160000 ps/nm Configured OSNR lower Threshold = 9.00 dB Configured DGD Higher Threshold = 80.00 ps Parameter Explanation User-Configurable Rx Power Threshold(dBm) Receive power in dBm No Rx Power Threshold(mW) Receive power in mW No Tx Power Threshold(dBm) Transmit power in dBm No Tx Power Threshold(dBm) Transmit power in mW No LBC High Threshold Laser bias % of max Yes, lbc-high-threshold command Temp threshold Case temperature of transceiver No CD High Threshold Chromatic dispersion high Yes, cd-high-threshold command CD Low threshold Chromatic dispersion low Yes, cd-low-threshold command OSNR Low threshold Optical signal to noise ratio Yes, osnr-low-threshold DGD Low threshold Differential group delay low value Yes, dgd-low-threshold command Counted Optical AlarmsBased on either an event or built-in PM threshold, specific alarms increment over time. These are seen with the “show controller optics” CLI command or appropriate YANG model.Alarm Statistics# ------------- HIGH-RX-PWR = 0 LOW-RX-PWR = 0 HIGH-TX-PWR = 0 LOW-TX-PWR = 0 HIGH-LBC = 0 HIGH-DGD = 0 OOR-CD = 0 OSNR = 18 WVL-OOL = 0 MEA = 0 IMPROPER-REM = 0 TX-POWER-PROV-MISMATCH = 0 Alarm Expansion Comment HIGH-RX-PWR RX power too high   HIGH-TX-PWR TX power too high   LOW-RX-PWR RX power too low   LOW-TX-PWR TX power too low   HIGH-LBC Laser bias current too high   HIGH-DGD Diffeential group delay too high   OOR-CD Chromatic dispersion out of range   OSNR Optical signal to noise ratio too low   WVL-OOL     MEA Mismatch equipment alarm Optic speed not supported by port IMPROPER-REM Improper removal Not applicable to IOS-XR routers TX-POWER-PROV-MISMATCH Difference in configured and actual value too large   Counted Digital AlarmsBased on either an event or built-in PM threshold, specific alarms increment over time. These are seen with the “show controller coherentdsp” CLI command or appropriate YANG model.Alarm Information#LOS = 40 LOF = 0 LOM = 0OOF = 0 OOM = 0 AIS = 0IAE = 0 BIAE = 0 SF_BER = 0SD_BER = 0 BDI = 0 TIM = 0FECMISMATCH = 0 FEC-UNC = 0 FLEXO_GIDM = 0FLEXO-MM = 0 FLEXO-LOM = 0 FLEXO-RDI = 0FLEXO-LOF = 35Detected Alarms # LOS Alarm Expansion Comment LOS Loss of signal In IOS-XR <24.1.1 this is based on the RX power being below the sensitivity threshold, but possible for signal to still be up LOF Loss of frame Not used for ZR/ZR+ LOM Loss of multi-frame Not used for ZR/ZR+ OOF Out of frame Not used for ZR/ZR+ OOM Out of multi-frame Not used for ZR/ZR+ IAE Incoming alignment error Not used for ZR/ZR+ BIAE Backward incoming alignment error Not used for ZR/ZR+ SF_BER Signal fault due to high Bit-Error rate Not used for ZR/ZR+ SD_BER Signal degrade due to high Bit-Error rate Not used for ZR/ZR+ BDI Backward defect indication Not used for ZR/ZR+ TIM Trace identifier mismatch, OTN TTI mismatch Not used for ZR/ZR+ FECMISMATCH FEC Mismatch between endpoints Not used for ZR/ZR+ FEC-UNC Uncorrectable words   FLEXO-GIDM FlexO framing group ID mismatch Not used for ZR/ZR+ FLEXO-MM FlexO multi-frame mismatch Not used for ZR/ZR+ FLEXO-LOM FlexO framing loss of multi-frame Common alarm FLEXO-RDI FlexO remote defect indicator Not used for ZR/ZR+ FLEXO-LOF FlexO loss of frame Common alarm Common Optical Alarms%PKT_INFRA-FM-3-FAULT_MAJOR # ALARM_MAJOR #OPTICS RX LOS LANE-0 #DECLARE #0/RP0/CPU0# Optics0/0/0/10%PKT_INFRA-FM-3-FAULT_MAJOR # ALARM_MAJOR #OPTICS RX LOL LANE-0 #DECLARE #0/RP0/CPU0# Optics0/0/0/10%PKT_INFRA-FM-3-FAULT_MAJOR # ALARM_MAJOR #OPTICS MEDIA RX CHROMATIC DISPERSION LOSS OF LOCK #DECLARE #0/RP0/CPU0# Optics0/0/0/10%PKT_INFRA-FM-3-FAULT_MAJOR # ALARM_MAJOR #OPTICS MEDIA RX LOSS OF FRAME #DECLARE #0/RP0/CPU0# Optics0/0/0/10%PKT_INFRA-FM-4-FAULT_MINOR # ALARM_MINOR #OSNR #DECLARE #Optics0/0/0/20Common Digital Alarms%PKT_INFRA-FM-3-FAULT_MAJOR # ALARM_MAJOR #OPTICS MEDIA FEC DEGRADED #DECLARE #0/RP0/CPU0# Optics0/0/0/10%PKT_INFRA-FM-3-FAULT_MAJOR # ALARM_MAJOR #OPTICS MEDIA FEC EXCESS DEGRADED #DECLARE #0/RP0/CPU0# Optics0/0/0/10%PKT_INFRA-FM-2-FAULT_CRITICAL # ALARM_CRITICAL #FLEXO-LOF #DECLARE #CoherentDSP0/0/0/10#Common Alarms on Fiber CutThe active alarms on the device can be shown using the “show alarms brief system active”--------------------------------------------------------------------------------Active Alarms (Brief) for 0/RP0--------------------------------------------------------------------------------Location Severity Group Set Time Description -------------------------------------------------------------------------------- 0/RP0/CPU0 Major Software 11/30/2023 12#11#06 PST Optics0/0/0/0 - hw_optics# RX LOS LANE-0 ALARM 0/RP0/CPU0 Major Software 11/30/2023 12#11#06 PST Optics0/0/0/0 - hw_optics# RX POWER LANE-0 LOW ALARM --------------------------------------------------------------------------------Conditions (Brief) for 0/RP0--------------------------------------------------------------------------------Location Severity Group Set Time Description --------------------------------------------------------------------------------0/0 Critical OTN 11/30/2023 12#11#06 PST CoherentDSP0/0/0/0 - Incoming Payload Signal Absent IOS-XR Performance Measurement Engine and Threshold Crossing AlertsAnother class of flexible monitoring and alerting is available to the user usingIOS-XR’s Performance Measurement Engine feature.Performance Measurement Engine OverviewSelect metric data in IOS-XR is collected and stored at regular intervals forconsumption by the user or management systems. The data by default is collectedacross three different time periods or “bins/buckets”# 10s, 30s, 15m, and 24h.Within each period the min, max, and avg values during the period arestored. This can take some of the burden off the management system as it no longer needs to calculate these values.The actual collection interval is dependent on the specific metric. Asan example even though the storage bucket is 30 seconds, some data iscollected at a faster cadence such as every 5 seconds. The “flex-bin” optionuses a period of 10s and is not user configurable. The flex-bin period can be used to mimic the behavior of current/instantaneous PM.Performance Measurement HistoryData collected using the PM Engine is stored on the router for a number of timeperiods. The following table lists how many historical periods are stored foreach time period.Note the data will NOT be retained across a router reload Period History buckets Max history flex-bin (10s) 1 NA 30s 30 15m 15m 32 8h 24h 7 7d Optics Controller PM Engine metricsThe following table lists all of the available optics controller PM metrics Metric Units Definition LBC mA Laser bias current OPT dBm in .01 increment Optical power transmit OPR dBm in .01 increment Optical power receive CD ps/nm Chromatic dispersion DGD ps Differential group delay OSNR dB Optical signal to noise ratio SOPMD ps^2 Second order polarization mode dispersion PDL dB Polarization dependent loss PCR rad/s Polarization change rate RX_SIG dBm in .01 increments Coherent signal power FREQ_OFF Mhz Frequency offset, difference between expected and actual receive frequency SNR dB Signal to noise ratio (not OSNR) Coherent DSP Controller PM Engine metrics Metric Units Definition EC-BITS NA Number of error corrected bits in interval UC-WORDS NA Uncorrectable words in interval PreFEC BER Rate Pre-FEC bit error rate PostFEC BER Rate Post-FEC bit error rate Q dB Quality factor Q_Margin Quality margin   Host-Intf-0-FEC-BER Host side FEC it error rate   Host-Intf-0-FEC-FERC Host side FEC received corrected   Note the last two metrics are associated with the electrical connection to theEthernet PHY/NPUDisplaying and retrieving PM Engine DataThe data being collected by the PM Engine can be displayed using CLI commands or retrieved using the following YANG models and paths.The CLI command to output the optics controller PM Engine data is the following#show controllers optics 0/0/0/10 pm <current,history> <flex-bin,30-sec,15-min,24-hour> optics 1Note the last “1” is the lane, which will always be 1 for DCO.This results in the following output#RP/0/RP0/CPU0#ron-poc-8201-1#show controllers optics 0/0/0/10 pm current 30-sec optics 1Mon Feb 19 09#53#38.238 PSTOptics in the current interval [09#53#30 - 09#53#38 Mon Feb 19 2024]Optics current bucket type # Valid MIN AVG MAX Operational Configured TCA Operational Configured TCA Threshold(min) Threshold(min) (min) Threshold(max) Threshold(max) (max)LBC[mA ] # 273 273 273 0 NA NO 524 NA NOOPT[dBm] # -9.98 -9.98 -9.98 -15.09 NA NO 5.00 NA YESOPR[dBm] # -2.92 -2.92 -2.91 5.00 5.00 YES 8.00 10.00 YESCD[ps/nm] # 2 2 3 -160000 NA YES 160000 NA YESDGD[ps ] # 1.00 1.00 1.00 0.00 NA NO 80.00 NA NOSOPMD[ps^2] # 41.00 44.89 48.00 0.00 NA NO 2000.00 NA NOOSNR[dB] # 35.10 35.38 35.60 0.00 NA YES 40.00 NA YESPDL[dB] # 0.60 0.66 0.70 0.00 NA NO 7.00 NA NOPCR[rad/s] # 0.00 0.00 0.00 0.00 NA NO 2500000.00 NA NORX_SIG[dBm] # -3.15 -3.15 -3.14 -10.00 -10.00 YES 1.00 5.00 YESFREQ_OFF[Mhz]# -14 -13 -12 -3600 NA NO 3600 NA NOSNR[dB] # 18.90 18.90 18.90 7.00 NA NO 100.00 NA NO The CLI command to output the coherent DSP controller PM Engine data is the following#show controllers coherentDSP 0/0/0/10 pm <current,history> <flex-bin,30-sec,15-min,24-hour> fec This results in the following output#RP/0/RP0/CPU0#ron-poc-8201-1#show controllers coherentDSP 0/0/0/10 pm current 30-sec fecMon Feb 19 09#58#33.576 PSTg709 FEC in the current interval [09#58#30 - 09#58#33 Mon Feb 19 2024]FEC current bucket type # Valid EC-BITS # 729415917 Threshold # 111484000000 TCA(enable) # YES UC-WORDS # 0 Threshold # 5 TCA(enable) # YES MIN AVG MAX Threshold TCA Threshold TCA (min) (enable) (max) (enable)PreFEC BER # 3.2E-04 3.2E-04 3.2E-04 0E-15 NO 0E-15 NOPostFEC BER # 0E-15 0E-15 0E-15 0E-15 NO 0E-15 NOQ[dB] # 10.60 10.60 10.70 0.00 NO 0.00 NOQ_Margin[dB] # 4.10 4.10 4.10 5.00 YES 0.00 NOHost-Intf-0-FEC-BER # 0E-15 0E-15 0E-15 0E-15 NO 0E-15 NOHost-Intf-0-FEC-FERC # 0E-15 0E-15 0E-15 0E-15 NO 0E-15 NO YANG data models for XR PM Engine data retrievalThe following YANG model paths can be used to retrieve all of the data shown in the CLI commands. Interfaces such as NETCONF, gNMI/GRPC, or native Cisco MDT can be used to retrieve the data on demand, on change, or via periodic subscription.Note the sensor path being used is dependent on whether retrieving current or historical data and the time period/bucket being used for retrieval. In the example given below the parent path is shown and then the path for retrieving the current dataset for the 30 second time period. “second30” can be replaced with minute15, hour24, or flex-bin.YANG data model for current or historical PM Engine Optical PM dataCisco-IOS-XR-pmengine-oper#performance-management/opticsCisco-IOS-XR-pmengine-oper#performance-management-history/global/periodic/optics-history/optics-port-historiesCisco-IOS-XR-pmengine-oper#performance-management/optics/optics-ports/optics-port/optics-current/optics-second30/optics-second30-optics/optics-second30-optic YANG data model for current or historical PM Engine Digital PM dataCisco-IOS-XR-pmengine-oper#performance-management/otuCisco-IOS-XR-pmengine-oper#performance-management-history/global/periodic/otu-history Cisco-IOS-XR-pmengine-oper#performance-management/otu/otu-ports/otu-port/otu-current/otu-second30/otu-second30fecs/otu-second30fecThe data can also be retrieved using the following native YANG models and paths#Optical PM Engine data native model pathCisco-IOS-XR-pmengine-oper#performance-management/opticsDigital PM Engine data native model pathCisco-IOS-XR-pmengine-oper#performance-management/otu/otu-ports/otu-portThe PM Engine data can also be retrieved via the following SNMP MIBPM Threshold Crossing Alert OverviewUser defined TCAs can be set for the metrics the PM infrastructure collects. The TCAs can be set for the min and max values collected. TCAs can be individually set for each time interval. TCA also includes the ability to report when a min or max TCA has been crossed. TCAs are not stored as system alarms, they are recorded in the main system log and also reported as syslog/SNMP traps if the system is configured to report those.Keep in mind that a TCA alert will be generated every time interval the alert is configured for. If TCA reporting is enabled for a min RX power in the 30s bucket, an alert will be generated every 30s the RX power is below the min threshold.Performance measurement configurationPerformance measurement for all available metrics is enabled by default on DCOfor both the optics controller and coherentdsp controller. See the tables belowfor a list of all PM metrics collected for each.Threshold crossing alert configurationMost metrics collected by the PM infrastructure have pre-defined TCA min/maxvalues, but those can be changed by the user to match their specific deployment.TCA reporting is not enabled by default on the optics and coherentdsp controllerexcept for the following exception# The coherentDSP controller has two TCAs setby the system,”EC-BITS” and “UC-WORDS.” The EC-BITS is a measurement of errorcorrected bits over the time interval and UC-WORDS is a measure of theuncorrectable words post-FEC. These are absolute values and not time-seriesmetrics. The EC-BITS is set by the system based on the current rate of the DCOand should not be changed.Reporting must be enabled for both min and max values for the metric and for specific time intervals.Optical Controller TCA alert configuration exampleThis example shows configuring the TCAs via XR CLI, however the TCAs could also be configured using the appropriate YANG models.The following configuration does the following# Enables TCA reporting for crossing the min threshold for opr,cd,osnr,rx-sig-pow Enables TCA reporting for crossing the max threshold for opt, opr, cd, osnr,rx-sig-pow Changes default optical power receive min threshold to 1 dBm (100*.01) Changes default optical power receive max threshold to 10 dBm (1000*.01)controller Optics0/0/0/10 pm 30-sec optics report opr min-tca pm 30-sec optics report cd min-tca pm 30-sec optics report osnr min-tca pm 30-sec optics report rx-sig-pow min-tca pm 30-sec optics report opt max-tca pm 30-sec optics report opr max-tca pm 30-sec optics report cd max-tca pm 30-sec optics report osnr max-tca pm 30-sec optics report rx-sig-pow max-tca pm 30-sec optics threshold opr-dbm min 100 pm 30-sec optics threshold opr-dbm max 1000 You can see below TCAs are now enabled for the appropriate parameters.RP/0/RP0/CPU0#ron-poc-8201-1#show controllers optics 0/0/0/10 pm current 30-sec optics 1Thu Feb 22 06#15#22.811 PSTOptics in the current interval [06#15#00 - 06#15#22 Thu Feb 22 2024]Optics current bucket type # Valid MIN AVG MAX Operational Configured TCA Operational Configured TCA Threshold(min) Threshold(min) (min) Threshold(max) Threshold(max) (max)LBC[mA ] # 273 273 273 0 NA NO 524 NA NOOPT[dBm] # -10.01 -10.00 -9.96 -15.09 NA NO 5.00 NA YESOPR[dBm] # -2.96 -2.93 -2.91 5.00 5.00 YES 8.00 10.00 YESCD[ps/nm] # 2 3 4 -160000 NA YES 160000 NA YES DGD[ps ] # 1.00 1.00 1.00 0.00 NA NO 80.00 NA NOSOPMD[ps^2] # 43.00 51.22 59.00 0.00 NA NO 2000.00 NA NOOSNR[dB] # 35.20 35.40 35.60 0.00 NA YES 40.00 NA YES PDL[dB] # 0.60 0.62 0.70 0.00 NA NO 7.00 NA NOPCR[rad/s] # 0.00 0.00 0.00 0.00 NA NO 2500000.00 NA NORX_SIG[dBm] # -3.14 -3.14 -3.13 -10.00 -10.00 YES 1.00 5.00 YES FREQ_OFF[Mhz]# -12 -5 -2 -3600 NA NO 3600 NA NOSNR[dB] # 18.80 18.83 18.90 7.00 NA NO 100.00 NA NO CoherentDSP (Digital) Controller TCA alert configuration exampleThis example shows configuring the TCAs via XR CLI, however the TCAs could also be configured using the appropriate YANG models.The following configuration does the following# Enables TCA reporting for crossing the min threshold for Q-margin Changes default Q-margin min threshold to 5controller CoherentDSP0/0/0/10pm 30-sec fec report Q-margin min-tca enable</font> pm 30-sec fec threshold Q-margin min 500 You can see below TCA is now enabled for the min threshold for the 30-secondbucket.RP/0/RP0/CPU0#ron-poc-8201-1#show controllers coherentDSP 0/0/0/10 pm current 30-sec fecThu Feb 22 06#03#27.771 PSTg709 FEC in the current interval [06#03#00 - 06#03#27 Thu Feb 22 2024]FEC current bucket type # Valid EC-BITS # 4326181146 Threshold # 111484000000 TCA(enable) # YES UC-WORDS # 0 Threshold # 5 TCA(enable) # YES MIN AVG MAX Threshold TCA Threshold TCA (min) (enable) (max) (enable)PreFEC BER # 2.9E-04 3.1E-04 3.2E-04 0E-15 NO 0E-15 NOPostFEC BER # 0E-15 0E-15 0E-15 0E-15 NO 0E-15 NOQ[dB] # 10.60 10.63 10.70 0.00 NO 0.00 NOQ_Margin[dB] # 4.10 4.10 4.10 5.00 YES 0.00 NO Host-Intf-0-FEC-BER # 0E-15 8.3E-14 2.3E-10 0E-15 NO 0E-15 NOHost-Intf-0-FEC-FERC # 0E-15 0E-15 0E-15 0E-15 NO 0E-15 NO TCA alert message exampleThe following will be shown in the system logs when a TCA is crossed. Thisexample is when the Q-Margin has dropped below the min value of 5.0. All PM TCAalarms will use the L1-PMENGINE-4-TCA nomenclature when being logged.RP/0/RP0/CPU0#2024 Feb 5 08#07#30.185 PST# optics_driver[192]# %L1-PMENGINE-4-TCA # Port CoherentDSP0/0/0/10 reports FEC Q-MARGIN-MIN(NE) PM TCA with current value 4.10, threshold 5.00 in current 30-sec interval windowMonitoring DCO using Cisco Network AutomationCrosswork Hierarchical ControllerCrosswork Hierarchical Controller or HCO has capabilities for monitoring RoutedOptical Networking PM for both the IP and Optical Line System layers in the sametool together. HCO’s Link Assurance application presents and end to endmulti-layer view of circuits using DCO endpoints and overlays relevant PM dataon the view.Link Assurance with PM DataHCO’s Performance application allows the user to explore the PM data at a device and interface level in a tabular format. The user can query based on a device or set of devices. The DCO data isCrosswork Network Controller Device Level MonitoringThe Health Insights application within CNC can be used to monitor any available telemetry sensor paths including the Optical and Digital PMs we’ve discussed.Health Insights can also be configured to alert based on different criteria suchas deviations in measured valued or absolute value changes. The Health Insightsdocumentation located here can be used as a reference#https#//www.cisco.com/c/en/us/td/docs/cloud-systems-management/crosswork-cahi/6-0/User-Guide/cisco-crosswork-CLNA-6-0-user-guide/m_healthinsights600.htmlKey Performance Indicators and KPI ProfilesA KPI in Health Insights is used to monitor a specific telemetry sensor PMattribute. In YANG these refer to the YANG operational mode leaf values.A set of KPIs are grouped together as part of a KPI Profile. This allows theuser to have a set of KPIs applied to a set of devices without having to manageper-device individual KPIs. When a KPI is added or removed from an activeprofile, telemetry collection will start for the devices using the KPI profile.Health insights includes a built-in set of pre-defined KPIs under the “Layer1-Optics” category. These sensor paths can be used to monitor both DCO and gray optics.L1 Optics Available PM MetricsL1 Optics KPI ProfileL1 optics power dataL1 optics temperature dataCustom KPIs and KPI ProfilesHealth Insights allows the user to customize the KPIs being used and group them into KPI Profiles specific to the application.Here we see a KPI Profile being used to monitor RX/TX Power and Q-Margin which can then be applied to devices with DCO.KPI graph being used specifically for DCO monitoring. Any sensor path leaf returning data as numeric data can be graphed. If the data is non-numeric it cannot be graphed but can be shown in a tabular format.The following shows alerts triggered by our custom KPIs. A critical alarm is raised when the Q-Factor of the DCO drops below .5 for a specific amount of time, and clears when the Q-Factor returns to a nominal value.Additional ResourcesRouted Optical Networking Design Guidehttps#//xrdocs.io/design/blogs/latest-routed-optical-networking-hldRouted Optical Networking Landing Pagehttps#//www.cisco.com/site/us/en/solutions/routed-optical-networking/index.htmlCrosswork Network Automation Homehttps#//www.cisco.com/c/en/us/products/collateral/cloud-systems-management/crosswork-hierarchical-controller/solution-overview-c22-744695.htmlCrosswork Network Controllerhttps#//www.cisco.com/c/en/us/products/cloud-systems-management/crosswork-network-controller/index.htmlCrosswork Hierarchical Controllerhttps#//www.cisco.com/c/en/us/products/cloud-systems-management/crosswork-hierarchical-controller/index.html", "url": "/blogs/xr-dco-monitoring", "author": "Phil Bedard", "tags": "iosxr, design, optical, ron, routing, zr, controller, dco" } , "#": {} , "blogs-2024-09-17-how-to-restore-ip-traffic-over-optical": { "title": "How to restore IP traffic over optical", "content": " On This Page Protection IP Traffic Against Fiber Cuts 1. Scope 2. Traffic protection technologies 2.1 IP protection/restoration 2.2 Dedicated optical protection (1+1) 2.3 Optical restoration 2.4 Multilayer restoration (MLR) 3. Which solution should be used for my network? 3.1 IP protection/restoration 3.2 Dedicated optical protection (1+1) 3.3 Optical restoration 3.4 Multilayer restoration 4. Conclusions Protection IP Traffic Against Fiber Cuts1. ScopeFiber cuts are a common cause of network outages. If you consider the huge amount of traffic that can be carried on a single fiber using modern optical line systems – in the order of 25-50Tb/s – these failures can cause significant service impact. Therefore, the design of the network must take these outages into account – despite the significant cost this adds to the network. While in some parts of the world fiber cuts are rare enough to justify designing the network against a single fiber cut, in densely populated developing countries, with a high amount of construction projects, fiber cuts are quite common, forcing the network designer to design the network against 2 or even 3 simultaneous fiber cuts.In this document we focus on different solutions to protect IP traffic from fiber cuts. Some of these solutions use optical switching technology and restore traffic in the optical domain. Other solutions use the built-in protection capability of the IP layer to restore the traffic. Finally, some solutions leverage both the optical layer and IP layer to optimally restore the traffic.2. Traffic protection technologiesThe following options for protecting IP traffic against fiber cuts are the most common ones in SP networks.2.1 IP protection/restorationThe IP network has inherent traffic protection capabilities thanks to fast convergence of its routing protocol. For high priority services, additional mechanisms exist – even fast end-to-end 1#1 protection for some segments routing tunnels. In addition to mechanisms that apply to all traffic types, the IP layer can distinguish between high priority traffic and best-effort traffic and favor the former, in case there is not enough capacity to protect everything.This is, by far, the simplest and most flexible solution from an operational perspective since it only involves functionality that is needed anyway, while all other schemes below involve the optical layer, implying additional provisioning and more complex multilayer behavior.2.2 Dedicated optical protection (1+1)This is an optical mechanism for protecting an optical connection by duplicating the data carried over it at the source of the wavelength, sending the second copy over a different path, and selecting the best copy of the data at the destination, before the receiver. This is a very fast mechanism, but it requires dedicated hardware per wavelength (called Photonic Switch Module) and is therefore complex and costly. Both working and protection paths and wavelength are static and cannot change if failure conditions require it.From the IP layer perspective, when such protection kicks in, IP links between routers go down and up – in some cases without triggering any IP reconvergence.This scheme is often coupled with optical restoration, and then it is referred to as “1+1+R”. For purposes of the analysis below, both schemes can be treated independently – each one with its distinct advantages and disadvantages.2.3 Optical restorationA mechanism that utilizes the wavelength switching mechanism that exists in modern high-end DWDM networks. Upon failure of a fiber, the control plane of the optical layer kicks in (typically, such solutions have been implemented before the days of SDN control) to figure out all impacted optical connections and to dynamically find alternate paths for them around the failure. In some cases, this implies changing the wavelength of the connection, which can take tens of seconds. Therefore, this is not considered a fast mechanism. Indeed, there is no need for it to be fast – all it needs to do is to get the IP layer back to its full capacity after a fiber cut, but before the next fiber cut occurs.Optical restoration is typically based on the vendor control plane (e.g. GMPLS) and has never evolved to support multivendor use cases. Therefore the solution is limited to a closed single vendor line system.From the IP layer perspective, IP links between routers go down and up, but the process takes a long time and therefore the IP layer will typically reconverge before the optical connection is restored, and then reconverge again once the optical connection is restored.2.4 Multilayer restoration (MLR)A more advanced restoration mechanism that considers both the IP layer and the optical layer. Since it involves both routers and optical gear – often from different vendors – the solution is based on the modern hierarchical SDN control architecture, which involves optical and IP controllers to control the gear and a hierarchical controller (HCO) to coordinate between the IP and optical controllers.The solution works in a similar way to optical restoration – dynamically finding alternative paths for failed optical connections leveraging optical switching – but it has an important additional flexibility lacking in optical restoration# it can change the capacity of optical connections and trigger a change in the IP layer to take into account the reduced capacity and avoid traffic loss. This flexibility is important because the restored optical path is often much longer than the working path and the modulation format of the signal must be changed to a more robust one to allow the connection to work – trading off reach for capacity.MLR has another advantage, which comes into play when reverting back to normal, after the fiber cut has been fixed# it can reroute the connection without any traffic loss. This is impossible with optical restoration, since any change of the optical path means that an IP link goes down and traffic is lost. The way MLR achieves this is by coordinating the process with the IP layer, first moving traffic away from the IP links, and only then switching the optical path.3. Which solution should be used for my network?The optimal choice of a traffic protection solution highly depends on your network# is it an optical mesh network or a bunch disparate point-to-point links and rings? Is it using traditional closed DWDM systems or transponders integrated into routers? (IPoDWDM or RON – click here for details.) The choice of solution also depends on your service availability considerations# are you worried about a single fiber cut, or are you prepared to invest to protect your network against multiple simultaneous cuts (or maintenance activities)?In this section we analyze the different options in light of the different network architecture and failure profile you are trying to protect against.3.1 IP protection/restorationIP protection is always needed to deal with IP layer failures or maintenance activities. Such failures cannot be protected against by the optical layer. When the IP layer is sized to have sufficient spare capacity to deal with IP failures (most notably a router outage), it can already deal with a single fiber cut quite well (a central router carries more capacity than each of the fibers connecting it to the rest of the network). Therefore, most network planners tend to be satisfied with IP protection if they are not concerned with multiple simultaneous failures. In fact, even if a couple of simultaneous failures are plausible, a careful simulation of the likely failure scenarios and the resulting behavior of the IP layer may reveal that the network is still protected without extra capacity or with just a little extra capacity.It should be noted that IP protection is very efficient, since it protects that actual traffic carried over the link and not the entire capacity of the link. In most networks, most of the time, the actual traffic is much lower than the link capacity and can easily find room on other IP links (which are mostly equally empty).In addition, the IP layer can be configured to prefer certain types of traffic should congestion occur. This allows planners to size the network for full protection of all traffic under one failure, but selectively protect only premium traffic under multiple failures. Since most traffic in IP networks is typically best-effort traffic, there is a lot of room for premium traffic to be protected while preempting best-effort traffic – allowing such traffic to survive many simultaneous fiber cuts.3.2 Dedicated optical protection (1+1)Dedicated optical protection is useful for protecting pure optical services that do not enjoy the protection that the IP layer provides – such as high-speed ELINE. This is indeed its main use case. But for IP services this mechanism adds no value – it only adds cost and complexity. While dedicated protection may work well for single fiber cuts, it can’t deal well with all dual failure scenarios since the second failure may affect the static protection path. The speed of protection that this scheme provides, is not an advantage for IP traffic since some of the protection mechanisms in the IP layer are just as fast.We do see very limited use cases that justify using this scheme for protecting IP links.3.3 Optical restorationOptical restoration is useful for optical mesh networks, where multiple paths exist to deal with multiple failures. Such deployments are typical in the core but sometimes in large metro networks as well.To allow the control plane to pick any path without many constraints, a highly functional optical switch is required at router locations# directionless switching at a minimum, but full colorless + directionless + contentionless switching provides much better support. Such systems are called CDC-ROADMs – see here for an excellent series of tutorials on the topic). While CDC-ROADMs are deployed in high-end networks, they are expensive and complex, limiting the applicability of optical restoration.The main drawback of optical restoration is the design complexity it creates in long-haul networks, where a long restoration path cannot have the high capacity that the working path can have. The network designer can cope with this limitation in one of three ways# Run the working path at its highest capacity and accept that not all paths will be restored under all failure combinations. In such cases, let the IP layer protect the traffic. This hybrid solution is operationally complex# when only some paths get restored, the optical team is unsure what to make of paths that fail to restore# is the current network state OK for the IP layer or not? Derate working paths based on the feasible capacity of their restoration path. This solution solves the problem of partial restoration of paths, but is very inefficient and costly, since even during normal conditions you pay the price of all possible failures, while in practice each failure combination only affects a subset of these paths. In this case, relative operational simplicity is achieved at a high cost per bit. A way to keep working paths running at high capacity but ensure all of them can be restored is based on inserting regenerators into the network. These are used by the restoration path to extend its reach when needed. This solution has its own drawbacks# first, regens are not cheap, so there’s still an economical penalty to be paid, but worse is the planning nightmare that the solution creates since one needs to place regens in strategic locations so that they can be shared by multiple restoration paths under different failure scenarios. This also has operational repercussions, since the team must make sure on an ongoing basis that the regens placed are still sufficient in light of changes in traffic patterns and traffic growth.3.4 Multilayer restorationThis scheme is applicable in the same network topologies and requires the same optical switching that optical restoration uses, but for RON networks instead of closed DWDM systems. Optical restoration cannot be used in such networks because of the need to coordinate the configuration of the WDM interface in the router and the optical line system – a task that is impractical for the optical control plane due to the inability to achieve multivendor interoperability as I explained above.The big difference between these two schemes, from a network planning perspective, is that for MLR you don’t need to worry about matching the capacity of the working path to that of the restoration path. You can run the working path at the highest capacity and let the restoration path use the maximum capacity it can under various failure conditions. This avoids the operational or planning nightmare that the first and third design options for optical restoration suffered from. It also avoids the significant cost increase that the second design option suffered from, but it may require some extra IP layer cost to ensure there is sufficient capacity to protect all traffic should the restoration path have reduced capacity.Finally, MLR lays the foundation many other novel capabilities, such as hitless coordinated maintenance between the layers and adaptive use of capacity in the face of optical degradations. Those will be covered in more detail in a subsequent post.4. ConclusionsIn this paper we surveyed different options network planners can use to protect IP services over an optical network. The right solution depends on the specific network architecture and conditions. If fiber cuts are rare or there is no requirement to protect all traffic under simultaneous failure conditions (especially best-effort traffic), the lowest cost and simplest solution to operate is to let the IP layer protect the traffic and keep the optical layer simple.If multiple simultaneous fiber cuts are a big concern, the optical system is a closed one, and the vendor supports optical restoration, then this is a good solution, potentially justifying the cost of CDC-ROADMs and the additional planning and operational complexity. Under the same conditions, if the optical line system is open and WDM interfaces are integrated into routers, then multilayer restoration is the best solution, with some clear advantages over optical restoration. Finally dedicated optical protection is rarely a good solution for protecting IP traffic – it adds cost and complexity and provides no value compared to IP layer traffic protection# protecting whole wavelengths is a lot less efficient than protecting the actual traffic that requires protection, since most of the wavelength capacity is typically unused.", "url": "/blogs/2024-09-17-how-to-restore-IP-traffic-over-optical/", "author": "Ori Gerstel", "tags": "iosxr, Optical, RON" } , "#": {} , "#": {} , "#": {} }