Cisco 8000 FIB Scale

13 minutes read

Introduction

Cisco 8000 routers have been widely adopted and deployed by service provider and cloud customers for core and peering roles. For those positions, FIB scale matters.

This article will document how Silicon One Q100 and Q200 based systems can support current and future BGP scale. Based on projected growth, tests will demonstrate they are also future proof. Last, a new feature introduced to increase FIB scale on Q200 will be demonstrated. Telemetry data will also be used to illustrate system capabilities.

What’s The FIB Scale

Forwarding Information Base was defined in RFC1812. This is what Fred Baker wrote in 1995:

Forwarding Information Base (FIB) The table containing the information necessary to forward IP Datagrams, in this document, is called the Forwarding Information Base. At minimum, this contains the interface identifier and next hop information for each reachable destination network prefix.

There is another old-school RFC3222 defining the FIB scale aspect:

Forwarding Information Base Size

Definition: Refers to the number of forwarding information base entries within a forwarding information base.

Discussion: The number of entries within a forwarding information base is one of the key elements that may influence the forwarding performance of a router. Generally, the more entries within the forwarding information base, the longer it could take to find the longest matching network prefix within the forwarding information base.

Measurement units:
Number of routes

Issues:

See Also:
forwarding information base (5.3)
forwarding information base entry (5.4)
forwarding information base prefix distribution (5.7)
maximum forwarding information base size (6.1)

20 years later, those points still matter when it comes to validate FIB scale on a modern router. Not only the number of prefixes is important, but it’s also key to consider prefixes distribution during validation phase. Ideally, a real FIB should be replayed and certification should not rely on artificial lab scenario.

Until now, Cisco 8000 officially supports up to 2M IPv4 and 512k IPv6 prefixes in FIB. This is a multi-dimensional number meaning both address families can be programmed to these maximum values. Released in March 2023, IOS-XR 7.9.1 brings FIB scale increase enhancement and raises this number to 3M IPv4 and 1M IPv6 prefixes. This new feature is covered later in this article.

Did you know? :bulb: A large-scale study realized in 2019 amongst 30 ISP and content providers revealed 95% of traffic was generated by less than 1000 BGP prefixes!

Cisco 8000 FIB Implementation

On Cisco Silicon One, the unicast FIB is stored into two major tables:

  • Longest Prefix Match (LPM) database: stores IPv4 and IPv6 routes
  • Central Exact Match (CEM) database: stores IPv6 host routes (/128)

q100-q200-lpm.png Q100 and Q200 Default LPM and CEM allocation (courtesy CS Lee, Cisco)

LPM and CEM are physically located amongst different Q100 and Q200 ASIC memories, which can be on-die and off-die (e.g TCAM, SRAM, HBM). This is out of this article scope.

Did you know? :bulb: Cisco 8100 systems use Silicon One ASICs which do not have HBM (Q200L, G100). This restricts buffering capacities but also FIB scale and is the reason why they are used for routed DC applications. Please refer to this deployment note for 8100 systems positioning.

FIB Scale Testing Methodology

FIB scale could have been tested artificially by injecting millions of consecutive prefixes. However, this does not match the field reality.

Instead, for the different tests performed in this article, a real customer production profile was used. A major Service Provider FIB was extracted from their live network and has been used to illustrate current resource utilization and validate target future growth.

Those production devices process a full BGP routing view. As of March 2023, this represents 935k IPv4 and 174k IPv6 prefixes.

On top of Internet routes, there is also a high number of internal routes. This customer runs a large-scale ISIS IGP with 28k IPv4 ISIS routes. In details, this IGP contains 18k loopbacks (/32) and 9k P2P links (/30 and /31). This backbone is dual stack and ISIS also carries 2.5k IPv6 prefixes distributed as 1.5k loopbacks (/128) and 1k P2P links (/126). Additional internal routes are carried over iBGP for both address families.

Table below summarizes prefixes origin, volume, and distribution:

Prefixes Current Scale IPv4 Current Scale IPv6
ISIS loopbacks 18k 1.5k
ISIS P2P & LAN 10k 1k
Total ISIS 28k 2.5k
BGP internal, including P2P (IPv4 /30, /31 & IPv6 /126, /127) 80.5k, 40k 34.5k, 1.6k
BGP Internet 935k 174k
Total FIB 1.1M 214k

This FIB was compared to additional ones collected over the last few months from Cloud Providers and Service Provider deployments across Europe, North America and Asia. It’s representative of a typical deployment.

The testbed is configured with Cisco 8000 routers running latest IOS-XR version available (7.9.1) and a Cisco’s inhouse tool to generate routes (routem). Customer’s FIB is replayed in the lab and all prefixes are injected through BGP protocol.

Current Resources utilization

Let’s have a look at prefixes distribution first.
IPv4 distribution is the following:


RP/0/RP0/CPU0:8202-1#sh cef summary
Router ID is 192.0.2.37

IP CEF with switching (Table Version 0) for node0_RP0_CPU0

  Load balancing: L4
  Tableid 0xe0000000 (0x921fb410), Vrfid 0x60000000, Vrid 0x20000000, Flags 0x1019
  Vrfname default, Refcount 1107623
  1107580 routes, 0 protected, 0 reresolve, 0 unresolved (0 old, 0 new), 239237280 bytes
    1107560 rib, 1 lsd, 15 aib, 0 internal, 2 interface, 4 special, 1 default routes
    Prefix masklen distribution:
        unicast: 61609 /32, 11224 /31, 38289 /30, 12009 /29, 3964 /28, 817 /27
                 1058 /26, 4128 /25, 614055 /24, 94219 /23, 108732 /22, 51033 /21
                 41683 /20, 24931 /19, 13801 /18, 8218 /17, 13499 /16, 2037 /15
                 1203 /14, 589 /13, 304 /12, 101 /11, 40 /10, 13 /9
                 17 /8 , 1 /0
      broadcast: 4 /32
      multicast: 1 /24, 1 /4

While IPv6 distribution as follows:


RP/0/RP0/CPU0:8202-1#sh cef ipv6 summary
Router ID is 192.0.2.37
IP CEF with switching (Table Version 0) for node0_RP0_CPU0

  Load balancing: L4
  Tableid 0xe0800000 (0x928b0778), Vrfid 0x60000000, Vrid 0x20000000, Flags 0x1019
  Vrfname default, Refcount 214659
  214637 routes, 0 protected, 0 reresolve, 0 unresolved (0 old, 0 new), 46361592 bytes
    214624 rib, 0 lsd, 7 aib, 0 internal, 0 interface, 6 special, 1 default routes
    Prefix masklen distribution:
        unicast: 4610 /128, 144 /127, 2514 /126, 47 /125, 1042 /124, 2 /120
                 148 /112, 183 /96 , 5367 /64 , 20673 /62 , 24 /61 , 6 /60
                 1 /59 , 22 /58 , 5 /57 , 314 /56 , 2 /53 , 2 /52
                 2 /51 , 1 /49 , 105403 /48 , 1979 /47 , 2345 /46 , 1120 /45
                 11066 /44 , 2617 /43 , 4549 /42 , 1312 /41 , 9970 /40 , 960 /39
                 1527 /38 , 873 /37 , 4778 /36 , 759 /35 , 2054 /34 , 2432 /33
                 20941 /32 , 206 /31 , 519 /30 , 3887 /29 , 122 /28 , 20 /27
                 15 /26 , 8 /25 , 27 /24 , 7 /23 , 6 /22 , 3 /21
                 14 /20 , 1 /19 , 1 /16 , 1 /10 , 1 /0
      multicast: 1 /128, 1 /104, 3 /16                 15 /26 , 8 /25 , 27 /24 , 7 /23 , 6 /22 , 3 /21
                 14 /20 , 1 /19 , 1 /16 , 1 /10 , 1 /0
      multicast: 1 /128, 1 /104, 3 /16

This is a visual representation of the FIB distribution using telemetry:

fib-distribution.png

This corresponds to @bgp4_table and @bgp6_table daily tweets:

bgp-prefixes-distribution.png BGP prefixes distribution as seen by @bgp4_table and @bgp6_table (courtesy Darren O’Connor)

Q100

For this test, a Cisco 8202 fixed system is used. It uses a single Silicon One Q100 ASIC.
For Q100, current LPM utilization is 52%:


RP/0/RP0/CPU0:8202-1#sh controllers npu resources lpmtcam location 0/RP0/CPU0
HW Resource Information
    Name                            : lpm_tcam
    Asic Type                       : Q100

NPU-0
OOR Summary
        Estimated Max Entries       : 100
        Red Threshold               : 95 %
        Yellow Threshold            : 80 %
        OOR State                   : Green
        OOR State Change Time       : 2023.Mar.20 16:03:51 UTC


OFA Table Information
(May not match HW usage)
        iprte                       : 1107561
        ip6rte                      : 210022
        ip6mcrte                    : 0
        ipmcrte                     : 0

Current Hardware Usage
    Name: lpm_tcam
        Estimated Max Entries       : 100
        Total In-Use                : 52       (52 %)
        OOR State                   : Green
        OOR State Change Time       : 2023.Mar.20 16:03:51 UTC


       Name: v4_lpm
           Total In-Use                : 1107581


       Name: v6_lpm
           Total In-Use                : 210025

It’s expected to have marginal CEM occupancy as number of /128 prefixes is low:


RP/0/RP0/CPU0:8202-1#sh controllers npu resources centralem location 0/RP0/CPU0
HW Resource Information
    Name                            : central_em
    Asic Type                       : Q100

NPU-0
OOR Summary
        Estimated Max Entries       : 100
        Red Threshold               : 95 %
        Yellow Threshold            : 80 %
        OOR State                   : Green


OFA Table Information
(May not match HW usage)
        brmac                       : 0
        iprte                       : 17
        ip6rte                      : 4610
        mplsrte                     : 5
        puntlptspolicer             : 92
        ipmcrte                     : 0
        ip6mcrte                    : 0
        aclpolicer                  : 0

Current Hardware Usage
    Name: central_em
        Estimated Max Entries       : 100
        Total In-Use                : 3        (3 %)
        OOR State                   : Green

Rest of this article will focus on LPM only.

Q200

For this test, a 88-LC0-36FH linecard is used inside a distributed system. It uses 3 x Silicon One Q200 NPUs.

LPM utilization is slightly lower with 41% resources consumed:


RP/0/RP1/CPU0:8812-1#sh controllers npu resources lpmtcam location 0/1/CPU0
HW Resource Information
    Name                            : lpm_tcam
    Asic Type                       : Q200

NPU-0
OOR Summary
        Estimated Max Entries       : 100
        Red Threshold               : 95 %
        Yellow Threshold            : 80 %
        OOR State                   : Green


OFA Table Information
(May not match HW usage)
        iprte                       : 1107694
        ip6rte                      : 210110
        ip6mcrte                    : 0
        ipmcrte                     : 0

Current Hardware Usage
    Name: lpm_tcam
        Estimated Max Entries       : 100
        Total In-Use                : 41       (41 %)
        OOR State                   : Green


       Name: v4_lpm
           Total In-Use                : 1107748


       Name: v6_lpm
           Total In-Use                : 210156



NPU-1
OOR Summary
        Estimated Max Entries       : 100
        Red Threshold               : 95 %
        Yellow Threshold            : 80 %
        OOR State                   : Green


OFA Table Information
(May not match HW usage)
        iprte                       : 1107694
        ip6rte                      : 210110
        ip6mcrte                    : 0
        ipmcrte                     : 0

Current Hardware Usage
    Name: lpm_tcam
        Estimated Max Entries       : 100
        Total In-Use                : 41       (41 %)
        OOR State                   : Green


       Name: v4_lpm
           Total In-Use                : 1107748


       Name: v6_lpm
           Total In-Use                : 210156



NPU-2
OOR Summary
        Estimated Max Entries       : 100
        Red Threshold               : 95 %
        Yellow Threshold            : 80 %
        OOR State                   : Green


OFA Table Information
(May not match HW usage)
        iprte                       : 1107694
        ip6rte                      : 210110
        ip6mcrte                    : 0
        ipmcrte                     : 0

Current Hardware Usage
    Name: lpm_tcam
        Estimated Max Entries       : 100
        Total In-Use                : 41       (41 %)
        OOR State                   : Green


       Name: v4_lpm
           Total In-Use                : 1107748


       Name: v6_lpm
           Total In-Use                : 210156

Info: Inside a 8800 distributed system, LPM is programmed across all linecards and all NPUs.

Future BGP Growth Support

Now that resources utilization has been checked against a current production deployment for both Q100 and Q200 ASICs, it’s time to simulate 5-year growth. This will be mainly driven by the BGP table growth.

APNIC provides BGP table forecasts in its yearly BGP table report.

For IPv4, data have shown future growth follows a linear model with 150 additional prefixes daily. It’s expected to have ~ 1.2M IPv4 routes in 2028. This is the value which will be tested.

ipv4-bgp-2028.png

IPv4 BGP table predictions (courtesy Geoff Huston, APNIC)

For IPv6, things are more complex: linear and exponential models give between 400k and 1.2M IPv6 routes in January 2028, which is a huge difference. An average value of 800k will be tested.

ipv6-bgp-2028.png Projections of IPv6 BGP table size (courtesy Geoff Huston, APNIC)

To make the future FIB realistic enough, current prefixes distribution will be applied for additional injected prefixes. This means most additional IPv4 prefixes will be driven by /24 which are disaggregated from bigger blocks, and most additional IPv6 will be driven by recently allocated and advertised /48, /32, /44 etc.

Q100

With this 2028 profile, Q100 LPM utilization reaches 94%. It’s interesting to note system does not enter into Red Out of Resource state (95% utilization) and there are still free LPM entries:


RP/0/RP0/CPU0:8202-1#sh controllers npu resources lpmtcam location 0/RP0/CPU0
HW Resource Information
    Name                            : lpm_tcam
    Asic Type                       : Q100

NPU-0
OOR Summary
        Estimated Max Entries       : 100
        Red Threshold               : 95 %
        Yellow Threshold            : 80 %
        OOR State                   : Yellow
        OOR State Change Time       : 2023.Mar.23 07:28:01 UTC


OFA Table Information
(May not match HW usage)
        iprte                       : 1398110
        ip6rte                      : 865460
        ip6mcrte                    : 0
        ipmcrte                     : 0

Current Hardware Usage
    Name: lpm_tcam
        Estimated Max Entries       : 100
        Total In-Use                : 94       (94 %)
        OOR State                   : Yellow
        OOR State Change Time       : 2023.Mar.23 07:28:01 UTC


       Name: v4_lpm
           Total In-Use                : 1398131


       Name: v6_lpm
           Total In-Use                : 865455

10 years after FCS, Q100 based systems will still be able to handle the BGP full view and extra prefixes.

Q200

With the 2028 profile, Q200 has more room for margin with 84% LPM utilization:


RP/0/RP1/CPU0:8812-1#sh controllers npu resources lpmtcam location 0/1/CPU0
HW Resource Information
    Name                            : lpm_tcam
    Asic Type                       : Q200

NPU-0
OOR Summary
        Estimated Max Entries       : 100
        Red Threshold               : 95 %
        Yellow Threshold            : 80 %
        OOR State                   : Yellow
        OOR State Change Time       : 2023.Mar.23 14:13:53 UTC


OFA Table Information
(May not match HW usage)
        iprte                       : 1398243
        ip6rte                      : 865548
        ip6mcrte                    : 0
        ipmcrte                     : 0

Current Hardware Usage
    Name: lpm_tcam
        Estimated Max Entries       : 100
        Total In-Use                : 84       (84 %)
        OOR State                   : Yellow
        OOR State Change Time       : 2023.Mar.23 14:13:53 UTC


       Name: v4_lpm
           Total In-Use                : 1398297


       Name: v6_lpm
           Total In-Use                : 865594
snip

With 84% utilization, it’s also expected to hit Yellow threshold (default is 80%).

8000 FIB Scale Increase

Starting IOS-XR 7.9.1, a new hw-module profile has been introduced to increase Cisco 8000 FIB scale up to 3M IPv4 and 1M IPv6 prefixes: hw-module profile route scale lpm tcam-banks

This profile is only supported on Q200 ASICs. System must be reloaded after activation:


RP/0/RP0/CPU0:8201-32FH(config)#hw-module profile route scale lpm tcam-banks
Mon Mar 20 22:05:53.261 UTC
In order to activate/deactivate this Scale LPM TCAM Banks profile, you must manually reload the chassis/all line cards
RP/0/RP0/CPU0:8201-1(config)#commit
Mon Mar 20 22:05:56.956 UTC
RP/0/RP0/CPU0:Mar 20 22:05:56.994 UTC: npu_drvr[316]: %FABRIC-NPU_DRVR-3-HW_MODULE_PROFILE_CHASSIS_CFG_CHANGED : Hw-module profile config changed for "route scale lpm high", do chassis reload to get the most recent config updated

ASIC resources are finite: once the profile is activated, some TCAM banks previously allocated for classification ACL will be used to store prefixes. Consequence is ACL scale is reduced (from 9.5k IPv4 entries to 7.5k entries, or from 5k IPv6 entries to 4k entries). It also impacts other features leveraging TCAM classification (BGP Flowspec, LPTS).

This is what LPM utilization looks like for the same 5-year growth projection:


RP/0/RP0/CPU0:8201-32FH#sh controllers npu resources lpmtcam location 0/RP0/CPU0
Thu Mar 23 14:14:56.905 UTC
HW Resource Information
    Name                            : lpm_tcam
    Asic Type                       : Q200

NPU-0
OOR Summary
        Estimated Max Entries       : 100
        Red Threshold               : 95 %
        Yellow Threshold            : 80 %
        OOR State                   : Green


OFA Table Information
(May not match HW usage)
        iprte                       : 1398116
        ip6rte                      : 865463
        ip6mcrte                    : 0
        ipmcrte                     : 0

Current Hardware Usage
    Name: lpm_tcam
        Estimated Max Entries       : 100
        Total In-Use                : 42       (42 %)
        OOR State                   : Green


       Name: v4_lpm
           Total In-Use                : 1398141


       Name: v6_lpm
           Total In-Use                : 865467

With only 42% LPM utilization, this profile gives even more room for growth for 2030 and beyond.

Current (2023) utilization is illustrated in next section with telemetry.

FIB Telemetry

CEF prefixes distribution can be collected with following CEF sensor-group:

sensor-group CEF
  sensor-path Cisco-IOS-XR-fib-common-oper:fib/nodes/node/protocols/protocol/fib-summaries/fib-summary/prefix-masklen-distribution/unicast-prefixes

When using telegraf, additional configuration must be used to treat mask length as a tag:

embedded_tags = ["Cisco-IOS-XR-fib-common-oper:fib/nodes/node/protocols/protocol/fib-summaries/fib-summary/prefix-masklen-distribution/unicast-prefix/mask-length"]

Here is sample Grafana visualization: fib-distribution.png

Cisco 8000 LPM TCAM and other NPU resources can be monitored with Open Forwarding Agent (OFA) sensor-group:

sensor-group OFA
  sensor-path Cisco-IOS-XR-platforms-ofa-oper:ofa

As CEF, an extra configuration must be added to create tags when using telegraf:

embedded_tags = [
"Cisco-IOS-XR-platforms-ofa-oper:ofa/stats/nodes/node/Cisco-IOS-XR-8000-platforms-npu-resources-oper:hw-resources-datas/hw-resources-data/npu-hwr/bank/bank-name",
"Cisco-IOS-XR-platforms-ofa-oper:ofa/stats/nodes/node/Cisco-IOS-XR-8000-platforms-npu-resources-oper:hw-resources-datas/hw-resources-data/npu-hwr/bank/lt-hwr/name"
]

Here is sample Grafana visualization of the current 2023 profile on a Q200 system: fib-ofa-telemetry.png

Conclusion

These tests measured Cisco 8000 current and future FIB scale. While there is no immediate need, the new FIB scale increase feature was demonstrated to support even higher scale. Resources monitoring and visualization was done using both traditional CLI and streaming telemetry techniques. This ultimately confirms 8000 routers can deal with peering and core FIB requirements.

Acknowledgement

Special thanks to Pamela Krishnamachary, Uzair Rehman and Selvam Ramanathan.

Leave a Comment