Scaling Cisco 8000 BGP table to 25M paths
Introduction
It’s common for network operators Internet border routers to receive and process several full routing tables. They can then influence traffic leaving and entering their networks using route policies to manipulate BGP attributes.
For such role, a recurrent question is how many full tables can be processed by a device? How many routes can be installed into RIB? How many BGP paths can the router hold? This article will answer this question for Cisco 8000 Series routers, illustrated with a practical demonstration.
BGP table, RIB, FIB: the basics
There is still some confusion between BGP table, RIB, FIB and their associated scale; especially when several full tables are processed by a single device.
Here is a quick refresh on the basics:
- Internet Border router receives first BGP table, and stores prefixes in BGP table along with all their attributes. There is 1 path per prefix:
RP/0/RP0/CPU0:8201-32FH#sh bgp 173.37.145.0/24
Fri Sep 22 09:30:28.130 UTC
BGP routing table entry for 173.37.145.0/24
Versions:
Process bRIB/RIB SendTblVer
Speaker 1663360696 1663360696
Last Modified: Sep 22 09:17:58.867 for 00:12:29
Paths: (1 available, best #1)
Not advertised to any peer
Path #1: Received by speaker 0
Not advertised to any peer
64537 65010 65000 207564 6939 109, (received & used)
10.70.79.79 from 10.70.79.79 (13.37.79.79)
Origin IGP, localpref 100, valid, external, best, group-best
Received Path ID 0, Local Path ID 1, version 1663360696
Origin-AS validity: (disabled)
RP/0/RP0/CPU0:8201-32FH#
- Additional full tables are received from other BGP neighbors. The number of paths for the same prefix increases. Attributes could vary (different AS path, communities, etc.) for those extra paths:
RP/0/RP0/CPU0:8201-32FH#sh bgp 173.37.145.0/24
Fri Sep 22 09:33:32.621 UTC
BGP routing table entry for 173.37.145.0/24
Versions:
Process bRIB/RIB SendTblVer
Speaker 1663360696 1663360696
Last Modified: Sep 22 09:17:58.867 for 00:15:33
Paths: (2 available, best #1)
Not advertised to any peer
Path #1: Received by speaker 0
Not advertised to any peer
64537 65010 65000 207564 6939 109, (received & used)
10.70.79.79 from 10.70.79.79 (13.37.79.79)
Origin IGP, localpref 100, valid, external, best, group-best
Received Path ID 0, Local Path ID 1, version 1663360696
Origin-AS validity: (disabled)
Path #2: Received by speaker 0
Not advertised to any peer
64537 65011 65000 207564 6939 109, (received & used)
11.70.79.79 from 11.70.79.79 (13.37.79.79)
Origin IGP, localpref 100, valid, external
Received Path ID 0, Local Path ID 0, version 0
Origin-AS validity: (disabled)
RP/0/RP0/CPU0:8201-32FH#
- BGP Best Path algorithm kicks in, and by default it selects 1 single path: it’s simply called the best path.
RP/0/RP0/CPU0:8201-32FH#sh bgp 173.37.145.0/24
Fri Sep 22 09:33:32.621 UTC
BGP routing table entry for 173.37.145.0/24
Versions:
Process bRIB/RIB SendTblVer
Speaker 1663360696 1663360696
Last Modified: Sep 22 09:17:58.867 for 00:15:33
Paths: (2 available, best #1)
Not advertised to any peer
Path #1: Received by speaker 0
Not advertised to any peer
64537 65010 65000 207564 6939 109, (received & used)
10.70.79.79 from 10.70.79.79 (13.37.79.79)
Origin IGP, localpref 100, valid, external, best, group-best
Received Path ID 0, Local Path ID 1, version 1663360696
Origin-AS validity: (disabled)
Path #2: Received by speaker 0
Not advertised to any peer
64537 65011 65000 207564 6939 109, (received & used)
11.70.79.79 from 11.70.79.79 (13.37.79.79)
Origin IGP, localpref 100, valid, external
Received Path ID 0, Local Path ID 0, version 0
Origin-AS validity: (disabled)
RP/0/RP0/CPU0:8201-32FH#
RP/0/RP0/CPU0:8201-32FH#sh bgp 173.37.145.0/24 bestpath-compare
Fri Sep 22 09:39:59.671 UTC
BGP routing table entry for 173.37.145.0/24
Versions:
Process bRIB/RIB SendTblVer
Speaker 1663360696 1663360696
Flags: 0x00023001+0x28010000;
Last Modified: Sep 22 09:17:58.867 for 00:22:00
Paths: (16 available, best #1)
Not advertised to any peer
Path #1: Received by speaker 0
Flags: 0x3000000001068001+0x00, import: 0x020
Not advertised to any peer
64537 65010 65000 207564 6939 109, (received & used)
10.70.79.79 from 10.70.79.79 (13.37.79.79), if-handle 0x00000000
Origin IGP, localpref 100, valid, external, best, group-best
Received Path ID 0, Local Path ID 1, version 1663360696
Origin-AS validity: (disabled)
best of AS 64537, Overall best
Path #2: Received by speaker 0
Flags: 0x3000000000028001+0x00, import: 0x020
Not advertised to any peer
64537 65011 65000 207564 6939 109, (received & used)
11.70.79.79 from 11.70.79.79 (13.37.79.79), if-handle 0x00000000
Origin IGP, localpref 100, valid, external
Received Path ID 0, Local Path ID 0, version 0
Origin-AS validity: (disabled)
Higher neighbor address than best path (path #1)
-- snip --
- The BGP best path is then programmed into the routing table, often referred as RIB (Routing Information Base). RIB stores these routes and selects the best ones from among all other routing protocols. For Internet routes, they usually come from a single protocol: BGP.
RP/0/RP0/CPU0:8201-32FH#sh route 173.37.145.0/24
Fri Sep 22 09:43:26.021 UTC
Routing entry for 173.37.145.0/24
Known via "bgp 65537", distance 20, metric 0
Tag 64537, type external
Installed Sep 22 09:18:34.535 for 00:24:51
Routing Descriptor Blocks
10.70.79.79, from 10.70.79.79, BGP external
Route metric is 0
No advertising protos.
RP/0/RP0/CPU0:8201-32FH#
- Last, hardware must be programmed: this happens at Forwarding Information Base (FIB) level. In IOS XR, RIB downloads the set of selected best routes to the FIB processes using the Bulk Content Downloader (BCDL) process, onto each line card.
RP/0/RP0/CPU0:8201-32FH#sh cef 173.37.145.0/24
Fri Sep 22 09:43:52.319 UTC
173.37.145.0/24, version 3162068783, internal 0x5000001 0x40 (ptr 0xedca25a8) [1], 0x0 (0x0), 0x0 (0x0)
Updated Sep 22 09:18:35.153
Prefix Len 24, traffic index 0, precedence n/a, priority 4
gateway array (0x9a80a1e8) reference count 1023547, flags 0x2010, source rib (7), 0 backups
[1 type 3 flags 0x48441 (0xf782e658) ext 0x0 (0x0)]
LW-LDI[type=0, refc=0, ptr=0x0, sh-ldi=0x0]
gateway array update type-time 1 Sep 22 09:17:37.734
LDI Update time Sep 22 09:17:37.734
Level 1 - Load distribution: 0
[0] via 10.70.79.79/32, recursive
via 10.70.79.79/32, 3 dependencies, recursive, bgp-ext [flags 0x6020]
path-idx 0 NHID 0x0 [0x9b7be8a8 0x0]
next hop 10.70.79.79/32 via 10.70.79.79/32
Load distribution: 0 (refcount 1)
Hash OK Interface Address
0 Y HundredGigE0/0/0/5.10 10.70.79.79
RP/0/RP0/CPU0:8201-32FH#
Note: The default behavior described here can change if BGP multipath or add-path feature is used, or if the device processes Internet table inside different VRFs. This is outside of this article scope.
Scaling Cisco 8000 BGP table to 25M paths: practical demonstration
This section demonstrates Cisco 8000 capability to scale to more than 20M BGP paths.
To realize this test, a 8201-32FH router running IOS XR 7.9.2 is used as Device Under Test (DUT). 8201-32FH uses Intel Xeon D-1530 CPU @ 2.40GHz along with 32GB of RAM. This router has multiple sub interfaces configured with a shadow router. This shadow router contains both IPv4 BGP table and IPv6 BGP table which process RIPE Routing Information Service (RIS) Live updates. Testbed is shown in below topology:
24 x eBGP sessions are established to receive multiple copies of IPv4 global Internet routing table.
RP/0/RP0/CPU0:8201-32FH#sh bgp sum
Wed Sep 13 13:04:40.082 UTC
BGP router identifier 13.37.1.1, local AS number 65537
BGP generic scan interval 60 secs
Non-stop routing is enabled
BGP table state: Active
Table ID: 0xe0000000 RD version: 551067265
BGP main routing table version 551067265
BGP NSR Initial initsync version 1025263 (Reached)
BGP NSR/ISSU Sync-Group versions 0/0
BGP scan interval 60 secs
BGP is operating in STANDALONE mode.
Process RcvTblVer bRIB/RIB LabelVer ImportVer SendTblVer StandbyVer
Speaker 551067265 551067265 551067265 551067265 551067265 0
Neighbor Spk AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down St/PfxRcd
1.63.51.21 0 65001 0 0 0 0 0 00:00:00 Active
10.70.79.79 0 65010 5280369 1719 551067265 0 0 1d04h 1010399
11.70.79.79 0 65011 5217380 1719 551067265 0 0 1d04h 1010399
12.70.79.79 0 65012 5205922 1719 551067265 0 0 1d04h 1010399
13.70.79.79 0 65013 5189440 1719 551067265 0 0 1d04h 1010399
14.70.79.79 0 65014 5196088 1719 551067265 0 0 1d04h 1010399
15.70.79.79 0 65015 5184072 1719 551067265 0 0 1d04h 1010399
16.70.79.79 0 65016 5187155 1719 551067265 0 0 1d04h 1010399
17.70.79.79 0 65017 5189922 1719 551067265 0 0 1d04h 1010399
18.70.79.79 0 65018 5197089 1719 551067265 0 0 1d04h 1010399
19.70.79.79 0 65019 5201890 1719 551067265 0 0 1d04h 1010399
20.70.79.79 0 65020 5201917 1719 551067265 0 0 1d04h 1010399
21.70.79.79 0 65021 5194267 1719 551067265 0 0 1d04h 1010399
22.70.79.79 0 65022 5194232 1719 551067265 0 0 1d04h 1010399
23.70.79.79 0 65023 5191012 1719 551067265 0 0 1d04h 1010399
24.70.79.79 0 65024 5193058 1719 551067265 0 0 1d04h 1010399
25.70.79.79 0 65025 5192615 1719 551067265 0 0 1d04h 1010399
26.70.79.79 0 65026 5197328 1719 551067265 0 0 1d04h 1010399
27.70.79.79 0 65027 5209301 1721 551067265 0 0 1d04h 1010399
28.70.79.79 0 65028 5191256 1719 551067265 0 0 1d04h 1010399
29.70.79.79 0 65029 5216363 1719 551067265 0 0 1d04h 1010399
30.70.79.79 0 65030 5192554 1723 551067265 0 0 1d04h 1010399
31.70.79.79 0 65031 5198880 1719 551067265 0 0 1d04h 1010399
32.70.79.79 0 65032 5192662 1719 551067265 0 0 1d04h 1010399
33.70.79.79 0 65033 5187821 1719 551067265 0 0 1d04h 1010399
And similarly for IPv6, 24 x eBGP sessions are setup:
RP/0/RP0/CPU0:8201-32FH#sh bgp ipv6 unicast summary
Tue Sep 12 12:35:46.551 UTC
BGP router identifier 13.37.1.1, local AS number 65537
BGP generic scan interval 60 secs
Non-stop routing is enabled
BGP table state: Active
Table ID: 0xe0800000 RD version: 6546135
BGP main routing table version 6546135
BGP NSR Initial initsync version 165973 (Reached)
BGP NSR/ISSU Sync-Group versions 0/0
BGP scan interval 60 secs
BGP is operating in STANDALONE mode.
Process RcvTblVer bRIB/RIB LabelVer ImportVer SendTblVer StandbyVer
Speaker 6546135 6546135 6546135 6546135 6546135 0
Neighbor Spk AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down St/PfxRcd
2001:db8:10:7079::79
0 65010 5597166 1722 497765891 0 0 1d04h 186594
2001:db8:11:7079::79
0 65011 5597049 1722 497765891 0 0 1d04h 186594
2001:db8:12:7079::79
0 65012 5597016 1722 497765891 0 0 1d04h 186594
2001:db8:13:7079::79
0 65013 5601651 1722 497765891 0 0 1d04h 186594
2001:db8:14:7079::79
0 65014 5599112 1722 497765891 0 0 1d04h 186594
2001:db8:15:7079::79
0 65015 5588400 1722 497765891 0 0 1d04h 186594
2001:db8:16:7079::79
0 65016 5598537 1722 497765891 0 0 1d04h 186594
2001:db8:17:7079::79
0 65017 5601125 1722 497765891 0 0 1d04h 186594
2001:db8:18:7079::79
0 65018 5576339 1722 497765891 0 0 1d04h 186594
2001:db8:19:7079::79
0 65019 5601651 1722 497765891 0 0 1d04h 186594
2001:db8:20:7079::79
0 65020 5582612 1722 497765891 0 0 1d04h 186594
2001:db8:21:7079::79
0 65021 5591860 1722 497765891 0 0 1d04h 186594
2001:db8:22:7079::79
0 65022 5581286 1722 497765891 0 0 1d04h 186594
2001:db8:23:7079::79
0 65023 5598931 1722 497765891 0 0 1d04h 186594
2001:db8:24:7079::79
0 65024 5582482 1722 497765891 0 0 1d04h 186594
2001:db8:25:7079::79
0 65025 5584450 1722 497765891 0 0 1d04h 186594
2001:db8:26:7079::79
0 65026 5598266 1722 497765891 0 0 1d04h 186594
2001:db8:27:7079::79
0 65027 5600159 1722 497765891 0 0 1d04h 186594
2001:db8:28:7079::79
0 65028 5592619 1722 497765891 0 0 1d04h 186594
2001:db8:29:7079::79
0 65029 5596220 1722 497765891 0 0 1d04h 186594
2001:db8:30:7079::79
0 65030 5581874 1722 497765891 0 0 1d04h 186594
2001:db8:31:7079::79
0 65031 5600379 1722 497765891 0 0 1d04h 186594
2001:db8:32:7079::79
0 65032 5590597 1722 497765891 0 0 1d04h 186594
2001:db8:33:7079::79
0 65033 5600118 1722 497765891 0 0 1d04h 186594
2001:db8:1337:0:1:63:51:21
0 65001 0 0 0 0 0 00:00:00 Active
This represents a total of ~ 28M BGP paths spread across 48 x eBGP sessions:
RP/0/RP0/CPU0:8201-32FH#sh bgp scale
Wed Sep 13 13:07:29.428 UTC
VRF: default
Neighbors Configured: 50 Established: 48
Address-Family Prefixes Paths PathElem Prefix Path PathElem
Memory Memory Memory
IPv4 Unicast 1017385 24373748 1017385 143.60MB 2.00GB 103.82MB
IPv6 Unicast 188244 4486597 188244 28.72MB 376.53MB 19.21MB
------------------------------------------------------------------------------
Total 1205629 28860345 1205629 172.32MB 2.37GB 123.03MB
Total VRFs Configured: 0
Going in details, each IPv4 prefix has 24 paths:
RP/0/RP0/CPU0:8201-32FH#sh bgp ipv4 unicast dfa-regex _109_
Fri Sep 22 09:45:05.247 UTC
BGP router identifier 13.37.1.1, local AS number 65537
BGP generic scan interval 60 secs
Non-stop routing is enabled
BGP table state: Active
Table ID: 0xe0000000 RD version: 1664830990
BGP main routing table version 1664830990
BGP NSR Initial initsync version 970816 (Reached)
BGP NSR/ISSU Sync-Group versions 0/0
BGP scan interval 60 secs
Status codes: s suppressed, d damped, h history, * valid, > best
i - internal, r RIB-failure, S stale, N Nexthop-discard
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 12.5.186.0/23 10.70.79.79 0 64537 65010 65000 52320 1299 7018 109 i
* 11.70.79.79 0 64537 65011 65000 52320 1299 7018 109 i
* 12.70.79.79 0 64537 65012 65000 52320 1299 7018 109 i
* 13.70.79.79 0 64537 65013 65000 52320 1299 7018 109 i
* 14.70.79.79 0 64537 65014 65000 52320 1299 7018 109 i
* 15.70.79.79 0 64537 65015 65000 52320 1299 7018 109 i
* 16.70.79.79 0 64537 65016 65000 52320 1299 7018 109 i
* 17.70.79.79 0 64537 65017 65000 52320 1299 7018 109 i
* 18.70.79.79 0 64537 65018 65000 52320 1299 7018 109 i
* 19.70.79.79 0 64537 65019 65000 52320 1299 7018 109 i
* 20.70.79.79 0 64537 65020 65000 52320 1299 7018 109 i
* 21.70.79.79 0 64537 65021 65000 52320 1299 7018 109 i
* 22.70.79.79 0 64537 65022 65000 52320 1299 7018 109 i
* 23.70.79.79 0 64537 65023 65000 52320 1299 7018 109 i
* 24.70.79.79 0 64537 65024 65000 52320 1299 7018 109 i
* 25.70.79.79 0 64537 65025 65000 52320 1299 7018 109 i
* 26.70.79.79 0 64537 65026 65000 52320 1299 7018 109 i
* 27.70.79.79 0 64537 65027 65000 52320 1299 7018 109 i
* 28.70.79.79 0 64537 65028 65000 52320 1299 7018 109 i
* 29.70.79.79 0 64537 65029 65000 52320 1299 7018 109 i
* 30.70.79.79 0 64537 65030 65000 52320 1299 7018 109 i
* 31.70.79.79 0 64537 65031 65000 52320 1299 7018 109 i
* 32.70.79.79 0 64537 65032 65000 52320 1299 7018 109 i
* 33.70.79.79 0 64537 65033 65000 52320 1299 7018 109 i
-- snip --
And the same for IPv6 prefixes:
RP/0/RP0/CPU0:8201-32FH#sh bgp ipv6 unicast dfa-regex _109_
Tue Sep 12 13:43:11.290 UTC
BGP router identifier 13.37.1.1, local AS number 65537
BGP generic scan interval 60 secs
Non-stop routing is enabled
BGP table state: Active
Table ID: 0xe0800000 RD version: 8656025
BGP main routing table version 8656025
BGP NSR Initial initsync version 165973 (Reached)
BGP NSR/ISSU Sync-Group versions 0/0
BGP scan interval 60 secs
Status codes: s suppressed, d damped, h history, * valid, > best
i - internal, r RIB-failure, S stale, N Nexthop-discard
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 2001:420::/32 2001:db8:10:7079::79
0 64537 65010 65000 199524 6939 109 i
* 2001:db8:11:7079::79
0 64537 65011 65000 199524 6939 109 i
* 2001:db8:12:7079::79
0 64537 65012 65000 199524 6939 109 i
* 2001:db8:13:7079::79
0 64537 65013 65000 199524 6939 109 i
* 2001:db8:14:7079::79
0 64537 65014 65000 199524 6939 109 i
* 2001:db8:15:7079::79
0 64537 65015 65000 199524 6939 109 i
* 2001:db8:16:7079::79
0 64537 65016 65000 199524 6939 109 i
* 2001:db8:17:7079::79
0 64537 65017 65000 199524 6939 109 i
* 2001:db8:18:7079::79
0 64537 65018 65000 199524 6939 109 i
* 2001:db8:19:7079::79
0 64537 65019 65000 199524 6939 109 i
* 2001:db8:20:7079::79
0 64537 65020 65000 199524 6939 109 i
* 2001:db8:21:7079::79
0 64537 65021 65000 199524 6939 109 i
* 2001:db8:22:7079::79
0 64537 65022 65000 199524 6939 109 i
* 2001:db8:23:7079::79
0 64537 65023 65000 199524 6939 109 i
* 2001:db8:24:7079::79
0 64537 65024 65000 199524 6939 109 i
* 2001:db8:25:7079::79
0 64537 65025 65000 199524 6939 109 i
* 2001:db8:26:7079::79
0 64537 65026 65000 199524 6939 109 i
* 2001:db8:27:7079::79
0 64537 65027 65000 199524 6939 109 i
* 2001:db8:28:7079::79
0 64537 65028 65000 199524 6939 109 i
* 2001:db8:29:7079::79
0 64537 65029 65000 199524 6939 109 i
* 2001:db8:30:7079::79
0 64537 65030 65000 199524 6939 109 i
* 2001:db8:31:7079::79
0 64537 65031 65000 199524 6939 109 i
* 2001:db8:32:7079::79
0 64537 65032 65000 199524 6939 109 i
* 2001:db8:33:7079::79
0 64537 65033 65000 199524 6939 109 i
As described in the introduction, a single path is programmed into RIB:
RP/0/RP0/CPU0:8201-32FH#sh route
Tue Sep 12 13:57:17.681 UTC
Codes: C - connected, S - static, R - RIP, B - BGP, (>) - Diversion path
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
i - ISIS, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, su - IS-IS summary null, * - candidate default
U - per-user static route, o - ODR, L - local, G - DAGR, l - LISP
A - access/subscriber, a - Application route
M - mobile route, r - RPL, t - Traffic Engineering, (!) - FRR Backup path
Gateway of last resort is 10.70.79.79 to network 0.0.0.0
B* 0.0.0.0/0 [20/0] via 10.70.79.79, 00:00:02
B 1.0.0.0/24 [20/0] via 10.70.79.79, 00:01:06
B 1.0.4.0/22 [20/0] via 10.70.79.79, 00:01:06
B 1.0.5.0/24 [20/0] via 10.70.79.79, 00:01:06
B 1.0.16.0/24 [20/0] via 10.70.79.79, 00:01:06
B 1.0.32.0/24 [20/0] via 10.70.79.79, 00:01:06
B 1.0.64.0/18 [20/0] via 10.70.79.79, 00:01:06
B 1.0.128.0/17 [20/0] via 10.70.79.79, 00:01:06
B 1.0.128.0/18 [20/0] via 10.70.79.79, 00:01:06
B 1.0.128.0/19 [20/0] via 10.70.79.79, 00:01:06
B 1.0.128.0/24 [20/0] via 10.70.79.79, 00:01:06
B 1.0.129.0/24 [20/0] via 10.70.79.79, 00:01:06
B 1.0.130.0/23 [20/0] via 10.70.79.79, 00:01:06
B 1.0.132.0/24 [20/0] via 10.70.79.79, 00:01:06
-- snip --
RP/0/RP0/CPU0:8201-32FH#sh route ipv4 unicast summary
Tue Sep 12 13:44:10.178 UTC
Route Source Routes Backup Deleted Memory(bytes)
connected 53 1 0 11664
local 54 0 0 11664
application fib_mgr 0 0 0 0
ospf 1 0 0 0 0
bgp 65537 1001184 0 0 216255744
static 0 0 0 0
dagr 0 0 0 0
vxlan 0 0 0 0
isis CORE 0 0 0 0
Total 1001291 1 0 216279072
And the same for IPv6:
RP/0/RP0/CPU0:8201-32FH#sh route ipv6 unicast
Tue Sep 12 13:57:53.114 UTC
Codes: C - connected, S - static, R - RIP, B - BGP, (>) - Diversion path
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
i - ISIS, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, su - IS-IS summary null, * - candidate default
U - per-user static route, o - ODR, L - local, G - DAGR, l - LISP
A - access/subscriber, a - Application route
M - mobile route, r - RPL, t - Traffic Engineering, (!) - FRR Backup path
Gateway of last resort is fe80::46b6:beff:fe46:22d0 to network ::
B* ::/0
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:33, HundredGigE0/0/0/5.10
B 1::1/128
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 2::1/128
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 10:1:2::/64
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 10:1:5::/64
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 11:1:2::/64
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 12::/64
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 12:1:2::/64
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 40::/11
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 60::/14
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 64::/17
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 64:8000::/18
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
B 64:c000::/19
[20/0] via fe80::46b6:beff:fe46:22d0, 00:01:43, HundredGigE0/0/0/5.10
-- snip --
RP/0/RP0/CPU0:8201-32FH#sh route ipv6 unicast summary
Wed Sep 13 13:08:27.789 UTC
Route Source Routes Backup Deleted Memory(bytes)
local-iid sidmgr 0 0 0 0
local 52 0 0 11232
connected 52 0 0 11232
connected l2tpv3_xconnect 0 0 0 0
local-srv6 xtc_srv6 0 0 0 0
local-srv6 bgp-65537 0 0 0 0
local-srv6 isis-CORE 0 0 0 0
bgp 65537 190957 0 0 216614392
static 0 0 0 0
vxlan 0 0 0 0
isis CORE 0 0 0 0
Total 191061 0 0 216636856
Note: this router processes a real and live BGP feed. Therefore, it’s expected to see churn and the number of routes might slightly differ between outputs.
FIB scale is similar to overall number of prefixes:
RP/0/RP0/CPU0:8201-32FH#sh cef summary
Wed Sep 13 13:10:38.537 UTC
Router ID is 192.168.0.23
IP CEF with switching (Table Version 0) for node0_RP0_CPU0
Load balancing: L4
Tableid 0xe0000000 (0x975b57e8), Vrfid 0x60000000, Vrid 0x20000000, Flags 0x1019
Vrfname default, Refcount 1017634
1017271 routes, 0 protected, 0 reresolve, 0 unresolved (0 old, 0 new), 219730536 bytes
1017090 rib, 0 lsd, 71 aib, 0 internal, 106 interface, 4 special, 1 default routes
Prefix masklen distribution:
unicast: 13797 /32, 262 /31, 1330 /30, 1948 /29, 1399 /28, 1060 /27
934 /26, 1169 /25, 604749 /24, 108618 /23, 114766 /22, 54300 /21
45828 /20, 26009 /19, 14331 /18, 8490 /17, 13770 /16, 2143 /15
1211 /14, 576 /13, 297 /12, 103 /11, 40 /10, 14 /9
16 /8 , 1 /0
RP/0/RP0/CPU0:8201-32FH#sh cef ipv6 summary
Wed Sep 13 13:10:44.048 UTC
Router ID is 192.168.0.23
IP CEF with switching (Table Version 0) for node0_RP0_CPU0
Load balancing: L4
Tableid 0xe0800000 (0x97f85b90), Vrfid 0x60000000, Vrid 0x20000000, Flags 0x1019
Vrfname default, Refcount 165786
165554 routes, 0 protected, 0 reresolve, 0 unresolved (0 old, 0 new), 35759664 bytes
165482 rib, 0 lsd, 65 aib, 0 internal, 0 interface, 6 special, 1 default routes
Prefix masklen distribution:
unicast: 323 /128, 250 /127, 135 /126, 6 /125, 8 /123, 9 /112
1 /96 , 1 /95 , 1 /94 , 1 /93 , 1 /92 , 1 /91
1 /90 , 1 /89 , 1 /88 , 1 /87 , 1 /86 , 1 /85
1 /84 , 1 /83 , 1 /82 , 1 /81 , 1 /80 , 1 /79
1 /78 , 1 /77 , 1 /76 , 1 /75 , 1 /74 , 1 /73
1 /72 , 1 /71 , 1 /70 , 1 /69 , 1 /68 , 1 /67
1 /66 , 3 /65 , 1199 /64 , 6 /63 , 3 /62 , 5 /61
7 /60 , 4 /59 , 8 /58 , 4 /57 , 682 /56 , 22 /55
162 /54 , 7 /53 , 35 /52 , 165 /51 , 28 /50 , 28 /49
56339 /48 , 4169 /47 , 3979 /46 , 2922 /45 , 14303 /44 , 2451 /43
3597 /42 , 1020 /41 , 12737 /40 , 1765 /39 , 1839 /38 , 1382 /37
5114 /36 , 1148 /35 , 3249 /34 , 3315 /33 , 18188 /32 , 3482 /31
3339 /30 , 5731 /29 , 6122 /28 , 5386 /27 , 582 /26 , 3 /25
96 /24 , 4 /23 , 37 /22 , 2 /21 , 55 /20 , 25 /19
11 /18 , 4 /17 , 2 /15 , 3 /14 , 4 /13 , 2 /12
1 /11 , 1 /10 , 1 /9 , 2 /8 , 2 /7 , 1 /6
1 /5 , 2 /4 , 1 /0
All those prefixes are ultimately programmed into Silicon One LPM database:
RP/0/RP0/CPU0:8201-32FH#sh controllers npu resources lpmtcam location 0/RP0/CPU0
Wed Sep 13 13:12:22.068 UTC
HW Resource Information
Name : lpm_tcam
Asic Type : Q200
NPU-0
OOR Summary
Estimated Max Entries : 100
Red Threshold : 95 %
Yellow Threshold : 80 %
OOR State : Green
OFA Table Information
(May not match HW usage)
iprte : 1017390
ip6rte : 184229
ip6mcrte : 0
ipmcrte : 0
Current Hardware Usage
Name: lpm_tcam
Estimated Max Entries : 100
Total In-Use : 38 (38 %)
OOR State : Green
Name: v4_lpm
Total In-Use : 1017528
Name: v6_lpm
Total In-Use : 184252
Let’s now check impact on memory. Following snapshot shows BGP process uses 6.8GB of RAM with such scale:
RP/0/RP0/CPU0:8201-32FH#sh processes memory detail
Wed Sep 13 15:06:50.245 UTC
JID Text Data Stack Dynamic Dyn-Limit Shm-Tot Phy-Tot Process
============================================================================================================
1089 2M 6820M 132K 6763M 7447M 77M 6817M bgp
194 620K 4521M 132K 3452M 22528M 1376M 4332M npu_drvr
1169 1M 710M 132K 588M 8192M 126M 714M ipv4_rib
-- snip –
With overall memory utilization still in acceptable range (56%):
RP/0/RP0/CPU0:8201-32FH#sh memory summary
Wed Sep 13 15:08:34.147 UTC
node: node0_RP0_CPU0
------------------------------------------------------------------
Physical Memory: 31595M total (14122M available)
Application Memory : 31595M (14122M available)
Image: 4M (bootram: 0M)
Reserved: 0M, IOMem: 0M, flashfsys: 0M
Total shared window: 2G
RP/0/RP0/CPU0:8201-32FH#
Multiple Paths impact on RIB and FIB
To continue this experience, maximum-paths ebgp 64
is configured for both address families, along with bgp bestpath as-path multipath-relax
knob. This will allow to leverage all existing paths and install them into RIB, and ultimately into FIB.
The 24 x paths are installed for both IPv4 and IPv6:
RP/0/RP0/CPU0:8201-32FH#sh route 173.37.145.0/24
Fri Sep 22 09:54:47.409 UTC
Routing entry for 173.37.145.0/24
Known via "bgp 65537", distance 20, metric 0
Tag 64537, type external
Installed Sep 22 09:48:44.248 for 00:06:03
Routing Descriptor Blocks
10.70.79.79, from 10.70.79.79, BGP external, BGP multi path
Route metric is 0
11.70.79.79, from 11.70.79.79, BGP external, BGP multi path
Route metric is 0
12.70.79.79, from 12.70.79.79, BGP external, BGP multi path
Route metric is 0
13.70.79.79, from 13.70.79.79, BGP external, BGP multi path
Route metric is 0
14.70.79.79, from 14.70.79.79, BGP external, BGP multi path
Route metric is 0
15.70.79.79, from 15.70.79.79, BGP external, BGP multi path
Route metric is 0
16.70.79.79, from 16.70.79.79, BGP external, BGP multi path
Route metric is 0
17.70.79.79, from 17.70.79.79, BGP external, BGP multi path
Route metric is 0
18.70.79.79, from 18.70.79.79, BGP external, BGP multi path
Route metric is 0
19.70.79.79, from 19.70.79.79, BGP external, BGP multi path
Route metric is 0
20.70.79.79, from 20.70.79.79, BGP external, BGP multi path
Route metric is 0
21.70.79.79, from 21.70.79.79, BGP external, BGP multi path
Route metric is 0
22.70.79.79, from 22.70.79.79, BGP external, BGP multi path
Route metric is 0
23.70.79.79, from 23.70.79.79, BGP external, BGP multi path
Route metric is 0
24.70.79.79, from 24.70.79.79, BGP external, BGP multi path
Route metric is 0
25.70.79.79, from 25.70.79.79, BGP external, BGP multi path
Route metric is 0
26.70.79.79, from 26.70.79.79, BGP external, BGP multi path
Route metric is 0
27.70.79.79, from 27.70.79.79, BGP external, BGP multi path
Route metric is 0
28.70.79.79, from 28.70.79.79, BGP external, BGP multi path
Route metric is 0
29.70.79.79, from 29.70.79.79, BGP external, BGP multi path
Route metric is 0
30.70.79.79, from 30.70.79.79, BGP external, BGP multi path
Route metric is 0
31.70.79.79, from 31.70.79.79, BGP external, BGP multi path
Route metric is 0
32.70.79.79, from 32.70.79.79, BGP external, BGP multi path
Route metric is 0
33.70.79.79, from 33.70.79.79, BGP external, BGP multi path
Route metric is 0
No advertising protos.
RP/0/RP0/CPU0:8201-32FH#
It’s often wrongly assumed increasing number of paths will also increase RIB and FIB scale. As it can be observed below, this is not the case.
RIB still contains same number of prefixes:
RP/0/RP0/CPU0:8201-32FH#sh route ipv4 unicast summary
Wed Sep 13 13:14:07.372 UTC
Route Source Routes Backup Deleted Memory(bytes)
connected 53 1 0 11664
local 54 0 0 11664
application fib_mgr 0 0 0 0
ospf 1 0 0 0 0
bgp 65537 1017831 0 0 1149232896
static 0 0 0 0
dagr 0 0 0 0
vxlan 0 0 0 0
isis CORE 0 0 0 0
Total 1017938 1 0 1149256224
RP/0/RP0/CPU0:8201-32FH#sh route ipv6 unicast summary
Wed Sep 13 13:14:34.996 UTC
Route Source Routes Backup Deleted Memory(bytes)
local-iid sidmgr 0 0 0 0
local 52 0 0 11232
connected 52 0 0 11232
connected l2tpv3_xconnect 0 0 0 0
local-srv6 xtc_srv6 0 0 0 0
local-srv6 bgp-65537 0 0 0 0
local-srv6 isis-CORE 0 0 0 0
bgp 65537 184864 0 729 210133808
static 0 0 0 0
vxlan 0 0 0 0
isis CORE 0 0 0 0
Total 184968 0 729 210156272
RP/0/RP0/CPU0:8201-32FH#
As for FIB.
Instead, a recursive forwarding chain is built: IP route > next-hop group > IP next-hop > interface
Going beyond the limits
Cisco 8000 is currently certified for a maximum number of 20M BGP paths. While the test performed earlier shows it’s possible to go higher, the platform will hit limits after a certain scale.
There are multiple factors limiting this scale:
- Hardware: the CPU & amount of memory used in the platform
- But also software: what’s the maximum amount of memory BGP process can allocate?
While 8201-32FH ships with 32GB of RAM, IOS XR will restrict the BGP process to 8GB. This is called the RLIMIT (Resource Limit).
RP/0/RP0/CPU0:8201-32FH#sh bgp process
Tue Sep 12 12:49:13.450 UTC
BGP Process Information:
BGP is operating in STANDALONE mode
Autonomous System number format: ASPLAIN
Autonomous System: 65537
Router ID: 13.37.1.1 (manually configured)
Default Cluster ID: 13.37.1.1
Active Cluster IDs: 13.37.1.1
Fast external fallover enabled
Platform Loadbalance paths max: 1024
Platform RLIMIT max: 8589934592 bytes
-- snip --
Info: RLIMIT is platform dependent. It’s possible to observe some IOS XR based platforms with higher values (e.g 80GB for IOS XRv9000 virtual route-reflector platform).
Adding additional neighbors will ultimately trigger memory allocation failure, causing BGP process errors or crash.
RP/0/RP0/CPU0:Sep 12 07:54:29.477 UTC: bgp[1089]: %ROUTING-BGP-3-NOMEM_RESET : [10420] : Failed to allocate memory for path, resetting neighbor : bgp : (PID=10384) : -Traceback= 55f00d13d929 7f67254aedb1 7f67254b8c44 7f6725db87be 55f00d13cfaf
RP/0/RP0/CPU0:Sep 12 07:54:29.477 UTC: bgp[1089]: %ROUTING-BGP-5-ADJCHANGE : neighbor 20.70.79.79 Down - No memory (VRF: default) (AS: 65020)
RP/0/RP0/CPU0:Sep 12 07:54:29.477 UTC: bgp[1089]: %ROUTING-BGP-5-ADJCHANGE : neighbor 38.70.79.79 Down - No memory (VRF: default) (AS: 65038)
RP/0/RP0/CPU0:Sep 12 07:54:33.922 UTC: bgp[1089]: %ROUTING-BGP-3-NOMSGCHUNK : [10418] : Failed to allocate 414 bytes from the memory pool for message of type 2 : bgp : (PID=10384) : -Traceback= 55f00d188db2 7f67257412aa 7f67254aedb1 7f67254b8c44 7f6725db87be 55f00d18a30e
Monitoring
Several YANG models are available to monitor BGP process health, statistics, but also platform memory utilization:
Cisco-IOS-XR-ipv4-bgp-oper:bgp/bpm-instances-table/bpm-instances
Cisco-IOS-XR-nto-misc-oper:memory-summary/nodes/node/detail
Cisco-IOS-XR-procmem-oper:processes-memory/nodes/node/process-ids/process-id
There are also OpenConfig models available to monitor the number of BGP paths:
Cisco-IOS-XR-ipv4-bgp-oc-oper:oc-bgp/bgp-rib/afi-safi-table/ipv4-unicast/loc-rib/num-routes/num-routes
Cisco-IOS-XR-ipv4-bgp-oc-oper:oc-bgp/bgp-rib/afi-safi-table/ipv6-unicast/loc-rib/num-routes/num-routes
Here are sample dashboards which can be used:
Looking ahead: the future of BGP table scale
This article demonstrated how Cisco 8000 could scale up to 25M+ BGP paths. It also went through some IOS XR BGP implementation details.
Is this number good enough? Are 20M BGP paths sufficient for a peering router? Is it common for network operators to process 20+ full BGP feeds on a single device?
As usual, the answer is: it depends. Different customers have diverse designs, use different features, and have different scale requirements. So far, only few hyperscaler customers expressed to support more.
While BGP table continues to grow, there is still room for margin. If current maximum scale is not high enough, network operators can still rely on IOS XR BGP Multi-Instance feature: up to 4 BGP processes can be spawned, and address families can be spread across them to reach theoretically up to 80M paths.
Last, recent routers have plenty of RAM available: most current Cisco 8000 have 32GB of RAM and new products now ship with 64GB of RAM. Currently, IOS XR doesn’t keep advantage of all this memory. There is plan in progress to increase BGP process RLIMIT, with a recent prototype IOS XR image demonstrating support of 75M BGP paths on a 8202-32FH-M with 64GB of RAM. This was possible using a 24GB RLIMIT value for the BGP process during a Customer Proof of Concept.
Acknowledgement
I’d like to thanks Serge Krier, BGP software engineer & Technical Leader at Cisco for providing lab support and review.
Leave a Comment