nanog mailing list archives

Re: BGP offloading (fixing legacy router BGP scalability issues)

From: Łukasz Bromirski <lukasz () bromirski net>
Date: Thu, 9 Apr 2015 14:56:36 +0200

Hi Frederik,

On 09 Apr 2015, at 13:24, Frederik Kriewitz <frederik () kriewitz eu> wrote:

Thank you very much for all your responses.

First of all, the problems we see are really RIB (Processor memory)
and CPU related.
The TCAM/FIB limits are properly configured. From the FIB capacity
view they should last a couple of more years. Software routing doesn't
cause the problem.
The most extreme case of Cisco 6500/SUP720 abuse I'm aware of is a
setup with 4 full table transit connections + 2 RR sessions + ~20
peerings, no downstreams. Besides the IPv4 and IPv6 peerings it's
pretty much only handling a small amount of OSPF and MPLS (<5k
prefixes ~500 routers). No netflow or any other memory hog. Under
normal condition it's running at 20% CPU and 90% processor memory
(1G/SUP720 XL).


The main limit here apart from the rather slow CPU for RP is
the amount of memory you can have. I’d setup a CSR1000v as RR
and offload the 6500 from the control-plane completely. It’s nice
box to do very fast hardware forwarding as long as the FIB fits
in the TCAMs, which it seems it does in your scenario.

In case a session with a lot of prefixes (e.g. a transit) fails, it
takes up to 5 minutes for the BGP Router process to recompute the RIB,
etc.. During that time it's running at 100% CPU. Low priority
processes are completely ignored (e.g. SNMP based monitoring stops
working). Occasionally it even drops OSPF neighbours or other BGP
sessions due to expired hold timers causing further havoc.


You can tune this with process time tweaks.

Applying a /22 filter was suggested. In order to actually safe the RIB
memory we would have to disable soft-reconfiguration on the
corresponding sessions.
I don't like that option for various reasons as it trades less memory
usage for longer convergence times and significant bigger impacts on
route map updates.
Due to the IPv4 exhaustion we expect to see more small prefixes in the
future which can't be aggregated (considering the AS path). Simply
dropping them would result in less optimal routing.


If you have to filter somewhere on something, I’d rather try to filter
by AS_PATH (neighbors, etc) than prefix lengths.

-- 
"There's no sense in being precise when |               Łukasz Bromirski
 you don't know what you're talking     |      jid:lbromirski () jabber org
 about."               John von Neumann |    http://lukasz.bromirski.net

Current thread:

Re: BGP offloading (fixing legacy router BGP scalability issues), (continued)
- - - Re: BGP offloading (fixing legacy router BGP scalability issues) Mark Tinka (Apr 02)
    - Re: BGP offloading (fixing legacy router BGP scalability issues) Colin Johnston (Apr 02)
    - Re: BGP offloading (fixing legacy router BGP scalability issues) Mark Tinka (Apr 02)
    - Re: BGP offloading (fixing legacy router BGP scalability issues) Colin Johnston (Apr 02)
- Re: BGP offloading (fixing legacy router BGP scalability issues) Max Tulyev (Apr 02)
  - Re: BGP offloading (fixing legacy router BGP scalability issues) Bryan Tong (Apr 02)
    - Re: BGP offloading (fixing legacy router BGP scalability issues) Faisal Imtiaz (Apr 02)
    - Re: BGP offloading (fixing legacy router BGP scalability issues) Colin Johnston (Apr 02)
- Re: BGP offloading (fixing legacy router BGP scalability issues) William Herrin (Apr 02)
- Re: BGP offloading (fixing legacy router BGP scalability issues) Frederik Kriewitz (Apr 09)
  - Re: BGP offloading (fixing legacy router BGP scalability issues) Łukasz Bromirski (Apr 09)
- Re: BGP offloading (fixing legacy router BGP scalability issues) Scott Weeks (Apr 03)