This article discusses BGP session culling techniques that mitigate a negative impact of maintenance of Internet Exchange Points (IXPs) on IP networks. IXP represents a central point used to exchange traffic between ISPs (IXP members). It is one or several interconnected Ethernet L2 switches connecting Internet Service Provider (ISP) routers. The simplified network scheme is shown on the picture below.
Picture 1 – Typical Internet Exchange Point
There are two ISPs routers connected through two IXP switches. Both ISPs have their own assigned AS numbers and eBGP session is established through the IXP. Let’s look at a scenario where an IXP operator needs to do maintenance work on the right IXP switch with the switch being rebooted afterwards. We will explain what happens to the production traffic during maintenance.
Bidirectional Forwarding Detection (BFD) is Employed on ISP Routers
BFD is a protocol used for fast detection of link failures. In conjunction with BGP, it can be used for fast detection of dead BGP peers. Let’s configure the interface GigabitEthernet0/0 on both ISP routers to send BFD packets every 500ms (the first value) and to expect receiving BFD packets every 500ms (min_rx parameter). The BGP peer is proclaimed to be dead after 2500ms when five BFD packets are missed (multiplier parameter). Thanks to BFD mechanism, a router ISP1 avoids to reach BGP Hold timer. As a result, traffic is not blackholed by IXP switch going through maintenance but is quickly rerouted through alternative path, once a link failure is detected by BFD.
bfd interval 500 min_rx 500 multiplier 5
Now configure BGP to use BFD to detect failure for a neighbor 198.51.100.2 on ISP1 router.
neighbor 198.51.100.2 remote-as 64502
neighbor 198.51.100.2 fall-over bfd
We need to add the similar configuration to ISP2 router.
neighbor 198.51.100.1 remote-as 64501
neighbor 198.51.100.1 fall-over bfd
Voluntary BGP Session Teardown
When BFD cannot be used, we solely rely on BGP Hold Timer expiration. As we know, a keepalive interval is set to 60 seconds by default on a Cisco device. Every 30 seconds, a BGP KEEPALIVE message is sent to BGP peers. When the keepalive message is not received within the hold-time interval from a BGP peer, the peer is declared dead. In that case a BGP NOTIFICATION message with an error code Hold Timer Expired is sent to a dead peer. By default, the hold time interval is set to 180 for Cisco devices. Until the BGP hold timer is expired on the ISP1 router, production traffic from ISP1 is being blackholed for a time period of 180s on the right IXP switch. It happens because the link between ISP1 and the IXP switch on the left is still up. However, the router ISP1 has no information that the right IXP switch is down. As a result, the time period of 180s is added to the maintenance window with the bad consequences to user traffic.
One of the feasible solutions is culling the BGP sessions by IXP members. The IXP operator informs IXP members (ISPs) ahead of time about the maintenance and the members shutdown their BGP session before maintenance commence. This technique is called voluntary BGP session teardown and it is described in Internet draft here.
The command below issued on the ISP1 router administratively shuts down eBGP session between ISP1 and its ISP2 peer.
Involuntary BGP Session Teardown
According to Will Hargrave from LONAP IXP, very few IXP members shutdown their affected BGP sessions despite of the fact that they are informed about the upcoming IPX maintenance. Obviously, an IXP operator has no access to ISP devices to shut down BGP session on behalf of the ISPs. For this reason, BGP session should be culled down on the IXP side. The solution for this is described in the above-mentioned Internet draft and is called involuntary BGP Session teardown. It is basically a Layer4 Access list (ACL) applied on the IXP Layer2 ports before maintenance. The ACL blocks traffic to and from IPX subnet, BGP port TCP 179, allowing all other traffic. The ACL causes that BGP Hold timer expires, BGP sessions are culled down and end-user traffic can be rerouted over alternative paths. Afterwards maintenance is commenced. Below is an example of configuration L4 ACL for our scenario.
10 deny tcp 198.51.100.0 0.0.0.255 eq bgp 198.51.100.0 0.0.0.255
20 deny tcp 198.51.100.0 0.0.0.255 198.51.100.0 0.0.0.255 eq bgp
30 permit ip any any
In case the IPv6 is used we also need to create IPv6 ACLs.
sequence 10 permit tcp 2001:DB8:1:1::/64 eq bgp 2001:DB8:1:1::/64
sequence 20 permit tcp 2001:DB8:1:1::/64 2001:DB8:1:1::/64 eq bgp
sequence 30 permit ipv6 any any
The IPv4 and IPv6 ACLs block traffic in both directions to prevent reestablishment of a BGP session. They are applied on interface GigabitEthernet0/0 of the IXP switch for inbound packets.
description IXP Participant Affected by Maintenance
ip access-group acl-permit-all-except-bgp in
ipv6 traffic-filter acl-ipv6-permit-all-except-bgp in
BGP session culling gains popularity and is applied at more and more IXPs. It mitigates a negative impact of maintenance activities while requiring no input from the ISPs. BGP session culling is a relatively simple technique that proved to be useful being applied by LONAP IPX operators since 2013. Even though it can be considered more of a workaround than a fully automated solution, it is still better than having no solution at all.
Boost BGP Preformance
Automate BGP Routing optimization with Noction IRP