IRP uses two types of BGP monitors and a BMP monitoring station to collect data, diagnose and report mainly the state of the BGP session between the edge routers and the providers, as well as the network reachability through a specific provider. The information provided by monitors enables IRP to avoid announcing routing updates that would result in traffic misrouting for example by sending improvements to a failed provider but also to better inform IRP probing and improvement decisions.
1.2.5.1 Internal monitor #
Internal BGP Monitor is checking the state of the Edge Router → Provider BGP session by regularly polling the router via SNMP. When queried, the SNMP protocol returns variables describing the session status to be used by the IRP’s Internal BGP Monitor. If the session between the edge router and the provider is down, SNMP will return a value, representing session failure and IRP will react as follows:
- the provider will be marked as FAILED,
- all the improvements towards this provider will be withdrawn from the routing tables to avoid creating black holes,
- new improvements towards this providers will not be made.
In some cases (e.g. DDoS attack or various factors causing router CPU over-usage) there may be no response to the SNMP queries at all. In this case a timeout status will be reported to the Internal Monitor and a 30 minutes timer (called longhold timer) (bgpd.mon.longholdtime) will be started. During this time the monitor will be sending ICMP/UDP ping requests toward the configured provider’s next-hop IP address (peer.X.ipv4.next_hop or peer.X.ipv6.next_hop). The requests will be sent once in keepalive period (a parameter adjustable in the BGP daemon configuration interface) (bgpd.mon.keepalive). If the next-hop stops responding to these requests, another 30 seconds timer (called hold timer) (bgpd.mon.holdtime) will be started. If according to the ping response the session is reestablished during this time, the hold timer will be discarded while the longhold timer continues. In case one of the timers expires, the provider is switched to a FAIL state and all the improvements towards this provider will be withdrawn from the routing table. However, if the BGP session with the provider is re-established, the system will start rerouting traffic to this provider.
When the BGPd is started, the monitors are initialized and one SNMP query is sent towards each router, in order to check the status of the BGP sessions with providers. If there is no reply, the Internal Monitor will send up to two more SNMP requests, separated by a keepalive interval.
If none of the SNMP queries returned a status confirming that the sessions with providers are up, the provider will be assigned a FAIL status and the Internal Monitor will continue the periodical SNMP polling (each 60 seconds), to recheck providers sessions’ status.
Then, the BGP session with the edge routers is initialized and BGPd starts retrieving the routing table from the edge routers. While IRP retrieves the routing table, SNMP request may timeout due to the high CPU usage on the edge routers.
1.2.5.2 External monitor #
External BGP Monitor analyzes the network reachability through a specific provider. It performs ICMP/UDP ping requests towards the configured remote IP address(es) (peer.X.ipv4.mon or peer.X.ipv6.mon) through the monitored provider. If any of the configured IP addresses are accessible, the monitor is marked as OK. If the monitored remote IP addresses do not reply through the examined provider IRP will react as follows:
- the provider will be marked as FAILED,
- all the improvements towards this provider will be withdrawn from the routing table,
- new improvements towards this providers will not be made.
If for some reason (e.g. when the provider’s interface goes down state), the Next-Hop of the Policy Based Routing rule does not exist in the routing table, then the packets forwarding may return to the default route. In that case, the External BGP Monitor will return a false-positive state. To avoid that by properly configuring PBR, please consult “Specific PBR configuration scenarios” (Specific PBR configuration scenarios).
The External BGP Monitor status does not depend on the state of the BGP session(s) between the edge router and the provider (which is monitored by the Internal BGP Monitor). Therefore, in the case that the BGP session with one of the providers goes down, the External Monitor still shows an OK state which will remain unchanged as long as the packets are successfully routed towards the monitored destination.
We do recommend adding at least two remote IP addresses, in order to prevent false-positive alerts.
When both BGP monitors are enabled (peer.X.mon.ipv4.internal.state, peer.X.mon.ipv4.external.state), they function in conjunction with each other. If any of them fails, the provider will be declared as FAILED and IRP will react as described above. The BGP monitors’ statuses are displayed on the system dashboard as shown in the screenshot below.
For details refer:
peer.X.mon.snmp
peer.X.ipv4.mon
peer.X.ipv6.mon
peer.X.mon.ipv4.bgp_peer
peer.X.mon.ipv6.bgp_peer
peer.X.mon.snmp
peer.X.ipv4.mon
peer.X.ipv6.mon
peer.X.mon.ipv4.bgp_peer
peer.X.mon.ipv6.bgp_peer
1.2.5.3 BMP monitoring station #
A BMP monitoring station is included in IRP starting with version 3.9. It implements the monitoring station specified in RFC 7854 BGP Monitoring Protocol (BMP). The BMP monitoring station requires a monitored router to communicate over BMP the detailed routing information received from neighbors.
The BMP monitoring station exposes detailed routing data to other IRP components so that better and timelier decisions are made, for example:
- BMP lists both active and inactive routes advertised by peers on an Internet Exchange. The additional information is used by IRP to evaluate and identify the best candidate peers at all times. Without BMP data IRP has knowledge about active routes only which only point to a single peer on the IX while all the alternatives are hidden.
- route changes even for inactive routes are visible via BMP. This allows IRP the opportunity to revisit previously made probes and improvements not only at predefined re-probing intervals but also when route changes are detected for both active and inactive routes.
- prefix monitors for IX improvements consume significant router CPU resources in order to service the SNMP requests traversing the router’s relevant OIDs. More so this information is at times inaccurate and vendor dependent. When BMP data is available IRP uses this routing data to determine if IX peers still advertise the routes and no longer makes the SNMP requests for those prefixes thus significantly reducing the CPU overhead especially on routers servicing very large IX.
- IRP reconstructs the AS Path for candidate providers in order to make accurate iBGP announcements of improvements. Unfortunately network configuration practices might cause some errors during reconstruction of AS Paths using traceroute. BMP data makes the reconstruction of AS Path redundant and more accurate as this BGP attribute can be retrieved from actual (inactive) routes received from neighbors.
- improvements can be re-visited on AS Path changes. Both new and old provider AS Path attributes are monitored via BMP for changes. When changes are detected IRP re-probes the prefix to ensure the network uses the best available route. Note that re-probing can be triggered on any AS Path changes or only on major ones – when AS Path traverses a different set of autonomous systems.
The possible benefits of passing BMP data to IRP are many. To benefit from them the monitored router must support BMP too. Configuration is fully performed on monitored router by pointing it to the IRP BMP monitoring station IP address and port. The monitored router establishes the TCP connection and communicates the data while the IRP BMP monitoring station continuously listens and accepts fresh routing data as it comes.



