IRP Installation and Configuration Guide

IRP Installation and Configuration Guide

1 Introduction

1.1 How to contact support

If you encounter any problems while using or setting up the Intelligent Routing Platform, please contact us.
You may contact us in the following ways:

1.2 What is IRP

BGP is a fundamental technology for the fault-tolerance of the Internet, which chooses network paths based on the number of hops traffic must traverse before it reaches its destination. However, BGP does not take into account the important factors of network performance. Even if multi-homing does provide some redundancy, multiple network outages have shown that multi-homing alone is not the solution for risk diversity and business continuity. When major blackouts or even congestions happen, multi-homing gives a fallback link for the “first-mile” connection, rather than providing a way to route around the Internet “middle-mile” issues.
Noction Intelligent Routing Platform (IRP) is a product developed by Noction to help businesses to optimize their multi-homed network infrastructure. The platform sits in the network and gets a copy of the traffic from the edge routers. The system passively analyzes it for specific TCP anomalies and actively probes remote destination networks for such metrics as latency, packet loss, throughput, and historical reliability. It computes a performance- or a cost-improvement network traffic engineering policy and applies the new improved route by announcing it to the network‘s edge routers via traditional BGP session.
Noction IRP is a complete network monitoring and troubleshooting solution, which facilitates the detection, diagnosis, and automatic resolution performance issues. It delivers real-time views and dashboards that allow to visually track network performance and generate triggers, alerts and notifications when specific problems occur.

1.2.1 IRP Features

The Intelligent Routing Platform is designed to help Service Providers to improve the performance and to reduce the costs of running a multi-homed BGP network. The system makes intelligent routing decisions by analyzing various network performance metrics and selecting the best performing route for the traffic to pass through. As a result, Noction IRP allows you to:
  • Improve overall network performance
  • Reroute congestion and outages
  • Decrease network downtime
  • Reduce latency and packet loss
  • Get comprehensive network performance analytics
  • Facilitate network troubleshooting
  • Decrease network operational costs
  • Monitor platform performance
  • Reduce the risk of human errors during BGP configuration

1.2.1.1 IRP Lite limitations

IRP Lite is designed as a promotional and educational tool that might also be useful to optimize some smaller networks that are not encumbered by IRP Lite limitations. IRP Lite use is governed by Terms of Service that constrain further how IRP Lite can be used.
The high level list of technical limitations in IRP Lite include:
IRP Lite periodically sends event notifications to a designated @noction.com email address. The data is collected for statistics purposes and is used to enhance this and other Noction services like outage detection and confirmation and Internet health reports.

1.2.2 IRP Components

The IRP platform has a few interconnected components (see figure 1.1 on page 1↓) , which are performing together to improve the routing best path selection and to eliminate various common network issues.
A short description of each of the components is given below. The detailed information is available in the next chapters.
The Core is the most important part of the system. It runs all the logical operations and interconnects all the components. It handles the performance and cost improvements. It processes, stores and exports data.
The Collector receives, analyzes and processes all the traffic passing the network. It has two ways to gather data about the network: by mirroring network traffic or by using NetFlow/sFlow.
The Explorer runs all the probes and checks the metrics specified by the platform policies, such as packet loss and latency. This information is sent back to the Core.
To inject the Improvements into the network, the platform needs to ‘tell’ the routers what exactly needs to be changed in the routing table. The IRP BGP daemon announces the improved prefix with an updated next-hop for the traffic to start flowing through the new path.
The Frontend represents a web interface with a comprehensive set of reports, graphs and diagnostic information which can reflect the current and historical network state, as well as the benefits of IRP network optimization.

1.2.3 IRP Technical Requirements

In order to plan the IRP deployment in your network, a series of requirements need to be met and specific information has to be determined to configure IRP.
Figure 1.2.1: IRP Components

1.2.3.1 Hardware requirements

In production, a dedicated server for each IRP instance is strongly recommended. The system can also be deployed on a Virtual Machine with matching specifications, provided that this is hardware- or paravirtualization (Xen, KVM, VMware). Os-level virtualization (OpenVZ/Virtuozzo or similar) is not supported.
  1. CPU
    • Recommended Intel® Xeon® Processor E3/E5 family, for example:
      • 1x Intel® Xeon® Processor E3 family for up to 20 Gbps traffic;
      • 1x Intel® Xeon® Processor E5 family for 40 Gbps or more traffic.
  2. RAM
    • if providing sFlow/NetFlow data at least 16 GB, recommended - 32 GB;
    • if providing raw traffic data by port mirroring:
      • minimum 16 GB for up to 10 Gbps traffic;
      • minimum 32 GB for 40 Gbps traffic.
  3. HDD
    • At least 160GB of storage. Allocate at least 100GB to /var partition. SAS disks are recommended for production environments. LVM setup is preferrable.
  4. NIC
    • if providing sFlow/NetFlow data - at least 1 x 1000Mbps NIC while two NICs are recommended (one will be dedicated to management purposes).
    • if providing raw traffic data by port mirroring - additional 10G interfaces are required for each of the configured SPAN ports (Myricom 10G network cards with Sniffer10G license are recommended to be used for high pps networks). When configuring multiple SPAN ports the same number of additional CPU cores are needed to analyze traffic.
In very large networks carrying hundreds or more of Gbps of traffic and in IRP configurations with very aggressive optimization settings configurations with 2x or 4x CPUs are recommended. The IRP servers in these cases should also allocate double the recommended RAM and use SSD storage.
Noction can size, setup and mail to you an appliance conforming to your needs. The appliance is delivered with OS installed (latest CentOS 7) and IRP software deployed.

A different supported OS can be installed on customer request.

1.2.3.2 Software requirements

Clean Linux system, with the latest CentOS 7 or Ubuntu Server LTS x86_64 version installed on the server.

IRP has a dependency on MySQL/MariaDB server and it expects the latest version from official OS repositories. In case the DBMS has been installed from a different repository it is strongly advised that the database instance and its configuration is purged before proceeding with IRP installation. IRP requires root access to local database instance during first installation. In case the root access can’t be given, use the statements below to grant all necessary privileges to the ’irp’ user and database.:
GRANT ALL ON $dbdbname.* TO ’$dbusername’@’$dbourhost’ IDENTIFIED BY ’$dbpassword’ WITH GRANT OPTION
GRANT SELECT ON $dbdbname_fe.* TO ’$dbusername_fe’@’$dbourhost’ IDENTIFIED BY ’$dbpassword_fe’
where $dbdbname (4.1.1↓), $dbusername (4.1.1↓), $dbpassword (4.1.1↓), $dbourhost (4.1.1↓) are the corresponding parameters from /etc/noction/db.global.conf
$dbdbname_fe and $dbusername_fe from /etc/noction/db.frontend.conf

IRP’s Frontend is built on PHP 5.6 and Symfony 2.8 framework. When they are not part of mainstream OS they are retrieved from Noction’s repositories.

1.2.3.3 Network-related information and configuration

IRP is designed to help Service Providers (AS) optimize a multi-homed BGP network. This implies the basic prerequisites for using IRP:
  • Ownership of the AS for the network where IRP is deployed,
  • BGP protocol is used for routing and,
  • Network is multi-homed.
Eventually the following needs to be performed in order to deploy and configure IRP:
  1. Prepare a network diagram with all the horizontal (own) as well as upstream (providers) and downstream (customers) routers included. Compare if your network topology is logically similar to one or more of the samples listed in section Collector Configuration↓ for example Flow export configuration ↓.
  2. Identify the list of prefixes announced by your AS that must be analyzed and optimized by IRP.
  3. Review the output of commands below (or similar) from all Edge Routers:
    • sh ip bgp summary
    • sh ip bgp neighbor [neighboor-address] received-routes
    • sh ru‘ (or similar)
    The settings relating to BGP configuration, prefixes announced by your ASN, the route maps, routing policies, access control list, sFlow/NetFlow and related interfaces configurations are used to setup similar IRP settings or to determine what settings do not conflict with existing network policies.
  4. Provide traffic data by:
    (a) 
    sFlow, NetFlow (v1, 5, 9) or jFlow and send it to the main server IP. Make sure the IRP server gets both inbound and outbound traffic info. Egress flow accounting should be enabled on the provider links, or, if this is not technically possible, ingress flow accounting should be enabled on all the interfaces facing the internal network. NetFlow is most suitable for high traffic volumes, or in the case of a sophisticated network infrastructure, where port mirroring is not technically possible. Recommended sampling rates: i.For traffic up to 1Gbps: 1024 ii.For traffic up to 10Gbps: 2048
    (b)
    Or: configure port mirroring (a partial traffic copy will suffice). In this case, additional network interfaces on the server will be required - one for each mirrored port.
  5. Setup Policy Based Routing (PBR) for IRP active probing.
    • Apart from the main server IP, please add an additional alias IP for each provider and configure PBR for traffic originating from each of these IPs to be routed over different providers.
    • No route maps should be enforced for the main server IP, traffic originating from it should pass the routers using the default routes.
    • Define ProviderPBR IP routing map
    In specific complex scenarios, traffic from the IRP server should pass multiple routers before getting to the provider. If a separate probing Vlan cannot be configured across all routers, GRE tunnels from IRP to the Edge routers should be configured. The tunnels are mainly used to prevent additional overhead from route maps configured on the whole IRPEdge routers path.
If network has Flowspec capabilities then alternatively Flowspec policies can be used instead of PBR. Refer for example Flowspec policies↓, global.flowspec.pbr↓.
  1. Configure and provide SNMP for each provider link, and provide the following information:
    • SNMP interface name (or ifIndex)
    • SNMP IP (usually the router IP)
    • SNMP community
    This information is required for the report generation, Commit Control decision-making and prevention of overloading a specific provider with an excessive number of improvements.
The above is applicable in case of SNMP v2c. If SNMP v3 is used further details will be required depending on security services used.
  1. To setup cost related settings as well as the Commit Control mechanism, provide the maximum allowed interface throughput for each provider link as well as the cost per Mbps for each provider.


1.2.4 IRP Operating modes

The IRP platform can operate in two modes, which can be used at different stages of the deployment process. During the initial installation and configuration, it is recommended for the system not to inject any improvements into the network until the configuration is completed.
After running several route propagation tests, the system can be switched to the full Intrusive mode.

1.2.4.1 Non-intrusive mode

While running in this mode, the system will not actually advertise any improvement to the network, and will only reflect the network improvements and events in the platform reports and graphs.

1.2.4.2 Intrusive mode

After the system configuration is completed, and manual route propagation tests were performed in order to ensure that the edge routers behavior is correct, the system can be switched to Intrusive mode. While running in this mode, the system injects all the computed improvements into the edge router(s) routing tables, allowing the traffic to flow through the best performing route.

1.2.4.3 Going Intrusive

While IRP operates in non-intrusive mode it highlights the potential improvements within client’s environment. Going Intrusive will realize IRP potential.
The difference between Intrusive and Non-Intrusive operating modes is that IRP advertises improvements to edge routers. In order to switch to Intrusive we follow a controlled process. The highlights of the process are as follows:

  1. The optimizing component of IRP (Core) is taken offline and existing improvements are purged. The Core being offline guarantees IRP will not automatically insert new improvements into its Current Improvement table and hinder the Go Intrusive process.
    Listing 1.1: Stop IRP Core and purge existing improvements
    root@server ~ $ service core stop
    root@server ~ $ mysql irp -e ’delete from improvements;’
    
  2. Enable Intrusive Mode and adjust IRP Core and BGPd parameters as follows:
    Listing 1.2: Switch to Intrusive Mode and adjust IRP Core and BGPd parameters
    root@server ~ $ nano /etc/noction/irp.conf
    global.nonintrusive_bgp = 0
    core.improvements.max = 100
    bgpd.improvements.remove.next_hop_eq = 0
    bgpd.improvements.remove.withdraw = 0
    
  3. Improvements towards test networks are introduced manually so that client’s traffic is not affected. The improvements are chosen so that they cover all client’s providers. Any public networks could be used for test purposes. Just keep in mind that preferably your network shouldn’t have traffic with chosen test network in order to do not re-route the real traffic. Use the template below in order to insert the test improvements:
    Listing 1.3: Inserting test improvements
    mysql> insert into improvements 
      (ni_bgp, prefix, peer_new, ipv6, asn) 
      values 
        (0, ’10.10.10.0/24’, 1, 0, 48232),
        (0, ’10.10.11.0/24’, 2, 0, 48232),
        (0, ’10.10.12.0/24’, 3, 0, 48232);
  4. Make sure that ’route-reflector-client’ is set for IRP BGP session.
  5. Make sure that ’next-hop-self’ is not configured for IRP BGP session.
  6. On iBGP sessions (between edge routers, route-reflectors; except session with IRP) where ’next-hop-self’ is configured, the following route-map should be applied:
    Listing 1.4: Remove next-hop-self route-map (RM-NHS) example
    route-map RM-NHS
    set ip next-hop peer-address
    neighbor X.X.X.X route-map out RM-NHS
    
    where X.X.X.X is the iBGP neighbor
    route-map contents should be integrated into existing route-map in case other route-map already configured on the iBGP session.
  7. Use the commands below to restart IRP BGPd to use actual IRP configuration and establish BGP session(s) and verify if BGP updates are being announced:
    Listing 1.5: Restart IRP BGPd
    root@server ~ $ service bgpd restart
    root@server ~ $ tail -f /var/log/irp/bgpd.conf
    ​
    Wait for the following lines for each BGP session:
    NOTICE: Adding peer X
    NOTICE: BGP session established X
    INFO: N update(s) were sent to peer X
    
    where X is the router name and N is the number of the updates sent towards the X router.
  8. Verify if IRP BGP announcements are properly propagated across all the network. Run the following commands on each router (the commands vary depends on the router brand):
    Listing 1.6: Show BGP information for specified IP address or prefix
    show ip bgp 10.10.10.1
    show ip bgp 10.10.11.1
    show ip bgp 10.10.12.1
    
    Analyze the output from all the routers. If the IRP BGP announcements are properly propagated, you should see /25 (refer to 4.1.4.31↓) announcements and the next-hop for each announcement should be the improved provider’s next-hop:
    10.10.10.1 - provider 1 next-hop
    10.10.11.1 - provider 2 next-hop
    10.10.12.1 - provider 3 next-hop
    (refer to 4.1.12.25↓, 4.1.12.33↓, 4.1.15.3↓).

    Run the following commands in order to check if IRP improvements are announced and applied:
    Listing 1.7: Traceroute destination networks
    root@server ~ $ traceroute -nn 10.10.10.1
    root@server ~ $ traceroute -nn 10.10.11.1
    root@server ~ $ traceroute -nn 10.10.12.1
    
    Again, you should see corresponding providers’ next-hops in the traces.
  9. If the tests are successful perform the steps below:
    (a)
    Delete test improvements
    Listing 1.8: Delete test improvements
    root@server ~ $ mysql -e "delete from improvements where prefix like ’10.10.1%’;"

    (b)
    Configure at most 100 improvements and revert BGPd configuration
    Listing 1.9: Configure at most 100 improvements and revert BGPd configuration
    root@server ~ $ nano /etc/noction/irp.conf
    core.improvements.max = 100
    bgpd.improvements.remove.next_hop_eq = 1
    bgpd.improvements.remove.withdraw = 1
    

    (c)
    Restart IRP Core and BGPd
    Listing 1.10: Restart IRP Core and BGPd
    root@server ~ $ service core restart
    root@server ~ $ service bgpd restart
    
  10. If everything goes well, after 1-2 hours the maximum number of improvements announced is increased to 1000 and after 24 hour to 10000.
As a rollback plan we have to revert the changes and switch the system to non-intrusive mode:
  1. Delete test improvements
    Listing 1.11: Delete test improvements
    root@server ~ $ mysql -e "delete from improvements where prefix like ’10.10.1%’;"
    
  2. Switch the system to non-intrusive mode
    Listing 1.12: Switch the system to non-intrusive mode
    root@server ~ $ nano /etc/noction/irp.conf
    global.nonintrusive_bgp = 1
    
  3. Restart IRP Core and BGPd
    Listing 1.13: Restart IRP Core and BGPd
    root@server ~ $ service core restart
    root@server ~ $ service bgpd restart
    

1.2.5 BGP Monitoring

IRP uses two types of BGP monitors and a BMP monitoring station to collect data, diagnose and report mainly the state of the BGP session between the edge routers and the providers, as well as the network reachability through a specific provider. The information provided by monitors enables IRP to avoid announcing routing updates that would result in traffic misrouting for example by sending improvements to a failed provider but also to better inform IRP probing and improvement decisions.

1.2.5.1 Internal monitor

Internal BGP Monitor is checking the state of the Edge Router → Provider BGP session by regularly polling the router via SNMP. When queried, the SNMP protocol returns variables describing the session status to be used by the IRP’s Internal BGP Monitor. If the session between the edge router and the provider is down, SNMP will return a value, representing session failure and IRP will react as follows:
  • the provider will be marked as FAILED,
  • all the improvements towards this provider will be withdrawn from the routing tables to avoid creating black holes,
  • new improvements towards this providers will not be made.
In some cases (e.g. DDoS attack or various factors causing router CPU over-usage) there may be no response to the SNMP queries at all. In this case a timeout status will be reported to the Internal Monitor and a 30 minutes timer (called longhold timer) (bgpd.mon.longholdtime↓) will be started. During this time the monitor will be sending ICMP/UDP ping requests toward the configured provider’s next-hop IP address (peer.X.ipv4.next_hop↓ or peer.X.ipv6.next_hop↓). The requests will be sent once in keepalive period (a parameter adjustable in the BGP daemon configuration interface) (bgpd.mon.keepalive↓). If the next-hop stops responding to these requests, another 30 seconds timer (called hold timer) (bgpd.mon.holdtime↓) will be started. If according to the ping response the session is reestablished during this time, the hold timer will be discarded while the longhold timer continues. In case one of the timers expires, the provider is switched to a FAIL state and all the improvements towards this provider will be withdrawn from the routing table. However, if the BGP session with the provider is re-established, the system will start rerouting traffic to this provider.
When the BGPd is started, the monitors are initialized and one SNMP query is sent towards each router, in order to check the status of the BGP sessions with providers. If there is no reply, the Internal Monitor will send up to two more SNMP requests, separated by a keepalive interval.

Internal monitors for Internet Exchange peering partners are not initialized until there is an improvement made towards it. When running in non-intrusive mode internal monitors for IX peers are not initialized at all.
If none of the SNMP queries returned a status confirming that the sessions with providers are up, the provider will be assigned a FAIL status and the Internal Monitor will continue the periodical SNMP polling (each 60 seconds), to recheck providers sessions’ status.
Then, the BGP session with the edge routers is initialized and BGPd starts retrieving the routing table from the edge routers. While IRP retrieves the routing table, SNMP request may timeout due to the high CPU usage on the edge routers.

1.2.5.2 External monitor

External BGP Monitor analyzes the network reachability through a specific provider. It performs ICMP/UDP ping requests towards the configured remote IP address(es) (peer.X.ipv4.mon↓ or peer.X.ipv6.mon↓) through the monitored provider. If any of the configured IP addresses are accessible, the monitor is marked as OK. If the monitored remote IP addresses do not reply through the examined provider IRP will react as follows:
  • the provider will be marked as FAILED,
  • all the improvements towards this provider will be withdrawn from the routing table,
  • new improvements towards this providers will not be made.
If for some reason (e.g. when the provider’s interface goes down state), the Next-Hop of the Policy Based Routing rule does not exist in the routing table, then the packets forwarding may return to the default route. In that case, the External BGP Monitor will return a false-positive state. To avoid that by properly configuring PBR, please consult “Specific PBR configuration scenarios” (Specific PBR configuration scenarios↓).
The External BGP Monitor status does not depend on the state of the BGP session(s) between the edge router and the provider (which is monitored by the Internal BGP Monitor). Therefore, in the case that the BGP session with one of the providers goes down, the External Monitor still shows an OK state which will remain unchanged as long as the packets are successfully routed towards the monitored destination.
We do recommend adding at least two remote IP addresses, in order to prevent false-positive alerts.
When both BGP monitors are enabled (peer.X.mon.enabled↓), they function in conjunction with each other. If any of them fails, the provider will be declared as FAILED and IRP will react as described above. The BGP monitors’ statuses are displayed on the system dashboard as shown in the screenshot below.
Figure 1.2.2: System Dashboard

Starting with version 1.8.5, IRP requires at least the Internal Monitor to be configured. Otherwise, the system Frontend will return an error as shown below.
Figure 1.2.3: Error: Internal BGP Monitor not configured

1.2.5.3 BMP monitoring station

A BMP monitoring station is included in IRP starting with version 3.9. It implements the monitoring station specified in RFC 7854 BGP Monitoring Protocol (BMP). The BMP monitoring station requires a monitored router to communicate over BMP the detailed routing information received from neighbors.
The BMP monitoring station exposes detailed routing data to other IRP components so that better and timelier decisions are made, for example:
  • BMP lists both active and inactive routes advertised by peers on an Internet Exchange. The additional information is used by IRP to evaluate and identify the best candidate peers at all times. Without BMP data IRP has knowledge about active routes only which only point to a single peer on the IX while all the alternatives are hidden.
  • route changes even for inactive routes are visible via BMP. This allows IRP the opportunity to revisit previously made probes and improvements not only at predefined re-probing intervals but also when route changes are detected for both active and inactive routes.
  • prefix monitors for IX improvements consume significant router CPU resources in order to service the SNMP requests traversing the router’s relevant OIDs. More so this information is at times inaccurate and vendor dependent. When BMP data is available IRP uses this routing data to determine if IX peers still advertise the routes and no longer makes the SNMP requests for those prefixes thus significantly reducing the CPU overhead especially on routers servicing very large IX.
  • IRP reconstructs the AS Path for candidate providers in order to make accurate iBGP announcements of improvements. Unfortunately network configuration practices might cause some errors during reconstruction of AS Paths using traceroute. BMP data makes the reconstruction of AS Path redundant and more accurate as this BGP attribute can be retrieved from actual (inactive) routes received from neighbors.
  • improvements can be re-visited on AS Path changes. Both new and old provider AS Path attributes are monitored via BMP for changes. When changes are detected IRP re-probes the prefix to ensure the network uses the best available route. Note that re-probing can be triggered on any AS Path changes or only on major ones - when AS Path traverses a different set of autonomous systems.
The possible benefits of passing BMP data to IRP are many. To benefit from them the monitored router must support BMP too. Configuration is fully performed on monitored router by pointing it to the IRP BMP monitoring station IP address and port. The monitored router establishes the TCP connection and communicates the data while the IRP BMP monitoring station continuously listens and accepts fresh routing data as it comes.

As per BMP RFC requirements IRP BMP monitoring station never attempts to establish BMP or any other connections with the monitored router leaving the full scope of decisions regarding when and if BMP data is communicated in network’s responsibility.

1.2.6 Outage detection

A complete traffic path from source to destination typically passes through multiple networks, with different AS numbers. This is reflected in the traceroute results. If the Outage detection is enabled in the IRP configuration, the system gathers network performance information for all traceroute hops over which the traffic passes to the remote networks. Next, each hop is translated into AS numbers. In case any network anomalies are detected on a specific ASN, then this ASN and the immediate neighbor ASN are declared a problematic AS-pattern. The system then re-probes the prefixes that pass through this AS-pattern. In case the issue is confirmed, all related prefixes are rerouted to the best performing alternate provider.
The Outage detection uses a statistical algorithm for selecting the best routing path. Rerouting will occur for all the prefixes that are routed through the affected as-pattern, despite their current route.

Several improvements-related reports need to indicate the original route for a specific prefix. This value is taken from the last probing results for this prefix, even if the results are outdated (but not older than 24h). Since the outage-affected prefixes are rerouted in bulk by as-pattern, in some cases the reports can show the same provider for both the old and the new route.

1.2.7 VIP Improvements

The VIP Improvements is a feature that allows manual specification of a list of prefixes or AS numbers that will be periodically probed by IRP and optimized in compliance with the probing results. This allows the system to monitor specific networks or Autonomous Systems, without reference to the data provided by the IRP collector.
Possible usage scenarios include, but are not limited to:
  • monitoring and optimizing traffic to commercial partners that should have access to your networks via the best performing routes
  • monitoring and optimizing traffic to your remote locations, operating as separate networks
  • monitoring and optimizing traffic to AS, which are known for the frequent accessibility issues due to geographical or technical reasons
If a prefix is being announced from multiple Autonomous Systems, you can see different ASNs in Current improvements↓ report in the prefixes translated from ASN
IRP performs the proactive (more frequent than regular probing) monitoring of the VIP prefixes/ASNs that allows VIPs to be constantly improved.
For future reference, see:
core.vip.interval.probe↓

1.2.8 Retry Probing

Retry Probing is a feature that allows reconfirmation of the initial and already reconfirmed improvements validity. The feature is applicable to all types of improvements made by the system (Performance, Cost and Commit Control improvements). The improvements that were made more than a retry probing period ago (core.improvements.ttl.retry_probe↓) are being sent to retry probing. If the probing results confirm that the current improvement is still valid, it stays in the system and its description is updated. Otherwise, it will be removed with the log message further described in this section.
During Retry Probing reconfirmation the improvement details will be updated in the following cases:
  • Performance and Cost improvements
    • An old provider has been removed from the system configuration.
      Example: “Old provider and performance metrics not known. New packet loss 55%, avg rtt 105 ms.”
  • Commit Control improvements
    • An old provider’s has been removed from the system configuration.
      Example: “Previous provider not known. Rerouted 1 Mbps to Peer5[5] (250 Mbps, 50%)”
    • An old provider’s bandwidth statistics are not available.
      Example: “Rerouted 6 Mbps from Peer1[1] to Peer5[5] (250 Mbps, 50%)”
    • A new provider’s bandwidth statistics are not available.
      Example: “Rerouted 6 Mbps from Peer1[1] (250 Mbps, 50%) to Peer5[5]”
    • The old and new providers’ bandwidth statistics are not available.
      Example: “Rerouted 6 Mbps from Peer1[1] to Peer5[5]”
Commit control improvements are reconfirmed based on their average bandwidth usage (and not on current bandwidth usage). This way if performance characteristics allow it, even when current bandwidth usage is low but the average is still relevant, the improvement is preserved thus anticipating network usage cycles and reducing number of route changes.
During Retry Probing reconfirmation the improvements will be removed from the system and the details will be logged into the log file (core.log↓) in the following cases:
  • The Commit Control feature has been disabled.
    Example: “Prefix 1.0.2.0/24 withdrawn from Improvements (Commit Control is disabled)“
  • The low prefix traffic volume is less than the configured bandwidth limits (core.commit_control.agg_bw_min↓).
    Example: “Prefix 1.0.2.0/24 withdrawn from Improvements (low traffic volume, irrelevant for the Commit Control algorithm)”
  • The system has been switched from the Cost mode to the Performance mode (applied for cost improvements only).
    Example: “Prefix 1.0.2.0/24 withdrawn from Improvements (Performance Improvements mode)”
  • A prefix has been added to the ignored networks/ASN list.
    Example: “Prefix 1.0.2.0/24 withdrawn from Improvements (added to ignored networks/ASN)”
  • The improvement’s performance metrics are not the best ones anymore.
    Example: “Prefix 1.0.2.0/24 withdrawn from Improvements (Performance Improvement not actual anymore)”
  • The maximum number of improvements limits (core.improvements.max↓, core.improvements.max_ipv6↓) are exceeded.
    Example: “Prefix 1.0.2.0/24 withdrawn from Improvements (no more available Improvement slots)”

1.2.9 Routing Policies

Routing Policies brings the capability of defining specific routing policies according to business objectives. This feature permits denying or allowing providers to be used for probing and reaching a specific prefix or ASN. It also provides the possibility to set a static route through a particular provider. Within a policy you can choose between VIP or non-VIP (regular) probing mechanisms to be used.

Policies can be configured for networks (ASN) or aggregate prefixes. These are unpacked into existing specific sub-prefixes from the routing table before being processed by IRP components.
In version 3.9 IRP added support for policies by Country. Note that IRP maps individual prefixes to a country and does not use a transitive inference based on AS records. These prefix mappings allow IRP to more accurately work with large transcontinental AS.
Starting with version 3.9 IRP introduces a priority attribute that defines what policy to choose in cases when different policies include the same unpacked specific prefix. The policy with the highest priority will apply for such a prefix.

Note that after upgrade all existing policies are assigned (implicitly) the lowest default priority of 0.
Below you can find some typical scenarios of Routing Policies usage. For instance, there may be a specific prefix that consumes a high volume of traffic and IRP redirects it through the most expensive provider due to performance considerations. At the same time you have another two less costly available providers. In this case, you could deny the expensive provider for the specific prefix and IRP will choose the best-performing provider between the remaining two. This can be achieved by applying a Deny policy to the expensive provider or an Allow policy to the less costly providers.

The table below shows which cases a specific policy can be applied in, depending on the number of providers.
No. of providers \ Policy Allow Deny Static
1 provider No Yes Yes
2 providers or more (maximum: total number of providers - 1) Yes Yes No
All providers (VIP probing disabled) No No No
All providers (VIP probing enabled) Yes No No

If a Static Route policy is applied to a prefix, VIP probing through each of the providers is unnecessary. Regular probing will suffice for detection of a provider failure that would trigger IRP to reroute the traffic to a different provider. Therefore IRP does not allow using VIP probing within a Static Route policy.

Routing policies are designed to control outgoing paths for destination networks that are outside your infrastructure. This feature should not be used to manipulate your infrastructure network behavior.
Avoid setting up of policies that point to same prefix. When improvement by aggregate is enabled, multiple prefixes can point to the same aggregate and this can cause unpredictable behaviour.
If a provider is added or removed, suspended or shutdown, the routing policies are adjusted by IRP in the following way:
A new provider is added.
Policy Result
Allow all providers (VIP probing enabled) The new provider is automatically included into the configured policy and probed by the VIP probing mechanism.
Allow selected providers only The new provider is automatically ignored by the configured policy and not probed by the probing mechanism.
Deny The new provider is automatically included into the configured policy and probed by the selected probing mechanism.
Static Route The new provider is automatically ignored by the configured policy and probed by the Regular probing mechanism.

The provider under the policy is removed.
Policy Result
Allow all providers (VIP probing enabled) The new provider is automatically removed from the configured policy and not probed by the VIP probing mechanism. If there is only one provider left, the policy is automatically deactivated.
Allow selected providers only The provider is automatically removed from the configured policy and not probed by the probing mechanism. If there is only one provider left, the policy is automatically deactivated.
Deny The provider is automatically removed from the configured policy and not proved by the probing mechanism. If there is no any other provider under this policy, it is automatically deactivated.
Static Route The provider is automatically not probed by the probing mechanism. The policy is automatically deactivated.

The provider under the policy is suspended.
Policy Result
Allow all providers (VIP probing enabled) The provider is temporarily removed from the configured policy and not probed by the VIP probing mechanism. If there is only one provider left, the policy is temporarily deactivated.
Allow selected providers only The provider is temporarily removed from the configured policy and not probed by default the probing mechanism. If there is only one provider left, the policy is temporarily deactivated.
Deny The provider is temporarily removed from the configured policy and not probed by default the probing mechanism. If there is no any other provider under this policy, it is temporarily deactivated.
Static Route The provider is temporarily not probed by default the probing mechanism. The policy is temporarily deactivated.

The provider under the policy is shutdown.
Policy Result
Allow all providers (VIP probing enabled) The provider is temporarily removed from the configured policy and not probed by the VIP probing mechanism. If there is only one provider left, the policy is temporarily deactivated.
Allow selected providers only The provider is temporarily removed from the configured policy and not probed by default the probing mechanism. If there is only one provider left, the policy is temporarily deactivated.
Deny The provider is temporarily removed from the configured policy and not probed by default the probing mechanism. If there is no any other provider under this policy, it is temporarily deactivated.
Static Route The provider is temporarily not probed by default the probing mechanism. The policy is temporarily deactivated.

Policies that target an AS can be cascaded. Cascading applies the same policy to AS that are downstream to target AS, i.e. to AS that are transited by target AS.

The use case of cascading is to apply a policy to a remote AS that transits a few other AS. Still, a cascading policy can cover a huge number of down-streams. This number is parameterized and can be set to values that best fit customer’s needs. Refer for example 4.1.4.15↓.
When multiple routing domains are configured a policy can be configured to prevent global improvements. Refer to Optimization for multiple Routing Domains↓ for further details about routing domains.
Policies can be assigned a specific community and all policy based improvements will be marked with the designated value to allow their further manipulation on edge routers.
All Routing Policies are stored in /etc/noction/policies.conf file, which is automatically generated by the system.

Do not alter the /etc/noction/policies.conf file manually because any modifications will be overwritten by the system. Any changes can only be performed from the Configuration -> Routing Policies section in the system Frontend.
The rule will be ignored by the platform if the Routing Policy contains a syntax/logical error.
Please check (Routing Policies settings↓) for a detailed description of Routing Policies parameters.

1.2.10 Support for Centralized Route Reflectors

Figure 1.2.4: Support for Centralized Route Reflectors

IRP gives the possibility to advertise routes into one or more route reflectors which subsequently advertise improvements into upper and/or lower routing layers (such as edge, core or distribution).
In case the iBGP sessions can’t be established between the IRP appliance and edge routers, a route reflector is used. The following restrictions apply for such a solution:
  • The Next-Hop-Self option should not be used. A direct iBGP session is required between IRP and each of the divergence points (Reflector1, Edge2) in case it is enabled. This restriction is not applicable between reflectors and core/distribution layers.
  • Next-Hop addresses should be accessible (exist in the routing table) where next-hop-self is not applied (Either static routes or IGP is used).
  • An Internal Monitor should be configured to retrieve the eBGP session state from the device where the corresponding eBGP session is terminated. For example, ISP1 should be monitored on Reflector1, ISP2 on Edge1, and ISP3 and ISP4 on Edge2.
  • Injecting routes to reflector(s) can cause temporary routing loops.
In order to announce improvements into route reflector, it should be configured as a BGP router in the “Configuration” → ”BGP and routers” section and should be assigned to all the related providers in the “Configuration” → ”Providers and Peers” section.

1.2.11 Support for Internet Exchanges

A transit provider can deliver traffic to any destination on the Internet. However, within an Internet Exchange, a peering partner gives access only to the set of prefixes originated or transiting its network. Therefore, when IRP evaluates the Exchange as a best path, it has to know the prefixes announced by each peer, to avoid inefficient probing of paths that cannot lead to the desired destination.
With this purpose, IRP gets the routing table from the edge router containing the list of IPs and the corresponding next-hop; this represents the next router’s IP address to which a packet is sent as it traverses a network on its journey to the final destination. IRP matches the prefix with the corresponding next-hop among the configured peers, allowing it to select for probing only those peers that have access to a specific prefix. This process is also performed in the case of a transit provider that gives access only to a limited set of prefixes, rather than the entire Internet.
Figure 1.2.5: IRP configuration in a multi-homed network connected to transit providers as well as and Internet Exchange

In the case of multiple transit providers, there is an additional IP alias added on the IRP platform for each provider. The edge router is configured in such a way that traffic originating from each of these IPs is routed over different providers. This is done with the help of Policy Based Routing (PBR) or Flowspec policies.
With PBR, a network engineer has the ability to dictate the routing behavior based on a number of different criteria other than the destination network. These PBR rules are applied to make sure that IRP probes are following the desired paths. However, when it comes to Internet Exchanges, configuring hundreds of IP aliases on the platform would result in inefficient IP address usage and an unmanageable setup.
To avoid this, a set of PBR rules are applied making sure that the probes to be sent through a specific provider are originating from one of the configured IPs with a specific DSCP code assigned. DSCP - Differentiated Services Code Point - is a field in an IP packet that enables different levels of service to be assigned to network traffic. Since DSCP can take up to 64 different values, one configured IP can be associated with up to 64 peers. Although, due to this mechanism, the number of required IP addresses for aliases to be configured has decreased considerably, hard work would still be needed to configure the PBR on the edge router as described above.
To solve this, IRP implemented a built-in PBR config-generator which provides the configuration code to be used for a specific router model. By running this generated set of commands, network administrators can easily configure the required PBR rules on the router.

1.2.12 Optimization for multiple Routing Domains

Overview
Some networks have multiple Points of Presence interconnected both internally via inter-datacenter links and externally via multiple transit providers. The diagram below depicts an example diagram with the available routes to one destination on the Internet.
IRP uses the concept of Routing Domains to separate the locations. A Routing Domain’s main characteristic is that its routing tables are mainly built on data received from its locally connected providers and the preferred routes are based on locally defined preferences.
The process of optimizing outbound network traffic in such a configuration is to mainly find better alternative routes locally (within a Routing Domain) and only reroute the traffic to other Routing Domains via inter-datacenter links when local routes are completely underperforming.
It must be noted that a multiple Routing Domain configuration works best if the Points of Presence are not too far away (ex. a network with POPs in San Francisco, Palo Alto and Danville is perfectly suitable under this scenario.
Figure 1.2.6: City wide network

POPs situated at larger distances, for example in Las Vegas and Salt Lake City are still supported by a single IRP instance running in San Francisco.
Figure 1.2.7: Regional network

Intercontinental links for POPs in Tokyo and Melbourne are way too far away from the IRP instance in San Francisco and in such a case multiple IRP instances are required.
Figure 1.2.8: Intercontinental network

Multiple Routing Domains implementation attributes
To further detail the multiple routing domain attributes the following diagram will be used:
Figure 1.2.9: Multiple routing domains

A multiple Routing Domain configuration has a series of attributes:
  • Multiple locations belonging to the same network (AS) represented in the diagram by POP SJC, POP SFO and POP LAX (of course, more than 3 routing domains are supported).
  • The locations are distinguished by the different Routing Domains within which they operate (depicted by RD SJC, RD SFO, and RD LAX)
  • The Routing Domains are managed by edge routers belonging to different locations
  • Nearby locations that process routing data differently should be split into different Routing Domains, even if they have the same upstream providers. In the diagram above RD SFO and RD SFO’ are depicted as part of a single Routing Domain. A decision to split or keep in the same routing domain should be made based on exact knowledge on how routing data is processed.
  • Inter-datacenter loop interconnects the different locations (depicted by idc1, idc2 and idc3 segments)
  • Data flows between locations take only the short path (in the example POP SJC can be reached from POP SFO via idc2 path (short) or idc3 + idc1 path (long))
  • Each Routing Domain has different providers and different preferred routes to reach a specific destination (a1, b1, c1)
  • A single IRP instance collects statistics about traffic (Irpflowd only), probes available destinations and makes improvements towards specific prefixes/networks on the Internet.
  • IRP assumes RTT of zero and unlimited capacity to route traffic within a Routing Domain
  • IRP assumes that Sites are not physically too far away. It is ok to have different sites in the same city or region as at this scale inter-datacenter links have predictable characteristics. When taking intercontinental links into consideration this is quite probably not the case.
  • Distances between sites (idc1, idc2, idc3 delays) are measured in advance and specified in IRP’s configuration.
Inter-datacenter link characteristics
Support for Multiple Routing Domains relies on existence of inter-datacenter links. These links should be independent of upstream providers.
Example of inter-datacenter links that multiple routing domains is designed for are:
  • private connections,
  • L2 links with guaranteed service,
  • MPLS links
VPNs via public Internet could be used with Multi Routing Domain feature but is a suboptimal choice. Under such conditions IRP MUST be prevented from making Global Improvements. This way IRP will do only local optimizations in each Routing Domain and will operate similarly to multiple IRP instances (while probing excessively because it probes destinations via remote locations too).
Constraints
At the moment IRP multiple Routing Domains implementation does not cover the following:
  • IRP does not take measurements of inter-datacenter link delays (idc1, idc2 and idc3). This values are configurable.
  • IRP does not monitor if inter-datacenter links are operating normally. In case such a link is broken it is expected IRP to loose BGP connectivity with routing domain routers and this will cause IRP improvements to be withdrawn till the link is restored.
  • IRP does not try to detect if the traffic is following long or short paths on the inter-datacenter links. In the image above traffic from RD SJC can follow path idc1 (short) or idc2+idc3 (long). IRP always assumes the short path is being followed internally.
  • IRP does not take measurements of inter-datacenter link capacity and current bandwidth usage. At this stage IRP assumes there is enough inter-datacenter link capacity to also carry the (few) global improvements. Also, IRP tries to minimize usage of inter-datacenter links.
Routing domains
Routing domain is a generic term used to distinguish a logical location that works with different routing tables. The differences are caused by the fact that a router composes its routing table according to routes received from different providers. It is possible to have multiple routing domains in the same datacenter if routing data is received by different routers (even from same or different sources) and data flows are distributed via different routers by different policies. In the image above RD SFO and RD SFO’ can represent a single routing domain or multiple routing domains depending on what routing policies are applied.
Different routing domains are assigned identifiers in the range 1-100. Routing Domain identifier is assigned individually to each provider via parameter peer.X.rd↓. It must be noted the Routing domain that hosts the IRP instance is considered as the first routing domain (RD=1).
Parameter global.rd_rtt↓ gives the distances between routing domains. The format of the parameter is

rda:rdb:rtt
for example if RD SJC has Routing Domain id = 42, RD SFO - 1 (since it hosts IRP), RD LAX - 3 then the idc1, idc2 and idc3 rtt is defined as the collection:

global.rd_rtt = 3:42:20 42:1:17 1:3:35
This parameter will be validated for correctness and besides the format above it requires that RD SJC and RD SFO values are different and already configured (RD1 is always present).

Round trip time between one routing domain and another is calculated by executing PING towards edge routers and taking average integer value:
$ ping X -c 10 -q 
PING X (X) 56(84) bytes of data. 
--- X ping statistics — 
10 packets transmitted, 10 received, 0% packet loss, time 9085ms 
rtt min/avg/max/mdev = 40.881/41.130/41.308/0.172 ms 
Flow agents
A very natural constraint for Multiple Routing Domain networks is that IRP can rely only on Flow statistics - NetFlow or sFlow.

SPAN cannot be used because it does not carry attributes to distinguish traffic between different providers
Flow collector needs to know the exact details of such a configuration in order to correctly determine the overall provider volume and active flows. For this each provider in an MRD setup must be assigned Flow agents to enable IRP to match Flow statistics accordingly. Refer Flow agents↓ for further details.

Global and local improvements

Local improvements
Local improvements represent better alternative routes identified within a routing domain. If in the example image above current routes are represented by black lines then local improvements are depicted by orange lines b2 and c2. Keep in mind that a1 just reconfirmed an existing current route and no improvements are made in such a case.
Local improvements are announced in their routing domains and this has the characteristic that local traffic exits customer’s network via local providers. This also means that inter-datacenter interconnects are free from such traffic.
IRP prefers local routes and Improvements to Global improvements.
Parameter bgpd.rd_local_mark↓ specifies a community marker that distinguishes local improvements from Global Improvements. A BGP speaker should not advertise these improvements outside its Routing Domain. It must be noted that a single marker is used for all the routing domains and each of them shall be configured to advertise local improvements within the domain and filter it out for inter-domain exchanges.
Local improvements should be stopped from propagating across routing domains. A route map is used to address this. Below are listed sample route maps for Cisco IOS and JUNOS 9.

Cisco IOS

Refer your router capabilities in order to produce the correct route map. The route map MUST be integrated into existing route maps. It is not sufficient to simply append them.
neighbor <neighbor-from-another-RD> send-community (should be configured 
     for all iBGP sessions)
ip community-list standard CL-IRP permit 65535:1
route-map RM-IRP-RD deny 10
  match community CL-IRP
route-map RM-IRP-RD permit 20
​
router bgp AS
  neighbor <neighbor-from-another-RD> route-map RM-IRP-RD out

JUNOS 9

Refer your router capabilities in order to produce the correct route map. The route map MUST be integrated into existing route maps. It is not sufficient to simply append them.
policy-options{
	policy-statement IRP-CL {
		term 0 {
	        from {
	         	protocol bgp;
	         	community IRP-RD;
	        }
	        then reject;
	    }
	    term 1 {
	        then accept;
	    }
	}
	community IRP-RD members 65535:1;
}
protocols {
	bgp {
    	group ebgp {
        	type external;
        	neighbor 10.0.0.1 {
            	export IRP-CL;
        	}
    	}
    }
}

Global improvements
Global improvements are made when IRP identifies an alternative route that even after factoring in the latencies incurred by inter-datacenter interconnects are better than all existing alternatives. Such an example can be represented by alternative route c2 in the image above. A global improvement is made when one routing domain alternative is better than the best local alternatives in all other routing domains even considering the latencies incurred by inter-datacenter interconnects. In the image above c2 will become a global improvement if his loss characteristic is best to all alternatives and its latency:
  • (c2+idc1 - margin) is better than best local alternative a1 in RD SJC
  • (c2+idc3 - margin) is better than best local alternative b2 in RD SFO
where:
  • a1, b2 and c2 represent roundtrip times determined by IRP during probing of a destination.
  • idc values are configurable and are set as one entry of global.rd_rtt↓ parameter.
  • margin is given by core.global.worst_ms↓.
Global improvements can degrade performance in some routing domains. If performance for some routing domains degrades, IRP announcements for the global improvement also carry designated BGP community attributes set by rd.X.community_worsening↓.
Global improvements move traffic via inter-datacenter interconnects and as such are less desirable to local routes. Global improvements make sense when defined as above and even more sense when packet loss is taken in consideration and routing via a different datacenter reduces packet loss significantly.

1.2.13 Improvements weight

IRP assigns each improvement a weight. The weight takes into consideration many network and environment aspects of the change such as policy or VIP destinations, loss and latency differences, cost or commit control type of the improvement. Based on all of the above the improvement gathers more or less weight as appropriate.
Later on, instead of replacing oldest improvements that might still bring significant benefits with new improvements just because they are fresh, IRP relies on the weights to decide whether the benefit of the new improvement is sufficient to replace an existing one. More so, besides preserving the most relevant improvements this feature reduces route flapping by blocking announcement of new improvements and withdrawal of existing ones if the changes are not offering a good enough return.

1.2.14 Notifications and events

IRP produces a huge number of various events and some of them are critical for customer’s awareness. Notifications allow customers to subscribe to any of the available events using the following channels:
  • SMS
  • Email
  • Slack (via Webhook)
  • SNMP Traps
IRP service Irppushd provides this feature. In order for Notifications to be delivered correctly the corresponding channel configuration shall be provided. By default only email notifications can be delivered since IRP uses the embedded system email service to send them.
More so, users should subscribe for specific events.

Only events for valid subscriptions using correctly configured channels will be delivered.
Refer section Notifications and Events↓ for details about configuring, subscribing and contents of notifications.
Refer section Notification and events parameters↓ for details about individual configuration parameter.

Events

The list of events monitored by IRP that can generate notifications is provided below.
When one of the IRP components detects a transition form normal to abnormal traffic behavior or back it fires these events:
  • Abnormal correction: irpflowd
  • Abnormal correction: irpspand
  • Inbound traffic low: SPAN
  • Inbound traffic low: Flow
  • Inbound traffic normal: Flow
  • Inbound traffic normal: SPAN
  • Outbound traffic low: SPAN
  • Outbound traffic low: Flow
  • Outbound traffic normal: Flow
  • Outbound traffic normal: SPAN
When Commit Control limits are exceeded per provider or overall one of the following events fires. Refer section 4.1.10↓ for configuring the actual limits of the events.
  • Commit Control overload by X Mbps
  • Commit Control overload by X%
  • Commit Control provider X overloaded by Y Mbps
  • Commit Control provider X overloaded by Y%
When an IRP component (re)loads the configuration it validates it and depending on results fires one of the following events:
  • Configuration Invalid: BGPd
  • Configuration Invalid: Core
  • Configuration Invalid: Explorer
  • Configuration Invalid: irpapid
  • Configuration Invalid: irpflowd
  • Configuration Invalid: irpspand
  • Configuration Invalid: irpstatd
  • Configuration Ok: BGPd
  • Configuration Ok: Core
  • Configuration Ok: Explorer
  • Configuration Ok: irpapid
  • Configuration Ok: irpflowd
  • Configuration Ok: irpspand
  • Configuration Ok: irpstatd
Outage detection algorithm fires one of the following events when it confirms congestion or outage problems and reroutes traffic around it:
  • Congestion or Outage
  • Outage: Confirmed and rerouted
Explorer periodically checks the PBRs and its expected probing performance and triggers the following events:
  • Failed PBR (IPv6) check for provider
  • Failed PBR (IPv4) check for provider
  • Successful PBR (IPv4) check for provider
  • Successful PBR (IPv6) check for provider
  • Explorer performance low
  • High number of VIP prefixes degrades IRP performance
IRP BGP Internal and External monitors fire the following events:
  • ExternalMonitor (IPv4) Failed status for a provider. All improvements towards the provider will be withdrawn.
  • ExternalMonitor (IPv4) OK status for a provider. All improvements towards the provider will be announced.
  • ExternalMonitor (IPv6) Failed status for a provider. All improvements towards the provider will be withdrawn.
  • ExternalMonitor (IPv6) OK status for a provider. All improvements towards the provider will be announced.
  • InternalMonitor (IPv4) Failed status for a provider. All improvements towards the provider will be withdrawn.
  • InternalMonitor (IPv4) OK status for a provider. All improvements towards the provider will be announced.
  • InternalMonitor (IPv6) Failed status for a provider. All improvements towards the provider will be withdrawn.
  • InternalMonitor (IPv6) OK status for a provider. All improvements towards the provider will be announced.
When statistics collection over SNMP is up or down IRP fires the following events:
  • Provider SNMP stats down: X
  • Provider SNMP stats up: X
BGPd raises these events when BGP sessions are established/disconnected:
  • IRP BGP session disconnected
  • IRP BGP session established
When IRP identifies conditions to re-route traffic (make an improvement) and additionally it considers the differences to be excessive it raises these events:
  • Excessive packet latency for prefix
  • Excessive packet loss for prefix
  • Improvements spike
  • Low rate of announced IPv4 improvements
  • Low rate of announced IPv6 improvements
  • New improvement
Once an IRP component is started, stopped or restarted it raises the following events:
  • Service started: BGPd
  • Service started: Core
  • Service started: Explorer
  • Service started: irpapid
  • Service started: irpflowd
  • Service started: irpspand
  • Service started: irpstatd
  • Service stopped: BGPd
  • Service stopped: Core
  • Service stopped: Explorer
  • Service stopped: irpapid
  • Service stopped: irpflowd
  • Service stopped: irpspand
  • Service stopped: irpstatd

SNMP Traps

SNMP traps is a widely used mechanism to alert about and monitor a system’s activity.
IRP SNMP traps not only notify about some IRP platform event but also include the list of varbinds which contain detailed information related to the thrown trap. The complete list of traps and varbinds with their descriptions can be found at /usr/share/doc/irp/NOCTION-IRP-MIB.txt

1.2.15 IRP API

IRP exposes a web API that uses HTTP verbs and a RESTful endpoint structure. Request and response payloads are formatted as JSON.
The API is running on the IRP instance and is reachable by default over SSL at port 10443. If called directly from the IRP instance server the API can be accessed at https://localhost:10443 Use https://hostname/api in order to access the API from elsewhere on the network.
An IRP user id is required to access most of the API services. Use IRP’s Frontend to manage users or configure external User Directories (refer Security Configuration↓). IRP API uses an authenticating mechanism based on authentication tokens. The token is passed as a query parameter for all API requests that require authentication.
The API comes with an API Reference available from IRP’s Help menu and the API References also includes a sample PHP application with source code to aid in development.
IRP’s API is powered by Irpapid service that can be started, stopped, configured like any other IRP service. Refer to the 4.1.3↓ section for Irpapid configuration parameter details.

1.2.16 IRP Failover

Overview

IRP offers failover capabilities that ensure Improvements are preserved in case of planned or unplanned downtime of IRP server.
IRP’s failover feature uses a master-slave configuration. A second instance of IRP needs to be deployed in order to enable failover features. For details about failover configuration and troubleshooting refer Failover configuration↓.

A failover license is required for the second node. Check with Noction’s sales team for details.
IRP’s failover solution relies on:
  • slave node running same version of IRP as the master node,
  • MySQL Multi-Master replication of ’irp’ database,
  • announcement of the replicated improvements with different LocalPref and/or communities by both nodes,
  • monitoring by slave node of BGP announcements originating from master node based on higher precedence of master’s announced prefixes,
  • activating/deactivating of slave IRP components in case of failure or resumed work by master,
  • syncing master configuration to slave node.
For exact details about IRP failover solution refer to configuration guides (2.6↓, 3.12.13.5↓), template files, and (if available) working IRP configurations. For example, some ’irp’ database tables are not replicated, ’mysql’ system database is replicated too, some IRP components are stopped.

IRP versions 3.5 and earlier do no offer failover capabilities for Inbound improvements. It is advised that in these versions only one of the IRP instances is configured to perform inbound optimization in order to avoid contradictory decisions. In case of a failure of this instance inbound improvements are withdrawn.
An overview of the solution is presented in the following figure:
Figure 1.2.10: Failover high level overview

The diagram highlights the:
  • two IRP nodes - Master and Slave,
  • grayed-out components are in stand-by mode - services are stopped or operating in limited ways. For example, the Frontend detects that it runs on the slave node and prohibits any changes to configuration while still offering access to reports, graphs or dashboards.
  • configuration changes are pushed by master to slave during synchronization. SSH is used to connect to the slave.
  • MySQL Multi-Master replication is setup for ’irp’ database between master and slave nodes. Existing MySQL Multi-Master replication functionality is used.
  • master IRP node is fully functional and collects statistics, queues for probing, probes and eventually makes Improvements. All the intermediate and final results are stored in MySQL and due to replication will make it into slave’s database as well.
  • BGPd works on both master and slave IRP nodes. They make the same announcements with different LocalPref/communities.
  • BGPd on slave node monitors the number of master announcements from the router (master announcements have higher priority than slave’s)
  • Timers are used to prevent flapping of failover-failback.

Requirements

The following additional preconditions must be met in order to setup failover:
  1. second server to install the slave,
  2. MySQL Multi-Master replication for the irp database.
MySQL replication is not configured by default. Configuration of MySQL Multi-Master replication is a mandatory requirement for a failover IRP configuration. Failover setup, and specifically MySQL Multi-Master replication should follow a provided failover script. Only a subset of tables in irp database are replicated. Replication requires extra storage space, depending on the overall traffic and platform activity, for replication logs on both failover nodes.
  1. a second set of BGP sessions will be established,
  2. a second set of PBR IP addresses are required to assign to the slave node in order to perform probing,
  3. a second set of improvements will be announced to the router,
  4. a failover license for the slave node,
  5. Key-based SSH authentication from master to slave is required. It is used to synchronize IRP configuration from master to slave,
  6. MySQL Multi-Master replication of ’irp’ database,
  7. IRP setup in Intrusive mode on master node.
In case IRP failover is setup in a multiple Routing Domain configuration and IRP instances are hosted by different RDs this must be specified in IRP configuration too. Refer Optimization for multiple Routing Domains↑, global.master_rd↓, global.slave_rd↓.

Failover

IRP failover relies on the slave node running the same version of IRP to determine if there are issues with the master node and take over if such an incident occurs.
Slave’s BGPd service verifies that announcements are present on a router from master. If announcements from master are withdrawn for some reason the slave node will take over.

In order for this mechanism to work IRP needs to operate in Intrusive mode and master’s node announcements must have higher priority then the slave’s.
During normal operation the slave is kept up to date by master so that it is ready to take over in case of an incident. The following operations are performed:
  • master synchronizes its configuration to slave. This uses a SSH channel to sync configuration files from master to slave and process necessary services restart.
  • MySQL Multi-Master replication is configured on relevant irp database tables so that the data is available immediately in case of emergency,
  • components of IRP such as Core, Explorer, Irppushd are stopped or standing by on slave to prevent split-brain or duplicate probing and notifications,
  • slave node runs BGPd and makes exactly the same announcements with a lower BGP LocalPref and/or other communities thus replicating Improvements too.
It is imperative that master’s LocalPref value is greater than slave’s value. This ensures that master’s announcements are preferred and enables slave to also observe them as part of monitoring.
In case of master failure its BGP session(s) goes down and its announcements are withdrawn.

Slave node only considers that master is down and takes over only if master’s Improvements are withdrawn from all edge routers in case of networks with multiple edge routers.
The same announcements are already in router’s local RIB from slave and the router chooses them as best.

This is true only if LocalPref and/or communities assigned to slave node are preferred. If other most preferable announcements are sent by other network elements , no longer announcements from slave node will be best. This defeats the purpose of using IRP failover.
At the same time, Failover logic runs a set of timers after master routes are withdrawn (refer global.failover_timer_fail↓). When the timers expire IRP activates its standby components and resumes optimization.

Failback

IRP includes failback feature too. Failback happens when master comes back online. Once BGPd on the slave detects announcements from master it starts its failback timer (refer global.failover_timer_failback↓). Slave node will continue running all IRP components for the duration of the failback period. Once the failback timer expires redundant slave components are switched to standby mode and the entire setup becomes normal again. This timer is intended to prevent cases when master is unstable after being restored and there is a significant risk it will fail again.

During failback it is recommended that both IRP nodes are monitored by network administrators to confirm the system is stable.

Recovery of failed node

IRP failover configuration is capable to automatically restore its entire failover environment if downtime of failed node is less than 24 hours.

Recovery speed is constrained by restoring replication of MySQL databases. On 1Gbps non-congested links replication for a full day of downtime takes approximately 30-45 minutes with 200-250Mbps network bandwidth utilization between the two IRP nodes. During this time the operational node continues running IRP services too.
If downtime was longer than 24 hours MySQL Multi-Master replication is no longer able to synchronize the databases on the two IRP nodes and manual MySQL replication recovery is required.

Upgrades

Failover configurations of IRP require careful upgrade procedures especially for major versions.

It is imperative that master and slave nodes are not upgraded at the same time. Update one node first, give the system some time to stabilize and only after that update the second node.


1.2.17 Inbound optimization

Starting with version 3.4 IRP introduced optimization of Inbound traffic. Inbound bandwidth control reshapes the traffic from different providers targeting your sub-prefixes .
IRP uses well known and proven BGP mechanisms to instruct your routers to adjust their advertisements of your network segments to upstream providers and subsequently to the World. The adjusted advertisements take advantage of existing BGP policies implemented by edge routers worldwide in order to increase or decrease the preference of your internal network segments as advertised by one or another AS to the world. This allows more traffic to lean towards some of your upstream providers and less towards others. In case of an incident that your multihomed configuration is designed to be resilient against, the entire world still knows of the alternative routes towards your network and will be able to adjust accordingly.
Noction’s IRP Inbound feature:
  • advises your edge routers to advertise different prefixes of your network with different counts of BGP prepends to each of your upstream providers;
  • monitors inbound and outbound traffic on your network interfaces using standard protocols (SNMP, NetFlow, sFlow) to determine if there is need for action;
  • continuously assesses network health and performance to ensure that when overloads are possible it knows good and reliable alternative routes to use;
  • uses a proprietary inferring mechanism that takes into account the inertial nature of the Internet to react to inbound preference changes and thus dampens the urge to “act now” and destabilize the network;
  • provides you with configuration, monitoring and visualization tools that allow you to cross-check and validate its actions.
The entire Inbound optimization solution covers:
  1. Segmenting your network into prefixes that are controlled by IRP
  2. Coordinating communication over BGP between IRP and routers
  3. Setting network conditions for making Inbound Improvements
  4. Reviewing Inbound optimization results
Starting with version 3.7 IRP has the capability to also monitor and improve transit routes. Refer Optimization of transiting traffic ↓ for details.
The use cases below highlight typical scenarios when IRP’s inbound bandwidth control capabilities might come handy:
Let's start with the case of unbalanced inbound and outbound peering
You have a peering agreement with one of your neighbors to exchange traffic. Unfortunately the rest of the neighboring network configuration significantly unbalances the volumes of inbound and outbound traffic.
Figure 1.2.11: Unbalanced inbound/outbound peering

You rely on manipulating prefix announcements towards neighbors in order to shape the traffic. Unfortunately this is a reactive solution and consumes a lot of time while at the same time pushing the balance either one way or another. Often this pushes the network into the second typical scenario.

Fluctuating traffic shape
A multihomed configuration overwhelms some links while other links remain barely used. Your network administrators frequently get alerts during peak network use and they manually add or remove prepends or altogether remove some inbound prefixes from being announced to affected neighboring links. These changes are sometime forgotten or other times just push the bulk of traffic towards other links and the problem re-appears.
Figure 1.2.12: Fluctuating inbound traffic

For both above scenarios Inbound commit control in IRP can automate most of the inbound traffic shaping operations. It uses a typical solution for traffic shaping and specifically manipulates prepend counts announced to different providers and this eventually propagates across the Internet thus diverting traffic as desired. IRP automates your routine day to day traffic shaping actions and probably makes significantly more adjustments than you can afford to. At the same time it offers you reliable reviewing and fine-tuning options that will allow you to adjust IRP’s work in a changing and evolving network.

1.2.18 Flowspec policies


Flowspec policies can be used only in conjunction with Flowspec capable routers.
Starting with version 3.5 IRP has support of Flowspec policies. This means that Flowspec capability is recognized and can be used accordingly for BGP sessions established by IRP. In short Flowspec defines matching rules that routers implement and enforce in order to ensure that only desirable traffic reaches a network. Flowspec by relying on BGP to propagate the policies uses a well understood and reliable protocol.

As specified by BGP, when a session disconnects all the announcements received from that session are discarded. The same is generally true for Flowspec policies. Still, since some vendors recognize Flowspec policies but implement them using capabilities other than BGP a confirmation is needed whether on session disconnect the specific router model indeed removes all the underlying constructs and reverts to a known state.
IRP not being involved in direct packet forwarding expects that Flowspec policies are implemented at least by your edge routers. If upstream providers also offer Flowspec support these policies can be communicated upstream where their implementation is even more effective.
Eventually Flowspec policies help ensure that traffic entering or exiting your network conforms to your plans and expectations. The main use cases that can be accomplished with Flowspec policies in IRP allows:
  • controlling bandwidth usage of your low priority traffic towards external services, for example throttling bandwidth usage originating on your backup systems towards off-premises services.
  • anticipating inbound traffic towards your services and shaping bandwidth use in advance, for example anticipating low numbers of legitimate customers from Russia, China or India on your e-commerce services and setting high but controllable rate limits on packets originating in those networks.
  • reacting on a packet flooding incident by dropping specific packets, for example dropping all packets targeting port 53.
  • redirecting some traffic for scrutiny or cleansing, for example forwarding port 80 packets through an intelligent device capable of detecting RUDY, slow read or other low-bandwidth/amplification attacks.
IRP Flowspec policies rely on a minimal set matching rules and actions that offer most of the capabilities while keeping the learning curve low and integration simple:
  • Source or destination IP address specified as either CIDR format prefix or direct IP address
  • Traffic protocols, for example TCP, UDP or ICMP
  • Source or destination TCP/UDP ports
  • Throttle, drop and redirect actions.
It is important to note that IRP does not cross-validate Flowspec policies with improvements. While it is possible that for example a Flowspec redirect action pushes some traffic a different way to what an improvement advises, usually improvements cover many prefixes and while there will be a contradiction for one prefix there will be many other prefixes that IRP improves to compensate for these unitary abnormalities. It is recommended that Flowspec policies take precedence over improvements in order to benefit from this compensating nature of improvements.
Consider that depending on whether source or destination prefix belongs to your network the policy applies to either inbound or outbound traffic while the choice of ports allows targeting different traffic types.
Compare Flowspec policies to the already well known Routing Policies ↑. For further details regarding Flowspec configuration refer Flowspec policies↓.

1.2.19 Throttling excessive bandwidth use with Flowspec

IRP can be configured to automatically add throttling Flowspec policies for prefixes that started using abnormal volumes of traffic. This feature can be used only if your network has Flowspec capabilities. Refer Flowspec policies↑ for details about Flowspec.

In case thresholds for excessive bandwidth use are set to very aggressive levels IRP can create large numbers of Flowspec policies.
This feature can be described as:
  • configure excess threshold and throttling multipliers
  • periodically determine current and average prefix bandwidth usage for this hour of the day
  • verify if current usage exceeds the average by a larger factor then the threshold multiplier
  • rate-limit excessive bandwidth usage prefixes at their average use times throttling multiplier.
Past throttling rules are revised if/when prefix abnormal usage pattern ends.
For example, when a prefix usually consumes 1-2Mbps of traffic and its current bandwidth spikes tenfold and the spike is sustained for a significant period of time a throttling rule limiting usage by this prefix at 5-6Mbps will still offer ample bandwidth for normal service use and will also protect other services from network capacity starvation.

1.2.20 Maintenance windows

Maintenance works are on everybody’s agenda in current fast paced and continuously evolving networks. During maintenance network engineers are very busy and will welcome any help their systems can offer in carrying out those works with the least amount of headaches. IRP is clearly not in the top of network engineer’s priorities and asking to suspend or shutdown a provider immediately before a maintenance window starts and restart the provider back once the maintenance works end is not very helpful if not even annoying.
Instead IRP offers the facility to plan maintenance windows in advance. Knowing when a maintenance window starts and ends, IRP excludes that specific provider link from either performance optimization or bandwidth control. More so, IRP has the capability to reshape the traffic flowing in and out of a network to anticipate any downtime on a link.

Setting a maintenance window by router sets each of the providers on the router with a maintenance window of their own.
Properly configured maintenance windows allows IRP time to move most of the outbound traffic and deflect most of inbound traffic away from the provider link that is scheduled for maintenance. Having only a small fraction of traffic or none at all on the maintenance link before the downtime starts avoids any (shall we say, catastrophic) spikes, possible overloads and consequently unpredictable behavior of the remaining live network equipment.
Specifically the following applies:
  • a maintenance window is configured in advance and can be removed/revised at any time,
  • a maintenance window sets details for single provider. If needed multiple maintenance windows can be setup and even overlapping maintenance windows are OK.
  • IRP highlights maintenance windows in IRP’s Frontend sidebar so that it is easy to spot current maintenance window status.
  • optionally IRP can preserve existing improvements so that once the maintenance window ends improvements are reimplemented. It is advised that this feature is used only when the maintenance window is very short (a few minutes).
  • an unloading period can be setup. During unloading IRP actively re-routes outbound prefixes through other available providers. While IRP is able to make most of the unloading improvements fast consideration shall be given to the announcement rate limitations setup in BGPd in order for all the improvements to reach network routers in time for maintenance window starting time.
  • a prepend time can be setup. This is only applicable if Inbound optimization is operational. If this time is setup then IRP will prepend configured inbound prefixes with the maximum allowed number of prepends through the provider link under maintenance in order to deflect inbound traffic towards other providers. Refer to Inbound optimization↑ for more details about Inbound optimization.
Refer Maintenance windows↓ for details how to configure, review, create, edit, delete maintenance windows.

1.2.21 Optimization of transiting traffic

Starting with IRP 3.7 IRP introduces optimization of transiting traffic capabilities. Optimization of transiting traffic is an enhancement of Inbound optimization↑.
Optimization of transiting traffic relies on the same method of influencing Internet-wide routing - manipulating best path selection by increasing AS Path length for a prefix carrying traffic on an undesirable interface. The most secure and effective way of AS Path manipulation is to apply these changes on a router facing a provider. Edge routers prepend routes according to designated communities.
Besides being an enhancement of Inbound optimization, transit prefixes are governed by a different set of constraints:
  • The number of potential routes is very large and only some will/should be targeted for optimization. For this IRP uses filters by ASN/prefix allowing network administrators to configure what segments of the Internet to focus IRP’s attention to.
  • The improvements are visible on the Internet and excessive route changes can be flagged by external monitoring services as flapping or route instability. IRP protects against this by rate limiting number of route changes through a specific provider.
  • Once a route has been improved all its traffic might be diverted away from this network and no new statistics will be available to make further inferences. In such cases IRP reverts old transit improvements during network’s off-peak hour by either decreasing the number of advertised prepends or withdrawing the improvement altogether (configurable).
  • Transit improvements apply to the same prefixes as do outbound improvements. The outbound improvements are withdrawn in order to avoid the risk of contradictory routing decisions.
Implementing optimization of transiting traffic introduces a series of risks and some inherent drawbacks, for example:
  • Potential blackholing of traffic when all alternative routes are withdrawn. IRP implements a protection against this by traversing all RIB-in entries for improved transit prefixes and confirming that the route is still being announced by other providers besides IRP. Still there can be a short time period between confirmations when all alternative routes have been withdrawn and IRP did not yet get a chance to re-confirm this. Attempting to reduce this time period by setting up a higher frequency of confirmations leads to increased load on the router and a tradeoff needs to be made for this. For example when the number of providers is quite large the probability that a route will be withdrawn through all of them is quite small and thus the frequency of confirmations can be reduced too.
  • Additional CPU load on edge router(s) for servicing mandatory SNMP requests that check alternative route presence and BGP Best Path selection inside IRP.
  • Working with missing BGP attributes that is not available over SNMP.
  • Strict upper limits on the number of possible Inbound improvements are imposed by the trade-off required to reduce excessive router’s CPU load.

1.2.22 Circuit issues detection

Starting with IRP 3.8 IRP adds excessive loss circuit issues detection features.

Circuit issues detection feature is available for transit providers only.
When this feature is enabled for a provider IRP uses past probing data to detect when it suffers from excessive levels of packet loss. To determine excessive loss IRP compares a provider’s average loss over an immediate past time horizon, number of probes and average loss difference from other providers. Depending on packet loss thresholds IRP can attempt different actions on the network.
Every time a circuit issue is detected IRP will raise corresponding alerts that can be subscribed to. Network engineers or external network management systems can act on them.
Additionally IRP even though it is constrained on how much can do, ensures that the following will take place:
  • provider is marked with an issue badge and excluded from being considered a candidate for performance improvements,
  • reprobing is performed for destination prefixes routed through affected provider,
  • outbound improvements through affected provider are withdrawn,
  • max prepends are announced for inbound and transit prefixes through affected provider,
  • FlowSpec rules to induce drop of BGP session towards affected provider are added,
Note that this is only possible if FlowSpec is enabled and the network is capable of processing FlowSpec rules. Dropping the BGP session with the affected provider causes all outbound and inbound traffic through this provider to be re-routed through known good providers.
  • affected provider is monitored to detect if the issue was temporary and if loss averages return to normal restore it to a good state.
Enable this feature for each of the designated provider as detailed in Configuration editor: Provider name ↓ and review circuit issue detection thresholds as detailed in Core configuration↓.

1.3 IRP Optimization modes

1.3.1 Performance optimization

Performance optimization mode makes sure that traffic flows through the best performing routes by reducing packet loss and latency, while ignoring other characteristics such as provider bandwidth usage or transit cost. The system analyzes each of the connected providers and compares only their performance metrics in order to choose the best candidate and make an improvement.
Figure 1.3.1: Performance optimization algorithm

First of all, IRP considers packet loss. If loss is lower and the difference is greater than a predefined value, then the system checks if sending the traffic through this provider will not cause any network congestion. If it confirms that the traffic can flow freely, the system declares the provider as the best one to route through.
However, if loss values are equal or the difference between providers is smaller than predefined, the system continues by comparing latency values. If latency is lower and the difference in latency between providers is greater than predefined, then the system declares a latency-based improvement.

1.3.2 Cost optimization

Cost optimization mode decreases packet loss and improves the cost while not worsening latency more than allowed. If IRP cannot improve costs it tries to reduce latency. The platform runs the same algorithm for loss comparison as in performance optimization mode.
Figure 1.3.2: Cost optimization algorithm

Note that the diagram does not highlight higher cost cases. IRP operating in cost optimization mode can make improvements towards higher cost providers only when current routes suffer from loss.
Before comparing latency values IRP compares the transit cost for each provider. If the cost of the new provider is better, the system goes further by checking the latency and validates the cost improvement only if the latency is not worsening more than predefined. However, if the cost cannot be improved, IRP tries to make a latency improvement using the same algorithm as in performance mode. Thus, if there is no way to make a cost improvement, system reroutes the traffic to the best performing provider.

1.3.3 Commit Control

Commit Control, allows to keep the commit levels for each provider at a pre-configured level. It includes bandwidth control algorithms for each provider as well as the active traffic rerouting, in case bandwidth for a specific provider exceeds the configured limit. Commit Control also includes passive load adjustments inside each provider group.
A parameter called “precedence” (see peer.X.precedence↓) is used to set the traffic unloading priorities, depending on the configured bandwidth cost and providers throughput. The higher is the precedence, the lower is the probability for the traffic to be sent to a provider,when its pre-configured 95th percentile usage is higher. The platform will reroute excessive bandwidth to providers, whose current load is less than their 95th percentile. If all providers are overloaded, traffic is rerouted to the provider with the smallest precedence - usually this provider has either the highest available bandwidth throughput, or the lowest cost.

IRP usually allows CC improvements when the candidate providers have better or equal loss to current route. This can be configured under core.commit_control.loss_override↓.

1.3.3.1 Flexible aggressiveness of Commit algorithm based on past overloads

Metering bandwidth usage by the 95th presents the following alternative interpretation - the customer is allowed to exceed his limits 5% of times. As such, IRP assumes there’s a schedule of overloads based on the current time within a month and an actual number of overloads already made.
Figure 1.3.3: More aggressive when actual exceeds schedule

The sample image above highlights the remaining scheduled and the actual amount of allowed overloads for the month decreasing. Whenever IRP depicts that the actual line goes below schedule (meaning the number of actual overloads exceeds what is planned) it increases its aggressiveness by starting unloading traffic earlier than usual. For example, if the commit level is set at 1Gbps and IRP will start unloading traffic at possibly 90% or 80% depending on past overloads count.
The least aggressive level is set at 99% and the most aggressive level is constrained by configuration parameter core.commit_control.rate.low↓ + 1%.
This is a permanent feature of IRP Commit Control algorithm.

1.3.3.2 Trigger commit improvements by collector

Commit control improvements are made after the flows carrying traffic are probed and IRP has fresh and relevant probe results. Subsequent decisions are made based on these results and this makes them more relevant. Still, probing takes some time and in case of fluctuating traffic patterns the improvements made will have reduced impact due to flows ending soon and being replaced by other flows that have not been probed or optimized.
For networks with very short flows (average flow duration under 5 minutes) probing represents a significant delay. In order to reduce the time to react to possible overload events IRP added the feature to trigger commit control improvements on collector events. When Flow Collector detects possible overload events for some providers, IRP will use data about past probed destinations in order to start unloading overloaded providers early. This data is incomplete and a bit outdated but still gives IRP the opportunity to reduce the wait time and prevent possible overloads. Later on, when probes are finished another round of improvements will be made if needed.
Due to the fact that the first round of improvements is based on older data, some of the improvements might become irrelevant very soon. This means that routes fluctuate more than necessary while on average getting a reduced benefit. This is the reason this feature is disabled by default. Enabling this feature represents a tradeoff that should be taken into consideration when weighing the benefits of a faster react time of Commit Control algorithm.
This feature is configurable via parameter: core.commit_control.react_on_collector ↓. After enabling/disabling this feature IRP Core service requires restart.

1.3.3.3 Commit Control improvements on disable and re-enable

IRP up to version 2.2 preserved Commit Control improvements when the function was disabled globally or for a specific provider. The intent of this behavior was to reduce route fluctuation. In time these improvements are overwritten by new improvements.
We found out that the behavior described above ran contrary to customer’s expectations and needs. Usually, when this feature is disabled it is done in order to address a more urgent and important need. Past Commit Control improvements were getting in the way of addressing this need and was causing confusion. IRP versions starting with 2.2 aligns this behavior with customer expectations:
  • when Commit Control is disabled for a provider (peer.X.cc_disable↓ = 1), this Provider’s Commit Control improvements are deleted;
  • when Commit Control is disabled globally (core.commit_control↓ = 0), ALL Commit Control improvements are deleted.
It must be noted that the improvements are actually moved into a ’recycle bin’ of sorts and if need be they can be restored after a short period of time. When Commit Control is enabled overall or for a Provider IRP will need some time to make a series of improvements that level traffic flows. This is only natural because IRP does not have statistics and has no knowledge of flow behaviors in the network.
Still, if Commit Control feature is re-enabled there’s the possibility to restore old improvements based on past data and measurements. Unfortunately network characteristics fluctuate and IRP cannot restore any past improvements since they might be no longer relevant or even altogether wrong. As such, IRP will preserve and restore only improvements that are not older than the retry time interval which is configurable via parameter: core.improvements.ttl.retry_probe↓

1.3.3.4 Provider load balancing

Provider load balancing is a Commit Control related algorithm, that allows a network operator to evenly balance the traffic over multiple providers, or multiple links with the same provider.
For example, a specific network having an average bandwidth usage of 6Gbps has two separate ISPs. The network operator wants (for performance and/or cost reasons) to evenly push 3Gbps over each provider. In this case, both upstreams are grouped together (see peer.X.precedence↓), and the IRP system passively routes traffic for an even traffic distribution. Provider load balancing is enabled by default via parameter peer.X.group_loadbalance↓.
Figure 1.3.4: Provider load balancing

1.3.3.5 Commit control of aggregated groups

Customers can deploy network configurations with many actual links going to a single ISP. The additional links can serve various purposes such as to provision sufficient capacity in case of very large capacity requirements that cannot be fulfilled over a single link, to interconnect different points of presence on either customer (in a multiple routing domain configuration) or provider sides, or for redundancy purposes. Individually all these links are configured in IRP as separate providers. When the customer has an agreement with the ISP that imposes an overall limitation on bandwidth usage, these providers will be grouped together in IRP so that it can optimize the whole group.
The rationale of this feature as illustrated in the figure below is that if in the group overusages on one provider are compensated by underusages on another provider there is no need to take any action since overall the commitments made by the customer to the ISP have not been violated. Commit control algorithm will take action only when the sum of bandwidth usage on all providers in the group exceed the sum of bandwidth limits for the same group of providers.
Figure 1.3.5: Commit control of aggregated groups

The image above highlights that many overusages on the green line are compensated by purple line underusages so that the group usage is below group total limits. Only when the traffic on the purple line increases significantly and there are no sufficient underusages on the other providers in the group to compensate the overusages, Commit Control identifies overusages (highlighted with a red x on the drawing above) and takes action by rerouting some traffic towards providers outside the group.

It is important to note that in order for this feature to be effective there must be providers configured in IRP that are not part of this group. This way when candidate improvements are considered there are alternative routes via those providers that the traffic can be rerouted to.
In order to configure providers that are optimized as an aggregated group 1) first the providers will be configured with the same precedence in order to form a group; 2) the overall 95th limitation will be distributed across providers in the group as appropriate; 3) and finally load balancing for the group will be Disabled by parameter peer.X.group_loadbalance↓.

1.3.3.6 95th calculation modes

Commit control uses the 95th centile to determine whether bandwidth is below or above commitments. There are different ways to account for Outbound and Inbound traffic when determining the 95th value. IRP supports the following 95th calculation modes:
  • Separate 95th for in/out: The 95th value for inbound and outbound traffic are independent and consequently bandwidth control for each is performed independently of each other. For this 95th calculation modes IRP monitors two different 95th for each inbound and outbound traffic levels.
  • 95th from greater of in, out: At each time-point the greater of inbound or outbound bandwidth usage value is used to determine 95th.
  • Greater of separate in/out 95th: 95th are determined separately for inbound and outbound traffic and the larger value is used to verify if commitments have been met.
Refer 4.1.12.5↓.

1.3.3.7 Other commit control features

Current improvements report has been improved to keep CC improvements applied during an algorithm cycle in a single sorted batch. This way the report no longer confuses by intermingling the improvements.
Figure 1.3.6: Sorted Commit Control improvements

The excerpt above highlights the new commit control improvements done at the same time. They unload 9.11, 6.09, 5.70, 4.26 Mbps from provider B to A thus unloading B from 109% to 58% and increasing the load on A from 11% to 46%. The data would have been more confusing if the improvements in that single batch were intermingled.
During retry probing IRP will delete Commit Control improvements with current bandwidth less than the configured threshold given by core.commit_control.agg_bw_min↓.


|