Duplication of NetFlow occurs when the NetFlow records about the same flow are exported multiple times to a NetFlow collector. In turn, the volume of network traffic increases and the bandwidth may be depleted by NetFlow traffic as the same copies of NetFlow records are transferred towards a single or multiple collectors. The collector’s resources (CPU, RAM) are consumed as well as it must process the same flow multiple times without performance degradation. And what is worse is that the collected data is not accurate as it contains duplicate entries. It results in inaccurate reports that cannot be trusted. That being said, flow deduplication is a must and should be supported by flow collectors.
|Note: Volume of the exported flow information increases with the higher NetFlow versions resulting in higher bandwidth consumption.|
Flow Duplication Caused by Incorrect NetFlow Configuration
The most common mistakes that cause flow duplication are the configuration errors. Let’s look at the R1 router with flow export configured for the interface Gi0/0 in the ingress direction and for the interface Gi0/1 in the egress direction (Picture 1). If a packet is sent from the host A to the host B, R1 exports the same flows (A to B) to a collector twice. However, a flow from B to A is not matched at all. Therefore, all the interfaces on the R1 router should be configured to collect flows only in one direction, either ingress or egresses. Hence, we need to change the direction to ingress for the interface Gi0/1 so both flows A->B and B->A are properly matched and no flow duplication occurs.
Picture 1: Flow Export to Collector is Duplicated When Collecting in Mixed Directions
|Note: In order to select whether to collect flows either in the ingress or egress direction, we should be aware that both methods have their own pros and cons. For instance, ingress export includes blocked traffic. Moreover, Netflow was originally supported only on ingress direction. If egress export is used, traffic destined for multiple interfaces (multicast) is exported as different flows.|
Flow Duplication Caused by Export from Multiple Exporters
Example of flow duplication caused by the export of the same flow from multiple exporters is depicted in Picture 2. Flow export is enabled on routers Exporter1 (Gi0/1), Exporter2 (Gi0/1) and Exporter3 (Gi0/2) in the egress direction. The NetFlow configuration for all devices is shown below:
ip flow-export source GigabitEthernet0/1 ip flow-export version 5 ip flow-export destination 192.168.4.2 2055 interface GigabitEthernet0/1 ip flow egress
ip flow-export source GigabitEthernet0/1 ip flow-export version 5 ip flow-export destination 192.168.3.2 2055 interface GigabitEthernet0/2 ip flow egress
ip flow-export source GigabitEthernet0/2 ip flow-export version 5 ip flow-export destination 192.168.5.2 2055 interface GigabitEthernet0/0 ip flow egress
Picture 2: Network Topology with a NetFlow Exporter, Samplicator and Two Collectors
Traffic between the PC1 and PC2 passes all three exporters so the collector receives flow records about the same flow from three sources. The problem can be solved with the use of either an automatic or manual flow deduplication.
Automatic Flow Deduplication
One of the mandatory features that the NetFlow server should offer is the ability to remove duplicate flows automatically. This can be done based on the nexthop information that is carried inside the exported flow records (Picture 3). It is basically the IP address of the next-hop router. When an exporter sends a flow, and this flow includes an IP address of another exporter as the next hop information, then the flow will be skipped by a NetFlow collector.
Picture 3: Flow Records Sent From Exporter1 Includes IP Address of the Next-Hop Router
Below is the list of IP addresses of exporters and the appropriate next-hop routers. Flow records received from the flow exporters Exporter 1 and 2 will be ignored by the collector as they contain the IP addresses of the exporters. Only the flow records exported by the Exporter3 are accepted by the collector because they contain the next-hop IP address (172.16.2.1).
IP addresses of exporters and their next-hops
Exporter 1: 192.168.1.1 – next-hop: 192.168.1.254
Exporter 2: 192.168.1.254 – next-hop: 192.168.2.1
Exporter 3: 192.168.2.1 – next-hop: 172.16.2.1
In order to achieve automatic deduplication, all devices in the flow chain must be configured for flow export. If not, a collector cannot correctly deduplicate the received flows even if the automatic deduplication is enabled. Let’s examine a scenario where NetFlow is not enabled on the Exporter2 and both routers Exporter1 and Exporter3 are configured for NetFlow. In this case, flow records are accepted by the collector and the duplication occurs. How is it possible? Flow records from Exporter1 contain the next-hop IP address of the router Exporter2 (192.168.1.254) but the Exporter2 is not configured for the flow export. Therefore, the collector accepts the received records from Exporter1 unconditionally. Similarly, the collector accepts the flow records from Exporter3 as the IP address of the next-hop router (172.16.2.1) does not match the IP of any known exporter. As a result, the flow is duplicated by two exporters.
In certain scenarios, automatic deduplication cannot be done or it is not desirable. This might be a case when a device in a flow chain does not support NetFlow or it is not convenient to collect flows from all devices. In this case, we have to rely on manual deduplication.
Manual Flow Deduplication
Ideally, we should collect and export flows from a single centralized device where all traffic flows through. Although flow records are not duplicated, we still need to collect flow records in one direction to avoid flow duplication. Important to mention that some networks are huge in terms of size and complexity so we need to collect flows from multiple locations. Therefore, manual flow deduplication is the feature that a NetFlow server should support. It is basically a filter enabled by an operator on the server, based on one parameter or a combination of parameters such as an IP address of the exporter, an IP address of the next-hop router or the exporter’s interface.
Flow deduplication is a real must in complex networks where flows are exported from different devices. The main benefit of automatic deduplication is the ease of configuration, enabled one time only. The big disadvantage here is the need to keep flow continuity, requiring NetFlow configuration on all devices. It might be perceived negatively by network operators as it consumes exporters’ and the collector’s resources (CPU and RAM) and increases the amount of network traffic. Deduplication can also be achieved manually, filtering flows based on the IP addresses of exporters, next-hop IP addresses or an interface. It, however, includes a manual filter configuration, requiring a deeper knowledge of the managed network.