FlowTraq > Blog > Network Security and Visibility > Optimizing Your Netflow Analysis System to Maximize Security Detection and Query Performance

Optimizing Your Netflow Analysis System to Maximize Security Detection and Query Performance

Dr. Vincent Berk
By | August 19, 2015


Show me a security analyst who doesn’t want fast answers.

If your security operations center is under attack right now, or you want to perform immediate queries on stored network traffic flow records to figure out the source or effects of an attack, performance is essential.

Network traffic flow analysis and security detection can be most effective – and extremely fast – when set up in a parallel processing server environment. Fundamentally, FlowTraq’s architecture allows you to handle an unlimited incoming flow rate by adapting the architecture to the available environment.

If the underlying server architecture is not properly configured, a security analyst who uses FlowTraq may experience longer query times and sluggish performance. Most of these limitations are a direct result of the realities of modern hardware platforms – and it helps to be aware of them so you can avoid running into them. Here are some limits you may be faced with when building your FlowTraq cluster and how to work around those limits:

25K Flows Per Second (fps)
FlowTraq recommends that a server with 8 cores and 64GB of RAM handle no more than 25,000 flow updates per second of peacetime traffic. Although modern hardware is capable of handling more flow updates (single servers are reportedly capable of 10x that level), there comes a point where an analyst feels the system becomes “too slow” for analysis tasks. Specifically, as forensic recall into a flow history is limited by the IO subsystem, 25Kfps is a reasonable rate for a medium-grade server. The easiest way to handle a higher flow update rate is to build a cluster of FlowTraq systems, each capable of 25Kfps.

Since unpacking and processing flow records takes a fixed amount of CPU time, there is a limit where an individual thread running on a single CPU core can no longer keep up – and it starts dropping records. This limit on average processors lies around 100Kfps, ranging toward 200Kfps for more powerful server processors. Since the operating system needs time to receive a NetFlow packet and put it in a queue to the FlowTraq application, it can be difficult to avoid this limit. The easiest trick is to open multiple ports for flow ingress and direct different exporters toward different ports. Each port will be handled by a separate thread, avoiding packet drops.

When a single FlowTraq portal is used to receive and re-distribute flow records to workers, it functions as a smart load balancer for the FlowTraq cluster. At 800,000 flow update records per second, the ingress and egress of flow records starts to approach 1Gbps. This means that a single 1Gbps network card would saturate and records would start to drop. To avoid this bottleneck, you can either use multiple 1Gbps network cards, or move to 10Gbps networking hardware.

When using 10 or 12 NetFlow ingress ports for a FlowTraq portal, some inherent limits in computer hardware start to become apparent. Contention on IO resources for the network hardware, as well as multiple CPU cores attempting to access RAM put a fuzzy limit on the maximum amount of flow a single FlowTraq portal can distribute. This limit lies between 1.2M and 3Mfps for the most modern hardware. Working around this limitation is straightforward: use multiple load balancing portals, and this limit disappears.

At 2,000,000 flow updates per second moving through the network cards, we observe a data rate limit in the PCI2 communication system. Adding network hardware does not alleviate this limit, as the PCI bus is the common shared resource for all IO, and this limit is unavoidable. Thankfully, the PCI3 standard is pretty ubiquitous right now – in our lab, we zoom right past the 2Mfps limit on a single distribution portal.

At 3,200,000 flow updates per second we have reached the limit of what a single-level FlowTraq cluster using average hardware is recommended to handle. At full fidelity, NetFlow reporting a 3.2Mfps flow rate represents over 3TB per second of network traffic flowing through your network.

(That’s a lot.)

Breaking through this limit and handling a higher flow rate with FlowTraq is actually rather trivial: add a layer to your cluster, or consider the use of multiple distribution portals, possibly geographically distributed, each with their own analysis worker cluster.

The bottom line
A single pane-of-glass view of very large networks is extremely powerful. And FlowTraq offers you this view, regardless of the size of your network.

Today we take advantage of multi-core processors, optimized data storage and clustering to provide the highest-performance, most scalable NetFlow processing available. As hardware gets faster, we will see real-time detection speed continue to fly, as well as the ability to run fast queries over longer periods of stored network traffic records. FlowTraq remains scalable, regardless of how large your traffic volumes grow, especially when provisioned with the underlying clustered servers as recommended.

Want to learn more? Get in touch or take FlowTraq for a test drive with a 14-day free trial.