Webinar: Top Tips for Effective Cyber Threat Hunting Watch Now
FlowTraq > Blog > Deep network visibility and security in AWS VPC

Deep network visibility and security in AWS VPC

Alex Barsamian
By | May 9, 2018

AWS VPC is (nearly) a wonder of the world

Most networking in the AWS VPC cloud is familiar and intuitive to network engineering veterans. And if you think about that, that’s kind of a miracle. All you have to do is pick a region, declare a VPC, and suddenly you have an internal IP space, Elastic IPs for communicating with the world, and virtual routers, firewalls, NAT devices, and more.

All these abstractions behave uncannily like the brick and mortar hardware they’re modeling, making them easy to reason about and design with. And interaction with them is neatly packaged up as APIs, command-line tools, and mostly-decent user interface, enabling network engineers to forget many of the arcane details they used to have to worry about (“how do I configure a VLAN on a Nexus 5000?”)

The black boxes are so nice and shiny that it’s easy to forget all the engineering under the hood. The trouble with black boxes, though, is that sometimes what’s inside surprises you. And sometimes you need a little more control than the knobs, switches, and ports on the outside of the box.

This is especially true in network security and visibility, where 1) your threat detection platforms and analysis tools absolutely need real-time, forensically-accurate network data, and 2) your analysts need access to first-class tools across the whole network, especially the stuff in the cloud.

The state of flow monitoring in AWS

Recognizing those two needs, about a year ago I released a free AWS Lambda tool to convert and forward Cloudwatch flow logs to Netflow v5. My honest hope was that people would use it to forward their VPC traffic from AWS to their favorite flow analysis platform and use that to do some good in the world (and perhaps that some of those might discover FlowTraq and find a new favorite).

Based on customer feedback, the biggest issue with this approach is the reliance on Cloudwatch logging. Cloudwatch flow logging is an AWS abstraction with some pretty serious limitations, the biggest one being that AWS only promises flow logs every ten minutes or so, and even then the promise isn’t a strong one.

Ten minutes’ lag time before detecting a DDoS attack or other security incident isn’t that far from not detecting it at all. It became clear we needed to take another approach to the AWS VPC security problem.

(Incidentally, while Google was late to the party having just rolled out their version of VPC flow logs a few weeks ago, they really showed Amazon up here, with flow updates every five seconds. I plan to discuss their offering in a future post.)

When the built-in options aren’t enough

If you’re committed to the AWS ecosystem, where do you go from here? Well, in a traditional network environment, if your existing hardware doesn’t support flow generation, you have a couple options:

  • If you can get a network tap in there, generate flow off of that.
  • If you can’t, put a device in-line and generate flow off of that. The simpler the device, the better; best case would be a simple layer 2 bridge doing Proxy ARP

Unfortunately, there’s no abstraction for a tap in the AWS VPC. So the first option is out.

As for the second option: it turns out you can’t build a working bridge in AWS, either. That marvelous simplicity-in-complexity that is AWS that was I referring to earlier? It means that some networking concepts don’t translate. In this case the concept that isn’t a perfect match for on-premise networking is ARP.

In an AWS VPC, when host “A” arps for host “B” the response doesn’t come from “B”. In fact, the initial request never even arrives at “B”. It’s captured and handled by something called the “AWS mapping service”. Simply put, there isn’t a traditional broadcast domain in AWS at all. No, not even between machines on the same VPC subnet!

(Want to learn more about the mapping service and other AWS engineering marvels? Check out Eric Brandwine’s talk “A Day in the Life of a Billion Packets” at AWS re:Invent 2013. It’s fascinating stuff!)

What this means is, to get real-time, forensically accurate, reliable flow out of AWS VPCs we actually need to roll our own router instance. Yes, really. It’s not as bad as it sounds, and it will start paying dividends in networking security right away.


Credit to:

Fill out the form below and we’ll give you access to the detailed How-To Document:

    Claim Your Free Trial

    Subscribe to our blog!

     Subscribe to monthly insider tips!*