nanog mailing list archives

Re: Real world sflow vs netflow?


From: Peter Phaal <peter.phaal () gmail com>
Date: Fri, 13 Jul 2012 13:20:45 -0700

Hi David,

The main architectural difference between sFlow and Netflow is the
location of the flow cache:

1. NetFlow: Packets are decoded on the router, flow keys are extracted
and used to lookup/create an entry in a flow cache which is then
updated based on values in the packet. Records are exported from the
flow cache in the form of Netflow datagrams when the flow completes or
based on a timeout.
2. sFlow: Packets are randomly sampled in hardware and the packet
headers are immediately exported as sFlow datagrams - there is no flow
cache on the switch/router. In addition to exporting the packet
header, the sFlow agent captures the FIB state associated with
forwarding the sampled packet, exporting information such as next hop
router, AS-path, communities etc. An sFlow agent also periodically
sends all the MIB-II interface counters, eliminating the need for SNMP
polling - this isn't very important if you are only monitoring a few
links, but makes a big difference if you are monitoring large chassis
switches or tens or hundreds of thousands of ports in a data center or
campus environment.

Moving the flow cache off the router has a number of benefits:
1. You are no longer limited by the hardware/firmware capabilities of
the router - your analysis software decides which fields to decode and
how to accumulate results. For example, if you are managing a mixed
IPv4/IPv6 environment you can decide to use sFlow to look into v6 over
v4 and v4 over v6 tunnels (to do the same thing with Netflow would
likely require a hardware upgrade). You can even feed sFlow into
Wireshark for detailed analysis of protocols and packet headers.
2. Operational complexity is greatly reduced since the configuration
options and resource management issues associated with the flow cache
are eliminated.
3. Low latency. Measurements aren't delayed by the flow cache - you
can detect DDoS attacks/large flows within seconds.
4. Scalability - you can turn on sFlow on every link (even 100G
links), on every device for a comprehensive view of traffic.
5. Multi-vendor interoperability. The sFlow measurements are
interoperable across vendors (since very little processing is
performed on the devices). With NetFlow, different vendors and devices
have different hardware limitations affecting the fields that they can
export.

Unsampled Netflow is only practical for moderate traffic levels. If
you carry significant traffic you would want to enable sampling
anyway, even with Netflow. However, there are a wide range of Netflow
sampling implementations, many of which yield questionable results. In
contrast, the sFlow standard specifies how sampling must be performed
and ensures that information is included that allows the sampled data
to be correctly scaled and produce unbiased measurements.

Cheers,
Peter

On Fri, Jul 13, 2012 at 10:30 AM, David Hubbard
<dhubbard () dino hostasaurus com> wrote:
Can anyone on or off list give me some real world
thoughts on sflow vs netflow for border
routers? (multi-homed, BGP, straight v4 & v6 only
for web hosting, no mpls, vpns, vlans, etc.)

Finding it hard to decipher the vendor version
of the answer to that question.  We use
netflow v9 currently but are considering hardware
that would be sflow.  We don't use it for
billing purposes, mostly for spotting malicious
remote hosts doing things like scans, spotting
traffic such as weird ports in use in either
direction that warrant further investigation,
watching for ddos/dos destinations to act on
mitigation, or investigating the nature of unusual
levels of traffic on switch ports that set off
alarms.  I'm concerned things like port scans,
etc. won't be picked up by the NMS if fed by
sflow due to the sampling nature, or similar
concern if 500 ssh connections by the same remote
host are sampled as 1 connection, etc.  Of course
these concerns were put in my head by someone
interested in me continuing to use equipment that
happens to output netflow data, hence me wanting some
real people answers. :-)

Thanks!




Current thread: