Firewall Wizards mailing list archives
Re: parsing logs ultra-fast inline
From: "Adrian Grigorof" <adi () grigorof com>
Date: Mon, 6 Feb 2006 23:08:59 -0500
To clarify this issue, I did not mean the total number of message types that
you can find in the documentation. I meant the message types that you find
in a typical log. While I was not thinking about a Cisco VPN concentrator in
particular when I posted my initial message, I've just ran a script against
a log from such device and there were 103 message types. I would say that's
closer to 50 than to 2049. And Cisco seems to be quite verbose when it comes
to logging. But I remember trying to get some information from a Nortel
Contivity VPN few years ago - in debug mode there were maybe 10 or 15
message types. Which one would you prefer?
Another example, Cisco Pix firewall - from all the logs that we gathered
from our customers, and there were quite a few, we found a little bit over
160 unique message types. If you check the documentation, you'll probably
find that in theory there could be thousands as well.
I am not a super-programmer but I did a little test. I took a Cisco VPN log
entry a measured how long it took me to write a regex for it. So for this:
2005-04-22 10:10:43 Local0.Notice 192.168.83.130 4721869 04/22/2005
10:10:43.280 SEV=4 AUTH/23 RPT=1560 192.168.176.104 User [192.168.176.104]
Group [192.168.176.104] disconnected: duration: 11:36:49
it took me 2 minutes to write this:
(\d{4})-(\d{2})-(\d{2}) (\d\d:\d\d:\d\d)\t(.*?)\t(.*?)\t(.*?) (.*?) (.*?)
SEV=(.*?) (.*?) RPT=(.*?) (.*?) User \[(.*?)\] Group \[(.*?)\] (.*?):
duration: (.*)
A regex that would capture the information applicable to a AUTH/23 message
type. And I can say that it was more a typing skills issue rather then one
of regex knowledge. Now, 103 unique message types for a Cisco VPN x 2 min =
206 min or approx. 3.5 hours. But let's make it 35 hours... it still just
few days of work for a medium level programmer to generate the regex for the
most common messages from a Cisco VPN. You want to do all 2049 of them? Do
the math - 8.5 days (8 hours per day) but let's be a good boss, give the
programmer a whole month! Better still, you can outsource it to Eastern
Europe or India. The good part, you only need to do it once.
So let's face it, for practical purposes, the number of message types is not
an issue, not if you are willing to hardcode your program for a particular
device. What is an issue is the size of the logs - and yes, if you have to
analyze large log files, each with a large number of message types it may
require all sorts of tricks to do the job, including the type of parallel
processing that Marcus was mentioning. But what choice do you have when this
is the case? For performance reasons, you have to hardcode the parsing of
each message type. More than that, you have to hardcode them so the most
common ones are checked first.
That being said, interpreting the results of parsing is as hard for small
logs as it is for large logs and much harder than writing some regular
expressions. And that was the point of my initial post, there were several
discussions about how to parse the logs but hardly anything about what to do
with the results.
I remember the
http://airsnarf.shmoo.com/pipermail/loganalysis/2005-December/002906.html
thread - there were a couple of interesting messages, mostly about the
inability to correlate logs from various sources. I couldn't see though how
that was a log parsing problem - and in fact hardly anybody complained about
being unable to parse the logs. Instead, the most common issue was the
(in)ability to extract useful data from these logs.
Regards,
Adrian Grigorof
Altair Technologies
www.altairtech.ca
www.eventid.net
----- Original Message -----
From: "Anton Chuvakin" <anton () chuvakin org>
To: "Adrian Grigorof" <adi () grigorof com>;
<firewall-wizards () honor icsalabs com>
Sent: Monday, February 06, 2006 17:05
Subject: Re: [fw-wiz] parsing logs ultra-fast inline
All,
While I am preparing to enter this discussion in full force :-), I
figured I'd shoot a quick one on this:
meaning. Take Tina's VPN example - how many types of log entries you would expect from a VPN concentrator? From my experience, not more than 20 but let's assume there are 50. Give a sample from each entry to a Perl
He-he, no :-) I just looked at the old documentation bundle of Cisco VPN 3000 messages and its nowhere near the above. How about 2049 unique messages documented by Cisco? Parsing IS often a challenge, e.g. see this and the discussion that ensued: http://airsnarf.shmoo.com/pipermail/loganalysis/2005-December/002906.html Syslog is where it becomes just plain extreme (50,000 message types anybody?), as Marcus pointed out, but there are some other fun areas where it is tough. Best, -- Anton Chuvakin, Ph.D., GCIA, GCIH, GCFA http://www.chuvakin.org http://www.securitywarrior.com _______________________________________________ firewall-wizards mailing list firewall-wizards () honor icsalabs com http://honor.icsalabs.com/mailman/listinfo/firewall-wizards
Current thread:
- Re: parsing logs ultra-fast inline, (continued)
- Re: parsing logs ultra-fast inline Chuck Swiger (Feb 02)
- RE: parsing logs ultra-fast inline Tina Bird (Feb 02)
- Re: parsing logs ultra-fast inline Adrian Grigorof (Feb 03)
- Re: parsing logs ultra-fast inline Chuck Swiger (Feb 07)
- Re: parsing logs ultra-fast inline Marcus J. Ranum (Feb 07)
- Re: parsing logs ultra-fast inline Brian Loe (Feb 08)
- Message not available
- Re: parsing logs ultra-fast inline Marcus J. Ranum (Feb 08)
- Re: parsing logs ultra-fast inline John Adams (Feb 09)
- Re: parsing logs ultra-fast inline Adrian Grigorof (Feb 03)
- RE: parsing logs ultra-fast inline Paul Melson (Feb 15)
- Re: parsing logs ultra-fast inline Anton Chuvakin (Feb 07)
- Re: parsing logs ultra-fast inline Adrian Grigorof (Feb 07)
- Re: parsing logs ultra-fast inline Patrick M. Hausen (Feb 07)
- RE: parsing logs ultra-fast inline Tina Bird (Feb 07)
