Tuesday, June 5, 2012

Apache grep big log file

I need to parse Apache log file to look for specific suspicious patterns (like SQL injections).

For example I'm looking for id='%20or%201=1;

I am using grep to check the log file for this pattern (and others) and because these logs are huge it takes a long amount of time

Here my command:

grep 'id=' Apache.log | egrep "' or|'%20"

Is there a better or a faster method or command I need use to make the search faster?

Source: Tips4all


  1. For starters, you don't need to pipe your grep output to egrep. egrep provides a superset of grep's regular expression parsing, so you can just do this:

    egrep "id='( or|%20)'" apache.log

    Calling egrep is identical to calling grep -E.

    That may get you a little performance increase. If you can look for fixed strings rather than regular expressions, that might also help. You can tell grep to look for a fixed string with the -F option:

    grep -F "id='%20or" apache.log

    But using fixed strings you lose a lot of flexibility.

  2. I assume most of your time is spent while getting the data from disk (CPU usage is not maxed out). Then you can't optimize a query. You could try to only log the interesting lines in a seperate file though....

  3. Are you looking for grep -E "id=(' or|'%20)" apache.log ?