2:57AM on a Monday. I have to be up at 8AM. The faster I get the job done the more sleep I get. Sounds like the kind of thing to motivate a person.

TASK: Parse an access.log file and produce page visit trace for each visitor. Ex:

11.22.33.90 on Monday at 3pm   (Montreal, Firefox 4, on Mac OS X):
  /contents          (stayed for 3 secs)
  /derivatives       (stayed for 2m20sec)
  /contents          (6 secs)
  /derivative_rules  (1min)
  /derivative_formulas  (2min)
  end

I had already found some access.log parsing code,  and setup a processing pipeline from last time I wanted to work on this. Here is what we have so far.

3:45AM. Here is the plan. All the log entries are in a list called entries, which I will now sort and split by IP.

4:15AM. Done. Though I have to cleanup the output some more.

Leave a Reply

Your email address will not be published. Required fields are marked *