Common Log Format

March 2004 | Fredrik Lundh

Here’s a simple regular expression that can be used to parse server log files, in the Common Log Format.

 
p = re.compile(
    '([^ ]*) ([^ ]*) ([^ ]*) \[([^]]*)\] "([^"]*)" ([^ ]*) ([^ ]*)'
    )

for line in file.readlines():
    m = p.match(line)
    if not m:
        continue
    host, ignore, user, date, request, status, size = m.groups()
    ...

Here’s a variation that parses the Extended Common Log Format, which contains additional referrer and user-agent fields.

 
p = re.compile(
    '([^ ]*) ([^ ]*) ([^ ]*) \[([^]]*)\] "([^"]*)" ([^ ]*) ([^ ]*)'
    ' "([^"]*)" "([^"]*)"' # extensions
    )


for line in file.readlines():
    m = p.match(line)
    if not m:
        continue
    host, ignore, user, date, request, status, size,
        referer, agent = m.groups()
    ...

A Django site. rendered by a django application. hosted by webfaction.