The bz2 Module

(New in 2.3) The bz2 module provides tools for bzip2 compression, known from the tool with the same name. This compression format is based on the Burrows-Wheeler block sorting algorithm, combined with Huffman coding. The bzip2 algorithm is usually a bit more efficient than the more commonly used zlib/deflate format (usually around 10%).

To compress data in a string, use the compress function. It returns an 8-bit string containing compressed data. To get the original data back, use the decompress function:

Example: Using the bz2 module
# File:

import bz2

MESSAGE = "the meaning of life"

compressed_message = bz2.compress(MESSAGE)
decompressed_message = bz2.decompress(compressed_message)

print "original:", repr(MESSAGE)
print "compressed message:", repr(compressed_message)
print "decompressed message:", repr(decompressed_message)
$ python
original: 'the meaning of life'
compressed message: 'BZh91AY&SY\xcb\x18\xf4\x9e\x00\x00\t\x11
\x80@\x00#\xe7\x84\x00 \x00"\x8d\x94\xc3!\x03@\xd0\x00\xfb\xf6
decompressed message: 'the meaning of life'

(Note that for very short strings like this, the compressed byte stream is actually larger than the original string.)

The module also provides BZ2Compressor and BZ2Decompressor classes, which support incremental compression and decompression. In the following example, the string is compressed word by word, and then decompressed by a single call to the decompress function:

Example: Using the bz2 module for incremental compression
# File:

import bz2

text = "the meaning of life"
data = ""

comp = bz2.BZ2Compressor()

for word in text.split():
    data += comp.compress(word + " ")

data += comp.flush()

print repr(bz2.decompress(data))
$ python
'the meaning of life '

The module also makes it easy to read and write compressed files. The BZ2File function is similar to open, but automatically compresses data on the way in (or out).

Example: Using the bz2 module for stream compression
# File:

import bz2

file = bz2.BZ2File("samples/sample.bz2", "r")

for line in file:
    print repr(line)
$ python
'We will perhaps eventually be writing only small\n'
'modules which are identified by name as they are\n'
'used to build larger ones, so that devices like\n'

A Django site. rendered by a django application. hosted by webfaction.