This is an old copy of the Python FAQ. The information here may be outdated.

How do I get data out of HTML?

Try Beautiful Soup:

http://www.crummy.com/software/BeautifulSoup

Beautiful Soup is more forgiving than other parsers in that it won’t choke on bad markup.

If you want to parse HTML into a structure compatible with Python’s ElementTree library, you can use the ElementSoup adapter:

http://effbot.org/zone/element-soup.htm

CATEGORY: tutor

 

A Django site. rendered by a django application. hosted by webfaction.