Welcome to Grab’s documentation!

Grab is python framework for building web scrapers. With Grab you can build web scrapers of various complexity: from simple 5-line scripts to complex asynchronous web-site crawlers processing millions of web pages. Grab provides API for performing network requests and for handling received content e.g. interacting with DOM tree of the HTML document.

There are two main parts in Grab library:

1) The single request/response API that allows you build network request, perform it and work with the received content. That API is a wrapper of the pycurl and lxml libraries.

2) The Spider API to build asynchronous web crawlers. You write class that define handlers for each type of network request. Each handler could spawn new network requests. Network requests are processed simultaneusly with a pool of asynchronous web sockets.

Table of Contents

Grab User Manual

API Reference

Using the API Reference you can get an overview of what modules, classes, and methods exist, what they do, what they return, and what parameters they accept.

Indices and tables