Grab is site scraping framework. Grab could be used for:
- website data mining
- work with network API
- automation of actions performed on websites i.e. creation of profile on some site
Grab consists of following parts:
- Grab interface for creating network requests and working with results of these requests. This interface is good for simple scripts where is no need in multithreading.
- Grab::Spider interface which allows to develop complex multithreaded asynchronous site scrapers. This interface has two main benefits:
- It restrict you spider to have clean structure
- It allows to perform multiple concurrent requests without big CPU/memory consumption
TODO:
* Работа с прокси
* Утилиты:
* process_links
* process_next_page
* inc_count/add_item/save_list/render_stats/save_all_lists
* process_object_image
If you are looking for information on a specific function, class or method, this part of the documentation is for you.
Base Interface
Extensions:
Tools: