Scrapy
Scrapy
Python web-crawling framework
Scrapy (/ˈskreɪpaɪ/[2] SKRAY-peye) is a free and open-source web-crawling framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler.[3] It is currently maintained by Zyte (formerly Scrapinghub), a web-scraping development and services company.
Scrapy project architecture is built around "spiders", which are self-contained crawlers that are given a set of instructions. Following the spirit of other don't repeat yourself frameworks, such as Django,[4] it makes it easier to build and scale large crawling projects by allowing developers to reuse their code.
Some well-known companies and products using Scrapy are: Lyst,[5][6] Parse.ly,[7] Sayone Technologies,[8] Sciences Po Medialab,[9] Data.gov.uk’s World Government Data site.[10]