Contents

Scrapy 2.14.1

0

A high-level Web Crawling and Web Scraping framework

A high-level Web Crawling and Web Scraping framework

Stars: 59753, Watchers: 59753, Forks: 11241, Open Issues: 645

The scrapy/scrapy repo was created 16 years ago and the last code push was 4 days ago.
The project is extremely popular with a mindblowing 59753 github stars!

How to Install scrapy

You can install scrapy using pip

pip install scrapy

or add it to a project with poetry

poetry add scrapy

Package Details

Author
None
License
None
Homepage
None
PyPi:
https://pypi.org/project/Scrapy/
Documentation:
https://docs.scrapy.org/
GitHub Repo:
https://github.com/scrapy/scrapy

Classifiers

  • Internet/WWW/HTTP
  • Software Development/Libraries/Application Frameworks
  • Software Development/Libraries/Python Modules
No  scrapy  pypi packages just yet.

Errors

A list of common scrapy errors.

Code Examples

Here are some scrapy code examples and snippets.

GitHub Issues

The scrapy package has 645 open issues on GitHub

  • Suggest the user to set FORCE_CRAWLER_PROCESS when needed
  • Docs: Document job directory contents in jobs.rst
  • Failed test_start_deprecated_super with -n auto just after git clone
  • Add reactorless mode docs
  • Add Request to Response lifecycle documentation
  • Get rid of get_event_loop()
  • Foundations for the reactorless mode.
  • Mark settings that are specific to the Twisted reactor
  • Add a plain asyncio code path to AsyncCrawlerProcess
  • Add a reactorless test env
  • Refactor the shell to support reactorless mode and/or not running in a thread
  • Add a setting for enabling the reactorless mode
  • Check is_reactorless() in reactor-dependent components
  • Change is_asyncio_available() and add is_reactorless()
  • Use asyncio.in_thread() when available in FilesPipeline storages

See more issues on GitHub

Related Packages & Articles

qiling 1.4.6

Qiling is an advanced binary emulation framework that cross-platform-architecture