
Scrapy 2.14.1
0
A high-level Web Crawling and Web Scraping framework
Contents
A high-level Web Crawling and Web Scraping framework
Stars: 59753, Watchers: 59753, Forks: 11241, Open Issues: 645The scrapy/scrapy repo was created 16 years ago and the last code push was 4 days ago.
The project is extremely popular with a mindblowing 59753 github stars!
How to Install scrapy
You can install scrapy using pip
pip install scrapy
or add it to a project with poetry
poetry add scrapy
Package Details
- Author
- None
- License
- None
- Homepage
- None
- PyPi:
- https://pypi.org/project/Scrapy/
- Documentation:
- https://docs.scrapy.org/
- GitHub Repo:
- https://github.com/scrapy/scrapy
Classifiers
- Internet/WWW/HTTP
- Software Development/Libraries/Application Frameworks
- Software Development/Libraries/Python Modules
Related Packages
Errors
A list of common scrapy errors.
Code Examples
Here are some scrapy code examples and snippets.
GitHub Issues
The scrapy package has 645 open issues on GitHub
- Suggest the user to set FORCE_CRAWLER_PROCESS when needed
- Docs: Document job directory contents in jobs.rst
- Failed test_start_deprecated_super with -n auto just after git clone
- Add reactorless mode docs
- Add Request to Response lifecycle documentation
- Get rid of get_event_loop()
- Foundations for the reactorless mode.
- Mark settings that are specific to the Twisted reactor
- Add a plain asyncio code path to AsyncCrawlerProcess
- Add a reactorless test env
- Refactor the shell to support reactorless mode and/or not running in a thread
- Add a setting for enabling the reactorless mode
- Check
is_reactorless()in reactor-dependent components - Change
is_asyncio_available()and addis_reactorless() - Use
asyncio.in_thread()when available inFilesPipelinestorages
pythonfix







