scrapy / W3lib
Licence: other
Python library of web-related functions
Stars: ✭ 328
Programming Languages
python
139335 projects - #7 most used programming language
===== w3lib
.. image:: https://secure.travis-ci.org/scrapy/w3lib.png?branch=master :target: http://travis-ci.org/scrapy/w3lib
.. image:: https://img.shields.io/codecov/c/github/scrapy/w3lib/master.svg :target: http://codecov.io/github/scrapy/w3lib?branch=master :alt: Coverage report
Overview
This is a Python library of web-related functions, such as:
- remove comments, or tags from HTML snippets
- extract base url from HTML snippets
- translate entites on HTML strings
- convert raw HTTP headers to dicts and vice-versa
- construct HTTP auth header
- converting HTML pages to unicode
- sanitize urls (like browsers do)
- extract arguments from urls
Requirements
Python 2.7 or Python 3.5+
Install
pip install w3lib
Documentation
See http://w3lib.readthedocs.org/
License
The w3lib library is licensed under the BSD license.
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].