All Projects → scrapy → Parsel

scrapy / Parsel

Licence: other
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Parsel

Sirix
SirixDB is a temporal, evolutionary database system, which uses an accumulate only approach. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach called sliding snapshot.
Stars: ✭ 638 (+1.59%)
Mutual labels:  hacktoberfest, xml, xpath
Xquery
Extract data or evaluate value from HTML/XML documents using XPath
Stars: ✭ 155 (-75.32%)
Mutual labels:  xml, scraping, xpath
Htmlparser2
The fast & forgiving HTML and XML parser
Stars: ✭ 3,299 (+425.32%)
Mutual labels:  hacktoberfest, xml
Exist
eXist Native XML Database and Application Platform
Stars: ✭ 294 (-53.18%)
Mutual labels:  xml, xpath
Xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
Stars: ✭ 335 (-46.66%)
Mutual labels:  xml, xpath
XPath2.Net
Lightweight XPath2 for .NET
Stars: ✭ 26 (-95.86%)
Mutual labels:  xml, xpath
XPathTools
A Visual Studio Extension which can run any XPath and XPath function; navigates through results at the click of a button. Can show and copy any XPath incl. XML namespaces, avoiding XML namespace induced headaches. Keeps track of the current XPath via the statusbar.
Stars: ✭ 40 (-93.63%)
Mutual labels:  xml, xpath
Fluentdom
A fluent api for working with XML in PHP
Stars: ✭ 327 (-47.93%)
Mutual labels:  xml, xpath
Panther
A browser testing and web crawling library for PHP and Symfony
Stars: ✭ 2,480 (+294.9%)
Mutual labels:  hacktoberfest, scraping
Zek
Generate a Go struct from XML.
Stars: ✭ 451 (-28.18%)
Mutual labels:  hacktoberfest, xml
Camaro
camaro is an utility to transform XML to JSON, using Node.js binding to native XML parser pugixml, one of the fastest XML parser around.
Stars: ✭ 438 (-30.25%)
Mutual labels:  xml, xpath
Ferret
Declarative web scraping
Stars: ✭ 4,837 (+670.22%)
Mutual labels:  hacktoberfest, scraping
selectorlib
A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them
Stars: ✭ 53 (-91.56%)
Mutual labels:  scraping, xpath
Horaires Ratp Api
Webservice pour les horaires et trafic RATP en temps réel
Stars: ✭ 232 (-63.06%)
Mutual labels:  hacktoberfest, xml
Dita Ot
DITA Open Toolkit — the open-source XML publishing engine for content authored in the Darwin Information Typing Architecture.
Stars: ✭ 279 (-55.57%)
Mutual labels:  hacktoberfest, xml
Home
A configurable and eXtensible Xml serializer for .NET.
Stars: ✭ 208 (-66.88%)
Mutual labels:  hacktoberfest, xml
Spidermon
Scrapy Extension for monitoring spiders execution.
Stars: ✭ 309 (-50.8%)
Mutual labels:  hacktoberfest, scraping
Basex
BaseX Main Repository.
Stars: ✭ 515 (-17.99%)
Mutual labels:  xml, xpath
Configurate
A simple configuration library for Java applications providing a node structure, a variety of formats, and tools for transformation
Stars: ✭ 148 (-76.43%)
Mutual labels:  hacktoberfest, xml
Xbmc
Kodi is an award-winning free and open source home theater/media center software and entertainment hub for digital media. With its beautiful interface and powerful skinning engine, it's available for Android, BSD, Linux, macOS, iOS and Windows.
Stars: ✭ 13,175 (+1997.93%)
Mutual labels:  hacktoberfest, xml

====== Parsel

.. image:: https://img.shields.io/travis/scrapy/parsel/master.svg :target: https://travis-ci.org/scrapy/parsel :alt: Build Status

.. image:: https://img.shields.io/pypi/v/parsel.svg :target: https://pypi.python.org/pypi/parsel :alt: PyPI Version

.. image:: https://img.shields.io/codecov/c/github/scrapy/parsel/master.svg :target: https://codecov.io/github/scrapy/parsel?branch=master :alt: Coverage report

Parsel is a BSD-licensed Python_ library to extract and remove data from HTML_ and XML_ using XPath_ and CSS_ selectors, optionally combined with regular expressions_.

Find the Parsel online documentation at https://parsel.readthedocs.org.

Example (open online demo_):

.. code-block:: python

>>> from parsel import Selector
>>> selector = Selector(text=u"""<html>
        <body>
            <h1>Hello, Parsel!</h1>
            <ul>
                <li><a href="http://example.com">Link 1</a></li>
                <li><a href="http://scrapy.org">Link 2</a></li>
            </ul>
        </body>
        </html>""")
>>> selector.css('h1::text').get()
'Hello, Parsel!'
>>> selector.xpath('//h1/text()').re(r'\w+')
['Hello', 'Parsel']
>>> for li in selector.css('ul > li'):
...     print(li.xpath('.//@href').get())
http://example.com
http://scrapy.org

.. _CSS: https://en.wikipedia.org/wiki/Cascading_Style_Sheets .. _HTML: https://en.wikipedia.org/wiki/HTML .. _open online demo: https://colab.research.google.com/drive/149VFa6Px3wg7S3SEnUqk--TyBrKplxCN#forceEdit=true&sandboxMode=true .. _Python: https://www.python.org/ .. _regular expressions: https://docs.python.org/library/re.html .. _XML: https://en.wikipedia.org/wiki/XML .. _XPath: https://en.wikipedia.org/wiki/XPath

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].