ParselParsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Stars: ✭ 628 (+1084.91%)
Mutual labels: scraping, xpath
SqrapeSimple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
Stars: ✭ 144 (+171.7%)
Mutual labels: scraping, web-scraping
WebhereHTML scraping for Objective-C.
Stars: ✭ 16 (-69.81%)
Mutual labels: scraping, xpath
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+422.64%)
Mutual labels: scraping, web-scraping
reapr🕸→ℹ️ Reap Information from Websites
Stars: ✭ 14 (-73.58%)
Mutual labels: web-scraping, xpath
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+7592.45%)
Mutual labels: scraping, web-scraping
HumanoidNode.js package to bypass CloudFlare's anti-bot JavaScript challenges
Stars: ✭ 88 (+66.04%)
Mutual labels: scraping, web-scraping
top-github-scraperScape top GitHub repositories and users based on keywords
Stars: ✭ 40 (-24.53%)
Mutual labels: scraping, web-scraping
Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+350.94%)
Mutual labels: scraping, web-scraping
XqueryExtract data or evaluate value from HTML/XML documents using XPath
Stars: ✭ 155 (+192.45%)
Mutual labels: scraping, xpath
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+5850.94%)
Mutual labels: scraping, web-scraping
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+1241.51%)
Mutual labels: scraping, web-scraping
raspagem-de-dados-fatec📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí
Stars: ✭ 22 (-58.49%)
Mutual labels: scraping, web-scraping
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+775.47%)
Mutual labels: scraping, web-scraping
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-71.7%)
Mutual labels: scraping, web-scraping
Detect CmsPHP Library for detecting CMS
Stars: ✭ 78 (+47.17%)
Mutual labels: scraping, web-scraping
codechef-rank-comparatorWeb application hosted on Heroku cloud platform based on web scraping in python using lxml library (XML Path Language).
Stars: ✭ 23 (-56.6%)
Mutual labels: web-scraping, xpath
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+33.96%)
Mutual labels: scraping, web-scraping
PhpscraperPHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (+179.25%)
Mutual labels: scraping, web-scraping
PythonScrapyBasicSetupBasic setup with random user agents and IP addresses for Python Scrapy Framework.
Stars: ✭ 57 (+7.55%)
Mutual labels: scraping, web-scraping