A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.

Stars: ✭ 71 (+33.96%)

Mutual labels: scraping, web-scraping

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (+422.64%)

Mutual labels: scraping, web-scraping

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+5850.94%)

Mutual labels: scraping, web-scraping

Xquery

Extract data or evaluate value from HTML/XML documents using XPath

Stars: ✭ 155 (+192.45%)

Mutual labels: scraping, xpath

Parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Stars: ✭ 628 (+1084.91%)

Mutual labels: scraping, xpath

Sqrape

Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)

Stars: ✭ 144 (+171.7%)

Mutual labels: scraping, web-scraping

Phpscraper

PHP Scraper - an highly opinionated web-interface for PHP

Stars: ✭ 148 (+179.25%)

Mutual labels: scraping, web-scraping

codechef-rank-comparator

Web application hosted on Heroku cloud platform based on web scraping in python using lxml library (XML Path Language).

Stars: ✭ 23 (-56.6%)

Mutual labels: web-scraping, xpath

trafilatura

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Stars: ✭ 711 (+1241.51%)

Mutual labels: scraping, web-scraping

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-71.7%)

Mutual labels: scraping, web-scraping

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (+775.47%)

Mutual labels: scraping, web-scraping

Scrape Linkedin Selenium

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

Stars: ✭ 239 (+350.94%)

Mutual labels: scraping, web-scraping

ioweb

Web Scraping Framework

Stars: ✭ 31 (-41.51%)

Mutual labels: scraping, web-scraping

BookingScraper

🌎 🏨 Scrape Booking.com 🏨 🌎

Stars: ✭ 68 (+28.3%)

Mutual labels: web-scraping

telenium

Automation for Kivy Application

Stars: ✭ 56 (+5.66%)

Mutual labels: xpath

socials

👨‍👩‍👦 Social account detection and extraction in Python, e.g. for crawling/scraping.

Stars: ✭ 37 (-30.19%)

Mutual labels: scraping

etf4u

📊 Python tool to scrape real-time information about ETFs from the web and mixing them together by proportionally distributing their assets allocation

Stars: ✭ 29 (-45.28%)

Mutual labels: scraping

faexport

The API for Furaffinity you wish existed

Stars: ✭ 61 (+15.09%)

Mutual labels: web-scraping

scrapy-fieldstats

A Scrapy extension to log items coverage when the spider shuts down

Stars: ✭ 17 (-67.92%)

Mutual labels: scraping

Neural-Scam-Artist

Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.

Stars: ✭ 18 (-66.04%)

Mutual labels: web-scraping

Architeuthis

MITM HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.

Stars: ✭ 35 (-33.96%)

Mutual labels: scraping

saveddit

Bulk Downloader for Reddit

Stars: ✭ 130 (+145.28%)

Mutual labels: web-scraping

oversmash

Overwatch API library for player details and career stats

Stars: ✭ 42 (-20.75%)

Mutual labels: scraping

scrapers

scrapers for building your own image databases

Stars: ✭ 46 (-13.21%)

Mutual labels: scraping

shorter.recipes

A website dedicated to making recipes from any website easy to read.

Stars: ✭ 27 (-49.06%)

Mutual labels: scraping

uiautomatorview

给uiautomatorview添加xpath等待

Stars: ✭ 45 (-15.09%)

Mutual labels: xpath

docker-selenium-lambda

The simplest demo of chrome automation by python and selenium in AWS Lambda

Stars: ✭ 172 (+224.53%)

Mutual labels: scraping

turtle

Instagram Photo Downloader

Stars: ✭ 15 (-71.7%)

Mutual labels: scraping

diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

Stars: ✭ 53 (+0%)

Mutual labels: scraping

NBA-Fantasy-Optimizer

NBA Daily Fantasy Lineup Optimizer for FanDuel Using Python

Stars: ✭ 21 (-60.38%)

Mutual labels: scraping

double-agent

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

Stars: ✭ 123 (+132.08%)

Mutual labels: scraping

Python

covers python basic to advance topics, practice questions, logical problems in python, web development using html, css, bootstrap, jquery, DOM, Django 🚀🚀. 💥 🌈

Stars: ✭ 29 (-45.28%)

Mutual labels: web-scraping

Stock-Market-Predictor

Stock Market Predictor with LSTM network. Web scraping and analyzing tools (ohlc, mean)

Stars: ✭ 28 (-47.17%)

Mutual labels: web-scraping

gochanges

**[ARCHIVED]** website changes tracker 🔍

Stars: ✭ 12 (-77.36%)

Mutual labels: scraping

codepen-puppeteer

Use Puppeteer to download pens from Codepen.io as single html pages