PythonScrapyBasicSetupBasic setup with random user agents and IP addresses for Python Scrapy Framework.
Stars: ✭ 57 (+7.55%)
raspagem-de-dados-fatec📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí
Stars: ✭ 22 (-58.49%)
top-github-scraperScape top GitHub repositories and users based on keywords
Stars: ✭ 40 (-24.53%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+7592.45%)
reapr🕸→ℹ️ Reap Information from Websites
Stars: ✭ 14 (-73.58%)
HumanoidNode.js package to bypass CloudFlare's anti-bot JavaScript challenges
Stars: ✭ 88 (+66.04%)
WebhereHTML scraping for Objective-C.
Stars: ✭ 16 (-69.81%)
Detect CmsPHP Library for detecting CMS
Stars: ✭ 78 (+47.17%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+33.96%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+422.64%)
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+5850.94%)
XqueryExtract data or evaluate value from HTML/XML documents using XPath
Stars: ✭ 155 (+192.45%)
ParselParsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Stars: ✭ 628 (+1084.91%)
SqrapeSimple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
Stars: ✭ 144 (+171.7%)
PhpscraperPHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (+179.25%)
codechef-rank-comparatorWeb application hosted on Heroku cloud platform based on web scraping in python using lxml library (XML Path Language).
Stars: ✭ 23 (-56.6%)
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+1241.51%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-71.7%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+775.47%)
Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+350.94%)
iowebWeb Scraping Framework
Stars: ✭ 31 (-41.51%)
teleniumAutomation for Kivy Application
Stars: ✭ 56 (+5.66%)
socials👨👩👦 Social account detection and extraction in Python, e.g. for crawling/scraping.
Stars: ✭ 37 (-30.19%)
etf4u📊 Python tool to scrape real-time information about ETFs from the web and mixing them together by proportionally distributing their assets allocation
Stars: ✭ 29 (-45.28%)
faexportThe API for Furaffinity you wish existed
Stars: ✭ 61 (+15.09%)
scrapy-fieldstatsA Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-67.92%)
Neural-Scam-ArtistWeb Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Stars: ✭ 18 (-66.04%)
ArchiteuthisMITM HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.
Stars: ✭ 35 (-33.96%)
savedditBulk Downloader for Reddit
Stars: ✭ 130 (+145.28%)
oversmashOverwatch API library for player details and career stats
Stars: ✭ 42 (-20.75%)
scrapersscrapers for building your own image databases
Stars: ✭ 46 (-13.21%)
shorter.recipesA website dedicated to making recipes from any website easy to read.
Stars: ✭ 27 (-49.06%)
docker-selenium-lambdaThe simplest demo of chrome automation by python and selenium in AWS Lambda
Stars: ✭ 172 (+224.53%)
turtleInstagram Photo Downloader
Stars: ✭ 15 (-71.7%)
diffbot-php-client[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (+0%)
double-agentA test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (+132.08%)
Pythoncovers python basic to advance topics, practice questions, logical problems in python, web development using html, css, bootstrap, jquery, DOM, Django 🚀🚀. 💥 🌈
Stars: ✭ 29 (-45.28%)
Stock-Market-PredictorStock Market Predictor with LSTM network. Web scraping and analyzing tools (ohlc, mean)
Stars: ✭ 28 (-47.17%)
gochanges**[ARCHIVED]** website changes tracker 🔍
Stars: ✭ 12 (-77.36%)
codepen-puppeteerUse Puppeteer to download pens from Codepen.io as single html pages
Stars: ✭ 22 (-58.49%)
RARBG-scraperWith Selenium headless browsing and CAPTCHA solving
Stars: ✭ 38 (-28.3%)
web-poetWeb scraping Page Objects core library
Stars: ✭ 67 (+26.42%)
reason-rust-scraper🦀 Scraping & crawling websites using Rust, and ReasonML
Stars: ✭ 21 (-60.38%)
covid19br-pubProjeto de monitoramento de publicações oficiais relacionadas a COVID-19 no Brasil.
Stars: ✭ 12 (-77.36%)
TikTokDownloader PyWebIO🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音|TikTok数据爬取工具,支持API调用,在线批量解析及下载。
Stars: ✭ 919 (+1633.96%)
4catThe 4CAT Capture and Analysis Toolkit provides modular data capture & analysis for a variety of social media platforms.
Stars: ✭ 144 (+171.7%)
coreThe complete web scraping toolkit for PHP.
Stars: ✭ 1,110 (+1994.34%)
actor-content-checkerYou can use this act to monitor any page's content and get a notification when content changes.
Stars: ✭ 16 (-69.81%)
info-bot🤖 A Versatile Telegram Bot
Stars: ✭ 37 (-30.19%)
crawlzoneCrawlzone is a fast asynchronous internet crawling framework for PHP.
Stars: ✭ 70 (+32.08%)
exqueryEXQuery repository
Stars: ✭ 19 (-64.15%)
GoiratePillaging the seven seas for torrents, pieces of eight and other bounty.
Stars: ✭ 20 (-62.26%)
xpath2.jsxpath.js - Open source XPath 2.0 implementation in JavaScript (DOM agnostic)
Stars: ✭ 74 (+39.62%)