Linkedin Profile Scraper🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (+189.83%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+1237.29%)
CollyElegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+26230.51%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+8098.31%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+645.76%)
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (-13.56%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+6810.17%)
browser-automation-apiBrowser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other things like capture a screenshot, generate pdf, extract content or execute custom Puppeteer, Playwright functions.
Stars: ✭ 24 (-59.32%)
Instagram BotAn Instagram bot developed using the Selenium Framework
Stars: ✭ 138 (+133.9%)
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+5245.76%)
diffbot-php-client[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (-10.17%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-11.86%)
Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+305.08%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+672.88%)
ScrapyrtHTTP API for Scrapy spiders
Stars: ✭ 637 (+979.66%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+20.34%)
ScrapyScrapy, a fast high-level web crawling & scraping framework for Python.
Stars: ✭ 42,343 (+71667.8%)
SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (+98.31%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (+384.75%)
Webstera reliable high-level web crawling & scraping framework for Node.js.
Stars: ✭ 364 (+516.95%)
GeziyorGeziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.
Stars: ✭ 1,246 (+2011.86%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (+69.49%)
SquidwarcSquidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Stars: ✭ 125 (+111.86%)
AntchAntch, a fast, powerful and extensible web crawling & scraping framework for Go
Stars: ✭ 198 (+235.59%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-74.58%)
Awesome PuppeteerA curated list of awesome puppeteer resources.
Stars: ✭ 1,728 (+2828.81%)
JvppeteerHeadless Chrome For Java (Java 爬虫)
Stars: ✭ 193 (+227.12%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+369.49%)
Goose ParserUniversal scrapping tool, which allows you to extract data using multiple environments
Stars: ✭ 211 (+257.63%)
UdemycoursegrabberYour will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!
Stars: ✭ 137 (+132.2%)
NewspaperNews, full-text, and article metadata extraction in Python 3. Advanced docs:
Stars: ✭ 11,545 (+19467.8%)
Tianyanchapip安装的天眼查爬虫API,指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.
Stars: ✭ 206 (+249.15%)
double-agentA test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (+108.47%)
docker-selenium-lambdaThe simplest demo of chrome automation by python and selenium in AWS Lambda
Stars: ✭ 172 (+191.53%)
TinderBotzAutomated Tinder bot and scraper using selenium in python.
Stars: ✭ 265 (+349.15%)
crawling-frameworkEasily crawl news portals or blog sites using Storm Crawler.
Stars: ✭ 22 (-62.71%)
puppeteer-botcheck🕵♂ Bot detection tests for Puppeteer. Hide and seek!
Stars: ✭ 42 (-28.81%)
browserslist-generatorA library that makes generating and validating Browserslists a breeze!
Stars: ✭ 77 (+30.51%)
scrapmanRetrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (-64.41%)
zcrawlAn open source web crawling platform
Stars: ✭ 21 (-64.41%)
TikTokDownload public videos on TikTok using Python with Selenium
Stars: ✭ 37 (-37.29%)
throughout🎪 End-to-end testing made simple (using Jest and Puppeteer)
Stars: ✭ 16 (-72.88%)
scrapy-fieldstatsA Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-71.19%)
instagram-get-imagesInstagram get images 🌄 (hashtags, account, locations) with puppeteer
Stars: ✭ 69 (+16.95%)
ha-multiscrapeHome Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Stars: ✭ 103 (+74.58%)
Instagram-to-discordMonitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (+91.53%)
InstaBotSimple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (-45.76%)
copycatA PHP Scraping Class
Stars: ✭ 70 (+18.64%)
pumbaFetch, store and access user agent strings for different browsers
Stars: ✭ 12 (-79.66%)
RecorderA browser extension that generates Cypress, Playwright and Puppeteer test scripts from your interactions 🖱 ⌨
Stars: ✭ 277 (+369.49%)
site-audit-seoWeb service and CLI tool for SEO site audit: crawl site, lighthouse all pages, view public reports in browser. Also output to console, json, csv, xlsx, Google Drive.
Stars: ✭ 91 (+54.24%)
go-scrapyWeb crawling and scraping framework for Golang
Stars: ✭ 17 (-71.19%)
dijnet-botAz összes számlád még egy helyen :)
Stars: ✭ 17 (-71.19%)