Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+61.49%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-89.86%)
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+343.24%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+213.51%)
OLX Scraper📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-89.86%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+2654.73%)
top-github-scraperScape top GitHub repositories and users based on keywords
Stars: ✭ 40 (-72.97%)
Detect CmsPHP Library for detecting CMS
Stars: ✭ 78 (-47.3%)
Linkedin-ClientWeb scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (-71.62%)
ZillowZillow Scraper for Python using Selenium
Stars: ✭ 141 (-4.73%)
scrapy facebookerCollection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-85.14%)
SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (-20.95%)
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-90.54%)
Html MetadataMetaData html scraper and parser for Node.js (supports Promises and callback style)
Stars: ✭ 129 (-12.84%)
AzurLaneWikiScrapersA console application that can scrape the Azur Lane wiki and export the data to Json files
Stars: ✭ 12 (-91.89%)
scraperNodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
Stars: ✭ 37 (-75%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-60.14%)
Php Curl ClassPHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
Stars: ✭ 2,903 (+1861.49%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (+108.78%)
copycatA PHP Scraping Class
Stars: ✭ 70 (-52.7%)
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+2031.08%)
KatanaA Python Tool For google Hacking
Stars: ✭ 355 (+139.86%)
SqrapeSimple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
Stars: ✭ 144 (-2.7%)
RodA Devtools driver for web automation and scraping
Stars: ✭ 1,392 (+840.54%)
Awesome CrawlerA collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+3138.51%)
Imagescraper✂️ High performance, multi-threaded image scraper
Stars: ✭ 630 (+325.68%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-64.86%)
Captcha-ToolsAll-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!
Stars: ✭ 23 (-84.46%)
Scraper-Projects🕸 List of mini projects that involve web scraping 🕸
Stars: ✭ 25 (-83.11%)
SillyniumAutomate the creation of Python Selenium Scripts by drawing coloured boxes on webpage elements
Stars: ✭ 100 (-32.43%)
sp-subway-scraper🚆This web scraper builds a dataset for São Paulo subway operation status
Stars: ✭ 24 (-83.78%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (-52.03%)
raspagem-de-dados-fatec📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí
Stars: ✭ 22 (-85.14%)
ZeiverA Scraper, Downloader, & Recorder for static open directories.
Stars: ✭ 14 (-90.54%)
facebook-discussion-tkA collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.
Stars: ✭ 33 (-77.7%)
TorScrapperA Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)
Stars: ✭ 24 (-83.78%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+87.16%)
PypatentSearch for and retrieve US Patent and Trademark Office Patent Data
Stars: ✭ 31 (-79.05%)
Project TauroA Router WiFi key recovery/cracking tool with a twist.
Stars: ✭ 52 (-64.86%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+3168.24%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+208.11%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+433.11%)
ArachnidPowerful web scraping framework for Crystal
Stars: ✭ 68 (-54.05%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+197.3%)
CascadiaGo cascadia package command line CSS selector
Stars: ✭ 67 (-54.73%)
Email ExtractorThe main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (-45.27%)
DaftlistingsA library that enables programmatic interaction with daft.ie. Daft.ie has nationwide coverage and contains about 80% of the total available properties in Ireland.
Stars: ✭ 86 (-41.89%)
GeziyorGeziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.
Stars: ✭ 1,246 (+741.89%)
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (-65.54%)
Hockey ScraperPython Package for scraping NHL Play-by-Play and Shift data
Stars: ✭ 93 (-37.16%)
ScrapersA list of scrapers from around the web.
Stars: ✭ 366 (+147.3%)
Scrapy CraigslistWeb Scraping Craigslist's Engineering Jobs in NY with Scrapy
Stars: ✭ 54 (-63.51%)
HumanoidNode.js package to bypass CloudFlare's anti-bot JavaScript challenges
Stars: ✭ 88 (-40.54%)