Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Wayback Machine ScraperA command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
DocbaoCông cụ quét và phân tích từ khoá các trang báo mạng Việt Nam
City ScrapersScrape, standardize and share public meetings from local government websites
Trump LiesTutorial: Web scraping in Python with Beautiful Soup
Bet On SibylMachine Learning Model for Sport Predictions (Football, Basketball, Baseball, Hockey, Soccer & Tennis)
Twitter IntelligenceTwitter Intelligence OSINT project performs tracking and analysis of the Twitter
GrabWeb Scraping Framework
LearnpythonforresearchThis repository provides everything you need to get started with Python for (social science) research.
Netflix CloneNetflix like full-stack application with SPA client and backend implemented in service oriented architecture
Web ScrapingDetailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, SHFE and news data crawlers on BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
HelenaA Chrome extension for writing custom web scraping programs and web automation programs. Just demonstrate how to collect the first row of data, then let the extension write the program for collecting all rows.
Juno crawlerScrapy crawler to collect data on the back catalog of songs listed for sale.
PhpscraperPHP Scraper - an highly opinionated web-interface for PHP
SqrapeSimple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
ZillowZillow Scraper for Python using Selenium
Html MetadataMetaData html scraper and parser for Node.js (supports Promises and callback style)
Actor Page AnalyzerApify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.
Ayakashi⚡️ Ayakashi.io - The next generation web scraping framework
Dat8General Assembly's 2015 Data Science course in Washington, DC
Scrapyd Cluster On HerokuSet up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO 👉
RodA Devtools driver for web automation and scraping
PulsarTurn large Web sites into tables and charts using simple SQLs.
SillyniumAutomate the creation of Python Selenium Scripts by drawing coloured boxes on webpage elements
Splashr💦 Tools to Work with the 'Splash' JavaScript Rendering Service in R
Hockey ScraperPython Package for scraping NHL Play-by-Play and Shift data
HumanoidNode.js package to bypass CloudFlare's anti-bot JavaScript challenges
DaftlistingsA library that enables programmatic interaction with daft.ie. Daft.ie has nationwide coverage and contains about 80% of the total available properties in Ireland.
RvestSimple web scraping for R
ReaderExtract clean(er), readable text from web pages via Mercury Web Parser.
Ping SmReceive an email or Telegram message as soon as Migros Sanalmarket is available for delivery in your neighborhood.
ArachnidPowerful web scraping framework for Crystal
CascadiaGo cascadia package command line CSS selector
InstagoDownload/access photos, videos, stories, story highlights, postlives, following and followers of Instagram
Project TauroA Router WiFi key recovery/cracking tool with a twist.
Actor Google Search ScraperApify actor that crawls Google Search result pages (SERPs) and extracts a list of organic results, ads, related queries and more. It supports selection of custom country, language and location.
SnoopSnoop — инструмент разведки на основе открытых данных (OSINT world)
WebmiddleNode.js framework for modular web scraping and data extraction
Letterboxd recommendationsScraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username
Youtube tutorialsCollection of scripts corresponding to LucidProgramming YouTube tutorials
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
CoolqlcoolNextjs server to query websites with GraphQL
User AgentsA JavaScript library for generating random user agents with data that's updated daily.
RpaUI.Vision: Open-Source RPA Software (formerly Kantu) - Modern Robotic Process Automation with Selenium IDE++