Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, SHFE and news data crawlers on BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist

✭ 153

python web-scraping futures financial-data newsletter web-scraper

Helena

A Chrome extension for writing custom web scraping programs and web automation programs. Just demonstrate how to collect the first row of data, then let the extension write the program for collecting all rows.

✭ 151

javascript chrome-extension web-scraping synthesis

Juno crawler

Scrapy crawler to collect data on the back catalog of songs listed for sale.

✭ 150

python scrapy web-scraping

Phpscraper

PHP Scraper - an highly opinionated web-interface for PHP

✭ 148

scraper scraping web-scraping web-scraper

Sqrape

Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)

✭ 144

go reflection scraping web-scraping magic css-selector

Zillow

Zillow Scraper for Python using Selenium

✭ 141

python selenium scraper web-scraping chromedriver

Html Metadata

MetaData html scraper and parser for Node.js (supports Promises and callback style)

✭ 129

javascript nodejs web-scraping node-module web-scraper

Actor Page Analyzer

Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.

✭ 124

javascript headless-chrome web-scraping

30 Days Of Python

Learn Python for the next 30 (or so) Days.

✭ 1,748

HTML Jupyter Notebook api tutorial automation rest-api flask jupyter csv pandas selenium web-scraping selenium-webdriver fastapi

Ayakashi

⚡️ Ayakashi.io - The next generation web scraping framework

✭ 117

typescript automation data-mining headless-chrome web-scraping

Dat8

General Assembly's 2015 Data Science course in Washington, DC

Save For Offline

Android app for saving webpages for offline reading.

✭ 114

java android parser android-application offline viewer web-scraping html-parser

Scrapyd Cluster On Heroku

Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO 👉

✭ 106

python heroku cluster scrapy web-scraping

Rod

A Devtools driver for web automation and scraping

✭ 1,392

go golang web testing automation scraper devtools headless web-scraping chrome-devtools chrome-headless

Pulsar

Turn large Web sites into tables and charts using simple SQLs.

✭ 100

html data-science selenium web-scraping web-crawler

Sillynium

Automate the creation of Python Selenium Scripts by drawing coloured boxes on webpage elements

✭ 100

javascript python python3 automation chrome-extension chrome html5 opensource selenium scraper extensions web-scraping selenium-webdriver recorder paint automated-testing bookmarklet chromedriver

Splashr

💦 Tools to Work with the 'Splash' JavaScript Rendering Service in R

✭ 93

r rstats selenium web-scraping phantomjs

Hockey Scraper

Python Package for scraping NHL Play-by-Play and Shift data

✭ 93

python scraper web-scraping sports

Humanoid

Node.js package to bypass CloudFlare's anti-bot JavaScript challenges

✭ 88

javascript bot scraping web-scraping bypass

Daftlistings

A library that enables programmatic interaction with daft.ie. Daft.ie has nationwide coverage and contains about 80% of the total available properties in Ireland.

✭ 86

python web-scraping web-scraper beautifulsoup properties

Rvest

Simple web scraping for R

✭ 1,253

r html web-scraping

Detect Cms

PHP Library for detecting CMS

✭ 78

detection scraping web-scraping web-scraper

Reader

Extract clean(er), readable text from web pages via Mercury Web Parser.

✭ 75

python reader web-scraping extract readability cleaner

Ping Sm

Receive an email or Telegram message as soon as Migros Sanalmarket is available for delivery in your neighborhood.

✭ 71

python web-scraping

Arachnid

Powerful web scraping framework for Crystal

✭ 68

crystal bot crawler spider web-scraping crawling web-scraper

Cascadia

Go cascadia package command line CSS selector

✭ 67

go command-line command-line-tool curl web-scraping extract web-scraper tsv css-selector

Decapitated

Headless 'Chrome' Orchestration in R

✭ 65

javascript r rstats headless-chrome web-scraping

Social Media Profile Scrapers

Fetch user's data across social media

✭ 60

python request social-media web-scraping pinterest instagram-scraper web-scraper

Instago

Download/access photos, videos, stories, story highlights, postlives, following and followers of Instagram

✭ 59

go golang instagram downloader web-scraping webscraping gopherjs

Scrapy Craigslist

Web Scraping Craigslist's Engineering Jobs in NY with Scrapy

✭ 54

python scrapy web-scraping web-scraper

Project Tauro

A Router WiFi key recovery/cracking tool with a twist.

✭ 52

java hacking hacking-tool web-scraping web-security network-security web-scraper wifi-security

Actor Google Search Scraper

Apify actor that crawls Google Search result pages (SERPs) and extracts a list of organic results, ads, related queries and more. It supports selection of custom country, language and location.

✭ 38

html web-scraping

Uc Davis Cs Exams Analysis

📈 Regression and Classification with UC Davis student quiz data and exam data

✭ 33

r machine-learning nlp testing statistics regex unsupervised-learning training text-mining web-scraping logistic-regression linear-regression probability statistical-analysis

Snoop

Snoop — инструмент разведки на основе открытых данных (OSINT world)

✭ 886

python linux security windows scanner osint pentest infosec ctf redteam termux geolocation web-scraping ip geo blueteam

Webmiddle

Node.js framework for modular web scraping and data extraction

✭ 13

javascript nodejs framework jsx modular web-scraping

Letterboxd recommendations

Scraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username

✭ 23

python flask web-scraping collaborative-filtering svd

Youtube tutorials

Collection of scripts corresponding to LucidProgramming YouTube tutorials

✭ 769

python python3 web-scraping

Spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

✭ 656

ruby web crawler spider scraper web-scraping web-crawler web-scraper

Faster Than Requests

Faster requests on Python 3

✭ 639