document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-65%)
evineInteractive CLI Web Crawler
Stars: ✭ 140 (+250%)
yt-videos-listCreate and **automatically** update a list of all videos on a YouTube channel (in txt/csv/md form) via YouTube bot with end-to-end web scraping - no API tokens required. Multi-threaded support for YouTube videos list updates.
Stars: ✭ 64 (+60%)
mongodb-scraperScraps for publicly accessible MongoDB instances and dumps user passwords
Stars: ✭ 33 (-17.5%)
open-bus🚌 Analysing Israel's public transport data
Stars: ✭ 65 (+62.5%)
patreon-scraperWIP Patreon attachment download written in TypeScript
Stars: ✭ 25 (-37.5%)
retro-gtfsCollect real-time transit data and process it into a retroactive GTFS 'schedule' which can be used for routing/analysis
Stars: ✭ 45 (+12.5%)
discord-musicDiscord music bot written in Typescript
Stars: ✭ 12 (-70%)
impartus-downloaderDownload Impartus lectures, convert to mkv for offline viewing.
Stars: ✭ 19 (-52.5%)
Scraper-Projects🕸 List of mini projects that involve web scraping 🕸
Stars: ✭ 25 (-37.5%)
stock-market-scraperScraps historical stock market data from Yahoo Finance (https://finance.yahoo.com/)
Stars: ✭ 110 (+175%)
ZeiverA Scraper, Downloader, & Recorder for static open directories.
Stars: ✭ 14 (-65%)
site-audit-seoWeb service and CLI tool for SEO site audit: crawl site, lighthouse all pages, view public reports in browser. Also output to console, json, csv, xlsx, Google Drive.
Stars: ✭ 91 (+127.5%)
gtfs-utilsUtilities to process GTFS data sets.
Stars: ✭ 19 (-52.5%)
PDAP-ScrapersCode relating to scraping public police data.
Stars: ✭ 72 (+80%)
linkyYet Another LInkedIn Scraper...
Stars: ✭ 44 (+10%)
scraperA simple web scraper built around the JavaFX WebEngine
Stars: ✭ 13 (-67.5%)
covid-19Current and historical coronavirus covid-19 confirmed, recovered, deaths and active case counts segmented by country and region. Includes csv, json and sqlite data along with an interactive website explorer.
Stars: ✭ 15 (-62.5%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (+30%)
botCompletely free and open-source human-like Instagram bot. Powered by UIAutomator2 and compatible with basically any Android device 5.0+ that can run Instagram - real or emulated.
Stars: ✭ 321 (+702.5%)
VK-ScraperScrapes VK user's photos
Stars: ✭ 42 (+5%)
subwayclockDisplay clock for NYC subways
Stars: ✭ 29 (-27.5%)
diostsA Go scraper that validates security.txt files and outputs them in the disclose.io JSON format.
Stars: ✭ 18 (-55%)
linkedinscraperLinkedinScraper is an another information gathering tool written in python. You can scrape employees of companies on Linkedin.com and then create these employee names, titles and emails.
Stars: ✭ 22 (-45%)
CoinstaA Python package for acquiring both historical and current data of cryptocurrencies
Stars: ✭ 47 (+17.5%)
ogePage metadata as a service
Stars: ✭ 22 (-45%)
RPICovidScraperscraper for Rensselaer Polytechnic Institute (RPI)'s Covid Dashboard
Stars: ✭ 12 (-70%)
citylinesCitylines.co is a collaborative platform for mapping the transit systems of the world!
Stars: ✭ 53 (+32.5%)
imdb-scraper🎬 An attempt at the most complete IMDb API
Stars: ✭ 24 (-40%)
go-jd京东App自动登录,在线商品自动下单
Stars: ✭ 158 (+295%)
sp-subway-scraper🚆This web scraper builds a dataset for São Paulo subway operation status
Stars: ✭ 24 (-40%)
freeDictionaryAPIThere was no free Dictionary API on the web when I wanted one for my friend, so I created one.
Stars: ✭ 1,352 (+3280%)
Linkedin-ClientWeb scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (+5%)
transitland-atlasan open directory of mobility feeds and operators — powers both Transitland v1 and v2
Stars: ✭ 55 (+37.5%)
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (+27.5%)
bing-ip2hostsbingip2hosts is a Bing.com web scraper that discovers websites by IP address
Stars: ✭ 99 (+147.5%)
Instagram-to-discordMonitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (+182.5%)
transitimeTheTransitClock real-time transit information system
Stars: ✭ 60 (+50%)
LeetCodeAt present contains scraped data from around 1500 problems present on the site. More to follow....
Stars: ✭ 45 (+12.5%)
TorScrapperA Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)
Stars: ✭ 24 (-40%)
ha-multiscrapeHome Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Stars: ✭ 103 (+157.5%)
Captcha-ToolsAll-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!
Stars: ✭ 23 (-42.5%)
CourseCakeBy serving course 📚 data that is more "edible" 🍰 for developers, we hope CourseCake offers a smooth approach to build useful tools for students.
Stars: ✭ 21 (-47.5%)
TikTokDownload public videos on TikTok using Python with Selenium
Stars: ✭ 37 (-7.5%)
WaGpScraperA Python Oriented tool to Scrap WhatsApp Group Link using Google Dork it Scraps Whatsapp Group Links From Google Results And Gives Working Links.
Stars: ✭ 18 (-55%)
sotokiStackExchange websites to ZIM scraper
Stars: ✭ 64 (+60%)
berlin corona casesScraper for the official dashboard with current Corona case numbers, traffic light indicators ("Corona-Ampel") and vaccination situation for Berlin.
Stars: ✭ 19 (-52.5%)
Mimo-CrawlerA web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.
Stars: ✭ 22 (-45%)
html2rss-web🕸 Generates and delivers RSS feeds via HTTP. Create your own feeds or get started quickly with the included configs.
Stars: ✭ 36 (-10%)
dm tomatrixledDisplay (real-time) public transport departures using Raspberry Pi and LED matrices
Stars: ✭ 17 (-57.5%)
quoters📝 Random quotes generator package. Available on npm and PyPi
Stars: ✭ 17 (-57.5%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-62.5%)
transit modelManaging transit data with Rust
Stars: ✭ 33 (-17.5%)
crawlkitA crawler based on Phantom. Allows discovery of dynamic content and supports custom scrapers.
Stars: ✭ 23 (-42.5%)