Declarative web scraping
Explaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
译文：Puppeteer 与 Chrome Headless —— 从入门到爬虫
NodeJS package that fetches a given URL's title, description, images, links etc.
a work-in-progress guide to web scraping as an artistic and critical practice
Command line program to download documents from web portals.
📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Spin up Tor containers and then proxy HTTP requests via these Tor instances
Scrape and take screenshots of dynamic and static webpages
Scrape all eBay sold listings to determine average/median pricing, plot listings over time with trend lines, and extract to excel
Monitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
extract videos from youtube in audio format using webscraping techniques 🎶
At present contains scraped data from around 1500 problems present on the site. More to follow....
RECSM-UPF Summer School: Social Media and Big Data Research
**[ARCHIVED]** website changes tracker 🔍
A CLI for Mozilla Readability. Get clean, uncluttered, ready-to-read HTML from any webpage!
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
Implementation of "Trade the Event: Corporate Events Detection for News-Based Event-Driven Trading." In Findings of ACL2021
A free anime streaming , using the jkanime content by scraping the jkanime website.