actor-scraperHouse of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.
Stars: ✭ 83 (+118.42%)
comic-scraper[Python] Scraps comics and manga from various websites and creates cbz files from them
Stars: ✭ 16 (-57.89%)
halfstaff🇺🇸 Is the US flag at half-staff?
Stars: ✭ 22 (-42.11%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+86.84%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+628.95%)
tableau-scrapingTableau scraper python library. R and Python scripts to scrape data from Tableau viz
Stars: ✭ 91 (+139.47%)
User AgentsA JavaScript library for generating random user agents with data that's updated daily.
Stars: ✭ 485 (+1176.32%)
Node-js-functionalitiesThis repository contains very useful restful API's and functionalities in node-js containing many important tutorial code for mastering node-js, all tutorials have been published on medium.com, tutorials link is given below
Stars: ✭ 69 (+81.58%)
linkextractorA Docker tutorial using a link extraction application example
Stars: ✭ 41 (+7.89%)
htmlunit🕸🧰☕️Tools to Scrape Dynamic Web Content via the 'HtmlUnit' Java Library
Stars: ✭ 39 (+2.63%)
AcheACHE is a web crawler for domain-specific search.
Stars: ✭ 320 (+742.11%)
investigation-amazon-brandsMaterials to reproduce our findings in our stories, "Amazon Puts Its Own 'Brands' First Above Better-Rated Products" and "When Amazon Takes the Buy Box, it Doesn’t Give it up"
Stars: ✭ 56 (+47.37%)
heroshiHeroshi – open source web crawler.
Stars: ✭ 51 (+34.21%)
Php Curl ClassPHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
Stars: ✭ 2,903 (+7539.47%)
Youtube tutorialsCollection of scripts corresponding to LucidProgramming YouTube tutorials
Stars: ✭ 769 (+1923.68%)
OLX Scraper📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-60.53%)
Text-AnalysisExplaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Stars: ✭ 48 (+26.32%)
WaWebSessionHandler(DISCONTINUED) Save WhatsApp Web Sessions as files and open them everywhere!
Stars: ✭ 27 (-28.95%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+1121.05%)
iwwAI based web-wrapper for web-content-extraction
Stars: ✭ 61 (+60.53%)
grailerweb scraping tool for grailed.com
Stars: ✭ 30 (-21.05%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+10628.95%)
sp-subway-scraper🚆This web scraper builds a dataset for São Paulo subway operation status
Stars: ✭ 24 (-36.84%)
CoolqlcoolNextjs server to query websites with GraphQL
Stars: ✭ 623 (+1539.47%)
codechef-rank-comparatorWeb application hosted on Heroku cloud platform based on web scraping in python using lxml library (XML Path Language).
Stars: ✭ 23 (-39.47%)
GSoC-Data-AnalyserSimple search for organisations participating/participated in the GSoC
Stars: ✭ 29 (-23.68%)
Letterboxd recommendationsScraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username
Stars: ✭ 23 (-39.47%)
restaurant-finder-featureReviewsBuild a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (-44.74%)
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+8200%)
top-github-scraperScape top GitHub repositories and users based on keywords
Stars: ✭ 40 (+5.26%)
scraping-ebayScraping Ebay's products using Scrapy Web Crawling Framework
Stars: ✭ 79 (+107.89%)
SnoopSnoop — инструмент разведки на основе открытых данных (OSINT world)
Stars: ✭ 886 (+2231.58%)
IMDB-ScraperScrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.
Stars: ✭ 37 (-2.63%)
automation-scriptsSimple scripts that I'm using to automate the boring things.
Stars: ✭ 14 (-63.16%)
RpaUI.Vision: Open-Source RPA Software (formerly Kantu) - Modern Robotic Process Automation with Selenium IDE++
Stars: ✭ 477 (+1155.26%)
leetcode-compensationCompensation analysis on the posts scraped from leetcode.com/discuss/compensation. At present, the reports have been generated only for Indian cities.
Stars: ✭ 83 (+118.42%)
raspagem-de-dados-fatec📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí
Stars: ✭ 22 (-42.11%)
rreddit𝐫⟋ Get Reddit data
Stars: ✭ 49 (+28.95%)
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+1626.32%)
extractnetA Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+36.84%)
Linkedin-ClientWeb scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (+10.53%)
Awesome Web ScrapingList of libraries, tools and APIs for web scraping and data processing.
Stars: ✭ 4,510 (+11768.42%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-60.53%)
WebmiddleNode.js framework for modular web scraping and data extraction
Stars: ✭ 13 (-65.79%)
SelectolaxPython binding to Modest engine (fast HTML5 parser with CSS selectors).
Stars: ✭ 368 (+868.42%)
PaperScraperA web scraping tool to systematically extract the text of scientific papers and corresponding metadata from university accessible journals.
Stars: ✭ 63 (+65.79%)