htmlunit🕸🧰☕️Tools to Scrape Dynamic Web Content via the 'HtmlUnit' Java Library
Stars: ✭ 39 (-50.63%)
actor-content-checkerYou can use this act to monitor any page's content and get a notification when content changes.
Stars: ✭ 16 (-79.75%)
leetcode-compensationCompensation analysis on the posts scraped from leetcode.com/discuss/compensation. At present, the reports have been generated only for Indian cities.
Stars: ✭ 83 (+5.06%)
grailerweb scraping tool for grailed.com
Stars: ✭ 30 (-62.03%)
fernando-pessoaClassificador de poemas do Fernando Pessoa de acordo com os seus heterônimos
Stars: ✭ 31 (-60.76%)
Data-Wrangling-with-PythonSimplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices
Stars: ✭ 90 (+13.92%)
ArticleSpiderCrawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation roadmap, distributed-crawler and coping with anti-crawling strategies).
Stars: ✭ 34 (-56.96%)
163Music163music spider by scrapy.
Stars: ✭ 60 (-24.05%)
Neural-Scam-ArtistWeb Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Stars: ✭ 18 (-77.22%)
cl-torrentsSearching torrents on popular trackers - CLI, readline, GUI, web client. Tutorial and binaries (issue tracker on https://gitlab.com/vindarel/cl-torrents/)
Stars: ✭ 83 (+5.06%)
vietnam-ecommerce-crawlerCrawling the data from lazada, websosanh, compare.vn, cdiscount and cungmua with flexible configs
Stars: ✭ 28 (-64.56%)
rymscraperPython API to extract data from rateyourmusic.com.
Stars: ✭ 63 (-20.25%)
codepen-puppeteerUse Puppeteer to download pens from Codepen.io as single html pages
Stars: ✭ 22 (-72.15%)
WaWebSessionHandler(DISCONTINUED) Save WhatsApp Web Sessions as files and open them everywhere!
Stars: ✭ 27 (-65.82%)
web-poetWeb scraping Page Objects core library
Stars: ✭ 67 (-15.19%)
Web-IotaIota is a web scraper which can find all of the images and links/suburls on a webpage
Stars: ✭ 60 (-24.05%)
scrapy.dartScrapy, a fast high-level web crawling & scraping framework for dart and Flutter
Stars: ✭ 50 (-36.71%)
scrapy helperDynamic configurable crawl (动态可配置化爬虫)
Stars: ✭ 84 (+6.33%)
aioScrapy基于asyncio与aiohttp的异步协程爬虫框架 欢迎Star
Stars: ✭ 34 (-56.96%)
lopezCrawling and scraping the Web for fun and profit
Stars: ✭ 20 (-74.68%)
rreddit𝐫⟋ Get Reddit data
Stars: ✭ 49 (-37.97%)
PythonScrapyBasicSetupBasic setup with random user agents and IP addresses for Python Scrapy Framework.
Stars: ✭ 57 (-27.85%)
Twitter IntelligenceTwitter Intelligence OSINT project performs tracking and analysis of the Twitter
Stars: ✭ 179 (+126.58%)
lgcrawlpython+scrapy+splash 爬取拉勾全站职位信息
Stars: ✭ 22 (-72.15%)
animecenterThe source code for animecenter
Stars: ✭ 16 (-79.75%)
estate-crawlerScraping the real estate agencies for up-to-date house listings as soon as they arrive!
Stars: ✭ 20 (-74.68%)
UofT-Timetable-GeneratorA web application that generates timetables for university students at the University of Toronto
Stars: ✭ 34 (-56.96%)
automation-scriptsSimple scripts that I'm using to automate the boring things.
Stars: ✭ 14 (-82.28%)
Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+202.53%)
Pythoncovers python basic to advance topics, practice questions, logical problems in python, web development using html, css, bootstrap, jquery, DOM, Django 🚀🚀. 💥 🌈
Stars: ✭ 29 (-63.29%)
iowebWeb Scraping Framework
Stars: ✭ 31 (-60.76%)
Selenium Python HeliumSelenium-python but lighter: Helium is the best Python library for web automation.
Stars: ✭ 2,732 (+3358.23%)
R Web Scraping Cheat SheetGuide, reference and cheatsheet on web scraping using rvest, httr and Rselenium.
Stars: ✭ 207 (+162.03%)
OpenScraperAn open source webapp for scraping: towards a public service for webscraping
Stars: ✭ 80 (+1.27%)
Bet On SibylMachine Learning Model for Sport Predictions (Football, Basketball, Baseball, Hockey, Soccer & Tennis)
Stars: ✭ 190 (+140.51%)
GrabWeb Scraping Framework
Stars: ✭ 2,147 (+2617.72%)
iwwAI based web-wrapper for web-content-extraction
Stars: ✭ 61 (-22.78%)
savedditBulk Downloader for Reddit
Stars: ✭ 130 (+64.56%)
LearnpythonforresearchThis repository provides everything you need to get started with Python for (social science) research.
Stars: ✭ 163 (+106.33%)
Node-js-functionalitiesThis repository contains very useful restful API's and functionalities in node-js containing many important tutorial code for mastering node-js, all tutorials have been published on medium.com, tutorials link is given below
Stars: ✭ 69 (-12.66%)
InstaBotSimple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (-59.49%)
scrapy-fieldstatsA Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-78.48%)
Web ScrapingDetailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, SHFE and news data crawlers on BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
Stars: ✭ 153 (+93.67%)
HelenaA Chrome extension for writing custom web scraping programs and web automation programs. Just demonstrate how to collect the first row of data, then let the extension write the program for collecting all rows.
Stars: ✭ 151 (+91.14%)
Linkedin-ClientWeb scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (-46.84%)