Tinking🧶 Extract data from any website without code, just clicks.
Stars: ✭ 331 (+451.67%)
Geeksforgeeks.pdfTopic wise PDFs of Geeks for Geeks articles. (Last updated in October 2018)
Stars: ✭ 489 (+715%)
Comic DlComic-dl is a command line tool to download manga and comics from various comic and manga sites. Supported sites : readcomiconline.to, mangafox.me, comic naver and many more.
Stars: ✭ 365 (+508.33%)
ARGUSARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (+13.33%)
OjTools for various online judges. Downloading sample cases, generating additional test cases, testing your code, and submitting it.
Stars: ✭ 517 (+761.67%)
Elixir ScrapeScrape any website, article or RSS/Atom Feed with ease!
Stars: ✭ 306 (+410%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+1215%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+633.33%)
CoronadatascraperCOVID-19 Coronavirus data scraped from government and curated data sources.
Stars: ✭ 372 (+520%)
raspagem-de-dados-fatec📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí
Stars: ✭ 22 (-63.33%)
SocialreaperSocial media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Stars: ✭ 338 (+463.33%)
Instagram ScraperScrape the Instagram frontend. Inspired from twitter-scraper by @kennethreitz.
Stars: ✭ 903 (+1405%)
SpidermonScrapy Extension for monitoring spiders execution.
Stars: ✭ 309 (+415%)
Facebook ScraperScrape Facebook public pages without an API key
Stars: ✭ 499 (+731.67%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (+376.67%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+361.67%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+673.33%)
facebook-discussion-tkA collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.
Stars: ✭ 33 (-45%)
ParselParsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Stars: ✭ 628 (+946.67%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-63.33%)
JekyllJekyll-based static site for The Programming Historian
Stars: ✭ 387 (+545%)
Data ScienceCollection of useful data science topics along with code and articles
Stars: ✭ 315 (+425%)
memes-apiAPI for scrapping common meme sites
Stars: ✭ 17 (-71.67%)
TabulaTabula is a tool for liberating data tables trapped inside PDF files
Stars: ✭ 5,420 (+8933.33%)
Scrapy ClusterThis Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Stars: ✭ 921 (+1435%)
KatanaA Python Tool For google Hacking
Stars: ✭ 355 (+491.67%)
Gazpacho🥫 The simple, fast, and modern web scraping library
Stars: ✭ 525 (+775%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+6695%)
Facebook data analyzerAnalyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more
Stars: ✭ 515 (+758.33%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (+415%)
WebhereHTML scraping for Objective-C.
Stars: ✭ 16 (-73.33%)
NickjsWeb scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)
Stars: ✭ 494 (+723.33%)
Clean Text🧹 Python package for text cleaning
Stars: ✭ 284 (+373.33%)
MtntCode for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (-20%)
LambdasoupFunctional HTML scraping and rewriting with CSS in OCaml
Stars: ✭ 280 (+366.67%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+7961.67%)
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+5156.67%)
Imagescraper✂️ High performance, multi-threaded image scraper
Stars: ✭ 630 (+950%)
instagram explorer📷 An app to scrap instagram posts and analyze data.
Stars: ✭ 17 (-71.67%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+660%)
jazzThe Scripting Engine that Combines Speed, Safety, and Simplicity
Stars: ✭ 132 (+120%)
ConfigsPublic, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores
Stars: ✭ 37 (-38.33%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-1.67%)
MechanizeMechanize is a ruby library that makes automated web interaction easy.
Stars: ✭ 4,158 (+6830%)
scraperNodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
Stars: ✭ 37 (-38.33%)
NewcrawlerFree Web Scraping Tool with Java
Stars: ✭ 589 (+881.67%)
LookylooLookyloo is a web interface that allows users to capture a website page and then display a tree of domains that call each other.
Stars: ✭ 381 (+535%)
Awesome Python Primer自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向
Stars: ✭ 57 (-5%)
Artooartoo.js - the client-side scraping companion.
Stars: ✭ 1,029 (+1615%)
PypatentSearch for and retrieve US Patent and Trademark Office Patent Data
Stars: ✭ 31 (-48.33%)
Undetected ChromedriverCustom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Stars: ✭ 365 (+508.33%)