SocialreaperSocial media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Stars: ✭ 338 (+686.05%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (+565.12%)
Geeksforgeeks.pdfTopic wise PDFs of Geeks for Geeks articles. (Last updated in October 2018)
Stars: ✭ 489 (+1037.21%)
facebook-discussion-tkA collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.
Stars: ✭ 33 (-23.26%)
OjTools for various online judges. Downloading sample cases, generating additional test cases, testing your code, and submitting it.
Stars: ✭ 517 (+1102.33%)
SpidermonScrapy Extension for monitoring spiders execution.
Stars: ✭ 309 (+618.6%)
Ics Security ToolsTools, tips, tricks, and more for exploring ICS Security.
Stars: ✭ 749 (+1641.86%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+544.19%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+923.26%)
Data ScienceCollection of useful data science topics along with code and articles
Stars: ✭ 315 (+632.56%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-48.84%)
KatanaA Python Tool For google Hacking
Stars: ✭ 355 (+725.58%)
WebhereHTML scraping for Objective-C.
Stars: ✭ 16 (-62.79%)
Tinking🧶 Extract data from any website without code, just clicks.
Stars: ✭ 331 (+669.77%)
Facebook ScraperScrape Facebook public pages without an API key
Stars: ✭ 499 (+1060.47%)
Elixir ScrapeScrape any website, article or RSS/Atom Feed with ease!
Stars: ✭ 306 (+611.63%)
PypatentSearch for and retrieve US Patent and Trademark Office Patent Data
Stars: ✭ 31 (-27.91%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+979.07%)
ParselParsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Stars: ✭ 628 (+1360.47%)
ARGUSARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (+58.14%)
PandapowerConvenient Power System Modelling and Analysis based on PYPOWER and pandas
Stars: ✭ 387 (+800%)
Undetected ChromedriverCustom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Stars: ✭ 365 (+748.84%)
scraperNodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
Stars: ✭ 37 (-13.95%)
TabulaTabula is a tool for liberating data tables trapped inside PDF files
Stars: ✭ 5,420 (+12504.65%)
CoronadatascraperCOVID-19 Coronavirus data scraped from government and curated data sources.
Stars: ✭ 372 (+765.12%)
Instagram ScraperScrape the Instagram frontend. Inspired from twitter-scraper by @kennethreitz.
Stars: ✭ 903 (+2000%)
Comic DlComic-dl is a command line tool to download manga and comics from various comic and manga sites. Supported sites : readcomiconline.to, mangafox.me, comic naver and many more.
Stars: ✭ 365 (+748.84%)
Gazpacho🥫 The simple, fast, and modern web scraping library
Stars: ✭ 525 (+1120.93%)
ConfigsPublic, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores
Stars: ✭ 37 (-13.95%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+9381.4%)
Facebook data analyzerAnalyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more
Stars: ✭ 515 (+1097.67%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+1734.88%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (+618.6%)
NickjsWeb scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)
Stars: ✭ 494 (+1048.84%)
Clean Text🧹 Python package for text cleaning
Stars: ✭ 284 (+560.47%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+11148.84%)
LambdasoupFunctional HTML scraping and rewriting with CSS in OCaml
Stars: ✭ 280 (+551.16%)
Imagescraper✂️ High performance, multi-threaded image scraper
Stars: ✭ 630 (+1365.12%)
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+7234.88%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+960.47%)
instagram explorer📷 An app to scrap instagram posts and analyze data.
Stars: ✭ 17 (-60.47%)
Auto CpufreqAutomatic CPU speed & power optimizer for Linux
Stars: ✭ 843 (+1860.47%)
jazzThe Scripting Engine that Combines Speed, Safety, and Simplicity
Stars: ✭ 132 (+206.98%)
MechanizeMechanize is a ruby library that makes automated web interaction easy.
Stars: ✭ 4,158 (+9569.77%)
bots-zooNo description or website provided.
Stars: ✭ 59 (+37.21%)
NewcrawlerFree Web Scraping Tool with Java
Stars: ✭ 589 (+1269.77%)
JekyllJekyll-based static site for The Programming Historian
Stars: ✭ 387 (+800%)
X Cube Usb PdUSB-C Power Delivery Firmware for STM32 microcontroller (ARM Cortex M0 & M4)
Stars: ✭ 41 (-4.65%)
Usb EspHow to make a tiny USB powered ESP-12S
Stars: ✭ 39 (-9.3%)
Scrapy ClusterThis Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Stars: ✭ 921 (+2041.86%)
LookylooLookyloo is a web interface that allows users to capture a website page and then display a tree of domains that call each other.
Stars: ✭ 381 (+786.05%)