SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (-20.41%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+210.2%)
NimqueryNim library for querying HTML using CSS-selectors (like JavaScripts document.querySelector)
Stars: ✭ 75 (-48.98%)
MechanizeMechanize is a ruby library that makes automated web interaction easy.
Stars: ✭ 4,158 (+2728.57%)
LookylooLookyloo is a web interface that allows users to capture a website page and then display a tree of domains that call each other.
Stars: ✭ 381 (+159.18%)
React Device DetectDetect device, and render view according to detected device type.
Stars: ✭ 1,145 (+678.91%)
Data ScienceCollection of useful data science topics along with code and articles
Stars: ✭ 315 (+114.29%)
WebmagicA scalable web crawler framework for Java.
Stars: ✭ 10,186 (+6829.25%)
MechamlOCaml functional web scraping library
Stars: ✭ 60 (-59.18%)
KatanaA Python Tool For google Hacking
Stars: ✭ 355 (+141.5%)
EmbedGet info from any web service or page
Stars: ✭ 1,808 (+1129.93%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+2673.47%)
MtntCode for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (-67.35%)
D4n155OWASP D4N155 - Intelligent and dynamic wordlist using OSINT
Stars: ✭ 105 (-28.57%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (+110.2%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (+94.56%)
ConfigsPublic, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores
Stars: ✭ 37 (-74.83%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-31.97%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+88.44%)
Scrapy ClusterThis Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Stars: ✭ 921 (+526.53%)
Fantasy Basketball Scraping statistics, predicting NBA player performance with neural networks and boosting algorithms, and optimising lineups for Draft Kings with genetic algorithm. Capstone Project for Machine Learning Engineer Nanodegree by Udacity.
Stars: ✭ 146 (-0.68%)
facebook-discussion-tkA collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.
Stars: ✭ 33 (-77.55%)
WebhereHTML scraping for Objective-C.
Stars: ✭ 16 (-89.12%)
NintendealsLibrary with a set of tools for scraping information about Nintendo games and its prices across all regions (NA, EU and JP).
Stars: ✭ 94 (-36.05%)
Facebook data analyzerAnalyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more
Stars: ✭ 515 (+250.34%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-85.03%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+436.73%)
HtmlsqlhtmlSQL is a experimental PHP library which allows you to access HTML values by an SQL like syntax.
Stars: ✭ 120 (-18.37%)
memes-apiAPI for scrapping common meme sites
Stars: ✭ 17 (-88.44%)
ParselParsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Stars: ✭ 628 (+327.21%)
webdextIntelligent Web Data Extractor
Stars: ✭ 75 (-48.98%)
PastepwnPython framework to scrape Pastebin pastes and analyze them
Stars: ✭ 87 (-40.82%)
PyLexPerform lexical analysis on words, one word at a time.
Stars: ✭ 60 (-59.18%)
ZeiverA Scraper, Downloader, & Recorder for static open directories.
Stars: ✭ 14 (-90.48%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-89.8%)
TabulaTabula is a tool for liberating data tables trapped inside PDF files
Stars: ✭ 5,420 (+3587.07%)
humanparserParse a human name string into salutation, first name, middle name, last name, suffix.
Stars: ✭ 78 (-46.94%)
Billylegacy backend for Open States
Stars: ✭ 85 (-42.18%)
dustArchive web pages with all relevant assets or save as a single file HTML
Stars: ✭ 19 (-87.07%)
Browser.phpA PHP Class to detect a user's Browser. This encapsulation provides a breakdown of the browser and the version of the browser using the browser's user-agent string. This is not a guaranteed solution but provides an overall accurate way to detect what browser a user is using.
Stars: ✭ 546 (+271.43%)
pompScreen scraping and web crawling framework
Stars: ✭ 61 (-58.5%)
Awesome PuppeteerA curated list of awesome puppeteer resources.
Stars: ✭ 1,728 (+1075.51%)
Gazpacho🥫 The simple, fast, and modern web scraping library
Stars: ✭ 525 (+257.14%)
PhpscraperPHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (+0.68%)
SqrapeSimple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
Stars: ✭ 144 (-2.04%)
UdemycoursegrabberYour will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!
Stars: ✭ 137 (-6.8%)
SouqscraperSimple scriptes for Level UP your scraping Skills, and source code for Level UP playlist on Youtube
Stars: ✭ 118 (-19.73%)
Email ExtractorThe main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (-44.9%)
Facebook ScraperScrape Facebook public pages without an API key
Stars: ✭ 499 (+239.46%)