All Projects → Mimo-Crawler → Similar Projects or Alternatives

1043 Open source projects that are alternatives of or similar to Mimo-Crawler

Crawlab
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+38045.45%)
Mutual labels:  web-crawler, webcrawler
crawlkit
A crawler based on Phantom. Allows discovery of dynamic content and supports custom scrapers.
Stars: ✭ 23 (+4.55%)
Mutual labels:  scraper, crawling
Xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
Stars: ✭ 335 (+1422.73%)
Mutual labels:  scraper, webscraping
OLX Scraper
📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-31.82%)
Mutual labels:  scraper, web-crawler
Polite
Be nice on the web
Stars: ✭ 253 (+1050%)
Mutual labels:  scraper, webscraping
Spidy
The simple, easy to use command line web crawler.
Stars: ✭ 257 (+1068.18%)
Mutual labels:  web-crawler, crawling
bing-ip2hosts
bingip2hosts is a Bing.com web scraper that discovers websites by IP address
Stars: ✭ 99 (+350%)
Mutual labels:  scraper, webscraping
bots-zoo
No description or website provided.
Stars: ✭ 59 (+168.18%)
Mutual labels:  scraper, crawling
Huginn
Create agents that monitor and act on your behalf. Your agents are standing by!
Stars: ✭ 33,694 (+153054.55%)
Mutual labels:  scraper, webscraping
Linkedin Profile Scraper
🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (+677.27%)
Mutual labels:  scraper, crawling
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (+136.36%)
Mutual labels:  scraper, crawling
diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (+140.91%)
Mutual labels:  scraper, crawling
Singlefilez
Web Extension for Firefox/Chrome/MS Edge and CLI tool to save a faithful copy of an entire web page in a self-extracting HTML/ZIP polyglot file
Stars: ✭ 882 (+3909.09%)
Mutual labels:  firefox, webpage
flink-crawler
Continuous scalable web crawler built on top of Flink and crawler-commons
Stars: ✭ 48 (+118.18%)
Mutual labels:  web-crawler, crawling
ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (+209.09%)
Mutual labels:  crawling, webscraping
newsemble
API for fetching data from news websites.
Stars: ✭ 42 (+90.91%)
Mutual labels:  scraper, webscraping
metacritic api
PHP Metacritic API - Mirrored by my GitLab
Stars: ✭ 31 (+40.91%)
Mutual labels:  scraper, webscraping
Autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+18431.82%)
Mutual labels:  scraper, webscraping
Nutch
Apache Nutch is an extensible and scalable web crawler
Stars: ✭ 2,277 (+10250%)
Mutual labels:  web-crawler, crawling
Spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+2881.82%)
Mutual labels:  scraper, web-crawler
Lulu
[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+3486.36%)
Mutual labels:  scraper, crawling
Youtube Projects
This repository contains all the code I use in my YouTube tutorials.
Stars: ✭ 144 (+554.55%)
Mutual labels:  scraper, webscraping
Django Dynamic Scraper
Creating Scrapy scrapers via the Django admin interface
Stars: ✭ 1,024 (+4554.55%)
Mutual labels:  scraper, webscraping
BookingScraper
🌎 🏨 Scrape Booking.com 🏨 🌎
Stars: ✭ 68 (+209.09%)
Mutual labels:  scraper, webscraping
ant
A web crawler for Go
Stars: ✭ 264 (+1100%)
Mutual labels:  scraper, web-crawler
Spam Bot 3000
Social media research and promotion, semi-autonomous CLI bot
Stars: ✭ 79 (+259.09%)
Mutual labels:  firefox, scraper
gotor
This program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and REST API.
Stars: ✭ 97 (+340.91%)
Mutual labels:  webcrawler, webscraping
web-crawler
Python Web Crawler with Selenium and PhantomJS
Stars: ✭ 19 (-13.64%)
Mutual labels:  scraper, webcrawler
TrackPurchase
단 몇줄의 코드로 다양한 쇼핑 플랫폼에서 결제 내역을 긁어오자!
Stars: ✭ 19 (-13.64%)
Mutual labels:  webcrawler, webscraping
Ferret
Declarative web scraping
Stars: ✭ 4,837 (+21886.36%)
Mutual labels:  scraper, crawling
Dotnetcrawler
DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (+354.55%)
Mutual labels:  crawling, webscraping
Skycaiji
蓝天采集器是一款免费的数据采集发布爬虫软件,采用php+mysql开发,可部署在云服务器,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统
Stars: ✭ 1,514 (+6781.82%)
Mutual labels:  crawling, webcrawler
Gopa
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+1159.09%)
Mutual labels:  web-crawler, crawling
Colly
Elegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+70513.64%)
Mutual labels:  scraper, crawling
newspaperjs
News extraction and scraping. Article Parsing
Stars: ✭ 59 (+168.18%)
Mutual labels:  scraper, webscraping
img-cli
An interactive Command-Line Interface Build in NodeJS for downloading a single or multiple images to disk from URL
Stars: ✭ 15 (-31.82%)
Mutual labels:  webpage, crawling
evine
Interactive CLI Web Crawler
Stars: ✭ 140 (+536.36%)
Mutual labels:  scraper, web-crawler
Antch
Antch, a fast, powerful and extensible web crawling & scraping framework for Go
Stars: ✭ 198 (+800%)
Mutual labels:  web-crawler, crawling
Rcrawler
An R web crawler and scraper
Stars: ✭ 274 (+1145.45%)
Mutual labels:  scraper, webscraping
Instagram-Scraper-2021
Scrape Instagram content and stories anonymously, using a new technique based on the har file (No Token + No public API).
Stars: ✭ 57 (+159.09%)
Mutual labels:  scraper, webscraping
Crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+1900%)
Mutual labels:  scraper, crawling
Goscraper
Golang pkg to quickly return a preview of a webpage (title/description/images)
Stars: ✭ 72 (+227.27%)
Mutual labels:  scraper, webpage
Scrapyrt
HTTP API for Scrapy spiders
Stars: ✭ 637 (+2795.45%)
Mutual labels:  scraper, crawling
Headless Chrome Crawler
Distributed crawler powered by Headless Chrome
Stars: ✭ 5,129 (+23213.64%)
Mutual labels:  scraper, crawling
Mailinglistscraper
A python web scraper for public email lists.
Stars: ✭ 19 (-13.64%)
Mutual labels:  scraper, webscraping
Awesome Crawler
A collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+21686.36%)
Mutual labels:  scraper, web-crawler
Newspaper
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Stars: ✭ 11,545 (+52377.27%)
Mutual labels:  scraper, crawling
Dataflowkit
Extract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+1972.73%)
Mutual labels:  scraper, crawling
Linkedin scraper
A library that scrapes Linkedin for user data
Stars: ✭ 413 (+1777.27%)
Mutual labels:  firefox, scraper
robotstxt
robots.txt file parsing and checking for R
Stars: ✭ 65 (+195.45%)
Mutual labels:  scraper, webscraping
proxycrawl-python
ProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (+131.82%)
Mutual labels:  scraper, crawling
supervised-machine-learning
This repo contains regression and classification projects. Examples: development of predictive models for comments on social media websites; building classifiers to predict outcomes in sports competitions; churn analysis; prediction of clicks on online ads; analysis of the opioids crisis and an analysis of retail store expansion strategies using…
Stars: ✭ 34 (+54.55%)
Mutual labels:  webscraping
freeRep
Bypass repubblica.it and lastampa.it paywall
Stars: ✭ 34 (+54.55%)
Mutual labels:  firefox
impartus-downloader
Download Impartus lectures, convert to mkv for offline viewing.
Stars: ✭ 19 (-13.64%)
Mutual labels:  scraper
web-scraping-engine
A simple web scraping engine supporting concurrent and anonymous scraping
Stars: ✭ 27 (+22.73%)
Mutual labels:  scraper
ir
Projeto de calculo de Imposto de Renda em operacoes na bovespa automaticamente. Tags:canal eletronico do investidor, CEI, selenium, bovespa, IRPF, IR, imposto de renda, finance, yahoo finance, acao, fii, etf, python, crawler, webscraping, calculadora ir
Stars: ✭ 120 (+445.45%)
Mutual labels:  webscraping
containerise-lists
Containerise compatible domain lists
Stars: ✭ 28 (+27.27%)
Mutual labels:  firefox
stock-market-scraper
Scraps historical stock market data from Yahoo Finance (https://finance.yahoo.com/)
Stars: ✭ 110 (+400%)
Mutual labels:  scraper
aliexscrape
Get Aliexpress product details in JSON
Stars: ✭ 80 (+263.64%)
Mutual labels:  scraper
VideoRecognition-realtime-autotrainer-alerts
State of the art object detection in real-time using YOLOV3 algorithm. Augmented with a process that allows easy training of the classifier as a plug & play solution . Provides alert if an item in an alert list is detected.
Stars: ✭ 36 (+63.64%)
Mutual labels:  webscraping
1-60 of 1043 similar projects