All Projects → Mimo-Crawler → Similar Projects or Alternatives

1043 Open source projects that are alternatives of or similar to Mimo-Crawler

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Stars: ✭ 8,392 (+38045.45%)

Mutual labels: web-crawler, webcrawler

crawlkit

A crawler based on Phantom. Allows discovery of dynamic content and supports custom scrapers.

Stars: ✭ 23 (+4.55%)

Mutual labels: scraper, crawling

Xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

Stars: ✭ 335 (+1422.73%)

Mutual labels: scraper, webscraping

OLX Scraper

📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Stars: ✭ 15 (-31.82%)

Mutual labels: scraper, web-crawler

Polite

Be nice on the web

Stars: ✭ 253 (+1050%)

Mutual labels: scraper, webscraping

Spidy

The simple, easy to use command line web crawler.

Stars: ✭ 257 (+1068.18%)

Mutual labels: web-crawler, crawling

bing-ip2hosts

bingip2hosts is a Bing.com web scraper that discovers websites by IP address

Stars: ✭ 99 (+350%)

Mutual labels: scraper, webscraping

bots-zoo

No description or website provided.

Stars: ✭ 59 (+168.18%)

Mutual labels: scraper, crawling

Huginn

Create agents that monitor and act on your behalf. Your agents are standing by!

Stars: ✭ 33,694 (+153054.55%)

Mutual labels: scraper, webscraping

Linkedin Profile Scraper

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.

Stars: ✭ 171 (+677.27%)

Mutual labels: scraper, crawling

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (+136.36%)

Mutual labels: scraper, crawling

diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

Stars: ✭ 53 (+140.91%)

Mutual labels: scraper, crawling

Singlefilez

Web Extension for Firefox/Chrome/MS Edge and CLI tool to save a faithful copy of an entire web page in a self-extracting HTML/ZIP polyglot file

Stars: ✭ 882 (+3909.09%)

Mutual labels: firefox, webpage

flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Stars: ✭ 48 (+118.18%)

Mutual labels: web-crawler, crawling

ARGUS

ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9

Stars: ✭ 68 (+209.09%)

Mutual labels: crawling, webscraping

newsemble

API for fetching data from news websites.

Stars: ✭ 42 (+90.91%)

Mutual labels: scraper, webscraping

metacritic api

PHP Metacritic API - Mirrored by my GitLab

Stars: ✭ 31 (+40.91%)

Mutual labels: scraper, webscraping

Autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Stars: ✭ 4,077 (+18431.82%)

Mutual labels: scraper, webscraping

Nutch

Apache Nutch is an extensible and scalable web crawler

Stars: ✭ 2,277 (+10250%)

Mutual labels: web-crawler, crawling

Spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+2881.82%)

Mutual labels: scraper, web-crawler

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

Stars: ✭ 789 (+3486.36%)

Mutual labels: scraper, crawling

Youtube Projects

This repository contains all the code I use in my YouTube tutorials.

Stars: ✭ 144 (+554.55%)

Mutual labels: scraper, webscraping

Django Dynamic Scraper

Creating Scrapy scrapers via the Django admin interface

Stars: ✭ 1,024 (+4554.55%)

Mutual labels: scraper, webscraping

BookingScraper

🌎 🏨 Scrape Booking.com 🏨 🌎

Stars: ✭ 68 (+209.09%)

Mutual labels: scraper, webscraping

ant

A web crawler for Go

Stars: ✭ 264 (+1100%)

Mutual labels: scraper, web-crawler

Spam Bot 3000

Social media research and promotion, semi-autonomous CLI bot

Stars: ✭ 79 (+259.09%)

Mutual labels: firefox, scraper

gotor

This program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and REST API.

Stars: ✭ 97 (+340.91%)

Mutual labels: webcrawler, webscraping

web-crawler

Python Web Crawler with Selenium and PhantomJS

Stars: ✭ 19 (-13.64%)

Mutual labels: scraper, webcrawler

TrackPurchase

단 몇줄의 코드로 다양한 쇼핑 플랫폼에서 결제 내역을 긁어오자!

Stars: ✭ 19 (-13.64%)

Mutual labels: webcrawler, webscraping

Ferret

Declarative web scraping

Stars: ✭ 4,837 (+21886.36%)

Mutual labels: scraper, crawling

Dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (+354.55%)

Mutual labels: crawling, webscraping

Skycaiji

蓝天采集器是一款免费的数据采集发布爬虫软件，采用php+mysql开发，可部署在云服务器，几乎能采集所有类型的网页，无缝对接各类CMS建站程序，免登录实时发布数据，全自动无需人工干预！是网页大数据采集软件中完全跨平台的云端爬虫系统

Stars: ✭ 1,514 (+6781.82%)

Mutual labels: crawling, webcrawler

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (+1159.09%)

Mutual labels: web-crawler, crawling

Colly

Elegant Scraper and Crawler Framework for Golang

Stars: ✭ 15,535 (+70513.64%)

Mutual labels: scraper, crawling

newspaperjs

News extraction and scraping. Article Parsing

Stars: ✭ 59 (+168.18%)

Mutual labels: scraper, webscraping

img-cli

An interactive Command-Line Interface Build in NodeJS for downloading a single or multiple images to disk from URL

Stars: ✭ 15 (-31.82%)

Mutual labels: webpage, crawling

evine

Interactive CLI Web Crawler

Stars: ✭ 140 (+536.36%)

Mutual labels: scraper, web-crawler

Antch

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

Stars: ✭ 198 (+800%)

Mutual labels: web-crawler, crawling

Rcrawler

An R web crawler and scraper

Stars: ✭ 274 (+1145.45%)

Mutual labels: scraper, webscraping

Instagram-Scraper-2021

Scrape Instagram content and stories anonymously, using a new technique based on the har file (No Token + No public API).

Stars: ✭ 57 (+159.09%)

Mutual labels: scraper, webscraping

Crawly

Crawly, a high-level web crawling & scraping framework for Elixir.

Stars: ✭ 440 (+1900%)

Mutual labels: scraper, crawling

Goscraper

Golang pkg to quickly return a preview of a webpage (title/description/images)

Stars: ✭ 72 (+227.27%)

Mutual labels: scraper, webpage

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (+2795.45%)

Mutual labels: scraper, crawling

Headless Chrome Crawler

Distributed crawler powered by Headless Chrome

Stars: ✭ 5,129 (+23213.64%)

Mutual labels: scraper, crawling

Mailinglistscraper

A python web scraper for public email lists.

Stars: ✭ 19 (-13.64%)

Mutual labels: scraper, webscraping

Awesome Crawler

A collection of awesome web crawler,spider in different languages

Stars: ✭ 4,793 (+21686.36%)

Mutual labels: scraper, web-crawler

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Stars: ✭ 11,545 (+52377.27%)

Mutual labels: scraper, crawling

Dataflowkit

Extract structured data from web sites. Web sites scraping.

Stars: ✭ 456 (+1972.73%)

Mutual labels: scraper, crawling

Linkedin scraper

A library that scrapes Linkedin for user data

Stars: ✭ 413 (+1777.27%)

Mutual labels: firefox, scraper

robotstxt

robots.txt file parsing and checking for R

Stars: ✭ 65 (+195.45%)

Mutual labels: scraper, webscraping

proxycrawl-python

ProxyCrawl Python library for scraping and crawling

Stars: ✭ 51 (+131.82%)

Mutual labels: scraper, crawling

supervised-machine-learning

This repo contains regression and classification projects. Examples: development of predictive models for comments on social media websites; building classifiers to predict outcomes in sports competitions; churn analysis; prediction of clicks on online ads; analysis of the opioids crisis and an analysis of retail store expansion strategies using…

Stars: ✭ 34 (+54.55%)

Mutual labels: webscraping

freeRep

Bypass repubblica.it and lastampa.it paywall

Stars: ✭ 34 (+54.55%)

Mutual labels: firefox

impartus-downloader

Download Impartus lectures, convert to mkv for offline viewing.

Stars: ✭ 19 (-13.64%)

Mutual labels: scraper

web-scraping-engine

A simple web scraping engine supporting concurrent and anonymous scraping

Stars: ✭ 27 (+22.73%)

Mutual labels: scraper

Projeto de calculo de Imposto de Renda em operacoes na bovespa automaticamente. Tags:canal eletronico do investidor, CEI, selenium, bovespa, IRPF, IR, imposto de renda, finance, yahoo finance, acao, fii, etf, python, crawler, webscraping, calculadora ir

Stars: ✭ 120 (+445.45%)

Mutual labels: webscraping

containerise-lists

Containerise compatible domain lists

Stars: ✭ 28 (+27.27%)

Mutual labels: firefox

stock-market-scraper

Scraps historical stock market data from Yahoo Finance (https://finance.yahoo.com/)

Stars: ✭ 110 (+400%)

Mutual labels: scraper

aliexscrape

Get Aliexpress product details in JSON

Stars: ✭ 80 (+263.64%)

Mutual labels: scraper

VideoRecognition-realtime-autotrainer-alerts

State of the art object detection in real-time using YOLOV3 algorithm. Augmented with a process that allows easy training of the classifier as a plug & play solution . Provides alert if an item in an alert list is detected.

Stars: ✭ 36 (+63.64%)

Mutual labels: webscraping

1-60 of 1043 similar projects

›

next*5