All Projects → Crawlerforreader → Similar Projects or Alternatives

506 Open source projects that are alternatives of or similar to Crawlerforreader

新一代爬虫平台，以图形化方式定义爬虫流程，不写代码即可完成爬虫。

Stars: ✭ 365 (+16.99%)

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.

Stars: ✭ 231 (-25.96%)

Mutual labels: crawler, jsoup

Gecco

Easy to use lightweight web crawler（易用的轻量化网络爬虫）

Stars: ✭ 2,310 (+640.38%)

Mutual labels: crawler, jsoup

Crawlerpack

Java 網路資料爬蟲包

Stars: ✭ 99 (-68.27%)

Mutual labels: crawler, jsoup

Appcrawler

基于appium的app自动遍历工具

Stars: ✭ 925 (+196.47%)

Mutual labels: crawler, xpath

Docs

《数据采集从入门到放弃》源码。内容简介：爬虫介绍、就业情况、爬虫工程师面试题；HTTP协议介绍； Requests使用；解析器Xpath介绍； MongoDB与MySQL；多线程爬虫； Scrapy介绍；Scrapy-redis介绍；使用docker部署；使用nomad管理docker集群；使用EFK查询docker日志

Stars: ✭ 118 (-62.18%)

Mutual labels: crawler, xpath

Graphquery

GraphQuery is a query language and execution engine tied to any backend service.

Stars: ✭ 112 (-64.1%)

Mutual labels: crawler, xpath

Jsoup

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.

Stars: ✭ 9,184 (+2843.59%)

Mutual labels: jsoup, xpath

Awesome Java Crawler

本仓库收集整理爬虫相关资源，开发语言以Java为主

Stars: ✭ 228 (-26.92%)

Mutual labels: crawler, jsoup

crawler

A simple and flexible web crawler framework for java.

Stars: ✭ 20 (-93.59%)

Mutual labels: crawler, jsoup

codes-scratch-crawler

读书笔记《自己动手写网络爬虫》，自己敲的代码。主要记录了网络爬虫的基本实现，网页去重的算法，网页指纹算法，文本信息挖掘

Stars: ✭ 44 (-85.9%)

Mutual labels: crawler

rankr

🇰🇷 Realtime integrated information analysis service

Stars: ✭ 21 (-93.27%)

Mutual labels: crawler

Sitemap Generator

Easily create XML sitemaps for your website.

Stars: ✭ 273 (-12.5%)

Mutual labels: crawler

Weixin Spider

微信公众号爬虫，公众号历史文章，文章评论，文章阅读及在看数据，可视化web页面，可部署于Windows服务器。基于Python3之flask/mysql/redis/mitmproxy/pywin32等实现，高效微信爬虫，微信公众号爬虫，历史文章，文章评论，数据更新。

Stars: ✭ 287 (-8.01%)

Mutual labels: crawler

indieweb-search

Source code for the IndieWeb search engine.

Stars: ✭ 16 (-94.87%)

Mutual labels: crawler

Arachni

Web Application Security Scanner Framework

Stars: ✭ 2,942 (+842.95%)

Mutual labels: crawler

ZhengFang System Spider

🐛一只登录正方教务管理系统，爬取数据的小爬虫

Stars: ✭ 21 (-93.27%)

Mutual labels: crawler

bots-zoo

No description or website provided.

Stars: ✭ 59 (-81.09%)

Mutual labels: crawler

Bt Btt

磁力網站U3C3介紹以及域名更新

Stars: ✭ 261 (-16.35%)

Mutual labels: crawler

dijnet-bot

Az összes számlád még egy helyen :)

Stars: ✭ 17 (-94.55%)

Mutual labels: crawler

slime

🍰 一个可视化的爬虫平台

Stars: ✭ 27 (-91.35%)

Mutual labels: crawler

Hquery.php

An extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.

Stars: ✭ 295 (-5.45%)

Mutual labels: crawler

Gospider

golang实现的爬虫框架，使用者只需关心页面规则，提供web管理界面。基于colly开发。

Stars: ✭ 285 (-8.65%)

Mutual labels: crawler

Tumblr crawler

This is a Multi-thread crawler for Tumblr.

Stars: ✭ 258 (-17.31%)

Mutual labels: crawler

CrawlBox

Easy way to brute-force web directory.

Stars: ✭ 118 (-62.18%)

Mutual labels: crawler

Crawling-CV-Conference-Papers

Crawling CV conference papers with Python.

Stars: ✭ 32 (-89.74%)

Mutual labels: crawler

tg crawler

Just a crawler based on tg-cli for Telegram. Deprecated by now, please use telegram-export.

Stars: ✭ 71 (-77.24%)

Mutual labels: crawler

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (-11.22%)

Mutual labels: crawler

scraper

Scraper example built on Scala, Akka and Jsoup

Stars: ✭ 15 (-95.19%)

Mutual labels: jsoup

Python Automation Scripts

Simple yet powerful automation stuffs.

Stars: ✭ 292 (-6.41%)

Mutual labels: crawler

spparser

an async ETL tool written in Python.

Stars: ✭ 34 (-89.1%)

Mutual labels: xpath

Rcrawler

An R web crawler and scraper

Stars: ✭ 274 (-12.18%)

Mutual labels: crawler

html-query

A fluent and functional approach to querying HTML

Stars: ✭ 48 (-84.62%)

Mutual labels: crawler

Go Dork

The fastest dork scanner written in Go.

Stars: ✭ 274 (-12.18%)

Mutual labels: crawler

snapcrawl

Crawl a website and take screenshots

Stars: ✭ 37 (-88.14%)

Mutual labels: crawler

Line Bot Tutorial

line-bot-tutorial use python flask

Stars: ✭ 267 (-14.42%)

Mutual labels: crawler

TumblTwo

TumblTwo, an Improved Fork of TumblOne, a Tumblr Downloader.

Stars: ✭ 57 (-81.73%)

Mutual labels: crawler

Sasila

一个灵活、友好的爬虫框架

Stars: ✭ 286 (-8.33%)

Mutual labels: crawler

WebCrawler

一个轻量级、快速、多线程、多管道、灵活配置的网络爬虫。

Stars: ✭ 39 (-87.5%)

Mutual labels: crawler

Weibo terminator workflow

Update Version of weibo_terminator, This is Workflow Version aim at Get Job Done!

Stars: ✭ 259 (-16.99%)

Mutual labels: crawler

videodl

Videodl: A lightweight video downloader written by pure python.

Stars: ✭ 320 (+2.56%)

Mutual labels: crawler

Supercrawler

A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.

Stars: ✭ 306 (-1.92%)

Mutual labels: crawler

2017 PyConTW Talk

tw.pycon.org/2017/events/talk/314386410792550475/

Stars: ✭ 18 (-94.23%)

Mutual labels: crawler

Spidy

The simple, easy to use command line web crawler.

Stars: ✭ 257 (-17.63%)

Mutual labels: crawler

Crawlertutorial

爬蟲極簡教學（fetch, parse, search, multiprocessing, API）- PTT 為例

Stars: ✭ 282 (-9.62%)

Mutual labels: crawler

WeiboCrawler

无cookie版微博爬虫，可以连续爬取一个或多个新浪微博用户信息、用户微博及其微博评论转发。

Stars: ✭ 45 (-85.58%)

Mutual labels: crawler

lightnovel epub

🍭 epub generator for (light)novels (轻) 小说 epub 生成器，支持站点：轻之国度、轻小说文库

Stars: ✭ 89 (-71.47%)

Mutual labels: crawler

BilibiliCrawler

🌀 crawl bilibili user info and video info for data analysis | BiliBili爬虫

Stars: ✭ 25 (-91.99%)

Mutual labels: crawler

lostark-wait-notifier

🐤️ Lost Ark wait notifier

Stars: ✭ 38 (-87.82%)

Mutual labels: crawler

galer

A fast tool to fetch URLs from HTML attributes by crawl-in.

Stars: ✭ 138 (-55.77%)

Mutual labels: crawler

TripAdvisor-Crawling-Suite

Fetching hotel data from TripAdvisor.

Stars: ✭ 17 (-94.55%)

Mutual labels: crawler

spiderable-middleware

🤖 Prerendering for JavaScript powered websites. Great solution for PWAs (Progressive Web Apps), SPAs (Single Page Applications), and other websites based on top of front-end JavaScript frameworks

Stars: ✭ 29 (-90.71%)

Mutual labels: crawler

Exist

eXist Native XML Database and Application Platform

Stars: ✭ 294 (-5.77%)

Mutual labels: xpath

Scrapy Crawlera

Crawlera middleware for Scrapy

Stars: ✭ 281 (-9.94%)

Mutual labels: crawler

octopus

Recursive and multi-threaded broken link checker