All Projects → flink-crawler → Similar Projects or Alternatives

876 Open source projects that are alternatives of or similar to flink-crawler

Js Reverse

JS逆向研究

Stars: ✭ 159 (+231.25%)

Mutual labels: crawler, spider

Fun crawler

Crawl some picture for fun

Stars: ✭ 169 (+252.08%)

Mutual labels: crawler, spider

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (-20.83%)

Mutual labels: spider, crawling

Xsrfprobe

The Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Toolkit.

Stars: ✭ 532 (+1008.33%)

Mutual labels: crawler, spider

Spoon

🥄 A package for building specific Proxy Pool for different Sites.

Stars: ✭ 173 (+260.42%)

Mutual labels: crawler, spider

Douyin

API of DouYin for Humans used to Crawl Popular Videos and Musics

Stars: ✭ 580 (+1108.33%)

Mutual labels: crawler, spider

Netdiscovery

NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。

Stars: ✭ 573 (+1093.75%)

Mutual labels: crawler, spider

Baiduimagespider

一个超级轻量的百度图片爬虫

Stars: ✭ 591 (+1131.25%)

Mutual labels: crawler, spider

Xxl Crawler

A distributed web crawler framework.（分布式爬虫框架XXL-CRAWLER）

Stars: ✭ 561 (+1068.75%)

Mutual labels: crawler, spider

Crawler

Go process used to crawl websites

Stars: ✭ 147 (+206.25%)

Mutual labels: crawler, crawling

Gain

Web crawling framework based on asyncio.

Stars: ✭ 2,002 (+4070.83%)

Mutual labels: crawler, spider

Ncov2019 data crawler

疫情数据爬虫，2019新型冠状病毒数据仓库，轨迹数据，同乘数据，报道

Stars: ✭ 175 (+264.58%)

Mutual labels: crawler, spider

N2h4

네이버 뉴스 수집을 위한 도구

Stars: ✭ 177 (+268.75%)

Mutual labels: crawler, crawling

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (+1227.08%)

Mutual labels: crawler, crawling

Crawler

A high performance web crawler in Elixir.

Stars: ✭ 781 (+1527.08%)

Mutual labels: crawler, spider

Scrapit

Scraping scripts for various websites.

Stars: ✭ 25 (-47.92%)

Mutual labels: crawler, spider

Zhihu Crawler

zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目

Stars: ✭ 890 (+1754.17%)

Mutual labels: crawler, spider

Awesome Python Primer

自学入门 Python 优质中文资源索引，包含书籍 / 文档 / 视频，适用于爬虫 / Web / 数据分析 / 机器学习方向

Stars: ✭ 57 (+18.75%)

Mutual labels: crawler, spider

Photon

Incredibly fast crawler designed for OSINT.

Stars: ✭ 8,332 (+17258.33%)

Mutual labels: crawler, spider

Terpene Profile Parser For Cannabis Strains

Parser and database to index the terpene profile of different strains of Cannabis from online databases

Stars: ✭ 63 (+31.25%)

Mutual labels: crawler, web-crawler

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

Stars: ✭ 789 (+1543.75%)

Mutual labels: crawler, crawling

Puppeteer Walker

a puppeteer walker 🕷 🕸

Stars: ✭ 78 (+62.5%)

Mutual labels: crawler, spider

Crawler examples

Some classic web crawler projects.一些经典的爬虫

Stars: ✭ 74 (+54.17%)

Mutual labels: crawler, spider

Gopa Abandoned

GOPA, a spider written in Go.（NOTE: this project moved to https://github.com/infinitbyte/gopa ）

Stars: ✭ 98 (+104.17%)

Mutual labels: crawler, spider

Spider

python crawler spider

Stars: ✭ 70 (+45.83%)

Mutual labels: crawler, spider

Crawler Detect

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent

Stars: ✭ 1,549 (+3127.08%)

Mutual labels: crawler, spider

Dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (+108.33%)

Mutual labels: crawler, crawling

Fictiondown

Stars: ✭ 362 (+654.17%)

Mutual labels: crawler, spider

Ok ip proxy pool

🍿爬虫代理IP池(proxy pool) python🍟一个还ok的IP代理池

Stars: ✭ 196 (+308.33%)

Mutual labels: crawler, spider

Decryptlogin

APIs for loginning some websites by using requests.

Stars: ✭ 1,861 (+3777.08%)

Mutual labels: crawler, spider

Zhihuspider

多线程知乎用户爬虫，基于python3

Stars: ✭ 201 (+318.75%)

Mutual labels: crawler, spider

Jssoup

JavaScript + BeautifulSoup = JSSoup

Stars: ✭ 203 (+322.92%)

Mutual labels: crawler, spider

Examples Of Web Crawlers

一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )

Stars: ✭ 10,724 (+22241.67%)

Mutual labels: crawler, spider

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Stars: ✭ 11,545 (+23952.08%)

Mutual labels: crawler, crawling

Mm131

MM131网站图片爬取 🚨

Stars: ✭ 129 (+168.75%)

Mutual labels: crawler, spider

Amazonbigspider

😱Full Automatic Amazon Distributed Spider | 亚马逊分布式四国际站采集选款产品|账号admin,密码adminadmin

Stars: ✭ 140 (+191.67%)

Mutual labels: crawler, spider

Bilibili member crawler

B站用户爬虫好耶~是爬虫

Stars: ✭ 115 (+139.58%)

Mutual labels: crawler, spider

zcrawl

An open source web crawling platform

Stars: ✭ 21 (-56.25%)

Mutual labels: crawling, web-crawling

Python3 Spider

Python爬虫实战 - 模拟登陆各大网站包含但不限于：滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝，如果喜欢请start ❤️

Stars: ✭ 2,129 (+4335.42%)

Mutual labels: crawler, spider

Scrapingoutsourcing

ScrapingOutsourcing专注分享爬虫代码尽量每周更新一个

Stars: ✭ 164 (+241.67%)

Mutual labels: crawler, spider

Jlitespider

A lite distributed Java spider framework :-)

Stars: ✭ 151 (+214.58%)

Mutual labels: crawler, spider

Laravel Crawler Detect

A Laravel wrapper for CrawlerDetect - the web crawler detection library

Stars: ✭ 227 (+372.92%)

Mutual labels: crawler, spider

Chromium for spider

dynamic crawler for web vulnerability scanner

Stars: ✭ 220 (+358.33%)

Mutual labels: crawler, spider

Ppspider

web spider built by puppeteer, support task-queue and task-scheduling by decorators，support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架，提供灵活的任务队列管理调度方案，提供便捷的数据保存方案（nedb/mongodb），提供数据可视化和用户交互的实现方案

Stars: ✭ 237 (+393.75%)

Mutual labels: crawler, spider

Proxy pool

Python爬虫代理IP池(proxy pool)

Stars: ✭ 13,964 (+28991.67%)

Mutual labels: crawler, spider

Douban Movie

Golang爬虫爬取豆瓣电影Top250

Stars: ✭ 114 (+137.5%)

Mutual labels: crawler, spider

Fooproxy

稳健高效的评分制-针对性- IP代理池 + API服务，可以自己插入采集器进行代理IP的爬取，针对你的爬虫的一个或多个目标网站分别生成有效的IP代理数据库，支持MongoDB 4.0 使用 Python3.7（Scored IP proxy pool ,customise proxy data crawler can be added anytime）

Stars: ✭ 195 (+306.25%)

Mutual labels: crawler, spider

Querylist

🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

Stars: ✭ 2,392 (+4883.33%)

Mutual labels: crawler, spider

Goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (+295.83%)

Mutual labels: crawler, spider

Jd mask robot

京东口罩库存监控爬虫(非selenium)，扫码登录、查价、加购、下单、秒杀