This package is a complete tool for creating a large dataset of images (specially designed -but not only- for machine learning enthusiasts). It can crawl the web, download images, rename / resize / covert the images and merge folders..

Stars: ✭ 51 (-76.17%)

Mutual labels: crawler

Black Widow

GUI based offensive penetration testing tool (Open Source)

Stars: ✭ 124 (-42.06%)

Mutual labels: crawler

Cloudmusic

网易云爬虫解决方案

Stars: ✭ 51 (-76.17%)

Mutual labels: spider

Lyrics Crawler

Get the lyrics for the song currently playing on Spotify

Stars: ✭ 49 (-77.1%)

Mutual labels: crawler

Apiproject

[https://www.sofineday.com], golang项目开发脚手架,集成最佳实践(gin+gorm+go-redis+mongo+cors+jwt+json日志库zap(支持日志收集到kafka或mongo)+消息队列kafka+微信支付宝支付gopay+api加密+api反向代理+go modules依赖管理+headless爬虫chromedp+makefile+二进制压缩+livereload热加载)

Stars: ✭ 124 (-42.06%)

Mutual labels: spider

Zhihu Spider

知乎爬虫程序，定时跟踪问题数据，定时推送热门话题

Stars: ✭ 87 (-59.35%)

Mutual labels: spider

Qiandao

🌟⏳🌟 各种网站的签到（停止维护）

Stars: ✭ 141 (-34.11%)

Mutual labels: spider

Alipayorderssupervisor Gui

GUI of AlipayOrdersSupervisor, implemented in Java and Swing

Stars: ✭ 85 (-60.28%)

Mutual labels: spider

Pixeval

A Strong, Fast and Flexible Pixiv Client based on .NET Core and WPF

Stars: ✭ 1,031 (+381.78%)

Mutual labels: crawler

Skill Share Crawler Dl

Download Videos Skill Share per ID or per Class

Stars: ✭ 122 (-42.99%)

Mutual labels: crawler

Fbiwarning

Node.js seed downloader (Node.js 种子神器)

Stars: ✭ 44 (-79.44%)

Mutual labels: spider

Owllook

owllook-小说搜索引擎

Stars: ✭ 2,163 (+910.75%)

Mutual labels: spider

Oddish

To crawl all csgo skins from website.

Stars: ✭ 139 (-35.05%)

Mutual labels: crawler

Weibo Album Crawler

新浪微博相册大图多线程爬虫。

Stars: ✭ 83 (-61.21%)

Mutual labels: crawler

Datmusic Api

Alternative for VK Audio API

Stars: ✭ 160 (-25.23%)

Mutual labels: crawler

Tumblr crawler

tumblr解析网站

Stars: ✭ 83 (-61.21%)

Mutual labels: crawler

Taiwan News Crawlers

Scrapy-based Crawlers for news of Taiwan

Stars: ✭ 83 (-61.21%)

Mutual labels: crawler

Wechat article

爬取微信公众号文章

Stars: ✭ 121 (-43.46%)

Mutual labels: spider

Acm Statistics

An online tool (crawler) to analyze users performance in online judges (coding competition websites). Supported OJ: POJ, HDU, ZOJ, HYSBZ, CodeForces, UVA, ICPC Live Archive, FZU, SPOJ, Timus (URAL), LeetCode_CN, CSU, LibreOJ, 洛谷, 牛客OJ, Lutece (UESTC), AtCoder, AIZU, CodeChef, El Judge, BNUOJ, Codewars, UOJ, NBUT, 51Nod, DMOJ, VJudge

Stars: ✭ 83 (-61.21%)

Mutual labels: crawler

Instagram Bot

An Instagram bot developed using the Selenium Framework

Stars: ✭ 138 (-35.51%)

Mutual labels: crawler

Is Google

Verify that a request is from Google crawlers using Google's DNS verification steps

Stars: ✭ 82 (-61.68%)

Mutual labels: crawler

Dbworld Search

🔍 简单的搜索引擎, django 框架

Stars: ✭ 39 (-81.78%)

Mutual labels: crawler

Php Crawler

A php crawler that finds emails on the internets

Stars: ✭ 119 (-44.39%)

Mutual labels: crawler

Dirhunt

Find web directories without bruteforce

Stars: ✭ 983 (+359.35%)

Mutual labels: crawler

Sentinel Crawler

Xenomorph Crawler, a Concise, Declarative and Observable Distributed Crawler(Node / Go / Java / Rust) For Web, RDB, OS, also can act as a Monitor(with Prometheus) or ETL for Infrastructure 💫 多语言执行器，分布式爬虫

Stars: ✭ 118 (-44.86%)

Mutual labels: crawler

Py Elasticsearch Django

基于python语言开发的千万级别搜索引擎

Stars: ✭ 207 (-3.27%)

Mutual labels: spider

Work crawler

Download comics novels 小说漫画下载工具小説漫画のダウンローダ小說漫畫下載:腾讯漫画大角虫漫画有妖气知音漫客咪咕 SF漫画哦漫画看漫画漫画柜汗汗酷漫動漫伊甸園快看漫画微博动漫 733动漫网大古漫画网漫画DB 無限動漫動漫狂卡推漫画动漫之家动漫屋古风漫画网 36漫画网亲亲漫画网乙女漫画 comico webtoons 咚漫ニコニコ静画 ComicWalker ヤングエースUP モアイ pixivコミックサイコミ;アルファポリスカクヨムハーメルン小説家になろう起点中文网八一中文网顶点小说落霞小说网努努书坊笔趣阁→epub.

Stars: ✭ 1,224 (+471.96%)

Mutual labels: crawler

Diskover

File system crawler, disk space usage, file search engine and file system analytics powered by Elasticsearch

Stars: ✭ 977 (+356.54%)

Mutual labels: crawler

Moodle Downloader 2

A Moodle downloader that downloads course content fast from Moodle (eg. lecture pdfs)

Stars: ✭ 118 (-44.86%)

Mutual labels: crawler

An Open Source Search Engine

Stars: ✭ 139 (-35.05%)

Mutual labels: crawler

Wombat

Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.

Stars: ✭ 1,220 (+470.09%)

Mutual labels: crawler

Pythondemo

My Python Demo

Stars: ✭ 173 (-19.16%)

Mutual labels: spider

Swiftlinkpreview

It makes a preview from an URL, grabbing all the information such as title, relevant texts and images.

Stars: ✭ 1,216 (+468.22%)

Mutual labels: crawler

Douyin Crawler

抖音爬虫. 通过手机代理爬取用户的作品和用户的喜欢

Stars: ✭ 33 (-84.58%)

Mutual labels: crawler

Copybook

用爬虫爬取小说网站上所有小说，存储到数据库中，并用爬到的数据构建自己的小说网站

Stars: ✭ 117 (-45.33%)

Mutual labels: spider

Universityrecruitment Ssurvey

用严肃的数据来回答“什么样的企业会到什么样的大学招聘”？

Stars: ✭ 30 (-85.98%)

Mutual labels: crawler

Baiducrawler

Sample of using proxies to crawl baidu search results.

Stars: ✭ 116 (-45.79%)

Mutual labels: crawler

Spider

A configurable web spider with a easy-to-use web console

Stars: ✭ 954 (+345.79%)

Mutual labels: spider

Instagram Scraper

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

Stars: ✭ 2,209 (+932.24%)

Mutual labels: crawler

Poopak

POOPAK - TOR Hidden Service Crawler

Stars: ✭ 78 (-63.55%)

Mutual labels: crawler

Koreanewscrawler

대량의 뉴스 데이터를 수집하기 위해 만들어진 뉴스 크롤러입니다.

Stars: ✭ 138 (-35.51%)

Mutual labels: crawler

Webb

Python: An all-in-one Web Crawler, Web Parser and Web Scrapping library!

Stars: ✭ 77 (-64.02%)

Mutual labels: crawler

Gerapy

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

Stars: ✭ 2,601 (+1115.42%)

Mutual labels: spider

Tianyancha

pip安装的天眼查爬虫API，指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.

Stars: ✭ 206 (-3.74%)

Mutual labels: crawler

Jvppeteer

Headless Chrome For Java （Java 爬虫）

Stars: ✭ 193 (-9.81%)

Mutual labels: crawler

Anticrawlersolution

It covers the blockade principle of most anti-climbing strategies and corresponding solutions.👽👽👽👽（涵盖了大部分的反爬策略的封锁原理以及对应的解决方案。）

Stars: ✭ 77 (-64.02%)

Mutual labels: crawler

Zhihu Spider

一个获取知乎用户主页信息的多线程Python爬虫程序。