稳健高效的评分制-针对性- IP代理池 + API服务，可以自己插入采集器进行代理IP的爬取，针对你的爬虫的一个或多个目标网站分别生成有效的IP代理数据库，支持MongoDB 4.0 使用 Python3.7（Scored IP proxy pool ,customise proxy data crawler can be added anytime）

Stars: ✭ 195 (-2.5%)

Mutual labels: crawler

Weibo terminator workflow

Update Version of weibo_terminator, This is Workflow Version aim at Get Job Done!

Stars: ✭ 259 (+29.5%)

Mutual labels: crawler

Tumblr Crawler

Easily download all the photos/videos from tumblr blogs. 下载指定的 Tumblr 博客中的图片，视频

Stars: ✭ 1,118 (+459%)

Mutual labels: crawler

Spidy

The simple, easy to use command line web crawler.

Stars: ✭ 257 (+28.5%)

Mutual labels: crawler

Fontobfuscator

字体混淆服务

Stars: ✭ 125 (-37.5%)

Mutual labels: crawler

galer

A fast tool to fetch URLs from HTML attributes by crawl-in.

Stars: ✭ 138 (-31%)

Mutual labels: crawler

Boj Autocommit

When you solve the problem of Baekjoon Online Judge, it automatically commits and pushes to the remote repository.

Stars: ✭ 60 (-70%)

Mutual labels: crawler

PY-Login

模拟登录各类网站，操作 API 完成各种不可描述的事情

Stars: ✭ 26 (-87%)

Mutual labels: crawler

Instagram Scraper

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

Stars: ✭ 2,209 (+1004.5%)

Mutual labels: crawler

weibo-scraper

Simple Weibo Scraper

Stars: ✭ 50 (-75%)

Mutual labels: crawler

Beanbun

Beanbun 是用 PHP 编写的多进程网络爬虫框架，具有良好的开放性、高可扩展性，基于 Workerman。

Stars: ✭ 1,096 (+448%)

Mutual labels: crawler

tg crawler

Just a crawler based on tg-cli for Telegram. Deprecated by now, please use telegram-export.

Stars: ✭ 71 (-64.5%)

Mutual labels: crawler

Crawlab Lite

Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台

Stars: ✭ 122 (-39%)

Mutual labels: crawler

codes-scratch-crawler

读书笔记《自己动手写网络爬虫》，自己敲的代码。主要记录了网络爬虫的基本实现，网页去重的算法，网页指纹算法，文本信息挖掘

Stars: ✭ 44 (-78%)

Mutual labels: crawler

Crawlergo

A powerful dynamic crawler for web vulnerability scanners

Stars: ✭ 1,088 (+444%)

Mutual labels: crawler

html-query

A fluent and functional approach to querying HTML

Stars: ✭ 48 (-76%)

Mutual labels: crawler

Leetcode Spider

用 node.js 爬你自己的 leetcode 解题源码

Stars: ✭ 176 (-12%)

Mutual labels: crawler

snapcrawl

Crawl a website and take screenshots

Stars: ✭ 37 (-81.5%)

Mutual labels: crawler

Awesome Python Primer

自学入门 Python 优质中文资源索引，包含书籍 / 文档 / 视频，适用于爬虫 / Web / 数据分析 / 机器学习方向

Stars: ✭ 57 (-71.5%)

Mutual labels: crawler

TumblTwo

TumblTwo, an Improved Fork of TumblOne, a Tumblr Downloader.

Stars: ✭ 57 (-71.5%)

Mutual labels: crawler

Qqmusicspider

基于Scrapy的QQ音乐爬虫(QQ Music Spider)，爬取歌曲信息、歌词、精彩评论等，并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料

Stars: ✭ 120 (-40%)

Mutual labels: crawler

WebCrawler

一个轻量级、快速、多线程、多管道、灵活配置的网络爬虫。

Stars: ✭ 39 (-80.5%)

Mutual labels: crawler

Leetcode Ranking Search

Leetcode Contest Ranking Searcher

Stars: ✭ 51 (-74.5%)

Mutual labels: crawler

videodl

Videodl: A lightweight video downloader written by pure python.

Stars: ✭ 320 (+60%)

Mutual labels: crawler

Weibo wordcloud

根据关键词抓取微博数据，再生成词云

Stars: ✭ 154 (-23%)

Mutual labels: crawler

2017 PyConTW Talk

tw.pycon.org/2017/events/talk/314386410792550475/

Stars: ✭ 18 (-91%)

Mutual labels: crawler

Lyrics Crawler

Get the lyrics for the song currently playing on Spotify

Stars: ✭ 49 (-75.5%)

Mutual labels: crawler

WeiboCrawler

无cookie版微博爬虫，可以连续爬取一个或多个新浪微博用户信息、用户微博及其微博评论转发。

Stars: ✭ 45 (-77.5%)

Mutual labels: crawler

Tiebamanager

（已跑路）百度贴吧吧务管理工具，自动扫描帖子并处理违规帖

Stars: ✭ 119 (-40.5%)

Mutual labels: crawler

lostark-wait-notifier

🐤️ Lost Ark wait notifier

Stars: ✭ 38 (-81%)

Mutual labels: crawler

Social Scraper

Tổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt

Stars: ✭ 47 (-76.5%)

Mutual labels: crawler

spiderable-middleware

🤖 Prerendering for JavaScript powered websites. Great solution for PWAs (Progressive Web Apps), SPAs (Single Page Applications), and other websites based on top of front-end JavaScript frameworks

Stars: ✭ 29 (-85.5%)

Mutual labels: crawler

Marmot

💐Marmot | Web Crawler/HTTP protocol Download Package 🐭

Stars: ✭ 186 (-7%)

Mutual labels: crawler

domfind

A Python DNS crawler to find identical domain names under different TLDs.

Stars: ✭ 22 (-89%)

Mutual labels: crawler

Weibo Crawler

新浪微博爬虫，用python爬取新浪微博数据，并下载微博图片和微博视频

Stars: ✭ 1,019 (+409.5%)

Mutual labels: crawler

php-google

Google search results crawler, get google search results that you need - php

Stars: ✭ 23 (-88.5%)

Mutual labels: crawler

Free proxy website

获取免费socks/https/http代理的网站集合

Stars: ✭ 119 (-40.5%)

Mutual labels: crawler

arachnod

High performance crawler for Nodejs

Stars: ✭ 17 (-91.5%)

Mutual labels: crawler

Avbook

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Stars: ✭ 8,133 (+3966.5%)

Mutual labels: crawler

flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Stars: ✭ 48 (-76%)

Mutual labels: crawler

Ngmeta

Dynamic meta tags in your AngularJS single page application

Stars: ✭ 152 (-24%)

Mutual labels: crawler

Vulnx

vulnx 🕷️ is an intelligent bot auto shell injector that detect vulnerabilities in multiple types of cms { `wordpress , joomla , drupal , prestashop .. `}

Stars: ✭ 1,009 (+404.5%)

Mutual labels: crawler

Laosj

golang light-weight image crawler

Stars: ✭ 199 (-0.5%)

Mutual labels: crawler

Ok ip proxy pool

🍿爬虫代理IP池(proxy pool) python🍟一个还ok的IP代理池

Stars: ✭ 196 (-2%)

Mutual labels: crawler

Github Spider

Github 仓库及用户分析爬虫