Top 615 crawler open source projects

Crawler For Github Trending
🕷️ A node crawler for github trending.
Proxy pool
Python爬虫代理IP池(proxy pool)
Gain
Web crawling framework based on asyncio.
Fun crawler
Crawl some picture for fun
Sitemap Generator Crawler
Script that generates a sitemap by crawling a given URL
Douyin crawler
抖音爬虫,tiktok crawler,抖音数据采集接口,抖音视频去水印,百分百成功,不需要服务器,不需要代理 IP。
✭ 169
crawler
Bitextor
Bitextor generates translation memories from multilingual websites.
Scrapingoutsourcing
ScrapingOutsourcing专注分享爬虫代码 尽量每周更新一个
Gocrawl
Polite, slim and concurrent web crawler.
Datmusic Api
Alternative for VK Audio API
Yispider
一款分布式爬虫平台,帮助你更好的管理和开发爬虫。 内置一套爬虫定义规则(模版),可使用模版快速定义爬虫,也可当作框架手动开发爬虫。(兴趣使然的项目,用的不爽了就更新)
Instagram Scraper
scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot
Crawler
An easy to use, powerful crawler implemented in PHP. Can execute Javascript.
Weibo wordcloud
根据关键词抓取微博数据,再生成词云
Python3 Spider
Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️
Ngmeta
Dynamic meta tags in your AngularJS single page application
Jlitespider
A lite distributed Java spider framework :-)
Ptt Alertor
📢 Ptt 文章通知機器人!Notify Ptt Article in Realtime
Dxy Covid 19 Crawler
2019新型冠状病毒疫情实时爬虫及API | COVID-19/2019-nCoV Realtime Infection Crawler and API
Cocrawler
CoCrawler is a versatile web crawler built using modern tools and concurrency.
Rendora
dynamic server-side rendering using headless Chrome to effortlessly solve the SEO problem for modern javascript websites
Pachong
一些爬虫的代码
Httpcode.core
简单、易用、高效 一个有态度的开源.Net Http请求框架!可以用制作爬虫,api请求等等。
Th Music Video Generator
Touhou Project random music video generator/player, crawling image and video from websites to generate MV.
Javpy
Enjoy driving on a Javascriptive (originally Pythonic) way to Japanese AV!
Crawler
Go process used to crawl websites
Python Dcdownloader
由Python编写的全异步实现的动漫之家(dmzj)漫画批量下载器(爬虫)
Soksaccounts
🔥 Shadowsocks 账号爬虫
Google Play Scraper
Google play scraper for Python inspired by <facundoolano/google-play-scraper>
Robots Txt
Determine if a page may be crawled from robots.txt, robots meta tags and robot headers
✭ 142
crawler
Oddish
To crawl all csgo skins from website.
Amazonbigspider
😱Full Automatic Amazon Distributed Spider | 亚马逊分布式四国际站采集选款产品|账号admin,密码adminadmin
Search
An Open Source Search Engine
Go spider
[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.
Koreanewscrawler
대량의 뉴스 데이터를 수집하기 위해 만들어진 뉴스 크롤러입니다.
Zhihu Spider
一个获取知乎用户主页信息的多线程Python爬虫程序。
Onegram
This repository is no longer maintained.
4chan Downloader
Python3 script to continuously download all images/webms of multiple 4chan thread simultaneously - without installation
Goclone
Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.
Newspaper
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Mm131
MM131网站图片爬取 🚨
Digger
Digger is a powerful and flexible web crawler implemented by pure golang
Weibo Topic Spider
微博超级话题爬虫,微博词频统计+情感分析+简单分类,新增肺炎超话爬取数据
Kuaishou Crawler
As you can see, a kuaishou crawler
Sina Weibo Album Downloader
Multithreading download all HD photos / pictures from someone's Sina Weibo album.
Squidwarc
Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Fontobfuscator
字体混淆服务
Crawlab Lite
Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Skill Share Crawler Dl
Download Videos Skill Share per ID or per Class
Qqmusicspider
基于Scrapy的QQ音乐爬虫(QQ Music Spider),爬取歌曲信息、歌词、精彩评论等,并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料
Pspider
简单易用的Python爬虫框架,QQ交流群:597510560
61-120 of 615 crawler projects