All Projects → hunterhug → Lizard

hunterhug / Lizard

Licence: other
💐 Full Amazon Automatic Download

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to Lizard

Spoon
🥄 A package for building specific Proxy Pool for different Sites.
Stars: ✭ 173 (+321.95%)
Mutual labels:  crawler, spider, distributed
Xxl Crawler
A distributed web crawler framework.(分布式爬虫框架XXL-CRAWLER)
Stars: ✭ 561 (+1268.29%)
Mutual labels:  crawler, spider, distributed
Amazonbigspider
😱Full Automatic Amazon Distributed Spider | 亚马逊分布式四国际站采集选款产品|账号admin,密码adminadmin
Stars: ✭ 140 (+241.46%)
Mutual labels:  amazon, crawler, spider
Jlitespider
A lite distributed Java spider framework :-)
Stars: ✭ 151 (+268.29%)
Mutual labels:  crawler, spider, distributed
Haipproxy
💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+12078.05%)
Mutual labels:  crawler, spider, distributed
Newcrawler
Free Web Scraping Tool with Java
Stars: ✭ 589 (+1336.59%)
Mutual labels:  crawler, spider
Baiduimagespider
一个超级轻量的百度图片爬虫
Stars: ✭ 591 (+1341.46%)
Mutual labels:  crawler, spider
Spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+1500%)
Mutual labels:  crawler, spider
Nodespider
[DEPRECATED] Simple, flexible, delightful web crawler/spider package
Stars: ✭ 33 (-19.51%)
Mutual labels:  crawler, spider
Fbcrawl
A Facebook crawler
Stars: ✭ 536 (+1207.32%)
Mutual labels:  crawler, spider
Grab Site
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Stars: ✭ 680 (+1558.54%)
Mutual labels:  crawler, spider
Crawler
A high performance web crawler in Elixir.
Stars: ✭ 781 (+1804.88%)
Mutual labels:  crawler, spider
Douyin
API of DouYin for Humans used to Crawl Popular Videos and Musics
Stars: ✭ 580 (+1314.63%)
Mutual labels:  crawler, spider
Netdiscovery
NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。
Stars: ✭ 573 (+1297.56%)
Mutual labels:  crawler, spider
Icrawler
A multi-thread crawler framework with many builtin image crawlers provided.
Stars: ✭ 629 (+1434.15%)
Mutual labels:  crawler, spider
Creeper
🐾 Creeper - The Next Generation Crawler Framework (Go)
Stars: ✭ 762 (+1758.54%)
Mutual labels:  crawler, spider
Zhihu Crawler
zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目
Stars: ✭ 890 (+2070.73%)
Mutual labels:  crawler, spider
Torbot
Dark Web OSINT Tool
Stars: ✭ 821 (+1902.44%)
Mutual labels:  crawler, spider
Scrapit
Scraping scripts for various websites.
Stars: ✭ 25 (-39.02%)
Mutual labels:  crawler, spider
Disec
Distributed Image Search Engine Crawler
Stars: ✭ 11 (-73.17%)
Mutual labels:  crawler, distributed

Project: Lizard

GitHub forks GitHub stars GitHub last commit Go Report Card GitHub issues

阿里云做活动了,2核4G 云服务器89元/年

此仓库为跨境电商全网选款产品的服务端代码,可视化前端界面仓库在此:https://github.com/hunterhug/lizardWeb

相对温馨的精细化选款新产品已经开发完毕了:饕餮选款服务端:Golang高性能精细化选款系统

一. 介绍

此项目使用Golang语言开发, 采集并发速度快, 前端使用beego开发. 已经老古董, 开发周期两年, 基本还能运行.

本项目前后端具体组件已经拆分出来:

  1. 💐Marmot | Web Crawler/HTTP protocol Download Package 🐭
  2. 💐Rabbit | Beego Simple Web| Easy use for everyone🐰

平台具体使用请查看亚马逊大数据智能选款平台使用手册v1.2.pdf

最新图片:

1. 中文介绍

用途: 选款,特别适合亚马逊跨境电子商务运营公司(不支持中国亚马逊)。核心竞争力: 四个国际站点:美国/英国/日本/德国,分布式,配套后台可视化。

关于选款: TOP20W 排名商品供你自由选择。

亚马逊爬虫支持:

  1. 列表页和详情页可选择代理方式
  2. 多浏览器保存cookie机制
  3. 机器人检测达到阈值自动换代理
  4. 检测日期过期自动停止程序
  5. IP池扫描周期填充代理IP
  6. 支持分布式跨平台抓取
  7. 高并发进程设置抓取
  8. 默认网页爬取去重
  9. 日志记录功能
  10. 配套可视化网站,支持多角度查看数据,小类数据,大类数据,Asin数据和类目数据,支持查看每件Asin商品的历史记录,如排名,价格,打分,reviews变化。部分数据支持导出,且网站支持RBAC权限,可分配每部分数据的查看和使用权限。
  11. 网络端监控爬虫,可查看爬虫当前时段数据抓取状态,爬取的进度,IP的消耗程度。 可支持网络端启动和停止爬虫,彻底成为Saas(待做)
  12. 可自定义填入IP,如塞入其他代理IP网站API获取的IP
  13. 可选择HTML文件保存本地

分布式,高并发,跨平台,多站点,多种自定义配置,极强的容错能力是这个爬虫的特点。机器数量和IP代理足够情况下,每天每个站点可满足抓取几百万的商品数据。

2. 简单一瞥

类目,你可以自行更改抓取页数,是否抓取。

小类数据,基本Top100商品数据。

大类数据,很详细,包括大类排名等,可以复杂查询条件筛选,下载。

产品趋势,你可以看到产品十几天的排名变化,价格变化。

导出的EXCEL

3. 软件架构

老的:

新的:

类目大体如下。

+----------------------------+-----------------+
| bigpname                   | count(bigpname) |
+----------------------------+-----------------+
| Amazon Launchpad           |              22 |
| Appliances                 |              34 |
| Arts Crafts & Sewing       |             470 |
| Automotive                 |            3162 |
| Baby                       |             333 |
| Beauty & Personal Care     |             406 |
| Camera & Photo             |             214 |
| Cell Phones & Accessories  |              61 |
| Clothing Shoes & Jewelry   |            1803 |
| Collectible Coins          |               3 |
| Computers & Accessories    |             294 |
| Electronics                |            1292 |
| Entertainment Collectibles |              43 |
| Gift Cards                 |              19 |
| Grocery & Gourmet Food     |            1324 |
| Health & Household         |            1185 |
| Home & Kitchen             |            1903 |
| Industrial & Scientific    |            3325 |
| Kitchen & Dining           |             738 |
| Musical Instruments        |             612 |
| Office Products            |             736 |
| Patio Lawn & Garden        |             590 |
| Pet Supplies               |             499 |
| Prime Pantry               |               1 |
| Sports & Outdoors          |            2686 |
| Sports Collectibles        |              57 |
| Tools & Home Improvement   |            1666 |
| Toys & Games               |             791 |
+----------------------------+-----------------+

免责声明

本产品遵循署名-非商业性使用-禁止演绎 4.0 国际。您可用于教育,学习,但不可用于商业盈利。

关于版权,爬虫有风险, 本人不承担由此开源项目带来的任何责任。

	版权所有,侵权必究
	署名-非商业性使用-禁止演绎 4.0 国际
	警告: 以下的代码版权归属hunterhug,请不要传播或修改代码
	你可以在教育用途下使用该代码,但是禁止公司或个人用于商业用途(在未授权情况下不得用于盈利)
	商业授权请联系邮箱:[email protected] QQ:459527502

	All right reserved
	Attribution-NonCommercial-NoDerivatives 4.0 International
	Notice: The following code's copyright by hunterhug, Please do not spread and modify.
	You can use it for education only but can't make profits for any companies and individuals!
	For more information on commercial licensing please contact hunterhug.
	Ask for commercial licensing please contact Mail:[email protected] Or QQ:459527502

	2017.7 by hunterhug

支持

如何部署该系统请参考: 搭建说明, 你也可以参考一次实例安装阿里云安装该产品

微信支持:

支付宝支持:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].