All Projects → Winniekun → spider

Winniekun / spider

Licence: other
🌟 powered by python3( simple learning of spider) 百度文库;网易云歌曲; 豆瓣电影; GitHub; 京东; QQ空间; 天气; vip解析助手; TED文本内容; wifi破解脚本; 必应图片设置为桌面等爬取

Programming Languages

python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language
HTML
75241 projects
julia
2034 projects

Projects that are alternatives of or similar to spider

go-movies
golang spider Crawler 爬虫 电影
Stars: ✭ 168 (+35.48%)
Mutual labels:  spider
C64-WiFi-Modem-User-Port
A NodeMCU (ESP8266) based WiFi modem for the C64's user port
Stars: ✭ 49 (-60.48%)
Mutual labels:  wifi
163Music
163music spider by scrapy.
Stars: ✭ 60 (-51.61%)
Mutual labels:  spider
feedingbottle
FeedingBottle is a Aircrack-ng GUI, create by Fast Light User-Interface Designer ("FLUID").
Stars: ✭ 26 (-79.03%)
Mutual labels:  wifi
wireless-esp8266-dap
ESP8266 Wireless Debugger. Based on CMSIS-DAP v2.0.0. Optional 40MHz SPI acceleration, etc. ESP8266 无线调试器
Stars: ✭ 154 (+24.19%)
Mutual labels:  wifi
FGRoute
Get your device ip address, router ip or wifi ssid
Stars: ✭ 128 (+3.23%)
Mutual labels:  wifi
ESP8266-WiFi-UART-transparent-bridge
Transparent serial communication sketch in Arduino IDE
Stars: ✭ 27 (-78.23%)
Mutual labels:  wifi
eewids
Easily Expandable Wireless Intrusion Detection System
Stars: ✭ 25 (-79.84%)
Mutual labels:  wifi
nodejs-meizitu
妹子图全站采集10G套图资源
Stars: ✭ 80 (-35.48%)
Mutual labels:  spider
WiFiSpi
SPI library for Arduino AVR and STM32F1 to connect to ESP8266
Stars: ✭ 55 (-55.65%)
Mutual labels:  wifi
Subbranch-China
银行、支行名称。中国各地区各银行支行名称数据爬虫,数据来源微信商户平台,已经整理可直接导入的sql文件
Stars: ✭ 31 (-75%)
Mutual labels:  spider
ideal-alligator
PowerShell script to retreive wifi ESSIDs and Passwords.
Stars: ✭ 24 (-80.65%)
Mutual labels:  wifi
UCAS-Helper
国科大(UCAS, ucas)校园网登录、课程资源下载、自动评教和分数查询助手
Stars: ✭ 105 (-15.32%)
Mutual labels:  wifi
OctoWifi-LEDs-Controller
LEDs driver for ESP32 ( support ART-NET, RGB888, RGB565, Z888 )
Stars: ✭ 16 (-87.1%)
Mutual labels:  wifi
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-58.06%)
Mutual labels:  spider
Apple-Signal
Connect Apple devices via bluetooth and wifi.
Stars: ✭ 27 (-78.23%)
Mutual labels:  wifi
qa
😚 Q & A website based on Spring Boot.
Stars: ✭ 46 (-62.9%)
Mutual labels:  spider
Rainbow-Wifi-Hack-Utility-Android
The program implements brute Wi-Fi network method on platform Android
Stars: ✭ 39 (-68.55%)
Mutual labels:  wifi
rpi-roam-webapp
Setup script and web application for a wireless Raspberry Pi bridge.
Stars: ✭ 13 (-89.52%)
Mutual labels:  wifi
network-interface
Operating system network-related library for Node.js is used to obtain hardware status and network environment changes, etc.
Stars: ✭ 24 (-80.65%)
Mutual labels:  wifi

python3.x 爬虫小项目


自己平时做数据分析时爬的数据 就当做练习爬虫了 😸

  • 爬取豆瓣国漫----2017/10

  • 爬取QQ好友所有说说----2017/11

  • 爬取赛氪网信息(未完成----2017/11

  • 爬取知乎用户信息(基于轮子哥 scrapy)----2017/11

  • 爬取WeChat(用itchat)----2017/12

  • 机器验证破解(未完成)----2017/12

  • 爬取星巴克信息----2018/1

  • 爬取网易云音乐评论 (持续更新)---- 2018/1

  • 爬取京东特定的商品评论---- 2018/1

  • 爬取豆瓣神秘巨星短评---- 2018/2

  • 爬取github--- 2018/2

  • vip视频解析助手--- 2018/2

    image

  • 抖音APP视频爬取下载(Fiddler)---2018/2
  • scrapy学习(依赖官方文档) ---2018/3
  • xpath学习 ---2018/3
  • 文件下载(浏览器下载的太慢了,ubuntu上还未发现好的下载软件,就自己简单实现了一个) ---/2018/3
  • 爬取ted的视频的文本内容,为后续的分析准备
  • WIFI 暴力破解

image

image

image

  • 添加百度文库的爬取(最近在用百度文库,经常提示粘贴超过用量,就弄了该脚本)

  • 并发爬取IMDB的数据

环境搭建与讲解

1. qq空间说说爬取

步骤:

  1. 通过模拟登录获取,因为说说中的请求链接需要的参数是在cookie中获取的,当然也可以通过其他的方式获取对应的cookies. 其中g_qzonetoken的获取是在网页的源码中获取的,
  2. 分析说说的链接, 构造参数, 传入即可

环境:

  1. selenuim
  2. request

注意事项

  1. 若是使用的是chrome, 注意chromedriver的版本和自己chrome的版本对应
  2. 使用模拟登录, 注意设置合适的睡眠时间, 避免还未执行登录操作, 后续的程序就直接执行了(可添加判断, 未做)

TODO

  • 并发爬取
  • 支持断点爬取
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].