1414044032 / Sina_Spider

Licence: other

新浪爬虫，基于Python+Selenium。模拟登陆后保存cookie，实现登录状态的保存。可以通过输入关键词来爬取到关键词相关的热门微博。

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Sina Spider

Capturer

capture pictures from website like sina, lofter, huaban and so on

Stars: ✭ 76 (+204%)

Mutual labels: spider, sina

Shadow

计算机基础知识，数据结构，设计模式，Tomcat中间件的实现

Stars: ✭ 19 (-24%)

Mutual labels: spider

fetchurls

A bash script to spider a site, follow links, and fetch urls (with built-in filtering) into a generated text file.

Stars: ✭ 97 (+288%)

Mutual labels: spider

yutto

🧊 一个可爱且任性的 B 站视频下载器（bilili V2）

Stars: ✭ 383 (+1432%)

Mutual labels: spider

ComicSpider

动漫之家漫画站电脑版原图爬虫

Stars: ✭ 67 (+168%)

Mutual labels: spider

js block

研究学习各种拦截：反爬虫、拦截ad、防广告注入、斗黄牛等

Stars: ✭ 59 (+136%)

Mutual labels: spider

bangumi yearly report

No description or website provided.

Stars: ✭ 24 (-4%)

Mutual labels: spider

weibo topic

微博话题关键词,个人微博采集, 微博博文一键删除 selenium获取cookie,requests处理

Stars: ✭ 28 (+12%)

Mutual labels: spider

NScrapy

NScrapy is a .net core corss platform Distributed Spider Framework which provide an easy way to write your own Spider

Stars: ✭ 88 (+252%)

Mutual labels: spider

job-spider

多线程爬取互联网行业常用招聘网站

Stars: ✭ 28 (+12%)

Mutual labels: spider

scripter

一些脚本和工具

Stars: ✭ 20 (-20%)

Mutual labels: spider

Bilibili manga download

带图形界面的哔哩哔哩漫画下载工具

Stars: ✭ 52 (+108%)

Mutual labels: spider

robotstxt

robots.txt file parsing and checking for R

Stars: ✭ 65 (+160%)

Mutual labels: spider

Tieba-Birthday-Spider

百度贴吧生日爬虫，可抓取贴吧内吧友生日，并且在对应日期自动发送祝福

Stars: ✭ 28 (+12%)

Mutual labels: spider

devsearch

A web search engine built with Python which uses TF-IDF and PageRank to sort search results.

Stars: ✭ 52 (+108%)

Mutual labels: spider

PTT Beauty Spider

PTT 表特版爬蟲圖片下載器

Stars: ✭ 47 (+88%)

Mutual labels: spider

get LibSeat

利昂图书馆预约系统自动预约&签到程序。支持包括中国人民大学、北京师范大学、济南大学、哈尔滨工业大学等在内的38所高校的图书馆系统

Stars: ✭ 39 (+56%)

Mutual labels: spider

zhihu-crawler

徒手实现定时爬取知乎，从中发掘有价值的信息，并可视化爬取的数据作网页展示。

Stars: ✭ 56 (+124%)

Mutual labels: spider

Spider

Spider项目将会不断更新本人学习使用过的爬虫方法！！！

Stars: ✭ 16 (-36%)

Mutual labels: spider

crawler

一个php爬虫

Stars: ✭ 13 (-48%)

Mutual labels: spider

View All Similar Projects ➔

Sina_Spider

新浪爬虫，基于Python+Selenium。模拟登陆后保存cookie，实现登录状态的保存。可以通过输入关键词来爬取到关键词相关的热门微博。

环境与工具：

Python：3.6 + selenium + firefox_Driver firfox_Driver 驱动下载地址： https://pan.baidu.com/s/1WGo7kVGsfRlE2XFvQRPHJA https://github.com/mozilla/geckodriver/releases 注意驱动与浏览器版本对应下载驱动后。可以放在 C:\Python36\Scripts 目录下面。不然需要配置环境变量，把驱动目录添加进Path。需要安装火狐浏览器：官网下载。

main 中修改为自己的账户密码即可。注意看浏览器打开的窗口登录时，是否有验证码。经过测试，邮箱登录一般不会弹出验证码。手机号码会弹出。异地登录会弹出。出现验证码，可以在 driver.find_element_by_css_selector("div.info_list:nth-child(6) > a:nth-child(1)").click() 之前time.sleep(20) 让驱动暂时暂停，手动输入验证码（20秒内）。之后就可以正常获取到cookie。获取的cookie 保存为txt文件，放在同一级目录中，再次登录就不需要模拟登陆了。

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

1414044032 / Sina_Spider

Programming Languages

Labels

Projects that are alternatives of or similar to Sina Spider

Sina_Spider

环境与工具：