Using Selenium, Neha scraped data about 35 top selling sneakers of Nike and Adidas from stockx.com. She used this data to draw insights about sneaker resales.

Stars: ✭ 32 (-88.28%)

Mutual labels: selenium

selenium-php

php selenium 数据采集

Stars: ✭ 18 (-93.41%)

Mutual labels: selenium

frontend testing

Repository containing sample code used in a Frontend Testing workshop

Stars: ✭ 14 (-94.87%)

Mutual labels: selenium

Monocle

PowerShell Web Automation module, made to make automating websites easier

Stars: ✭ 47 (-82.78%)

Mutual labels: selenium

phoenix.webui.framework

基于WebDriver的WebUI自动化测试框架

Stars: ✭ 118 (-56.78%)

Mutual labels: selenium

TeslaPy

A Python module to use the Tesla Motors Owner API

Stars: ✭ 216 (-20.88%)

Mutual labels: selenium

scrape-youtube-channel-videos-url

This Python script is used to scrape all the video links from a youtube channel.

Stars: ✭ 34 (-87.55%)

Mutual labels: selenium

selenium-grid-docker-swarm

web scraping in parallel with Selenium Grid and Docker

Stars: ✭ 32 (-88.28%)

Mutual labels: selenium

Raspagem-de-dados-para-iniciantes

Raspagem de dados para iniciante usando Scrapy e outras libs básicas

Stars: ✭ 113 (-58.61%)

Mutual labels: web-crawler

justtestlah

Dynamic test framework for web and mobile applications

Stars: ✭ 43 (-84.25%)

Mutual labels: selenium

carina

Carina automation framework: Web, Mobile, API, DB etc testing...

Stars: ✭ 652 (+138.83%)

Mutual labels: selenium

RARBG-scraper

With Selenium headless browsing and CAPTCHA solving

Stars: ✭ 38 (-86.08%)

Mutual labels: selenium

frameworkium-examples

Sample project which utilises frameworkium-core, a framework for writing maintainable Selenium and REST API tests and facilitates reporting and integration to JIRA.

Stars: ✭ 52 (-80.95%)

Mutual labels: selenium

fBrowser

Helpful Selenium functions to make web-scraping easier and faster

Stars: ✭ 16 (-94.14%)

Mutual labels: selenium

Python-Studies

All studies about python

Stars: ✭ 56 (-79.49%)

Mutual labels: selenium

telenium

Automation for Kivy Application

Stars: ✭ 56 (-79.49%)

Mutual labels: selenium

robotframework-seleniumtestability

Extension for SeleniumLibrary that provides manual and automatic waiting for asyncronous events like fetch, xhr, etc.

Stars: ✭ 34 (-87.55%)

Mutual labels: selenium

View All Similar Projects ➔

WeReadScan

About

一个用于的将微信读书上的图书扫描转换本地PDF的爬虫库.

谈谈为何而开发

不得不说，“微信读书”是一个很好的平台。但是美中不足很明显，用户购买了图书资源，但是只能在“微信读书”的Application中阅读或者做一些文字批注╮(╯▽╰)╭，这些功能相较于购买的纸质书籍显然是不足的。比如，作者就习惯于用iPad的相关notebook类app做笔记，而“微信读书”并没有适配pencil做handwriting笔记的功能。

因此，既然“微信读书”没有提供，那只好自己解决了。于是，经过2天的开发，终于有了这个爬虫脚本，也可以开心地做手写笔记了o(￣▽￣)ブ

Get started

WeReadScan(原始版本)

pip install WeReadScan

WeReadScan-HTML(html-scrape version)

pip install WeReadScan-HTML

使用WeReadScan-HTML这个版本请访问 https://github.com/Algebra-FUN/WeReadScan/tree/html-variant

本项目需要使用selenium，需要对selenium具备基础的了解

Demo

话不多说，直接上代码

from selenium.webdriver import Chrome, ChromeOptions
from WeReadScan import WeRead

# 重要！为webdriver设置headless
chrome_options = ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument("--disable-blink-features=AutomationControlled")
chrome_options.add_argument('disable-infobars')
chrome_options.add_argument('log-level=3')

# 启动webdriver(--headless)
headless_driver = Chrome(options=chrome_options)

# debug 模式启动，可以保留png缓存
with WeRead(headless_driver,debug=True) as weread:
    # 重要！登陆
    weread.login()
    # 爬去指定url对应的图书资源并保存到当前文件夹
    weread.scan2pdf('https://weread.qq.com/web/reader/2c632ef071a486a92c60226')

扫描结果样例：

几点说明：

webdriver 需要 无头(headless) 模式启动
只有登陆后，才能扫描完整的图书资源；若不登陆，也可以扫描部分无需解锁的部分

API Reference

WeRead

WeReadScan.WeRead(headless_driver)

微信读书网页代理，用于图书扫描

Args

headless_driver: 设置了headless的Webdriver示例

Returns

WeReadInstance

Usage

chrome_options = ChromeOptions()
chrome_options.add_argument('--headless')
headless_driver = Chrome(chrome_options=chrome_options)
weread = WeRead(headless_driver)

Login

WeReadScan.WeRead.login(wait_turns=15)

展示二维码以登陆微信读书

Args

wait_turns: 登陆二维码等待扫描的等待轮数

Usage

weread.login()

Scan2pdf

WeReadScan.WeRead.scan2pdf(self, book_url, save_at='.', binary_threshold=95, quality=90, show_output=True,font_size_index=1)

扫面微信读书的书籍转换为PDF并保存本地

Args

参数名	类型	默认值	描述
book_url	str	必填	扫描目标书籍的URL
save_at	str	'.'	保存地址
binary_threshold	int	200	二值化处理的阈值
quality	int	100	扫描PDF的质量
show_output	bool	True	是否在该方法函数结束时展示生成的PDF文件
font_size_index	int	1	设置字号大小(对应微信读书字号)

Usage

weread.scan2pdf('https://weread.qq.com/web/reader/a57325c05c8ed3a57224187kc81322c012c81e728d9d180')

Disclaimer

本脚本仅限用于已购图书的爬取，用于私人学习目的，禁止用于商业目的和网上资源扩散，尊重微信读书方面的利益
若User使用该脚本用于不当的目的，责任由使用者承担，作者概不负责

Stargazers over time

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Algebra-FUN / WeReadScan

Programming Languages

Labels

Projects that are alternatives of or similar to WeReadScan

WeReadScan

About

谈谈为何而开发

相关版本

Get started

WeReadScan(原始版本)

WeReadScan-HTML(html-scrape version)

Demo

API Reference

WeRead

Args

Returns

Usage

Login

Args

Usage

Scan2pdf

Args

Usage

Disclaimer

Stargazers over time