All Projects → Hexmagic → Baidu-Index

Hexmagic / Baidu-Index

Licence: other
精准的百度指数抓取

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Baidu-Index

YouTubeUploader
An automated, headless YouTube Uploader
Stars: ✭ 116 (+728.57%)
Mutual labels:  selenium
PhpScreenRecorder
A slim PHP wrapper around ffmpeg to record screen,best for recording your acceptance test using selenium, easy to use and clean OOP interface
Stars: ✭ 44 (+214.29%)
Mutual labels:  selenium
extensiveautomation-server
Extensive Automation server
Stars: ✭ 19 (+35.71%)
Mutual labels:  selenium
pyderman
Install Selenium-compatible Chrome/Firefox/Opera/PhantomJS/Edge webdrivers automatically.
Stars: ✭ 24 (+71.43%)
Mutual labels:  selenium
Crawler pubg.op.gg
This is a web crawler for pubg.op.gg, written by Ruichong Liu. 绝地求生游戏数据抓取
Stars: ✭ 15 (+7.14%)
Mutual labels:  selenium
WaWebSessionHandler
(DISCONTINUED) Save WhatsApp Web Sessions as files and open them everywhere!
Stars: ✭ 27 (+92.86%)
Mutual labels:  selenium
webdriverio-zap-proxy
Demo - how to easily build security testing for Web App, using Zap and Glue
Stars: ✭ 58 (+314.29%)
Mutual labels:  selenium
whatsapp-bot
Made with Python and Selenium, can be used to send multiple messages and send messages as characters made of emojis
Stars: ✭ 34 (+142.86%)
Mutual labels:  selenium
DeepLTranslator
The DeepL Translator is an API written in Java that translates via the DeepL website sentences. Without API key.
Stars: ✭ 45 (+221.43%)
Mutual labels:  selenium
selenium-openapi
The missing Selenium OpenAPI spec
Stars: ✭ 25 (+78.57%)
Mutual labels:  selenium
testbench
Vaadin TestBench is a tool for automated user interface testing of Vaadin Framework applications.
Stars: ✭ 20 (+42.86%)
Mutual labels:  selenium
selenified
The Selenified Test Framework provides mechanisms for simply testing applications at multiple tiers while easily integrating into DevOps build environments. Selenified provides traceable reporting for both web and API testing, wraps and extends Selenium calls to more appropriately handle testing errors, and supports testing over multiple browser…
Stars: ✭ 38 (+171.43%)
Mutual labels:  selenium
FaucetCryptoBot
A bot for FaucetCrypto a cryptocurrency faucet. The bot can currently claim PTC ads, main reward and all the shortlinks except exe.io and fc.lc.
Stars: ✭ 69 (+392.86%)
Mutual labels:  selenium
jdi-light
Powerful Framework for UI Automation Testing on Java
Stars: ✭ 84 (+500%)
Mutual labels:  selenium
Insta-Bot
Python bot using Selenium increasing Instagram Followers.
Stars: ✭ 62 (+342.86%)
Mutual labels:  selenium
python-data-from-web
API and web scraping workshops
Stars: ✭ 32 (+128.57%)
Mutual labels:  selenium
fb auto-commenter
Facebook auto comment script, without using API (BETA)
Stars: ✭ 25 (+78.57%)
Mutual labels:  selenium
codeigniter-tettei-apps
『CodeIgniter徹底入門』のサンプルアプリケーション(CodeIgniter v3.1版)
Stars: ✭ 26 (+85.71%)
Mutual labels:  selenium
SeleniumDemo
Selenium automation test framework
Stars: ✭ 84 (+500%)
Mutual labels:  selenium
Selenium.HtmlElements.Net
Elements model for Selenium.WebDriver
Stars: ✭ 26 (+85.71%)
Mutual labels:  selenium

(⊙﹏⊙)

今天才发现百度良心发现,指数直接给出了数字,不需要再识别了,新的百度指数代码 https://github.com/Hexmagic/BaiduIndexNew.git

优点

精准的百度指数抓取,综合已有百度指数爬虫优点,做到精准易用

使用方法

首先要安装依赖包

pip install -r requirement.txt

注意

  • selenium的chromeDriver需要自行下载对应版本,
  • lxml包需要vsbuild tool,可以在csdn找到或者到我的博客留言发给你
  • tensorflow 最新支持3.6,所以最好使用python3.6

安装好依赖包,就可以开始抓取了 抓取之前首先要登录,执行登录命令,保存为本地cookie文件以备后续测试使用,有了登录文件便可以直接使用脚本进行抓取关键词了

模型数据生成

模型使用tensorflow作为后端进行训练,这里简单说下怎么生成训练和测试样本,切换到model目录,运行下面的脚本

python generate_date.py

根据提示生成测试和训练脚本,生成的数据分别位于test和train目录

确保该目录在生成之前不存在

训练模型

生成好训练和测试数据直接运行下面的命令

python train_model.py

该命令生成了一个序列化的模型,名字为model.h5(代码自带一个训练好的model)

模型准确的截图,使用了增强后的数据训练的,精确度达到97%,对于百度的原始图片估计可以做到99%的精准度

训练结果

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].