All Projects → chenlongqiang → selenium-php

chenlongqiang / selenium-php

Licence: other
php selenium 数据采集

Programming Languages

PHP
23972 projects - #3 most used programming language
TSQL
950 projects

Projects that are alternatives of or similar to selenium-php

Selion
Enabling Test Automation in Java
Stars: ✭ 252 (+1300%)
Mutual labels:  selenium
java-e2e-test-example
An complete example of a pipeline focusing on API and UI (mobile and web) tests.
Stars: ✭ 18 (+0%)
Mutual labels:  selenium
frontend testing
Repository containing sample code used in a Frontend Testing workshop
Stars: ✭ 14 (-22.22%)
Mutual labels:  selenium
google-meet-bot
Bot for scheduling and entering google meet sessions automatically
Stars: ✭ 33 (+83.33%)
Mutual labels:  selenium
Fifa21-Autobidder
Selenium-based bot that autobids and autobuys players on FIFA 23 Ultimate Team's transfer market
Stars: ✭ 106 (+488.89%)
Mutual labels:  selenium
MasterSeleniumFramework
Automation Testing | Web | Java | OOPS | Selenium WebDriver | TestNG | Maven | ExtentReport | Allure Reports | Java mail API | Design Patterns (Page Object Model, Singleton) | Jenkins | Data-Driven Testing using JSON file
Stars: ✭ 52 (+188.89%)
Mutual labels:  selenium
Lambdium
headless chrome + selenium webdriver in AWS Lambda using the serverless application model
Stars: ✭ 246 (+1266.67%)
Mutual labels:  selenium
Python-Studies
All studies about python
Stars: ✭ 56 (+211.11%)
Mutual labels:  selenium
pinterest-web-scraper
Scraping Visually Similar Images from Pinterest
Stars: ✭ 26 (+44.44%)
Mutual labels:  selenium
RARBG-scraper
With Selenium headless browsing and CAPTCHA solving
Stars: ✭ 38 (+111.11%)
Mutual labels:  selenium
selenium-cheatsheet-java
A comprehensive list of selenium commands in Java
Stars: ✭ 20 (+11.11%)
Mutual labels:  selenium
TRA-Ticket-Booker
(已不適用新版臺鐵訂票系統,且不再更新)台灣鐵路訂票應用程式(臺鐵 / 台鐵 / 訂單程票 / 訂來回票),基於 Selenium + PyQt4。
Stars: ✭ 26 (+44.44%)
Mutual labels:  selenium
phoenix.webui.framework
基于WebDriver的WebUI自动化测试框架
Stars: ✭ 118 (+555.56%)
Mutual labels:  selenium
jest-selenium
This project shows how to drive your selenium tests with Jest.
Stars: ✭ 22 (+22.22%)
Mutual labels:  selenium
justtestlah
Dynamic test framework for web and mobile applications
Stars: ✭ 43 (+138.89%)
Mutual labels:  selenium
Pytest Selenium
Plugin for running Selenium with pytest
Stars: ✭ 246 (+1266.67%)
Mutual labels:  selenium
testng-metrics
A HTML Report of test executions via TestNG (No code changes required, Export results)
Stars: ✭ 3 (-83.33%)
Mutual labels:  selenium
scrape-youtube-channel-videos-url
This Python script is used to scrape all the video links from a youtube channel.
Stars: ✭ 34 (+88.89%)
Mutual labels:  selenium
ScatterFly
An attempt to improve user privacy by intelligent data obfuscation.
Stars: ✭ 49 (+172.22%)
Mutual labels:  selenium
fBrowser
Helpful Selenium functions to make web-scraping easier and faster
Stars: ✭ 16 (-11.11%)
Mutual labels:  selenium

使用 php 配合 selenium 进行数据采集,手摸手教学

tips!

本项目以采集 猪八戒任务 为例仅用于学习交流,采集前请阅读 robots.txt 协议

禁止用于非法行为,后果自负

运行环境及依赖说明

运行环境:php7.1,redis-4.0,mysql-5.6
依赖:java,chrome,chromedriver,selenium

依赖下载,已在百度网盘帮你准备好

链接:https://pan.baidu.com/s/1gbSckvixLMbW5JB3eaY6dQ 提取码:29qb

如不使用网盘,依赖包下载链接如下

依赖1: java jdk8 download

https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

依赖2: chrome download, my version: 76.0.3809.132

https://www.chromedownloads.net/

依赖3: chromedriver download, my version: 72.0.3626.69

https://chromedriver.storage.googleapis.com/index.html?path=72.0.3626.69/
download other version:
https://chromedriver.storage.googleapis.com/index.html

依赖4: selenium server download, version: selenium-server-standalone-3.141.59

https://www.seleniumhq.org/download/

使用流程

  • 安装好运行环境及依赖,并启动
  • 创建数据库,导入数据表 sql
    mysql -u username -ppassword -e "create database selenium_php character set utf8 collate utf8_general_ci" mysql -u username -ppassword selenium_php < zhubajie.sql
  • 配置 .env,redis mysql
  • cd selenium-php
  • java -jar selenium-server-standalone-3.141.59.jar
  • 采集列表页(爬取页码当前写死2~5页)php scripts/zhubajie/spider_list.php >> ./log/spider_list.log 2>&1
  • 列表页采集完成后,将任务丢进 redis 队列(方便详情页多进程采集)php scripts/zhuabajie/get_db_id_to_redis.php
  • 采集详情页 php scripts/zhuabajie/spider_detail.php >> ./log/spider_detail.log 2>&1

FAQ

因为无法验证开发者

sudo spctl --master-disable

windows 如何设置环境变量

https://www.java.com/zh_CN/download/help/path.xml

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].