All Projects → howie6879 → php-google

howie6879 / php-google

Licence: other
Google search results crawler, get google search results that you need - php

Programming Languages

PHP
23972 projects - #3 most used programming language

Projects that are alternatives of or similar to php-google

Fast Lianjia Crawler
直接通过链家 API 抓取数据的极速爬虫,宇宙最快~~ 🚀
Stars: ✭ 247 (+973.91%)
Mutual labels:  crawler
google-this
🔎 A simple yet powerful module to retrieve organic search results and much more from Google.
Stars: ✭ 88 (+282.61%)
Mutual labels:  google-search
auto crawler ptt beauty image
Auto Crawler Ptt Beauty Image Use Python Schedule
Stars: ✭ 35 (+52.17%)
Mutual labels:  crawler
Magic google
Google search results crawler, get google search results that you need
Stars: ✭ 247 (+973.91%)
Mutual labels:  crawler
weltschmerz
Weltschmerz by age - "I am X years old and... [Google autocomplete]"
Stars: ✭ 23 (+0%)
Mutual labels:  google-search
papercut
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-34.78%)
Mutual labels:  crawler
Ppspider
web spider built by puppeteer, support task-queue and task-scheduling by decorators,support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架,提供灵活的任务队列管理调度方案,提供便捷的数据保存方案(nedb/mongodb),提供数据可视化和用户交互的实现方案
Stars: ✭ 237 (+930.43%)
Mutual labels:  crawler
arachnod
High performance crawler for Nodejs
Stars: ✭ 17 (-26.09%)
Mutual labels:  crawler
serp-parser
Nodejs lib to parse Google SERP html pages
Stars: ✭ 28 (+21.74%)
Mutual labels:  google-search
crawler
A simple and flexible web crawler framework for java.
Stars: ✭ 20 (-13.04%)
Mutual labels:  crawler
Polite
Be nice on the web
Stars: ✭ 253 (+1000%)
Mutual labels:  crawler
ublacklist
Blocks specific sites from appearing in Google search results
Stars: ✭ 3,726 (+16100%)
Mutual labels:  google-search
TaobaoAnalysis
练习NLP,分析淘宝评论的项目
Stars: ✭ 28 (+21.74%)
Mutual labels:  crawler
Weibopicdownloader
免登录下载微博图片 爬虫 Download Weibo Images without Logging-in
Stars: ✭ 247 (+973.91%)
Mutual labels:  crawler
flink-crawler
Continuous scalable web crawler built on top of Flink and crawler-commons
Stars: ✭ 48 (+108.7%)
Mutual labels:  crawler
Strong Web Crawler
基于C#.NET+PhantomJS+Sellenium的高级网络爬虫程序。可执行Javascript代码、触发各类事件、操纵页面Dom结构。
Stars: ✭ 238 (+934.78%)
Mutual labels:  crawler
Python3Webcrawler
🌈Python3网络爬虫实战:QQ音乐歌曲、京东商品信息、房天下、破解有道翻译、构建代理池、豆瓣读书、百度图片、破解网易登录、B站模拟扫码登录、小鹅通、荔枝微课
Stars: ✭ 208 (+804.35%)
Mutual labels:  crawler
Sharingan
We will try to find your visible basic footprint from social media as much as possible - 😤 more sites is comming soon
Stars: ✭ 13 (-43.48%)
Mutual labels:  crawler
sse-option-crawler
SSE 50 index options crawler 上证50期权数据爬虫
Stars: ✭ 17 (-26.09%)
Mutual labels:  crawler
img-cli
An interactive Command-Line Interface Build in NodeJS for downloading a single or multiple images to disk from URL
Stars: ✭ 15 (-34.78%)
Mutual labels:  crawler

php-google

This is an easy Google Searching crawler that you can get anything you want in the page by using it.

During the process of crawling,you need to pay attention to the limitation from google towards ip address and the warning of exception , so I suggest that you should pause running the program and own the Proxy ip

python - MagicGoogle

2.How to Use?

This project can be installed via composer by requiring the howie6879/php-google package in composer.json:

{
    "require": {
        "howie6879/php-google": "1.0"
    }
}

If you have installed php-google in your project, you can get google search results that you need.

Example

# Add boostrap autoload file

require_once '../vendor/autoload.php';
use \howie6879\PhpGoogle\MagicGoogle;

# Or new MagicGoogle()
$magicGoogle = new MagicGoogle('http://127.0.0.1:8118');

# The first page of results
$data = $magicGoogle->search_page('python');

# Get url
$data = $magicGoogle->search_url('python');

foreach ($data as $value) {
    var_dump($value);
}

/** Output
 * string(23) "https://www.python.org/"
 * string(33) "https://www.python.org/downloads/"
 * string(35) "https://docs.python.org/3/tutorial/"
 * string(44) "https://www.python.org/about/gettingstarted/"
 * string(43) "https://wiki.python.org/moin/BeginnersGuide"
 * string(41) "https://www.python.org/downloads/windows/"
 * string(24) "https://docs.python.org/"
 * string(59) "https://en.wikipedia.org/wiki/Python_(programming_language)"
 * string(39) "https://www.codecademy.com/learn/python"
 * string(25) "https://github.com/python"
 * string(38) "https://www.tutorialspoint.com/python/"
 * string(28) "https://www.learnpython.org/"
 * string(44) "https://www.programiz.com/python-programming"
 */
 
# Get {'title','url','text'}
$data = $magicGoogle->search('python', 'en', '1');

foreach ($data as $value) {
    var_dump($value);
}

/** Output
 * array(3) {
 * ["title"]=>
 * string(21) "Welcome to Python.org"
 * ["url"]=>
 * string(23) "https://www.python.org/"
 * ["text"]=>
 * string(54) "The official home of the Python Programming Language. "
 * }
 */

You can see sample.php

If  you need a big amount of querie but only having an ip address,I suggest  you can have a time lapse between 5s ~ 30s.

The reason that it always return empty might be as follows:

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="https://ipv4.google.com/sorry/index?continue=https://www.google.me/s****">here</A>.
</BODY></HTML>
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].