Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → apify → Actor Google Search Scraper

apify / Actor Google Search Scraper

Apify actor that crawls Google Search result pages (SERPs) and extracts a list of organic results, ads, related queries and more. It supports selection of custom country, language and location.

Labels

html web-scraping

Projects that are alternatives of or similar to Actor Google Search Scraper

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (+628.95%)

Mutual labels: web-scraping

User Agents

A JavaScript library for generating random user agents with data that's updated daily.

Stars: ✭ 485 (+1176.32%)

Mutual labels: web-scraping

Youtube tutorials

Collection of scripts corresponding to LucidProgramming YouTube tutorials

Stars: ✭ 769 (+1923.68%)

Mutual labels: web-scraping

Ache

ACHE is a web crawler for domain-specific search.

Stars: ✭ 320 (+742.11%)

Mutual labels: web-scraping

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (+1121.05%)

Mutual labels: web-scraping

Pythoncode Tutorials

The Python Code Tutorials

Stars: ✭ 544 (+1331.58%)

Mutual labels: web-scraping

Php Curl Class

PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs

Stars: ✭ 2,903 (+7539.47%)

Mutual labels: web-scraping

Snoop

Snoop — инструмент разведки на основе открытых данных (OSINT world)

Stars: ✭ 886 (+2231.58%)

Mutual labels: web-scraping

Rpa

UI.Vision: Open-Source RPA Software (formerly Kantu) - Modern Robotic Process Automation with Selenium IDE++

Stars: ✭ 477 (+1155.26%)

Mutual labels: web-scraping

Spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+1626.32%)

Mutual labels: web-scraping

Autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Stars: ✭ 4,077 (+10628.95%)

Mutual labels: web-scraping

Awesome Web Scraping

List of libraries, tools and APIs for web scraping and data processing.

Stars: ✭ 4,510 (+11768.42%)

Mutual labels: web-scraping

Coolqlcool

Nextjs server to query websites with GraphQL

Stars: ✭ 623 (+1539.47%)

Mutual labels: web-scraping

Basketball reference web scraper

NBA Stats API via Basketball Reference

Stars: ✭ 279 (+634.21%)

Mutual labels: web-scraping

Letterboxd recommendations

Scraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username

Stars: ✭ 23 (-39.47%)

Mutual labels: web-scraping

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+8200%)

Mutual labels: web-scraping

Scrapy Fake Useragent

Random User-Agent middleware based on fake-useragent

Stars: ✭ 520 (+1268.42%)

Mutual labels: web-scraping

Uc Davis Cs Exams Analysis

📈 Regression and Classification with UC Davis student quiz data and exam data

Stars: ✭ 33 (-13.16%)

Mutual labels: web-scraping

Webmiddle

Node.js framework for modular web scraping and data extraction

Stars: ✭ 13 (-65.79%)

Mutual labels: web-scraping

Faster Than Requests

Faster requests on Python 3

Stars: ✭ 639 (+1581.58%)

Mutual labels: web-scraping

View All Similar Projects ➔

Google Search Results Scraper

Features
SERP API
Cost of usage
Use cases
Number of results
Input settings
Results
Tips and tricks
Changelog

Features

This SERP API actor crawls Google Search Result Pages (SERP or SERPs) and extracts data from the HTML to a structured format such as JSON, XML or Excel. Specifically, the actor extracts the following data from each SERP:

Organic results
Ads
Product ads
Related queries
People also ask
Price, reviews rating and count (under productInfo field if available)
Additional custom attributes

Note that the actor doesn't support special types of Google searches, such as Google Shopping, Google Images or Google News.

SERP API

Our Google Search Results Scraper gives you a RESTful SERP API that provides real-time results optimized for structured JSON output that you can download and use any way you want.

Cost of usage

The actor is free to use, but to scrape SERPs effectively, you should use Apify Proxy and need to have a sufficient limit for Google SERP queries (you can see the limit on your Account page).

New Apify users have a free trial of Apify Proxy and Google SERPs, so you can use the actor for free at the beginning.

Once the Apify Proxy trial is expired, you'll need to subscribe to a paid plan in order to keep using the actor. If you need to increase your Google SERP limit or have any questions, please email [email protected]

Use cases

Google Search is the front door to the internet for most people around the world, so it's really important for businesses to know how they rank on Google. Unfortunately, Google Search does not provide a public API, so the only way to monitor search results and ranking is to use web scraping.

Our free googlescraper tool gives you your own, customizable SERP scraper. You can do whatever you want with the SERP data once you extract and download it.

Typical use cases include:

Search engine optimization (SEO) — Monitor how your website performs on Google for certain queries over time.
Analyze display ads for a given set of keywords.
Monitor your competition in both organic and paid results.
Build a URL list for certain keywords. This is useful if you, for example, need good relevant starting points when scraping web pages containing specific phrases.

Read more in the How to scrape Google Search blog post.

Number of results

You can change the number of results per page by using the resultsPerPage parameter. The default is 10 but allowed values are 10-100. You can also set maxPagesPerQuery to get more results for each query.

Please keep in mind that, although Google shows that it internally found millions of results, Google will never display more than a few hundred results per single search query. You can try it in your own browser. If you need to get as many results as possible, try to create many similar queries and combine different parameters and locations.

Input settings

The actor gives you fine-grained control over what kind of Google Search results you'll get.

You can specify the following settings:

Query phrases or raw URLs
Country
Language
Exact geolocation
Number of results per page
Mobile or desktop version

For a complete description of all settings of the actor, see the input specification.

Results

The actor stores its result in the default dataset associated with the actor run, from which you can export it to various formats, such as JSON, XML, CSV or Excel.

The results can be downloaded from the Get dataset items API endpoint:

https://api.apify.com/v2/datasets/[DATASET_ID]/items?format=[FORMAT]

where [DATASET_ID] is the ID of the dataset and [FORMAT] can be csv, html, xlsx, xml, rss or json.

For each Google Search results page, the dataset will contain a single record, which in JSON format looks as follows. Keep in mind that some fields have example values:

{
  "searchQuery": {
    "term": "Hotels in Prague",
    "page": 1,
    "type": "SEARCH",
    "domain": "google.cz",
    "countryCode": "cz",
    "languageCode": "en",
    "locationUule": null,
    "resultsPerPage": "10"
  },
  "url": "http://www.google.com/search?gl=cz&hl=en&num=10&q=Hotels%20in%20Prague",
  "hasNextPage": false,
  "resultsTotal": 138000000078,
  "relatedQueries": [
    {
      "title": "cheap hotels in prague",
      "url": "https://www.google.com/search?hl=en&gl=CZ&q=cheap+hotels+in+prague&sa=X&sqi=2&ved=2ahUKEwjem6jG9cTgAhVoxlQKHeE4BuwQ1QIoAHoECAoQAQ"
    },
    {
      "title": "best hotels in prague old town",
      "url": "https://www.google.com/search?hl=en&gl=CZ&q=best+hotels+in+prague+old+town&sa=X&sqi=2&ved=2ahUKEwjem6jG9cTgAhVoxlQKHeE4BuwQ1QIoAXoECAoQAg"
    },
    // ...
  ],
  "paidResults": [
    {
      "title": "2280 Hotels in Prague | Best Price Guarantee | booking.com",
      "url": "https://www.booking.com/go.html?slc=h3;aid=303948;label=",
      "displayedUrl": "www.booking.com/",
      "description": "Book your Hotel in Prague online. No reservation costs. Great rates. Bed and Breakfasts. Support in 42 Languages. Hotels. Motels. Read Real Guest Reviews. 24/7 Customer Service. 34+ Million Real Reviews. Secure Booking. Apartments. Save 10% with Genius. Types: Hotels, Apartments, Villas.£0 - £45 Hotels - up to £45.00/day - Book Now · More£45 - £90 Hotels - up to £90.00/dayBook Now£130 - £180 Hotels - up to £180.00/dayBook Now£90 - £130 Hotels - up to £130.00/dayBook Nowup to £45.00/dayup to £90.00/dayup to £180.00/dayup to £130.00/day",
      "siteLinks": [
        {
          "title": "Book apartments and more",
          "url": "https://www.booking.com/go.html?slc=h3;aid=303948;label=",
          "description": "Bookings instantly confirmed!Instant confirmation, 24/7 support"
        },
        {
          "title": "More than just hotels",
          "url": "https://www.booking.com/go.html?slc=h2;aid=303948;label=",
          "description": "Search, book, stay – get started!Hotels when and where you need them"
        }
      ]
    },
    {
      "title": "Hotels In Prague | Hotels.com™ Official Site‎",
      "displayedUrl": "www.hotels.com/Prague/Hotel",
      "description": "Hotels In Prague Book Now! Collect 10 Nights and Get 1 Free. Budget Hotels. Guest Reviews. Last Minute Hotel Deals. Luxury Hotels. Exclusive Deals. Price Guarantee. Photos & Reviews. Travel Guides. Earn Free Hotel Nights. No Cancellation Fees. Types: Hotel, Apartment, Hostel.",
      "siteLinks": []
    },
    // ...
  ],
  "paidProducts": [],
  "organicResults": [
    {
      "title": "30 Best Prague Hotels, Czech Republic (From $11) - Booking.com",
      "url": "https://www.booking.com/city/cz/prague.html",
      "displayedUrl": "https://www.booking.com › Czech Republic",
      "description": "Great savings on hotels in Prague, Czech Republic online. Good availability and great rates. Read hotel reviews and choose the best hotel deal for your stay.",
      "siteLinks": [],
      "productInfo": {
          "price": "$123",
          "rating": 4.7,
          "numberOfReviews": 4510
      },
    },
    {
      "title": "The 30 best hotels & places to stay in Prague, Czech Republic ...",
      "url": "https://www.booking.com/city/cz/prague.en-gb.html",
      "displayedUrl": "https://www.booking.com › Czech Republic",
      "description": "Great savings on hotels in Prague, Czech Republic online. Good availability and great rates. Read hotel reviews and choose the best hotel deal for your stay.",
      "siteLinks": [],
      "productInfo": {},
    },
    // ...
  ],
  "peopleAlsoAsk": [
    {
      "question": "What is the name of the best hotel in the world?",
      "answer": "Burj Al Arab Jumeirah, Dubai. Arguably Dubai's most iconic hotel, the Burj Al Arab rises above the Persian Gulf on its own man-made island like a giant sail. Everything here is over-the-top, from the gilded furnishings in its guest rooms to the house fleet of Rolls-Royces.",
      "url": "https://www.travelandleisure.com/worlds-best/hotels-top-100-overall",
      "title": "Best 100 Hotels: World's Best Hotels 2020 | Travel + Leisure | Travel ...",
      "date": "Jul 8, 2020"
    }
  ],
  "customData": {
    "pageTitle": "Hotels in Prague - Google Search"
  }
},

How to get one search result per row

If you are only interested in organic Google Search results and want to get just one organic or paid result per row on the output, simply query the fields=searchQuery,organicResults and unwind=organicResults query parameters to the API endpoint URL:

https://api.apify.com/v2/datasets/[DATASET_ID]/items?format=[FORMAT]&fields=searchQuery,organicResults&unwind=organicResults

The API will return a result like this (in JSON format):

[
  {
    "searchQuery": {
      "term": "Restaurants in Prague",
      "page": 1,
      // ...
    },
    "title": "THE 10 BEST Restaurants in Prague 2019 - TripAdvisor",
    "url": "https://www.tripadvisor.com/Restaurants-g274707-Prague_Bohemia.html",
    "displayedUrl": "https://www.tripadvisor.com/Restaurants-g274707-Prague_Bohemia.html",
    "description": "Best Dining in Prague, Bohemia: See 617486 TripAdvisor traveler reviews of 6232 Prague restaurants and search by cuisine, price, location, and more.",
    "siteLinks": []
  },
  {
    "searchQuery": {
      "term": "Restaurants in Prague",
      "page": 1,
      // ...
    },
    "title": "The 11 Best Restaurants in Prague | Elite Traveler",
    "url": "https://www.elitetraveler.com/finest-dining/restaurant-guide/the-11-best-restaurants-in-prague",
    "displayedUrl": "https://www.elitetraveler.com/finest-dining/restaurant.../the-11-best-restaurants-in-prag...",
    "description": "Jan 16, 2018 - With the regional fare certainly a highlight of dining in Prague, a great number of superb international eateries have touched down to become ...",
    "siteLinks": []
  },
  // ...
]

When using tabular format such as csv or xls, you'll get a table where each row contains just one organic result. For more details about exporting and formatting the dataset records, please see the documentation of the Get dataset items API endpoint.

Tips and tricks

Crawling the second and further result pages might be slower than the first page.
If you need to scrape a lot of results for a single query, then you can greatly improve the speed of the crawl by setting Results per page (resultsPerPage) to 100, instead of crawling 10 pages each with 10 results.

Changelog

Google Search Results Scraper is under active development and we regularly introduce new features and fix bugs. We also often have to hotfix the extractor when Google changes the page layout. Check our Changelog for recent updates.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 38

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (27) 🔗