All Projects → Leszek-Sieminski → pagespeedParseR

Leszek-Sieminski / pagespeedParseR

Licence: other
pagespeedParseR is an R wrapper for Google Pagespeed Insights API, that also enables convenient parsing

Programming Languages

r
7636 projects

Projects that are alternatives of or similar to pagespeedParseR

SeoTags
SeoTags create all SEO tags you need such as meta, link, twitter card (twitter:), open graph (og:), and JSON-LD schema (structred data).
Stars: ✭ 113 (+465%)
Mutual labels:  seo, seo-optimization, seo-tools
DNZ.SEOChecker
SEO Checker and Recommander Plugin (like wordpress Yoast) for ASP.NET Core.
Stars: ✭ 18 (-10%)
Mutual labels:  seo, seo-optimization, seo-tools
phpinsights
A versatile PHP Library for Google PageSpeed Insights
Stars: ✭ 109 (+445%)
Mutual labels:  pagespeed, pagespeed-insights, pagespeed-insights-api
silverstripe-seo
An all-in-one SEO module for SilverStripe 4.1+
Stars: ✭ 35 (+75%)
Mutual labels:  seo, seo-optimization
awesome-search-engine-optimization
A curated list of backlink, social signal opportunities, and link building strategies and tactics to help improve search engine results and ranking.
Stars: ✭ 82 (+310%)
Mutual labels:  seo, seo-optimization
SEO-Manager-Electron
Generates SEO Report Easily
Stars: ✭ 24 (+20%)
Mutual labels:  seo, seo-optimization
ecommercetools
EcommerceTools is a Python data science toolkit for ecommerce, marketing science, and technical SEO analysis and modelling and was created by Matt Clarke.
Stars: ✭ 41 (+105%)
Mutual labels:  seo, seo-optimization
Google-rank-tracker
SEO: Python script + shell script and cronjob to check ranks on a daily basis
Stars: ✭ 124 (+520%)
Mutual labels:  seo, seo-optimization
Rendora
dynamic server-side rendering using headless Chrome to effortlessly solve the SEO problem for modern javascript websites
Stars: ✭ 1,853 (+9165%)
Mutual labels:  seo, seo-optimization
poke
A simple tool to check your site for broken links, media, iframes, stylesheets, scripts, forms or metadata.
Stars: ✭ 24 (+20%)
Mutual labels:  seo, seo-optimization
drupal 8 unset html head link
🤖 Module for unset any wrong HTML links (like rel="delete-form", rel="edit-form", etc.) from head on Drupal 8.x websites. This is trust way to grow up position in SERP Google, Yandex, etc.
Stars: ✭ 19 (-5%)
Mutual labels:  seo, seo-optimization
sitemap-checker
a tool for validate xml sitemap and sitemap index files for broken links
Stars: ✭ 21 (+5%)
Mutual labels:  seo, seo-tools
framework
A stylish PHP application framework crafted using Slim, Twig, Eloquent and Sentinel designed to get you from clone to production in a matter of minutes.
Stars: ✭ 56 (+180%)
Mutual labels:  seo, seo-optimization
spiderable-middleware
🤖 Prerendering for JavaScript powered websites. Great solution for PWAs (Progressive Web Apps), SPAs (Single Page Applications), and other websites based on top of front-end JavaScript frameworks
Stars: ✭ 29 (+45%)
Mutual labels:  seo, seo-optimization
people-also-ask
People also ask Google scraper. Get as many questions as you need to optimize your site for voice or new content ideas or answering questions about your desired topic.
Stars: ✭ 39 (+95%)
Mutual labels:  seo, seo-optimization
php-text-generator
Fast SEO text generator on a mask.
Stars: ✭ 19 (-5%)
Mutual labels:  seo, seo-optimization
seotool
New version of my keyword tracking tool with additional features - works currently for google.de!
Stars: ✭ 52 (+160%)
Mutual labels:  seo, seo-optimization
magento2-module-seo
Magento 2 Module for Search Engine Optimization
Stars: ✭ 100 (+400%)
Mutual labels:  seo, seo-optimization
Silverstripe-SEO
A SilverStripe module to optimise the Meta, crawling, indexing, and sharing of your website content
Stars: ✭ 41 (+105%)
Mutual labels:  seo, seo-optimization
seo-audits-toolkit
SEO & Security Audit for Websites. Lighthouse & Security Headers crawler, Sitemap/Keywords/Images Extractor, Summarizer, etc ...
Stars: ✭ 311 (+1455%)
Mutual labels:  seo, seo-tools

pagespeedParseR v0.3.1.9000

Lifecycle_Status Build status Coverage status

R wrapper for Google Pagespeed Insights API

News and plans

2019-07-10, ver. 0.3.1.9000:
  • small bugfixes to long_result = T parameter in download_lighthouse(..., output_type = "simple")
  • small bugfixes to tests
2019-07-05, ver. 0.3.0.9000 (Lighthouse overhaul):
  • big overhaul of download_lighthouse() function. Parsing to data frame with output_type = "simple" parameter will now provide much more data. However, it can generate literally hundreds/thousands of columns (up to ~2500). What is more, the number of columns IS NOT STABLE, because it depends on the number of found errors and/or their type
  • to ease the pain of dealing with such data frames, I added long_result parameter that defaults to FALSE. Setting it to TRUE will force the function to spread the data frame into messy, long-like form that I hope to be easier to comprehend
  • the behaviour of download_lighthouse(output_type = "raw") or other functions didn't change
  • please mind that this is experimental and may cause unexpected errors. If you happen to find one, don't hesitate to inform me and/or add an Issue
Plans for future:
  • reworking the way user uses the package. I want to create only one function that downloads just the nested lists and create additional functions that will be used on downloaded nested lists to extract choosen report categories or audits or error examples. I think it will speed up the process, especially for the users that do not want to parse everything but need only some specific information. It would also allow me to add some parallelization features to speed everything up (and wouldn't blow up the API)
  • improving quantity of data parsed into data frame
  • speeding things up

What is Google Pagespeed Insights?

Google Pagespeed is an online tool that identifies performance issues for a given URL and provides suggestions and optimizations. See more details.

Google Lighthouse is another tool for webdevelopers that helps in fixing page performance. It was recently featured in Pagespeed API (version 5). You can try it via Chrome browser, see more details.

Other Pagespeed packages in R

There are other R packages for Pagespeed that I find very valuable:

But why another R package for Pagespeed API?

Parsing API response returns nested list object in R which isn't very convenient to extract data from (at least in R), so the outputs of above packages' functions are limited (doesn't contain recommendations). The goal of pagespeedParseR is to create simple output in form of data frame which contains both performance scores and most of recommendations + list of errors. Alternatively, user can decide to obtain the output in raw form of nested list and parse it by himself/herself if that's not enough.

Features

  • Authenticate with an API key
  • Choose report content: use download_pagespeed() for classic PageSpeed v4 results OR download_lighthouse() for new Lighthouse (PageSpeed v5) results
  • Choose output format: simple (conveniently parsed data frame with most important info) or raw (nested list with all the data provided by the API)
  • Query multiple URLs in one function call and keep information which URL's failed the check
  • Query every page for both Desktop and Mobile in one function call
  • Control API limits usage with simple interval mechanism

Acquiring API access token

Visit your Google Developers Console page, create the project, switch on Pagespeed API and copy your API access token into R from the credentials screen.

Installation

install.packages("devtools")
devtools::install_github("Leszek-Sieminski/pagespeedParseR")

Authentication

library(pagespeedParseR)
api_key <- "12345" # example
auth_pagespeed(api_key)

Usage

Startup

library(pagespeedParseR)
auth_pagespeed("12345") # not run, example

Lighthouse reports as wide data frames (most important data)

# Lighthouse reports - Data Frames --------------------------------------------
# download simple data frame with "Performance" Lighthouse report for Google.com:
# that's a lot of columns and you will have problems but you can
# spread/gather them as you like

lh_df_1 <- download_lighthouse(
 url = "https://www.google.com",
 output_type = "simple") # return the results in a wide data frame

class(lh_df_1)
# [1] "data.frame"
dim(lh_df_1)   # 1 row, 779 columns. The number of columns may wildly differ
# [1]   1 779  # because it depends also to number of spotted errors and their types



# this time let's download it and parse into messy long-like table:
lh_df_1_long <- download_lighthouse(
  url = "https://www.google.com",
  output_type = "simple", # return the results in a wide data frame
  long_result = TRUE) # spread the data into easier-to-digest form

class(lh_df_1_long)
# [1] "data.frame"
dim(lh_df_1_long) # 780 rows in 3 columns
# [1] 780   3



# check "Performance" for Google.com & Bing.com for both desktop & mobile and
# return in a data frame with most important columns
lh_df_2 <- download_lighthouse(
  url = c("https://www.google.com",
          "https://www.bing.com/"),
  output_type = "simple", # return the results in a wide data frame
  strategy = c("desktop", # check both desktop and mobile, bind
               "mobile"),
  interval = 1, # wait 1 second between the calls to API
  categories = "performance") # which Lighthouse reports
                              # are to be run?

class(lh_df_2)
# [1] "data.frame"
dim(lh_df_2)
# [1]    4 1231



# check "Performance" and "Accessibility" for Google.com & Bing.com for
# both desktop & mobile and return in a data frame with most important columns
lh_df_3 <- download_lighthouse(
  url = c("https://www.google.com",
          "https://www.bing.com/"),
  output_type = "simple", # return the results in a wide data frame
  strategy = c("desktop", # check both desktop and mobile, bind
               "mobile"),
  interval = 2,           # wait 2 seconds between the calls to API
  categories = c("performance", # run performance & accessibility
                 "accessibility"))

class(lh_df_3)
# [1] "data.frame"
dim(lh_df_3)
# [1]    4 1637



# check "Performance" and "Accessibility" for Google.com & Bing.com for
# both desktop & mobile and return in a data frame with even more data,
# including error occurences and the importance of each report result
lh_df_4 <- download_lighthouse(
  url = c("https://www.google.com",
          "https://www.bing.com/"),
  output_type = "simple", # return the results in a wide data frame
  strategy = c("desktop", # check both desktop and mobile, bind
               "mobile"),
  interval = 2,           # wait 2 seconds between the calls to API
  categories = c("performance", # run performance & accessibility
                 "accessibility"))



# another run for a messy long-like data frame
lh_df_4_long <- download_lighthouse(
  url = c("https://www.google.com",
          "https://www.bing.com/"),
  output_type = "simple", # return the results in a wide data frame
  strategy = c("desktop", # check both desktop and mobile, bind
               "mobile"),
  interval = 2,           # wait 2 seconds between the calls to API
  categories = c("performance", # run performance & accessibility
                 "accessibility"),
  long_result = TRUE) # spread into 4 columns

class(lh_df_4_long)
# [1] "data.frame"
dim(lh_df_4_long)
# 4 columns ("device" + "parameter" + pages values x2) and 1637 rows
# [1]    4 1637

Lighthouse reports as nested lists (all data)

# Lighthouse reports - Nested Lists -------------------------------------------                                           
# download nested list with "Performance" Lighthouse report for Google.com
lh_nl_1 <- download_lighthouse(url = "https://www.google.com", 
                               output_type = "raw")           # return nested list with all possible data

# check "Performance" for Google.com & Bing.com for both desktop & mobile and
# return in a nested list with all possible data
lh_nl_2 <- download_lighthouse(url = c("https://www.google.com", 
                                       "https://www.bing.com/"), 
                               output_type = "raw",           # return nested list with all possible data
                               strategy = c("desktop",        # check both desktop and mobile, bind
                                            "mobile"), 
                               interval = 1,                  # wait 1 second between the calls to API 
                               categories = "performance")    # which Lighthouse reports are to be run?

PageSpeed reports as wide data frames (most important data)

# PageSpeed reports -  Data Frames --------------------------------------------
# download simple data frame with Pagespeed report for Google.com
ps_df_1 <- download_pagespeed(url = "https://www.google.com", output_type = "simple")

# run Pagespeed reports for Google.com & Bing.com for mobile and
# return in a data frame with most important columns
ps_df_2 <- download_pagespeed(url = c("https://www.google.com", 
                                      "https://www.bing.com/"), 
                              output_type = "simple",      # return the results in a wide data frame
                              strategy = "mobile",         # run tests for mobile
                              interval = 1)                # wait 1 second between the calls to API 

# run Pagespeed reports for Google.com & Bing.com for both desktop & mobile and
# return in a data frame with most important columns                              
ps_df_3 <- download_pagespeed(url = c("https://www.google.com", 
                                      "https://www.bing.com/"), 
                              output_type = "simple",      # return the results in a wide data frame
                              strategy = c("desktop",      # check both desktop and mobile, bind
                                           "mobile"), 
                              interval = 2)                # wait 2 seconds between the calls to API 

PageSpeed reports as nested lists (all data)

# PageSpeed reports - Nested Lists --------------------------------------------
# download nested list with Pagespeed report for Google.com
ps_nl_1 <- download_pagespeed(url = "https://www.google.com", 
                              output_type = "raw")

# run Pagespeed for Google.com & Bing.com for desktop and
# return in a nested list with all possible data
ps_nl_2 <- download_pagespeed(url = c("https://www.google.com", 
                                      "https://www.bing.com/"), 
                              output_type = "raw", 
                              strategy = "desktop", 
                              interval = 1)

# check "Performance" for Google.com & Bing.com for both desktop & mobile and
# return in a nested list with all possible data
ps_nl_3 <- download_pagespeed(url = c("https://www.google.com", 
                                      "https://www.bing.com/"), 
                              output_type = "raw", 
                              strategy = c("desktop", 
                                           "mobile"), 
                              interval = 2)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].