Google provides PageSpeed Insights API to assist search engine marketing professionals and builders by mixing real-world knowledge with simulation knowledge,  offering load efficiency timing knowledge associated to net pages.

The distinction between the Google PageSpeed Insights (PSI) and Lighthouse is that PSI entails each real-world and lab knowledge, whereas Lighthouse performs a web page loading simulation by modifying the connection and user-agent of the machine.

One other level of distinction is that PSI doesn’t provide any data associated to net accessibility, search engine marketing, or progressive net apps (PWAs), whereas Lighthouse gives the entire above.

Thus, after we use PageSpeed Insights API for the majority URL loading efficiency check, we gained’t have any knowledge for accessibility.

Nevertheless, PSI gives extra data associated to the web page pace efficiency, similar to “DOM Measurement,” “Deepest DOM Youngster Factor,” “Whole Activity Depend,” and “DOM Content material Loaded” timing.

Another benefit of the PageSpeed Insights API is that it provides the “noticed metrics” and “precise metrics” totally different names.

On this information, you’ll study:

  • create a production-level Python Script.
  • use APIs with Python.
  • assemble knowledge frames from API responses.
  • analyze the API responses.
  • parse URLs and course of URL requests’ responses.
  • retailer the API responses with correct construction.

An instance output of the Web page Velocity Insights API name with Python is under.

example output of the Page Speed InsightsScreenshot from writer, June 2022

Libraries For Utilizing PageSpeed Insights API With Python

The mandatory libraries to make use of PSI API with Python are under.

  • Advertools retrieves testing URLs from the sitemap of a web site.
  • Pandas is to assemble the info body and flatten the JSON output of the API.
  • Requests are to make a request to the particular API endpoint.
  • JSON is to take the API response and put it into the particularly associated dictionary level.
  • Datetime is to switch the particular output file’s title with the date of the second.
  • URLlib is to parse the check topic web site URL.

How To Use PSI API With Python?

To make use of the PSI API with Python, comply with the steps under.

  • Get a PageSpeed Insights API key.
  • Import the required libraries.
  • Parse the URL for the check topic web site.
  • Take the Date of Second for file title.
  • Take URLs into a listing from a sitemap.
  • Select the metrics that you really want from PSI API.
  • Create a For Loop for taking the API Response for all URLs.
  • Assemble the info body with chosen PSI API metrics.
  • Output the ends in the type of XLSX.

1. Get PageSpeed Insights API Key

Use the PageSpeed Insights API Documentation to get the API Key.

Click on the “Get a Key” button under.

psi api key Picture from builders.google.com, June 2022

Select a challenge that you’ve created in Google Developer Console.

google developer console api projectPicture from builders.google.com, June 2022

Allow the PageSpeed Insights API on that particular challenge.

page speed insights api enablePicture from builders.google.com, June 2022

You will want to make use of the particular API Key in your API Requests.

2. Import The Essential Libraries

Use the strains under to import the elemental libraries.

    import advertools as adv
    import pandas as pd
    import requests
    import json
    from datetime import datetime
    from urllib.parse import urlparse

3. Parse The URL For The Take a look at Topic Web site

To parse the URL of the topic web site, use the code construction under.

  area = urlparse(sitemap_url)
  area = area.netloc.break up(".")[1]

The “area” variable is the parsed model of the sitemap URL.

The “netloc” represents the particular URL’s area part. After we break up it with the “.” it takes the “center part” which represents the area title.

Right here, “0” is for “www,” “1” for “area title,” and “2” is for “area extension,” if we break up it with “.”

4. Take The Date Of Second For File Identify

To take the date of the particular perform name second, use the “datetime.now” methodology.

Datetime.now gives the particular time of the particular second. Use the “strftime” with the “%Y”, “”%m”, and “%d” values. “%Y” is for the 12 months. The “%m” and “%d” are numeric values for the particular month and the day.

 date = datetime.now().strftime("%Y_percentm_percentd")

5. Take URLs Into A Listing From A Sitemap

To take the URLs into a listing kind from a sitemap file, use the code block under.

   sitemap = adv.sitemap_to_df(sitemap_url)
   sitemap_urls = sitemap["loc"].to_list()

In the event you learn the Python Sitemap Well being Audit, you may study additional details about the sitemaps.

6. Select The Metrics That You Need From PSI API

To decide on the PSI API response JSON properties, it’s best to see the JSON file itself.

It’s extremely related to the studying, parsing, and flattening of JSON objects.

It’s even associated to Semantic search engine marketing, because of the idea of “directed graph,” and “JSON-LD” structured knowledge.

On this article, we gained’t deal with inspecting the particular PSI API Response’s JSON hierarchies.

You may see the metrics that I’ve chosen to assemble from PSI API. It’s richer than the essential default output of PSI API, which solely provides the Core Internet Vitals Metrics, or Velocity Index-Interplay to Subsequent Paint, Time to First Byte, and First Contentful Paint.

In fact, it additionally provides “recommendations” by saying “Keep away from Chaining Essential Requests,” however there isn’t a must put a sentence into an information body.

Sooner or later, these recommendations, and even each particular person chain occasion, their KB and MS values could be taken right into a single column with the title “psi_suggestions.”

For a begin, you may verify the metrics that I’ve chosen, and an vital quantity of them might be first for you.

PSI API Metrics, the primary part is under.

    fid = []
    lcp = []
    cls_ = []
    url = []
    fcp = []
    performance_score = []
    total_tasks = []
    total_tasks_time = []
    long_tasks = []
    dom_size = []
    maximum_dom_depth = []
    maximum_child_element = []
    observed_fcp  = []
    observed_fid = []
    observed_lcp = []
    observed_cls = []
    observed_fp = []
    observed_fmp = []
    observed_dom_content_loaded = []
    observed_speed_index = []
    observed_total_blocking_time = []
    observed_first_visual_change = []
    observed_last_visual_change = []
    observed_tti = []
    observed_max_potential_fid = []

This part contains all of the noticed and simulated elementary web page pace metrics, together with some non-fundamental ones, like “DOM Content material Loaded,” or “First Significant Paint.”

The second part of PSI Metrics focuses on doable byte and time financial savings from the unused code quantity.

    render_blocking_resources_ms_save = []
    unused_javascript_ms_save = []
    unused_javascript_byte_save = []
    unused_css_rules_ms_save = []
    unused_css_rules_bytes_save = []

A 3rd part of the PSI metrics focuses on server response time, responsive picture utilization advantages, or not, utilizing harms.

    possible_server_response_time_saving = []
    possible_responsive_image_ms_save = []

Word: Total Efficiency Rating comes from “performance_score.”

7. Create A For Loop For Taking The API Response For All URLs

The for loop is to take the entire URLs from the sitemap file and use the PSI API for all of them one after the other. The for loop for PSI API automation has a number of sections.

The primary part of the PSI API for loop begins with duplicate URL prevention.

Within the sitemaps, you may see a URL that seems a number of occasions. This part prevents it.

for i in sitemap_urls[:9]:
         # Stop the duplicate "/" trailing slash URL requests to override the data.
         if i.endswith("/"):
               r = requests.get(f"https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={i}&technique=cellular&locale=en&key={api_key}")
         else:
               r = requests.get(f"https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={i}/&technique=cellular&locale=en&key={api_key}")

Bear in mind to verify the “api_key” on the finish of the endpoint for PageSpeed Insights API.

Examine the standing code. Within the sitemaps, there is likely to be non-200 standing code URLs; these must be cleaned.

         if r.status_code == 200:
               #print(r.json())
               data_ = json.masses(r.textual content)
               url.append(i)

The subsequent part appends the particular metrics to the particular dictionary that we have now created earlier than “_data.”

               fcp.append(data_["loadingExperience"]["metrics"]["FIRST_CONTENTFUL_PAINT_MS"]["percentile"])
               fid.append(data_["loadingExperience"]["metrics"]["FIRST_INPUT_DELAY_MS"]["percentile"])
               lcp.append(data_["loadingExperience"]["metrics"]["LARGEST_CONTENTFUL_PAINT_MS"]["percentile"])
               cls_.append(data_["loadingExperience"]["metrics"]["CUMULATIVE_LAYOUT_SHIFT_SCORE"]["percentile"])
               performance_score.append(data_["lighthouseResult"]["categories"]["performance"]["score"] * 100)

Subsequent part focuses on “complete job” rely, and DOM Measurement.

               total_tasks.append(data_["lighthouseResult"]["audits"]["diagnostics"]["details"]["items"][0]["numTasks"])
               total_tasks_time.append(data_["lighthouseResult"]["audits"]["diagnostics"]["details"]["items"][0]["totalTaskTime"])
               long_tasks.append(data_["lighthouseResult"]["audits"]["diagnostics"]["details"]["items"][0]["numTasksOver50ms"])
               dom_size.append(data_["lighthouseResult"]["audits"]["dom-size"]["details"]["items"][0]["value"])

The subsequent part takes the “DOM Depth” and “Deepest DOM Factor.”

               maximum_dom_depth.append(data_["lighthouseResult"]["audits"]["dom-size"]["details"]["items"][1]["value"])
               maximum_child_element.append(data_["lighthouseResult"]["audits"]["dom-size"]["details"]["items"][2]["value"])

The subsequent part takes the particular noticed check outcomes throughout our Web page Velocity Insights API.

               observed_dom_content_loaded.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedDomContentLoaded"])
               observed_fid.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedDomContentLoaded"])
               observed_lcp.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["largestContentfulPaint"])
               observed_fcp.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["firstContentfulPaint"])
               observed_cls.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["totalCumulativeLayoutShift"])
               observed_speed_index.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedSpeedIndex"])
               observed_total_blocking_time.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["totalBlockingTime"])
               observed_fp.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedFirstPaint"])
               observed_fmp.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["firstMeaningfulPaint"])
               observed_first_visual_change.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedFirstVisualChange"])
               observed_last_visual_change.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["observedLastVisualChange"])
               observed_tti.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["interactive"])
               observed_max_potential_fid.append(data_["lighthouseResult"]["audits"]["metrics"]["details"]["items"][0]["maxPotentialFID"])

The subsequent part takes the Unused Code quantity and the wasted bytes, in milliseconds together with the render-blocking sources.

               render_blocking_resources_ms_save.append(data_["lighthouseResult"]["audits"]["render-blocking-resources"]["details"]["overallSavingsMs"])
               unused_javascript_ms_save.append(data_["lighthouseResult"]["audits"]["unused-javascript"]["details"]["overallSavingsMs"])
               unused_javascript_byte_save.append(data_["lighthouseResult"]["audits"]["unused-javascript"]["details"]["overallSavingsBytes"])
               unused_css_rules_ms_save.append(data_["lighthouseResult"]["audits"]["unused-css-rules"]["details"]["overallSavingsMs"])
               unused_css_rules_bytes_save.append(data_["lighthouseResult"]["audits"]["unused-css-rules"]["details"]["overallSavingsBytes"])

The subsequent part is to offer responsive picture advantages and server response timing.

               possible_server_response_time_saving.append(data_["lighthouseResult"]["audits"]["server-response-time"]["details"]["overallSavingsMs"])      
               possible_responsive_image_ms_save.append(data_["lighthouseResult"]["audits"]["uses-responsive-images"]["details"]["overallSavingsMs"])

The subsequent part is to make the perform proceed to work in case there’s an error.

         else:
           proceed

Instance Utilization Of Web page Velocity Insights API With Python For Bulk Testing

To make use of the particular code blocks, put them right into a Python perform.

Run the script, and you’ll get 29 web page speed-related metrics within the columns under.

pagespeed insights apiScreenshot from writer, June 2022

Conclusion

PageSpeed Insights API gives various kinds of web page loading efficiency metrics.

It demonstrates how Google engineers understand the idea of web page loading efficiency, and probably use these metrics as a rating, UX, and quality-understanding perspective.

Utilizing Python for bulk web page pace checks provides you a snapshot of all the web site to assist analyze the doable consumer expertise, crawl effectivity, conversion charge, and rating enhancements.

Extra sources:


Featured Picture: Dundanim/Shutterstock



Previous article64.2% Of Websites Use WordPress
Next article7 Strategies To Analysis & Analyze Your Viewers For search engine optimization

LEAVE A REPLY

Please enter your comment!
Please enter your name here