How to audit a site for structured data & search feature opportunities

Structured data in SEO – often used synonymously with Schema – is a way to mark up a webpage’s HTML to make its content more readable to search engine crawlers like Google, aiding crawlers’ understanding of the content on a page. Structured Data often allows pages to be eligible for search features (AKA rich results).

While people may use “Schema” and “Structured Data Markup” interchangeably, there are differences: Schema refers to Schema.org, which provides a standardized vocabulary that search engines (Google, Bing, Yandex, etc.) can read and support.

There are other vocabularies outside of Schema.org that can be used for structured data markup: microformats.org also offers vocabulary for defining a physical location, organization or person (hCard markup), or for product reviews (hReview markup).

What are search features?

Search features are results that add additional information or context outside of traditional links, titles, and meta descriptions. They’re designed to deliver answers to users’ queries as quickly as possible, without necessarily having to click on a search result. Another advantage of search features is that they take up valuable real estate at the top of the results page, increasing your page’s chances of click-through.

An example of a Featured Snippet search feature

In order to be eligible for search features, you don’t necessarily have to add structured data to your site’s code; many search features (i.e. sitelinks) are automatically served by search engines without additional code. There are other search features like reviews and products where implementing structured data markup will increase your chances of appearing for these features.

As a result, structured data markup (using approved vocabulary from Schema.org, for instance) can aid your SEO efforts by improving a given page’s appearance in search results.

The Structured Data Audit

NOTE: For this post, we refer to the Schema.org vocabulary hierarchy. You may use microformats.org (e.g. hCard, hReview) as it continues to be supported by search engines. That said, it’s recommended to pick one vocabulary over another (i.e. Schema.org or microformats.org) to avoid presenting duplicate – or seemingly spammy – structured data markup, and risk having your webpages’ rankings penalized.

Step 1: Assess current structured data markup (if any)

Your site may already have structured data implemented. You can check whether Google has found this markup and if there are any errors to address in Google Search Console (GSC). 

Fixing errors is the first priority of any structured data audit, as Google may penalize sites (via either algorithmic quality factors or manual action) that are using markup incorrectly or in a way that looks spammy.

To get started, navigate to GSC’s Enhancements section; here is where any discovered structured data markup will appear.

Three examples of structured data reports within GSC’s Enhancements

These reports will tell you whether your pages with markup have errors (which should be addressed ASAP) or warnings (nice-to-have optimizations).

google search console warnings

This site’s Product markup appears to have warnings that can be addressed to further optimize the existing structured data, but fortunately, there aren’t glaring errors making URLs ineligible for rich results.  

If you click into the details of a given Warning, GSC provides additional information into the error as well as a list of impacted URLs to further investigate.

google search console warning example

If you want to validate your revised structured data, you can test it in Rich Results Test (RRT) – Google’s replacement for the Structured Data Testing Tool – which allows you to check your markup directly or check a given URL.

Step 2: Audit site content for search feature opportunity

What we’re looking for at this stage is simply the broad ‘types’ of content found on your site, and whether those content types have an appropriate search feature for which they can be optimized. 

For example, a typical e-commerce/retail site might include:

  • Product Pricing
    • Recommended markup: Product 
  • Videos of products
    • Recommended markup: VideoObject
  • Product Reviews
    • Recommended markup: Review
  • Editorial / Blog content
    • Recommended markup: BlogPosting
  • Location / Contact Info
    • Recommended markup: LocalBusiness
  • Software / Application Download
    • Recommended markup: SoftwareApplication

If you’re using a keyword tracking software like STAT, you can also see which priority keywords have SERP features associated with them, like popular products, FAQ, jobs, etc. This can help you identify additional structured data markup opportunities to increase your chances of appearing in rich results. 

Once you identify what the different types of content are, the next step is to figure out which of these can be marked up.

Step 3: Map content to Schema types

Once you identify the types of content to mark up, review Schema.org’s documentation to find all currently supported Schema.org types. The goal is to find the most specific markup that accurately applies to each type and/or piece of content on your site.

Assuming your site is in the ecommerce or B2B space, you will likely want to use the following markup types:

  • Product
  • Event – even if you’re not hosting in-person events, you can use this markup for virtual sessions and webinars
  • Organization
  • Person
  • Place
  • Creative Work – within this type you’ll find Article & BlogPosting for any editorial content

These broad areas should cover the majority of content on your site. However, if you’re not sure whether there is a more specific type for what you want to markup, perform a Google search for a specific type of markup.

In order to create structured data markup, you have a couple of options:

  1. Use Google’s Structured Data Highlighter Helper tool: prerequisite requires your site to be set up in Google Search Console.
    • Pros: 
      1. it’s a WYSIWYG (what you see is what you get) tool, so it’s very easy to create the markup
      2. the tool applies to “page sets”, and thus can be templated & published across multiple pages at once
      3. you can publish your markup directly through the tool, so you don’t need to rely on a developer 
    • Cons: 
      1. markup options are limited, including only: Articles, Events, Local Businesses, Restaurants, Products, Software Applications, Movies, TV Episodes, and Books 
      2. the tool only works on pages that have already been crawled by Google; if you want to markup a recently published article or product, crawl latency may delay the process
      3. Your marked up data will only be recognized by Google, but not by other search engines like Bing
  2. Use Google’s Structured Data Markup Helper
    • Pros: 
      1. marked up data is recognized by all applicable search engines (i.e. Google and Bing)
      2. generates structured data as either microdata or JSON-LD
    • Cons: 
      1. as with the Data Highlighter Helper, the markup options are limited: Articles, Events, Local Businesses, Restaurants, Products, Software Applications, Movies, TV Episodes, Books, Datasets, and Questions and Answers
      2. unlike the Highlighter Helper, a developer needs to implement structured data
  3. Do it yourself using examples from Schema.org and validating your code in RRT
    • Pros:
      1. use any markup supported by Schema.org (as of October 2020 there are 800+ types)
      2. Schema.org provides JSON-LD, microdata, and RDFa examples of all its markup, which you can copy and repurpose
      3. have your choice of JSON-LD, microdata, or RDFa
    • Cons: 
      1. as with any new skill set/language, there’s a learning curve
      2. a developer needs to implement your markup

Side note: Google Tag Manager is not an appropriate way to implement structured data markup; any markup you deploy there will be visible after page load, which defeats the purpose of having markup visible to crawlers. 

As mentioned previously, Schema.org provides us with a vocabulary to use when implementing structured data markup for your content. There are multiple ways you can use Schema’s provided markup:

  • Microdata
  • RDFa
  • JSON-LD

Microdata

Microdata in Schema markup are annotations implemented inline within the HTML of a given element. 

Take this example from Schema.org’s Product markup using Microdata: using itemprop you’re able to tag individual elements of a product, including its product image, name, ratings, price, and more.

microdata itemprop

Microdata was long the preferred way of implementing structured data markup and is still supported. That said, there are more streamlined ways to implement markup that will leave content editors and developers much happier.

RDFa

RDFa in Schema markup looks and is implemented in a very similar way to Microdata: using in-line HTML, you can define property types and tag elements like product image, ratings, prices, etc. Similar to Microdata, RDFa is an acceptable way to mark up your webpages. That said, it’s arguably more challenging to scale to multiple pages without using logic-based rules to identify similar elements across your site (think: hundreds of product images, prices, reviews, etc).

rdfa property types and tag elements

JSON-LD

JSON-LD is the third structured data markup option from Schema.org, and differs from Microdata and RDFa in that individual elements within your content are marked up separately from the content itself; JSON-LD is a Javascript object that can live in the <head> or <body> tags of your webpages. Because JSON-LD isn’t implemented within HTML elements, its implementation (and any subsequent updates) is simpler via copy and pasting as a standalone script. Arguably, JSON-LD is the “best” schema markup encoding for SEOs, content editors, and developers.

json-ld structured data markup

Closing thoughts

We hope this makes the structured data vs. schema markup distinction clear and enables you to either audit or markup your site’s content today. As a reminder, you don’t need structured data markup in order to be eligible for rich results, so if you’re unable to implement markup, we encourage you to test the layout of your content with tables, lists, bullets, etc. and check your page’s eligibility for rich results in RRT.