Skip to content

What is a content audit anyway? By Roy Fisher and Isabelle Sperano

When you take your first steps into the content strategy space, it’s likely because you were asked to do a content audit. Maybe you’re about to migrate a public website to a new platform. Or maybe the team’s intranet is bloated with content accreted over the years. The prospect makes you feel more than a little lost, especially after you search the web for what a “content audit” means. There are lots of different possible activities involved in an audit, not to mention lots of different types.

Turns out that while most content strategists know what they mean by the term “content audit”, there’s no true consensus. Take capital-A Art. Everyone thinks they know what Art is and what it isn’t. But when you look at “Fountain”, literally a urinal flipped over on its side, you begin to wonder. When Marcel Duchamp submitted Fountain to an exhibition, the board members balked. It’s not that Fountain isn’t “Art”; it’s that Duchamp and the board meant different things by the term.

'Starry Night' by Van Gogh, and 'Fountain' attributed to Marcel Duchamp
“Starry Night” by Van Gogh, and “Fountain” attributed to Marcel Duchamp (possibly created by Elsa von Freytag-Loringhoven or Louise Norton).
Both have been displayed at the Musée d’Art Moderne.

Like art, a content audit is tricky to nail down. Dr. Isabelle Sperano of MacEwan University is one of the few people — possibly the only one — to study content audits from an academic standpoint. To get the truthiest definition of “content audits” (because that’s what academics do), she looked at 129 different publications (books, blog posts, etc) that tried to define it. And, of course, she found 129 different answers.

  • Some publications limit “content audits” to text content. Others include images, audio, and video.
  • Some consider a “content audit” to include creation of a list of content. Others call that a “content inventory”, and consider an “audit” to be a later analysis of that inventory.
  • Most publications suggest that “content audits” focus only on content that currently exists in the website. Some suggest that you should include only external facing content, though most accept that you really need background information and metadata for an audit or inventory to be successful. And a few contend that to be truly useful, the audit should be structured to expose content that doesn’t exist in the website, but should.

Still, Dr. Sperano did find common attributes:

  • Nearly everyone says a content audit includes creation of a list or inventory of content.
  • Most include some form of analysis. That is, content audits are almost always completed in response to a question, which almost always requires some form of analysis to answer.
  • Many publications talked about different types of audits. Most distinguish between different “facets” (Dr. Sperano’s word): The things that the audit looks for in the content, such as whether content is properly internationalized or meets a desired reading level. Audits can also vary with scope (such as whether the audit includes all content or a subset) or with duration (such as a “rolling audit” that re-analyzes new content as it’s added to a website).

Table 1 presents a sample of audit definitions from Dr. Sperano’s research.

Sample of content audit definitions
Publication Definition

BRO06O

A list of all the information contained in a website, along with data that describes the information from several points of view, like target audience or location.

DET12W

A content audit is an assessment of a website’s content from both a quantitative perspective (i.e., “How much content is there?”) and a qualitative one (i.e., “Is the content any good?”).

HAL12O

An audit is an accounting of all currently published web content, with all the details recorded on a spreadsheet.

JON09W

Ultimately, a content inventory results in a detailed spreadsheet that, at a minimum, lists the existing content.

KAD12O

The content audit is an assessment, or inventory, of all the existing content on a website.

LAN14O

A content inventory is a quantitative assessment of all the content on a website–a list of all the pages, images, and other files that make up the content set as well as data associated with those files, such as content type and metadata.

A content audit is a qualitative evaluation of a set of content. When you audit content, you assess it against a variety of measures depending on your context and goals.

LEW12O

A content inventory answers the question: What do we have to work with? It includes a detailed list of all documents (white papers, presentations, product manuals, etc.), video and audio assets, and images related to your business. This analysis serves as a quantitative measure of what you have.

A content audit, on the other hand, is an analysis of the information assets you have. It is the assessment of that content and an evaluation of its importance and relevance with the surrounding messaging. The content audit will answer the question: Is this any good?

LYO12O

This is a review of the content (copy, imagery, videos, audio) and its organization in an existing product, or content created for a new product.

NICW

A Content Inventory captures the content within an organization and its publication channel(s). It frames the current-state content scope and may be referred to as a quantitative audit.

A Content Audit leverages the inventory to provide an assessment of the content and its quality.

WIKW-1

A content inventory is the process and the result of cataloguing the entire contents of a website. An allied practice — a content audit — is the process of evaluating that content.

WOD09O

A content inventory is a tally of everything that exists on the site and everything you expect to be added to the site.

YUN02O

A content audit is simply a thorough analysis of all content—text and graphics—on your website with an eye on what should or should not be localized.

Table 1 - Content audit definitions, Content Audit for the Assessment of Digital Information Space: Definitions and Exploratory Typology, I. Sperano

What does this mean when planning your own content audit?

1) Make sure you and your client understand the question you’re trying to answer

That is, make sure you understand the goals of the audit (or if applicable, the goals of your client). A content audit can be geared for many different kinds of problems. When you’re clear on what you want to accomplish, you can figure out the kinds of information you need to get about the content.

2) Make sure both you and your stakeholders understand the activities you’ll perform and the information you’ll collect

Given the variety of things that “content audit” can mean, you need to make sure what you’re doing is what your client (or boss) expects. Collection of metadata like word counts, language, and subject heading might seem obvious to a client. But since they aren’t needed for every type of audit, that might not be obvious to you.

Some audit activities are more time-consuming than others. It’s easy for automated tools to collect things like URLs, word counts, and metatags. Some tools (like OnPoint’s Content Auditor) can even calculate a reading grade level or look for missing metatags that affect searchability and SEO ratings. But it’s less easy for automated tools to make human judgments, like identifying content that’s badly written, obsolete, or that violates your client’s branding guidelines. If your client expects those activities as part of a standard “content audit”, and you don’t do them, one of you will be a bit peeved.

We’re not saying it’s unreasonable to include those activities in a content audit. It totally is! But it’s also reasonable to not include those activities — or to do them on a reduced scope (such as only content accessible from the home page). Because the term “content audit” is still pretty new, so you should make sure everyone shares the same understanding for your project.

3) If the process feels confused and messy, you’re doing it right

When you start planning a content audit, it’s going to be confusing. And as you work out what needs to be done with your stakeholders, it’ll be messy. But as Dr. Sperano’s research reveals, that’s normal. Everyone thinks they know what a content audit is. Those definitions are all different, and they’re all correct. Clarity on what everyone means from the start saves time-consuming headaches and makes for happier stakeholders (and saner consultants).

So what is a content audit?

What definition did Dr. Sperano finally settle on? In Content Audit for the Assessment of Digital Information Space: Definitions and Exploratory Typology, she writes:

“[A] content audit is an evaluation method used to identify, describe, quantify, and assess content quality of a website or of a larger information ecosystem.”

As long as you know what qualities you’re identifying and describing (and quantifying, and assessing), seems good enough for us.

References

  • Sperano, I. (2017). L’audit de contenu en architecture d’information : examen de la méthode à travers les écrits d’experts. Université Laval, Québec: http://hdl.handle.net/20.500.11794/27621
  • Sperano, I. (2017). Content Audit for the Assessment of Digital Information Space: Definitions and Exploratory Typology. In Proceedings of the 35th ACM International Conference on the Design of Communication (pp. 1–10). New York, NY, USA: ACM. https://doi.org/10.1145/3121113.3121227