What Schema Markup Does
Schema markup is structured data embedded in a webpage that tells AI systems, search engines, and retrieval systems what the page is ... not just what it says, but what type of thing it is, who made it, what it is about, and how it connects to other known entities. Without schema markup, AI systems have to infer all of this from text alone. With schema markup, the answers are explicit and machine-readable.
How JSON-LD Works
The standard format for schema markup is JSON-LD (JavaScript Object Notation for Linked Data). It is a block of structured data embedded in the HTML of a page, usually in the head element, that describes the page using vocabulary from Schema.org. A news article page might have an Article schema that identifies the headline, author, date published, and the publisher organization. A FAQ page might have FAQPage schema that lists each question and answer in a structured, machine-readable format. This structured layer sits alongside the HTML content and is read by machines independently of how the page renders visually.
The Most Important Schema Types for AI Visibility
For most websites, the highest-value schema types to implement are WebSite and Organization (on the homepage, establishing site identity), Article or TechArticle (on articles and guides), FAQPage (on any page with question-and-answer content), BreadcrumbList (on all interior pages, showing hierarchical structure), and Service (on service or product pages). Each type provides specific structured signals that AI systems use when indexing, retrieving, and citing content.
Schema Markup and AI Citation Accuracy
One underappreciated benefit of schema markup is that it improves the accuracy of AI citations. When an LLM retrieves information from a webpage and presents it in an answer, it relies on signals to understand authorship, publication date, topic, and source authority. Schema markup provides those signals directly. Sites with complete schema are cited more accurately and attributed more reliably than sites that rely on text inference alone.