-
Notifications
You must be signed in to change notification settings - Fork 46
[WEB-4830] Add crawlable language selector links and structured data for SEO #2990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
898492e to
730d593
Compare
730d593 to
5c2081a
Compare
Add alternateLanguageLinks prop to Head component to support proper SEO signaling for language variants. This renders <link rel="alternate" hrefLang="..."> tags for each available language option, following SEO best practices for multi-language content. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
f6884bb to
f96be2f
Compare
Add HiddenLanguageLinks component that renders hidden anchor tags for each language variant, making them discoverable by crawlers and search engines. Update Layout to include this component at the bottom of the page. Update MDXWrapper to generate and pass alternate language links to the Head component for proper SEO signaling. This ensures that all language variants of documentation pages are discoverable by basic crawlers (curl, wget) and properly indexed by search engines. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Add comprehensive tests for: - Head component JSON-LD structured data rendering - HiddenLanguageLinks component crawlable links generation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
a5ac740 to
70d2e7f
Compare
| languages = activePageData.page.languages; // Use language overrides from the nav data first if possible | ||
| } else if (pageContext?.languages) { | ||
| languages = pageContext.languages.map(stripSdkType) as LanguageKey[]; // Use pageContext languages if available, this is generated for MDX pages | ||
| languages = Array.from(new Set(pageContext.languages.map(stripSdkType))) as LanguageKey[]; // Use pageContext languages if available, this is generated for MDX pages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solves a separate problem where langs could be duplicated when there's a full bevy of realtime and rest langs
Human preamble
For this ticket. Crawlers are not able to get a good look at language variants of docs pages. Since rendering anchor links in the LanguageSelector is not viable (disrupts SPA behaviour, not within reach in
react-selector Radix), this PR renders a set of links in the page when multiple languages are available, and also adds some JSON-LD structured data to the head.The
langparam is also considered in the canonical URL when present since it fundamentally changes the page context so should be indexed separately.To test: look at a docs pages with languages on (i.e. this) and then look at the JSON-LD in the header, and the hidden links in the foot.
Summary
Changes
1. JSON-LD Structured Data (Head.tsx, MDXWrapper.tsx)
Implemented proper Schema.org structured data to signal programming language variants to search engines:
TechArticle Schema: Represents technical documentation pages
hasPartproperty to link to multiple code examplesSoftwareSourceCode Schema: Represents each programming language variant
programmingLanguageproperty indicates the specific language (JavaScript, Python, Flutter, etc.)urlproperty links to the page with?lang=query parameterExample output:
{ "@context": "https://schema.org", "@type": "TechArticle", "headline": "Real-time Channels", "description": "Learn how to use real-time channels...", "url": "https://ably.com/docs/channels", "hasPart": [ { "@type": "SoftwareSourceCode", "programmingLanguage": "JavaScript", "url": "https://ably.com/docs/channels?lang=javascript" }, { "@type": "SoftwareSourceCode", "programmingLanguage": "Python", "url": "https://ably.com/docs/channels?lang=python" } ] }This follows Schema.org best practices and is recommended by Google for structured data.
2. Hidden Crawlable Links (HiddenLanguageLinks.tsx, Layout.tsx)
Created a component that renders visually hidden
<a>tags for each language option:sr-onlyclass andaria-hidden="true"to hide from users3. Language Deduplication (layout-context.tsx)
Added
Setdeduplication to prevent duplicate language entries in the language list.Why This Matters
Problem
Currently, basic crawlers and search engines cannot discover the language variants available in our language selector because:
Solution
Note on
rel="alternate"Initially considered using
rel="alternate"but this is semantically incorrect - it's meant for language translations (en/fr/es) or alternate formats (PDF/RSS), not programming language code variants. JSON-LD is the proper approach.Test Plan
<head>(view source, search forapplication/ld+json)sr-only)?lang=query params🤖 Generated with Claude Code