Indexability Checker Tool Build
A focused indexability checker that tests whether a URL appears technically eligible to be crawled and indexed, then explains the blockers or caution signals without pretending to know what Google has actually indexed.
The Problem
Indexing problems are rarely one signal. A page can be live, linked, and sitting in a sitemap while still being blocked by robots.txt, carrying a noindex directive, canonicalizing somewhere else, or returning the wrong final status code.
That is where many small-site SEO checks get fuzzy. Site owners want to know whether Google can find a page, but most tools either show raw technical output or collapse everything into a fake score. Neither helps someone decide what to fix first.
The Build
I built the Indexability Checker as the bridge between the existing robots.txt checker and sitemap validator. The Astro page handles the interface and results display, while a dedicated Next.js API on Vercel performs the server-side URL fetches that browsers cannot do reliably.
The endpoint follows redirects, records the final HTTP response, checks robots.txt for the selected crawler, extracts meta robots and X-Robots-Tag directives, parses the canonical tag, and returns a verdict with prioritized findings. The public tool calls that endpoint with CORS restricted to bree-sharp.com.
The Product Decisions
The tool is intentionally careful with language. It does not say a page is indexed. It says whether the page appears technically eligible to be crawled and indexed based on visible signals. That distinction matters because actual index status belongs in Google Search Console, not in a public URL fetcher.
Security mattered too. Production blocks localhost, private IPs, and internal-network targets so the API cannot be used as a private-address fetch proxy. Local development allows private URLs only outside production, which keeps testing ergonomic without weakening the deployed tool.
The Takeaway
This tool turns a messy technical SEO diagnostic into a clear first-pass workflow: can the crawler reach the URL, is it allowed to crawl it, does the page tell search engines to keep it out, and is another URL being declared canonical?
It completes the crawl-discovery suite. Robots.txt answers whether a crawler may request the URL. The sitemap validator checks what the site asks crawlers to discover. The Indexability Checker looks at the page-level signals that decide whether that discovered URL is even a candidate for search.
What I Built
- Astro tool page with responsive indexability checker UI
- Dedicated Next.js API endpoint deployed on Vercel
- Manual redirect-chain fetcher with final URL and status reporting
- Robots.txt fetching, parsing, and user-agent rule matching
- Meta robots and X-Robots-Tag noindex detection
- Canonical extraction with self-referencing vs canonicalized-elsewhere verdicts
- Production CORS and private-address protections
- WebApplication schema, FAQ schema, breadcrumbs, OG image, and share strip
More from the tool suite
Need a technical SEO tool or audit workflow built?
I build practical SEO systems that do one useful job clearly, then wire them into the site, schema, analytics, and conversion path around them.
Book Your SEO Health Check → ← Back to all case studiesWant work like this?
Whether you need a technical audit, a public-facing tool, or a workflow that turns messy SEO judgment into a repeatable system, I would love to hear what you are building.