Crawlable vs Indexable vs Indexed

Use these terms as a diagnostic sequence, not a vocabulary quiz. First ask whether the URL can be crawled. Then ask whether it is allowed to be indexed. Then ask whether Google actually chose to store it.

The short version

Crawlable, indexable, and indexed are three different checkpoints. A page has to clear the earlier ones before it has a fair shot at the last one. If your service page is missing from Google, do not start by rewriting every sentence. Start by finding the checkpoint it failed.

Signal	Plain-English meaning	Common failure
Crawlable	Search engines are allowed to request the URL.	Robots.txt blocks the page, the server fails, or redirects loop.
Indexable	The page can be considered for search results.	A noindex tag, canonical tag, or redirect points Google away.
Indexed	Google actually added the page to its search index.	The page is too weak, duplicate, orphaned, new, or not trusted enough yet.

The part that trips people up

A page can be crawlable and indexable without being indexed. That is the maddening part. Sometimes Google can access the page and simply decides it is not useful enough to store.

What crawlable means

A crawlable URL is a URL search engines can request. That sounds basic until a launch goes sideways and the money pages are technically live but quietly fenced off from crawlers. Customers can load the page. Googlebot cannot, or should not, according to the site's crawl-control signals.

Crawlability is mostly about access. Is the URL returning a clean response? Is robots.txt allowing the crawler? Are redirects working? Is the server responding? Can Google fetch the resources it needs to understand the page?

Robots.txt block: A rule like Disallow: /services/ can keep crawlers away from every service page under that folder.
Sitewide launch block: A leftover Disallow: / from staging can tell normal crawlers to stay away from the entire site. Painful, common, very rude.
Redirect problems: A page that redirects through multiple hops, loops, or lands on the wrong URL can waste crawl attention before Google ever evaluates the content.
Server errors: 5xx responses, timeouts, and intermittent hosting problems make a URL less reliable for crawlers.
Blocked resources: If important CSS or JavaScript is blocked, Google may not see the page the way a user does.

How to check crawlability

Start with the URL itself. Check the HTTP status, final destination after redirects, robots.txt permission for Googlebot, and whether the page can be fetched in Google Search Console's URL Inspection tool.

What indexable means

An indexable page is not just crawlable. It is allowed to be considered for search results. This is where noindex tags, X-Robots-Tag headers, canonical tags, and redirects become the main characters.

Indexability answers the question: after Google gets to the page, does the page tell Google to keep it, ignore it, or treat a different URL as the real version?

Meta noindex: A tag like <meta name="robots" content="noindex"> tells search engines not to show the page in results.
X-Robots-Tag noindex: The same instruction can come from an HTTP header, which is easy to miss if you only look at page source.
Canonical points elsewhere: A service page that canonicalizes to the homepage is basically telling Google, "do not treat me as the main page." Please do not let your service pages do this to themselves.
Redirected URL: If the URL redirects, the original URL is not the page Google is being asked to index.
Conflicting signals: A URL in the sitemap that also has noindex, is blocked by robots.txt, or canonicalizes elsewhere is sending mixed instructions.

Indexable means eligible. It does not mean accepted.

What indexed means

An indexed page is a page Google has actually stored in its index and can show in search results. This is the part people usually mean when they say, "Google can't find my page," but it is only the final checkpoint.

A page can be crawlable and indexable and still not indexed. That is usually where the diagnosis shifts from "is Google allowed to see this?" to "why would Google want to keep this?"

Thin or generic content: A service page that says the same thing every competitor says gives Google very little reason to store it.
Near-duplicate pages: Ten location pages with swapped city names and no real local value may look indexable but not worth indexing.
Weak internal links: If no important page links to it, Google may treat the page as low priority.
Orphaned URLs: A URL listed in a sitemap but not linked from the site has discovery help, but little internal context.
Poor canonical patterns: If several similar URLs compete with each other, Google may choose one and ignore the rest.
Low site trust or crawl priority: Newer, smaller, or messy sites can see slower indexing, especially for lower-value pages.

For a small service business, this often shows up as a technically fine page that still reads like a brochure. It has the service name, a phone number, and maybe a few paragraphs, but no proof, no specificity, no local context, no answers to real buyer questions, and no reason to choose that page over the other fifty pages Google already knows.

How Google Search Console statuses map to this

Two Google Search Console messages fit perfectly into this three-stage model: Discovered - currently not indexed and Crawled - currently not indexed. If you have seen either one in the Pages report and felt personally attacked, fair. But they do mean different things.

GSC status	Where it fits	What it usually means
Discovered - currently not indexed	Google knows the URL exists, but has not crawled it yet.	The URL may be low priority, poorly linked, one of many similar URLs, in a crawl queue, or part of a larger crawl-budget issue.
Crawled - currently not indexed	Google crawled the page, but did not add it to the index.	The page may be duplicate, thin, low value, soft-404-like, canonicalized oddly, or simply not strong enough compared with what Google already has.

Discovered - currently not indexed is usually a discovery and priority problem. Google found the URL somewhere, such as a sitemap or link, but has not spent the crawl on it yet. For a large site, that can become a crawl budget conversation. For a small business site, it is more often weak internal linking, too many low-value URLs, or a page that sits too far from the pages Google already trusts.

Crawled - currently not indexed is the tougher message. Google visited. It looked. It still did not keep the page. That does not automatically mean the page is terrible, but it does mean you should inspect quality, duplication, canonical signals, and internal links before repeatedly smashing the "request indexing" button and hoping the universe gets tired.

Why the words matter

If you treat every missing page like an indexing problem, you waste time. A blocked robots.txt rule needs a different fix than a thin service page. A canonical pointing somewhere else needs a different fix than a brand-new page waiting for discovery.

That is why the first question should not be "why is this not ranking?" It should be "where did this URL fall out of the process?" Otherwise you end up polishing a page Google was never allowed to keep in the first place.

If it is not crawlable: fix access first. Google cannot evaluate what it cannot request.
If it is crawlable but not indexable: check noindex, canonical, redirects, and HTTP headers.
If it is indexable but not indexed: look at content quality, internal links, duplication, and whether the page deserves to exist as its own result.

How to diagnose one URL

Start with the exact page, not the whole website. Copy the final canonical URL from the browser and check the signals in this order. This keeps you from treating a crawl problem like a content problem or a content problem like a crawl problem.

Check the HTTP status. A clean indexable page should usually return 200. If it redirects, check the final URL. If it errors, fix the server or URL first.
Check robots.txt. Make sure the exact URL is allowed for Googlebot. A blocked page may still be known to Google, but Google cannot properly evaluate the content.
Check noindex. Look for both meta robots tags and X-Robots-Tag headers. The header version is the one that hides in plain sight.
Check canonical. The canonical should usually point to the preferred version of that same page. If it points to another URL, Google may follow that preference.
Check sitemap consistency. If the URL is in the sitemap, it should be the final canonical URL, not an HTTP version, redirected version, noindex page, or duplicate variant.
Check internal links. A page linked only from the sitemap is not getting the same internal trust as a page linked from navigation, service hubs, related articles, or location pages.
Check content value. Ask whether the page answers a distinct search need or just exists because someone made a template and kept clicking duplicate.

Google Search Console is the truth source for Google specifically. The URL Inspection tool can tell you whether Google has seen the page, what canonical Google selected, and whether the URL is on Google. A third-party checker is useful before or alongside that because it shows the public signals fast.

How to diagnose the pattern across a site

One weird URL is a cleanup task. Dozens or hundreds of similar URLs are a pattern. That is where you stop inspecting pages one at a time and start asking what kind of pages Google is skipping.

Sort by page type. Are the missing URLs service pages, blog posts, tag pages, filtered URLs, location pages, or old redirected paths?
Compare sitemap coverage. If the sitemap lists pages that are noindex, redirected, canonicalized elsewhere, or blocked, clean the sitemap before blaming Google.
Look for orphaned pages. Important pages should have internal links from relevant pages, not just a sitemap mention.
Check internal link depth. If a page takes six clicks to find, Google may treat it as less important than a page linked from the main service hub.
Group duplicate intent. If five URLs all target the same basic query, choose the strongest version and consolidate or differentiate the rest.
Watch crawl budget on large sites. Most small service sites do not have true crawl budget problems, but large sites with thousands of parameter, filter, or duplicate URLs absolutely can.

Small-site translation

If your site has 20 pages, "crawl budget" is probably not the villain. If Google is skipping important pages, look first at noindex, canonical tags, redirects, internal links, sitemap quality, and whether the page actually deserves to exist.

What to fix first

Do not rewrite the content until you know the page can be crawled, can be indexed, and is not telling Google to prefer a different URL.

Fix hard blockers first: server errors, accidental redirects, robots.txt blocks, and noindex tags. A page that returns a 500 error or says noindex does not need a pep talk. It needs the blocker removed.

Then fix mixed signals. A bad canonical pattern looks like a service page pointing its canonical tag to the homepage, an HTTP URL in the sitemap that redirects to HTTPS, or a city page canonicalizing to a generic statewide page even though it is meant to rank on its own. Those are not tiny technical quirks. They tell Google which URL you think matters.

After that, clean discovery signals. If the sitemap lists redirected URLs, replace them with the final destination URLs. If internal links still point to old paths, update them. If the page is important, link to it from a relevant page that already has authority.

Only after the technical signals are clean should you judge the content. For a service business, the most common pattern is not a catastrophic technical issue. It is usually a page that technically can be indexed but looks too similar to every other service page on the site. Google may see it, understand it, and still leave it out because it does not add enough value.

Add specific services, markets, FAQs, proof, process details, and examples from real jobs or client situations.
Make the page internally linked from related articles, service pages, and location pages.
Consolidate pages that target the same intent instead of making Google choose between thin duplicates.
Request indexing in Search Console after meaningful changes, not after every tiny wording tweak.

FAQ

Is crawlable the same as indexable?

No. Crawlable means search engines can request the URL. Indexable means the page is allowed to be considered for search results. A page can be crawlable but not indexable if it has a noindex tag or canonical points elsewhere.

Can a page be indexable but not indexed?

Yes. Indexable only means the page is eligible. Google still decides whether the page is useful, unique, trustworthy, and worth storing in its index.

Is a site: search enough to check indexing?

A site: search is a quick clue, not a final answer. Use Google Search Console URL Inspection for a specific URL, and use a technical checker to review crawl and indexability signals.

What does Crawled - currently not indexed mean in Google Search Console?

It means Google crawled the page but has not added it to the index. Check for duplicate or thin content, weak internal links, poor canonical signals, soft-404-like pages, and whether the page adds enough distinct value to deserve indexing.

What does Discovered - currently not indexed mean?

It means Google knows the URL exists but has not crawled and indexed it yet. The URL may be low priority, weakly linked, one of many similar URLs, or part of a larger crawl-priority or crawl-budget issue.

What is crawl budget and does it affect indexing?

Crawl budget is the amount of time and resources Google devotes to crawling a site. It matters most for large sites with many URLs. Most small service business sites should fix crawl blockers, sitemap quality, internal links, and page value before worrying about crawl budget.

How is robots.txt different from noindex in practice?

Robots.txt controls whether crawlers should request a URL. Noindex tells search engines not to show a crawlable page in search results. If robots.txt blocks the page, Google may not see a noindex tag on that page.

How long does it take for a new page to get indexed?

It can take anywhere from days to weeks, and sometimes longer for low-priority or weakly linked pages. Submitting the URL in Google Search Console can help discovery, but it does not guarantee indexing.