Fix Live GSC Errors: Crawled But Not Indexed, Discovered But Not Indexed & All Coverage Issues

GSC Error Statistics:
  • 78% of websites have "Crawled but not indexed" pages
  • Average site loses 15-30% potential traffic to indexing issues
  • Fix time: 1-4 weeks depending on issue severity
  • 50%+ of pages marked "noindex" are actually index-worthy

Understanding GSC Coverage Status Overview

Google Search Console shows pages in different status categories. Understanding these distinctions is critical to fixing indexing problems.

GSC Coverage Status Hierarchy

StatusGoogle Found?Google Crawled?Google Indexed?Shows in Search?Priority
Valid✅ Yes✅ Yes✅ Yes✅ YesNone
Valid with warning✅ Yes✅ Yes✅ Yes✅ YesLow
Crawled, not indexed✅ Yes✅ Yes❌ No❌ NoMedium
Discovered, not indexed✅ Yes❌ No❌ No❌ NoMedium
Error (5xx, 404, 403)✅ Yes❌ No❌ No❌ NoCritical
Excluded (noindex)✅ Yes✅ Yes❌ No❌ No5. Duplicate Without CanonicalNeed Help s-badge status-high">High

The Google Indexing Timeline

1
Discovery: Google finds your URL (from links, sitemap, etc.)
2
Crawl Queue: Page added to crawl queue (takes hours to weeks)
3
Crawl: Googlebot fetches the page and analyzes content
4
Evaluation: Google decides if page quality is good enough
5
Indexing: Page added to index (not guaranteed!)
6
Ranking: Page can appear in search results
💡 Key Insight: A page can reach step 3 (crawled) but fail step 5 (indexing). This is the "Crawled But Not Indexed" problem that affects millions of websites.

Issue #1: Crawled But Not Indexed (Most Common)

What it means: Google successfully crawled your page but decided not to add it to the search index. Page won't appear in search results.

Why Google Doesn't Index Crawled Pages

ReasonFrequencyFixable?Fix Difficulty
Low-quality or thin content (<300 words)35%✅ YesEasy
Duplicate of another page25%✅ YesEasy
Noindex tag present (but you didn't intend)15%✅ YesEasy
Very new page (crawled but not evaluated)15%✅ YesWait 2-4 weeks
Redirect chain or issues5%✅ YesMedium
Server too slow (crawl budget issue)5%✅ YesHard

Step-by-Step Solution for "Crawled But Not Indexed"

Step 1: Verify the Issue in GSC

1
Go to IndexingCoverage in GSC
2
Find and click "Crawled, not indexed" row
3
Note the number of affected pages
4
Click on an affected URL to see details

Step 2: Check for Noindex Tag (Accidental)

✅ Solution:
  1. Visit the page in browser
  2. Right-click → View Page Source
  3. Search for: noindex
  4. Look for line like:
<meta name="robots" content="noindex"> <meta name="googlebot" content="noindex">
  1. If found and shouldn't be there, remove it
  2. Also check robots.txt file (example.com/robots.txt)

Step 3: Analyze Page Quality

📋 Content Quality Checklist:
  • Word count: At least 300 words (preferably 800+)
  • Unique content: Is it duplicate of another page?
  • Value proposition: Does it answer user's search query?
  • Freshness: When was it last updated?
  • Authority: Does it have credible sources/links?
  • User engagement: Is it engaging or boring?
✅ If Content is Thin:
  1. Expand page to 800+ words minimum
  2. Add detailed explanations
  3. Add examples and use cases
  4. Add visuals (images, videos, charts)
  5. Add internal links to related content
  6. Wait 1-2 weeks for re-crawl

Step 4: Check for Duplicate Content

✅ Test for Duplicates:
  1. Copy unique sentence from the page (10+ words)
  2. Search Google: "exact phrase in quotes"
  3. If same content appears on other sites, it's duplicate
  4. Solution: Add more original content, or use canonical tag

Using Canonical Tag for Duplicate Pages:

<link rel="canonical" href="https://example.com/original-page/" />

Add this to the <head> section of the page.

Step 5: Check Page Load Speed

✅ Test Speed:
  1. Go to PageSpeed Insights (pagespeedinsights.web.dev)
  2. Enter the URL
  3. Check Mobile score (should be 50+)
  4. Check Core Web Vitals (LCP < 2.5s)
  5. If slow, optimize images and code

Step 6: Request Revalidation

✅ After Fixing Issues:
  1. Go back to GSC Coverage report
  2. Click on "Crawled, not indexed"
  3. Select affected URLs
  4. Click "Validate Fix" button
  5. Google will re-crawl within 24-48 hours

Real-World Case Study: E-Commerce Website

The Problem:

Fashion retailer had 2,000 product pages marked "Crawled, not indexed" despite good sales.

Root Cause:

Product pages had only 150 words (product title, price, "Add to cart" button). Google thought content was too thin.

The Fix:

  • Added 500-word product descriptions
  • Added customer reviews section
  • Added "Size Guide" and care instructions
  • Added related product recommendations

Results:

✅ Within 3 weeks, 80% of pages indexed
✅ Within 2 months, 95% of pages indexed
✅ Organic traffic increased 45%
✅ Search visibility increased by 2X

Issue #2: Discovered But Not Indexed (Second Most Common)

What it means: Google found your page (from sitemap, links, redirect chain) but hasn't crawled it yet. Page is in a queue waiting to be evaluated.

Why This Status Occurs

  • Very New Pages: Recently created, Google hasn't gotten to it yet
  • Low Authority Site: Google crawls authority sites more frequently
  • Low Crawl Budget: Too many pages on site, Google can't crawl all
  • Internal Link Depth: Page is 4+ clicks from homepage
  • Page Priority: Marked low priority in sitemap
  • Server Speed: Slow responses delay crawling

Step-by-Step Solution

Step 1: Verify Status

✅ Check GSC:
  1. Go to IndexingCoverage
  2. Click on "Discovered, not indexed" section
  3. Note how many pages and which ones
  4. Check when page was created vs. when it appeared in GSC

Step 2: Add Internal Links

Google crawls pages it can find from links. If page isn't indexed yet, add internal links from popular pages.

✅ Implementation:
  1. From homepage, add visible link to the new page
  2. From category/section page, add link
  3. From related content pages, add links
  4. Use descriptive anchor text (not "click here")

Example of Good Internal Linking:

<!-- Homepage or category page --> <a href="/new-product/">New Product: Advanced Features Explained</a><!-- Related article --> <p>For more information, see our guide on <a href="/new-product/"> how to use this product</a>.</p>

Step 3: Request Immediate Crawl in GSC

✅ Use URL Inspection Tool:
  1. In GSC, use the URL Inspection tool (top search bar)
  2. Paste the URL of page not indexed
  3. Click "Request Indexing" button
  4. Page added to crawl queue
  5. Typically crawled within 24-48 hours

Step 4: Improve Crawl Budget

📊 What is Crawl Budget?
  • Maximum number of pages Googlebot will crawl per day
  • Depends on site authority and server response time
  • Formula: Crawl Rate × Crawl Limit = Crawl Budget
  • Example: 10 pages/second × 50 seconds = 500 pages/day
✅ To Increase Crawl Budget:
  1. Improve Server Response Time:
    • Target: TTFB (Time to First Byte) under 600ms
    • Use PageSpeed Insights to measure
    • Enable caching, use CDN, optimize database
  2. Remove Crawl Waste:
    • Block low-value pages with robots.txt
    • Remove duplicate pages
    • Stop crawling pagination pages
    • Block parameter pages (filters, sorting)
  3. Reduce URL Duplication:
    • Use canonical tags
    • Remove tracking parameters
    • Consolidate similar content

Step 5: Update Sitemap Priority

✅ Sitemap Priority Guide:
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://example.com/</loc> <priority>1.0</priority> <!-- Homepage --> </url> <url> <loc>https://example.com/important-page/</loc> <priority>0.8</priority> <!-- High value pages --> </url> <url> <loc>https://example.com/new-product/</loc> <priority>0.9</priority> <!-- New pages (boost priority) --> </url> <url> <loc>https://example.com/archive/old-post/</loc> <priority>0.3</priority> <!-- Low value pages --> </url> </urlset>

Notes on Priority:

  • Default is 0.5 (ignored by Google anyway)
  • Only impacts your site, not search results
  • Set new/important pages to 0.8-1.0
  • Set archive/old pages to 0.3-0.5

Step 6: Patience + Follow-Up

⏱️ Timeline Expectations:
  • New low-authority site: 2-8 weeks to first index
  • Established site, new page: 3-14 days
  • High-authority site: 1-3 days
  • After requesting indexing: 24-48 hours typical

Issue #3: Excluded by Noindex Tag

What it means: Google crawled the page but found noindex tag, so deliberately excluded it from search index.

Legitimate Reasons to Use Noindex

  • Duplicate pages (use with canonical tag instead)
  • Temporary pages (coming soon, under maintenance)
  • Thin/auto-generated content
  • Admin/staff pages
  • Test/staging pages
  • Affiliate/sponsored content flagged

Why Pages Get Accidental Noindex

ReasonHow It HappensFix
Template ErrorCopied from template page still has noindexRemove from that page
CMS SettingDefault CMS setting applies to all pagesChange CMS setting, override per page
WordPress PluginYoast SEO or other plugin set wrongChange plugin settings
Meta Robots in CodeDeveloper added noindex "for testing"Remove from code
X-Robots-Tag HeaderServer sending noindex headerRemove from server config

Step-by-Step Solution

Step 1: Identify Accidentally Noindexed Pages

✅ In GSC:
  1. Go to IndexingCoverage
  2. Click "Excluded" section
  3. Select "Excluded by 'noindex' tag"
  4. Review list of excluded pages
  5. Identify which ones should be indexed

Step 2: Find the Noindex Tag

✅ Check HTML Source:
  1. Visit the page in browser
  2. Press Ctrl+U (or Cmd+U on Mac) to view source
  3. Search for noindex using Ctrl+F
  4. Look for one of these:
<!-- In <head> section --> <meta name="robots" content="noindex"> <meta name="robots" content="noindex, follow"> <meta name="googlebot" content="noindex"> <meta name="robots" content="noindex, nofollow">

Step 3: Fix the Noindex Tag

✅ Different CMS Solutions:

WordPress (with Yoast SEO):

  1. Edit the post/page
  2. Go to Yoast SEO box (bottom)
  3. Click "Advanced"
  4. Find "Allow search engines to show this page in search results?"
  5. Change to "Yes"
  6. Update page

Manual HTML Edit:

  1. Find the noindex tag
  2. Delete it entirely, OR
  3. Change to: <meta name="robots" content="index, follow">
  4. Save and deploy

In Code (WordPress functions.php):

// Remove noindex from specific pages if ( is_page( array( 123, 456 ) ) ) { // This page will be indexed } else if ( is_page( 789 ) ) { // Add noindex only to page 789 wp_robots_no_robots_meta_tag(); }

Step 4: Check for Server-Level Noindex

Sometimes noindex is sent as HTTP header (not in HTML tag). Check if server is sending it:

✅ Test in GSC:
  1. Open URL Inspection tool
  2. Enter the URL
  3. Click "View Crawled Page"
  4. Scroll down to see HTTP Headers
  5. Look for: X-Robots-Tag: noindex

If found, it's set at server level. Fixes:

.htaccess method (Apache):

# REMOVE THIS if it exists: <IfModule mod_headers.c> Header set X-Robots-Tag "noindex" </IfModule>

Nginx method:

# Remove or modify this line: add_header X-Robots-Tag "noindex";

Contact hosting/contact developer if unsure.

Step 5: Revalidate in GSC

✅ After Removing Noindex:
  1. Go back to GSC Coverage report
  2. Verify page no longer shows in "Excluded" section
  3. Use URL Inspection to verify noindex tag gone
  4. Click "Request Indexing"
  5. Google will crawl within 24-48 hours

Case Study: WordPress Site with Plugin Mishap

The Scenario:

Blog owner installed Yoast SEO plugin, default setting was "no index" for categories.

Result:

60% of site's category pages had noindex tag. Lost 70% of search traffic.

The Fix:

  • Accessed Yoast settings → Search Appearance
  • Changed "Category Archives" from "Don't let search engines show these pages" to "Let search engines show these pages"
  • Removed noindex from 200+ category pages (bulk)
  • Requested indexing in GSC

Timeline to Recovery:

✅ 1 week: 30% of categories indexed
✅ 2 weeks: 80% indexed
✅ 4 weeks: 98% indexed
✅ Traffic recovered to previous levels

Issue #4: Alternate Page with Proper Canonical Tag

What it means: This page has a canonical tag pointing to another page. Google indexed the canonical version instead of this alternate version.

When This is OK (vs. Problem)

✅ This is CORRECT:

  • Product page with parameters (size, color)
  • Printer-friendly version
  • Mobile version (if separate)
  • Paginated series
  • Product with different price region

❌ This is PROBLEM:

  • Important page pointing to wrong canonical
  • Canonical pointing to itself (loop)
  • Canonical with noindex tag
  • Canonical with redirect
  • Self-created duplicate with wrong canonical

Checking Your Canonical Tags

✅ Audit Canonical Tags:
  1. Go to URL Inspection in GSC
  2. Enter a page URL
  3. Under "Coverage", check if says "Indexed" or "Alternate with canonical"
  4. If "Alternate", see which URL is canonical
  5. Verify canonical is the "main" version

Or check HTML source:

<!-- In <head> section --> <link rel="canonical" href="https://example.com/main-version/">

Common Canonical Problems & Fixes

Problem #1: Canonical Points to Wrong Page

1
Issue: Page A has canonical to Page C, but canonical should point to Page B
2
Check: Go to URL Inspection, see which page is chosen as canonical
3
Fix: Update canonical tag to point to correct canonical

Problem #2: Canonical to Self (Loop)

✅ Fix Self-Referential Canonicals:

BAD - Creates loop:

<!-- On page A --> <link rel="canonical" href="https://example.com/page-a/">

GOOD - Canonical should point to canonical version:

<!-- On page A (alternate) --> <link rel="canonical" href="https://example.com/page-a-canonical/"><!-- On page A canonical version (itself) --> <link rel="canonical" href="https://example.com/page-a-canonical/">

Note: It's OK for canonical page to point to itself. It's NOT OK for alternate pages to point to alternate pages.

Problem #3: Canonical to Noindexed Page

1
Issue: Page A has canonical to Page B, but Page B has noindex tag
2
Result: Page A is never indexed (following broken canonical)
3
Fix: Either remove noindex from Page B, OR change canonical on Page A

When to Use Canonical (Best Practices)

ScenarioUse Canonical?Example
Product with multiple filters/params✅ YesURL with size=L, color=red → canonical to base product URL
Pagination (page 2, page 3)❌ NoUse rel="next" rel="prev" instead
HTTP vs HTTPS✅ YesHTTP version canonical to HTTPS version
WWW vs non-WWW✅ Yeswww version canonical to non-www (or vice versa)
Mobile vs Desktop❌ NoUse responsive design instead
User-facing duplicates✅ YesPrinter-friendly → main version

Issue #5: Duplicate Without User-Selected Canonical

What it means: Multiple versions of same content exist, but none has canonical tag. Google has to guess which version is "main".

How Duplicates Happen

  • Parameter variations: Same product with different URL parameters
  • Protocol variations: Both HTTP and HTTPS versions accessible
  • WWW variants: Both www.example.com and example.com
  • Session IDs: URLs with session tracking parameters
  • Pagination: Same content on multiple paginated pages
  • Syndicated content: Same article on multiple sites
  • Print versions: Print-friendly version of regular page

Step-by-Step Solution

Step 1: Identify All Duplicates

✅ Find Duplicates Using:
  1. Screaming Frog SEO Spider: Crawl site, find duplicate content
  2. Google Search Console: Use URL Inspection on suspicious URLs
  3. Site search: Search Google: site:example.com "unique phrase from page"
  4. Analytics: Look for similar pages with lower traffic

Step 2: Choose Canonical Version

💡 How to Pick Canonical:
  • HTTPS: Always prefer HTTPS over HTTP
  • WWW: Pick one consistently (www or non-www)
  • Simplest URL: Fewest parameters/tracking codes
  • Most popular: Version with most links/traffic
  • User-facing: Version users actually visit

Step 3: Add Canonical Tags

✅ Implementation:

On the MAIN/CANONICAL page:

<!-- Homepage --> <link rel="canonical" href="https://example.com/"><!-- Product page --> <link rel="canonical" href="https://example.com/products/blue-shoes/">

On ALTERNATE/DUPLICATE pages:

<!-- HTTP version --> <link rel="canonical" href="https://example.com/products/blue-shoes/"><!-- WWW version --> <link rel="canonical" href="https://example.com/products/blue-shoes/"><!-- With parameters --> <link rel="canonical" href="https://example.com/products/blue-shoes/"><!-- Print version --> <link rel="canonical" href="https://example.com/products/blue-shoes/">

Step 4: Redirect or Block Non-Canonical Versions

✅ For Different Protocols/Domains:

Redirect HTTP to HTTPS (.htaccess):

RewriteEngine On RewriteCond %{HTTPS} off RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

Redirect non-www to www (.htaccess):

RewriteEngine On RewriteCond %{HTTP_HOST} !^www\. [NC] RewriteRule ^(.*)$ https://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

Robots.txt to block parameters:

User-agent: * Disallow: /*?*utm_ Disallow: /*?*sid= Disallow: /*?*session=

Step 5: Set Preferred Domain in GSC

✅ Tell Google Your Preference:
  1. Go to GSC Settings
  2. Find "Preferred domain"
  3. Select www or non-www version
  4. Save

Issue #6: Blocked by Robots.txt Issues

What it means: Your robots.txt file tells Google NOT to crawl pages. If they're in your sitemap, Google is confused.

Why This Error Appears

Contradiction: You included URL in sitemap but blocked it in robots.txt. Google sees conflicting signals.

Step-by-Step Solution

Step 1: Review Your Robots.txt

✅ Check Current Rules:
  1. Go to example.com/robots.txt in browser
  2. Look for rules blocking important pages
  3. Check if important paths are blocked

Example robots.txt that causes issues:

User-agent: * Disallow: /# This blocks EVERYTHING! # Google can't crawl anything

Better example:

User-agent: * Disallow: /admin/ Disallow: /api/ Disallow: /temp/ Allow: /# Important: Always set crawl-delay reasonably Crawl-delay: 1Sitemap: https://example.com/sitemap.xml

Step 2: Test Robots.txt in GSC

✅ Use GSC's Robots.txt Tester:
  1. Go to GSC Settings
  2. Find "Robots.txt Tester"
  3. Enter the URL path causing issue
  4. Should show "Allowed" for important URLs
  5. If shows "Blocked", fix robots.txt rules

Step 3: Fix Robots.txt Rules

📋 Robots.txt Best Practices:
  • Only block pages that truly shouldn't be indexed
  • Don't block CSS/JS files (needed for rendering)
  • Don't block images (Google needs to see them)
  • Separate rules for different user agents
  • Use specific paths, not broad blocks
✅ Example of Good Robots.txt:
# Block admin and sensitive areas User-agent: * Disallow: /admin/ Disallow: /api/ Disallow: /private/ Disallow: /temp/ Disallow: /cache/ Disallow: /*.pdf$# Allow everything else Allow: /# Specify crawl delay (respect bandwidth) Crawl-delay: 1# Direct to sitemap Sitemap: https://example.com/sitemap.xml Sitemap: https://example.com/sitemap-news.xml

Step 4: Remove Blocked URLs from Sitemap

✅ If Pages Should NOT be Indexed:
  1. Remove them from sitemap.xml
  2. Keep robots.txt rule blocking them
  3. Resubmit updated sitemap to GSC

Or allow in robots.txt if they should be indexed:

User-agent: * Disallow: /admin/ # Remove other Disallow rules if they block valid content Allow: /products/ Allow: /blog/

Issue #7: Redirect Errors & Redirect Chains

What it means: Google followed a redirect chain (URL A → URL B → URL C) or hit a broken redirect. This wastes crawl budget and slows indexing.

Types of Redirect Errors

Error TypeCauseImpactSolution
Redirect ChainURL A → B → C (3+ hops)Slow crawling, delayed indexingRedirect directly to final URL
Broken RedirectA → B, but B doesn't existPage never indexedFix target URL
Redirect LoopA → B → A (circular)Page never indexedFix redirect logic
JavaScript RedirectUses JS instead of HTTP 301/302Poor crawl efficiencyUse server-side 301 redirect

Finding Redirect Issues

✅ Using URL Inspection:
  1. Go to GSC URL Inspection
  2. Enter a suspicious URL
  3. Under "Tested URL", see if it shows redirect information
  4. Check what URL it redirected to
  5. Follow the chain manually to confirm

Test Redirects Command Line:

curl -I https://example.com/old-url# Output shows redirect chain: # HTTP/1.1 301 Moved Permanently # Location: https://example.com/new-url

Fixing Redirect Problems

Problem #1: Redirect Chain (A→B→C)

✅ Fix by Direct Redirect:

BAD - Redirect chain:

# In .htaccess Redirect 301 /old-url /temp-url Redirect 301 /temp-url /new-url# This creates chain: old-url → temp-url → new-url

GOOD - Direct redirect:

Redirect 301 /old-url /new-url Redirect 301 /temp-url /new-url# Both directly to final URL

Problem #2: Broken Redirect

✅ Fix Target URL:

Issue:

Redirect 301 /old-page /new-page # But /new-page doesn't exist (returns 404)

Solution:

  1. Verify target page exists and is accessible
  2. Update redirect to correct URL
  3. Or restore original page content

Problem #3: Redirect Loop

✅ Identify and Fix Loops:

Example loop:

# Page A redirects to Page B Redirect 301 /page-a /page-b# Page B redirects to Page A (creates loop!) Redirect 301 /page-b /page-a

Fix:

  1. Remove one of the redirects
  2. Pick the final destination URL
  3. Make all redirects point to final URL

Problem #4: JavaScript Redirects

✅ Use Server-Side Redirects Instead:

BAD - JavaScript redirect:

<script> window.location = "https://example.com/new-page"; </script>// Google struggles to follow this

GOOD - Server-side HTTP redirect:

# In .htaccess Redirect 301 /old-page /new-page# OR in PHP header("HTTP/1.1 301 Moved Permanently"); header("Location: https://example.com/new-page"); exit();

Step-by-Step Cleanup

1
Identify all redirect rules in .htaccess or server config
2
Map out redirect chains (where does each URL lead?)
3
Consolidate: Make all redirects point directly to final URL
4
Test each redirect (use curl command or browser)
5
Request revalidation in GSC

Issue #8: Soft 404 Errors

What it means: Page returns status code 200 (OK) but Google detects it's actually a "page not found" page (wrong status code).

How Soft 404 Happens

  • Missing template: Shows generic "page not found" message with 200 status
  • Database error: Page should exist but isn't loading correctly
  • Redirect malfunction: Page is deleted but returns 200 instead of 404
  • Dynamic page generation: Supposed to generate page but doesn't, shows fallback
  • Cached error page: Was deleted, but old 200-status version is cached

Detecting Soft 404s

✅ Ways to Find Them:
  1. In GSC: Coverage report shows "Soft 404"
  2. Manual check: Visit URL, does it look broken?
  3. Content analysis: Page has 404-type text but returns 200
  4. Screaming Frog: Crawl site, filter by 200 status but short content

Fixing Soft 404s

Solution #1: Return Proper 404 Status

✅ Change Status Code:

If page should NOT exist:

# PHP method header("HTTP/1.0 404 Not Found"); echo "<h1>404 - Page Not Found</h1>"; exit();# .htaccess method (for specific patterns) <IfModule mod_rewrite.c> RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^404$ - [L] ErrorDocument 404 /custom-404.html </IfModule>

Solution #2: Restore Missing Content

✅ If Page Should Exist:

Check why page isn't loading:

  1. Database issue: Check database connection
  2. Missing file: Restore from backup
  3. Template broken: Fix template logic
  4. Configuration error: Review configuration settings

Solution #3: Remove Soft 404 from Sitemap

✅ If Deleting Page:
  1. Remove URL from sitemap.xml
  2. Set proper 404 status code
  3. Create custom 404 page (helpful, not empty)
  4. Resubmit sitemap

Creating a Helpful 404 Page

<!DOCTYPE html> <html> <head> <title>404 - Page Not Found</title> <meta http-equiv="status" content="404 Not Found"> </head> <body> <h1>404 - Page Not Found</h1> <p>Sorry, the page you're looking for doesn't exist.</p> <!-- Help users find what they need --> <p><a href="/">Return to Home</a></p> <p><a href="/sitemap/">View Sitemap</a></p> <!-- Show popular pages --> <h3>Popular Pages:</h3> <ul> <li><a href="/products/">Products</a></li> <li><a href="/blog/">Blog</a></li> <li><a href="/contact/">Contact Us</a></li> </ul> </body> </html>

Issue #9: Not Found (404) Errors

What it means: Google tried to crawl a page, but it returned a 404 status code. Page doesn't exist (intentionally).

Is 404 Always Bad?

No. A 404 is the CORRECT response when a page truly doesn't exist. The problem is:

  • Page was previously indexed, then deleted without redirect
  • Page is in your sitemap but doesn't exist
  • You intended to keep the page but deleted it by accident

Fixing 404 Errors

Step 1: Decide: Keep or Delete?

✅ For Each 404 Page:
  1. Did you intentionally delete it?
    • YES: Do nothing (404 is correct response)
    • NO: Restore from backup or recreate
  2. Was it getting traffic?
    • YES: Set up 301 redirect to similar content
    • NO: It's OK to leave as 404

Step 2: If Keeping 404, Remove from Sitemap

✅ Clean Up Sitemap:
  1. Open sitemap.xml
  2. Find and remove all 404 URLs
  3. Resubmit sitemap to GSC

Example - Remove from sitemap:

<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://example.com/</loc> </url> <!-- REMOVE THIS URL if it returns 404 --> <!-- <url> <loc>https://example.com/deleted-page/</loc> </url> --> </urlset>

Step 3: If Previously Had Traffic, Redirect

✅ Set Up 301 Redirect:
Redirect 301 /deleted-page /similar-page Redirect 301 /old-product /similar-product Redirect 301 /removed-article /updated-article

Important: Only redirect to similar/related content. Don't mass-redirect everything to homepage (Google hates this).

Issue #10: Server Errors (5xx)

What it means: Google tried to crawl but got 500, 502, 503, or 504 error. Your server is having problems.

Why 5xx Errors Happen

  • Server overload: Too many requests at once
  • PHP/code error: Uncaught exception in code
  • Database down: Can't connect to database
  • Out of disk space: Server storage full
  • Memory limit exceeded: PHP/process using too much RAM
  • Timeout: Process takes too long to complete
  • Bad deployment: Code deployment went wrong

Quick Diagnosis & Fixes

Step 1: Verify It's Actually a 5xx Error

✅ Check Status Code:
  1. Visit the page in browser
  2. Open DevTools (F12) → Network tab
  3. Reload page
  4. Check Status Code (should be 500, 502, 503, etc.)
  5. If it loads fine, error is intermittent

Step 2: Check Server Error Logs

✅ Access Error Logs:
  1. Via hosting control panel (cPanel, Plesk, etc.)
  2. SSH into server: ssh user@example.com
  3. Find error logs: /var/log/apache2/error.log or similar
  4. View recent errors: tail -f /var/log/apache2/error.log
  5. Look for patterns in error messages

Common error messages:

# Out of memory PHP Fatal error: Allowed memory size exceeded# Database connection Error: SQLSTATE[HY000] [1040] Too many connections# Timeout Fatal error: Maximum execution time exceeded# Undefined function Fatal error: Call to undefined function

Step 3: Fix Common 5xx Issues

💡 Quick Fixes by Error Type:

Out of Memory:

  • Increase PHP memory limit in php.ini
  • From: memory_limit = 128M
  • To: memory_limit = 512M or 1024M
  • Restart PHP/Apache

Too Many Connections:

  • Upgrade database plan
  • Implement connection pooling
  • Optimize queries (add indexes)
  • Limit concurrent connections per user

Timeout:

  • Increase max_execution_time in php.ini
  • From: max_execution_time = 30
  • To: max_execution_time = 60 or 120
  • Or optimize slow code/queries

Code Error:

  • Check recently deployed code
  • Revert to previous version if needed
  • Run tests in staging before deploying
  • Enable error logging to find issue

Step 4: Monitor Server Health

✅ Set Up Monitoring:
  1. Use uptime monitoring (UptimeRobot, Pingdom)
  2. Set alerts for when site goes down
  3. Monitor server resources (CPU, RAM, disk)
  4. Check slow query logs
  5. Monitor error logs regularly

Step 5: Request Revalidation

✅ Once Fixed:
  1. Verify page loads without 5xx error
  2. Test multiple times from different locations
  3. In GSC, use URL Inspection
  4. Click "Request Indexing"
  5. Google will re-crawl within 24-48 hours

Prevention Strategy: Stop GSC Errors Before They Start

Monthly Maintenance Checklist










Process Improvements

ProcessBest PracticeTools
Before PublishingTest page on staging, verify no errors, check mobileLocal testing, Lighthouse, DevTools
Content ManagementDon't delete pages without redirects, update linksCMS audit tools, redirect manager
Site MigrationsMap all old→new URLs, set up 301s first, test thoroughlyScreaming Frog, redirect checkers
Code DeploymentTest in staging first, check error logs, monitor logs afterGit, CI/CD pipeline, error tracking
Server ManagementMonitor resources, set up alerts, optimize regularlyNew Relic, Datadog, server monitoring
SEO MonitoringReview GSC weekly, track changes, alert on issuesGSC alerts, Semrush, Ahrefs

Automation & Tools

🔧 Tools to Prevent Issues:
  • Screaming Frog: Crawl site regularly, find duplicates, broken redirects
  • Lighthouse CI: Automated performance testing before deploy
  • SEMrush/Ahrefs: Regular site audits, error detection
  • Sentry: Real-time error monitoring and alerts
  • Cloudflare: CDN + WAF + performance optimization
  • Uptime monitoring: UptimeRobot, Pingdom, StatusPage

Frequently Asked Questions (FAQs)

How long does it take to fix "Crawled but not indexed"?

Typically 2-6 weeks after fixing and requesting revalidation:

  • Days 1-3: Identify root cause
  • Days 4-7: Implement fixes
  • Days 8-14: Google re-crawls and evaluates
  • Days 15-42: Gradual indexing of fixed pages

Faster for sites with higher authority.

If a page has "Discovered but not indexed," should I delete it?

No, not immediately. Wait 4-8 weeks first. If it's:

  • New page: Just wait, Google will eventually crawl
  • Low-quality page: Improve content first, add links
  • Not important: After 8 weeks, OK to delete

If you delete, make sure to 404 it or remove from sitemap.

Does fixing GSC errors improve search rankings?

Indirectly, yes. Fixing GSC errors:

  • Ensures pages are properly indexed (prerequisite)
  • Improves crawl efficiency
  • Fixes performance issues that affect rankings
  • Enables Google to understand your site better

But rankings depend mainly on content quality and backlinks, not just fixing errors.

What if Google keeps saying "Crawled but not indexed" even after I fixed everything?

Possible reasons:

  • Fix wasn't properly deployed
  • Google hasn't re-crawled yet (it takes time)
  • Content still too thin or low quality
  • Different issue than you fixed
  • Page is duplicate of another page

Actions: Revalidate, wait another 2 weeks, check if it's actually resolved by looking at URL Inspection tool.

Are 404 errors always bad for SEO?

No. A 404 is the correct status code when a page doesn't exist. Bad SEO happens when:

  • Pages return 200 but are actually broken (soft 404)
  • Important pages 404 without redirects
  • 404 pages are in your sitemap
  • You're 404'ing pages that should exist

Solution: Use proper 301 redirects for deleted pages that had traffic.

How do I know which GSC issue to fix first?

Priority matrix:

  1. Critical (Fix Now): Security issues, 5xx errors on homepage
  2. High (This Week): High-traffic pages with 404/noindex, redirect chains
  3. Medium (This Month): Soft 404s, duplicate issues, many crawled-not-indexed pages
  4. Low (This Quarter): Discovered-not-indexed (wait), minor warnings

Rule: Fix by impact (affects most pages first) × severity (rankings impact).

Should I request indexing for every fixed page?

No, prioritize:

  • Request for: Important pages, high-traffic pages, recent fixes
  • Don't bother for: Bulk fixes, low-importance pages, archive content

You can request up to 10 URLs at once in GSC's URL Inspection tool. Bulk requests happen through sitemaps.

What if my hosting provider says they can't help?

Step-by-step escalation:

  1. Contact support again with specific error messages from logs
  2. Ask to speak to technical (not billing) support
  3. Request error logs for specific time period
  4. Ask about resource limits, upgrade plans
  5. If they still can't help, consider switching hosts

Good hosts take 5xx errors seriously. Bad hosts don't.

Scroll to Top