RAID 5 Triage — When Your Array Goes Dark » Why Is My Rebuild Stuck at 0%?

Understanding stalled rebuilds, parity disagreement, and when to stop and image


1️⃣ What It Looks Like

Your RAID controller reports:

Rebuild Progress: 0%
or
Rebuild Stalled – No Progress for Hours

Drives appear online, and the array might even show “Rebuilding” in BIOS or MegaRAID Manager.
But the counter never moves.

This is one of the most dangerous moments for a RAID system — the controller is writing parity while fighting read errors or mismatched geometry.


2️⃣ What’s Really Happening

Controllers don’t rebuild blindly — they read from every surviving disk before writing replacement parity.
If reads fail or metadata doesn’t agree, the rebuild halts, retries, or hangs at 0%.

Common underlying causes:

  • Repeatable read errors (UREs): bad sectors prevent a complete parity stripe.
  • Geometry disagreement: stripe size, order, or member count mismatch between drives and controller.
  • Foreign epoch drift: one member has an older or newer config version.
  • Cache inconsistency: battery-backed cache lost last-known offsets.
  • Rebuild started on wrong member: controller flagged the good disk as failed, so parity math is invalid.

The 0% hang is a safety state — the controller cannot continue without risking data corruption.


3️⃣ What Not to Do

When the counter won’t move:

  • Do not restart the rebuild; every reboot risks another parity rewrite.
  • Do not mark other drives offline or online to “force progress.”
  • Do not clear or import configurations.
  • Do not swap controllers or cables until the state is preserved.

These actions commit new metadata and destroy the original parity alignment.


4️⃣ What to Check Before Powering Down

  1. Controller Event Log: note the exact sequence of read retries or media errors.
  2. Physical Disk Screen: identify which member shows increasing Media Error or Predictive Failure counts.
  3. SMART Data: look for reallocated or pending sectors.
  4. Battery/BBU Status: if “Failed” or “Charging,” cache writes are disabled — no progress will occur.
  5. VD Configuration Table: confirm stripe size and layout match expected geometry.

If the same drive triggers every retry, that’s your likely point of failure — not the array itself.


5️⃣ Safe Response — The ADR Rule: Stop and Image

When rebuild progress stalls at 0%, it’s no longer recovery — it’s a diagnostic event.
Here’s how to preserve the array’s pre-damage state:

  1. Stop the rebuild immediately.
    • If still in BIOS, cancel rebuild.
    • If running in OS, power down gracefully.
  2. Label each drive with slot order and serial number.
  3. Clone all drives read-only using hardware or software imaging (DeepSpar, ddrescue, or ADR RAID Inspector™).
  4. Inspect metadata offline — verify each member’s sequence, size, and parity epoch.
  5. Virtually rebuild from the last clean state, excluding the bad member.

This preserves all valid data before controller-side writes overwrite historical parity.


6️⃣ How Professionals Confirm What’s Broken

ADR engineers diagnose 0% rebuilds using direct parity analytics:

  • Extract controller logs and metadata blocks from each drive.
  • Compare stripe-level offsets between members.
  • Recalculate expected parity to locate the failing LBA range.
  • Test sector recoverability on the bad disk — often 99% readable.
  • Reconstruct the array virtually, omitting the failing disk, and export data read-only.

By isolating the geometry disagreement or URE pattern, recovery succeeds without further parity loss.


7️⃣ When Geometry Mismatch Is the Cause

Controllers occasionally write an incorrect geometry map after firmware updates or slot remaps.
This causes parity math to misalign even though all drives are good.
Symptoms:

  • Array shows “Rebuilding” but parity check never progresses.
  • Rebuild completes instantly or hangs at 0%.
  • Reboot changes drive order or rebuild target.

Solution: export controller config, cross-check against disk headers, and revert to pre-firmware geometry before virtual reconstruction.


8️⃣ The “0% Rule” Summary

ConditionWhat It MeansCorrect Action
Rebuild never startsBad sector or geometry conflictStop, clone, inspect metadata
Rebuild restarts repeatedlyCache/battery issueReplace BBU, verify config before resuming
Rebuild completes instantlyWrong member flagged as failedCancel, restore previous state
Drive LEDs blink in sync but counter staticController retry loop on unreadable sectorsHalt and image drives immediately

9️⃣ Key Takeaways

  • “Stuck at 0%” = controller can’t complete read → parity math paused.
  • The controller isn’t idle — it’s retrying endlessly, risking further degradation.
  • Stop and image before metadata changes commit.
  • Rebuild only after metadata alignment and physical health are confirmed.
  • A professional parity reconstruction beats any on-controller rebuild.