Log in

No account? Create an account
penrose orange



cat /var/log/stephen >/dev/eyes

"fsck.ext2 for device /dev/hda4 exited with signal 11"
penrose orange
My first day back at work after my break, and I got dropped right in it; had to fix a customer's computer that wouldn't boot.

Turned out that there was some filesystem corruption severe enough to cause fsck(8) to segfault. Which is bad. On every reboot, the corruption was detected, fsck was run, crashed, and the cycle repeated.

Bizarre segmentation faults are often caused by slightly dodgy memory that generally seems alright, but fails when the system is under stress (such as during a filesystem check, a really big compile, or other memory-hungry operation). I tried replacing the memory; didn't help. I tried the disk in a different computer; same problem.

On the face of it, the corruption wasn't serious. I could mount the filesystem read-only and read the files without any difficulty. I just couldn't fix it. So I copied the files onto another disk, wrote a new filesystem to the problematic partition, copied all the files back, and ran fsck on the new filesystem a couple of times just to make sure that the problem wasn't with the disk. (Though I would have expected to see DMA timeouts or other low-level errors if the disk had been at fault).

So, the computer is working again, the customer is happy, and another unexplained disk problem has been swept under the carpet.