One Developer's Experience With Real Life Bitrot Under HFS+ 396
New submitter jackjeff (955699) writes with an excerpt from developer Aymeric Barthe about data loss suffered under Apple's venerable HFS+ filesystem. HFS+ lost a total of 28 files over the course of 6 years. Most of the corrupted files are completely unreadable. The JPEGs typically decode partially, up to the point of failure. The raw .CR2 files usually turn out to be totally unreadable: either completely black or having a large color overlay on significant portions of the photo. Most of these shots are not so important, but a handful of them are. One of the CR2 files in particular, is a very good picture of my son when he was a baby. I printed and framed that photo, so I am glad that I did not lose the original.
(Barthe acknowledges that data loss and corruption certainly aren't limited to HFS+; "bitrot is actually a problem shared by most popular filesystems. Including NTFS and ext4." I wish I'd lost only 28 files over the years.)
Backup? (Score:4, Insightful)
Comment removed (Score:5, Insightful)
Re:Backup? (Score:5, Insightful)
The problem with bit rot is that backups doesn't help. The corrupted file go into the backup and eventually replace the good copy depending on retention policy. You need a file system which uses checksums on all data block so that it can detect a corrupted block after reading it, flag the file as corrupted so that you can restore it from a good backup.
Re:Legacy file systems should be illegal (Score:5, Insightful)
At least you would know that the file was corrupted, so that you could restore it from a good backup.
Re:Legacy file systems should be illegal (Score:5, Insightful)
Yes absolutely great idea! Rather than having technical decisions being made at tech conferences and among developers, system administrators and analysts we should move that authority over to legislature. Because we all know we are going to see a far better weighing of the costs and benefits of various technology choices by the legislature than by technology marketplace.
Apple used HFS+ because it worked to successfully migrate people from Mac OS9, it supported a unix / MacOS hybrid. They continue to use it because it has been good enough and many of the more robust filesystems were pretty heavyweight. I'd like something like BTFS too. But I don't think the people who disagree with me should be jailed.
Re: Bitrot not the fault of filesystem (Score:5, Insightful)
Re:I've also had this happen with HFS+ (Score:5, Insightful)
coincidentally jpg images as well
Well, JPGs are usually lossy and thus compressed. Flipping one bit in a compressed image file is likely to have severe consequences. OTOH, you could coXrupt a fewYentire byteZ in an uncompressed text file and it would still be readable. I suspect your drives also had a few "typos" that you didn't notice because of that.
And the story is? (Score:4, Insightful)
Bitrot. It's a thing. It's been a thing since at least the very first tape drive - hell it was a thing with punch cards (when it might well have involved actual rot). While the mechanism changes, every single consumer-level data-storage system in the history of computing has suffered from it. It's a physical phenomena independent from file system, and impossible to defend against in software unless it transparently invokes the one and only defense: redundant data storage. Preferably in the form of multiple redundant backups.
So what is the point of this article?
Re:Backup? (Score:5, Insightful)
Depends on the backup methodology. If your backup works the way Apple's backups do, e.g. only modified files get pushed into a giant tree of hard links, then there's a good chance the corrupted data won't ever make it into a backup, because the modification wasn't explicit. Of course, the downside is that if the file never gets modified, you only have one copy of it, so if the backup gets corrupted, you have no backup.
So yes, in an ideal world, the right answer is proper block checksumming. It's a shame that neither of the two main consumer operating systems currently supports automatic checksumming in the default filesystem.
Re:HFS reliability (Score:4, Insightful)
Anyone who owned a Mac since the 80s remembers having to use Norton Disk Doctor and later DiskWarrior at least once per month to repair the filesystem. Entire folders could go randomly missing each time you booted up your Mac, and if you accidentally lost power to your hard drive, the use of one of those was mandatory.
Oh yes. I remember those days well. Journaled HFS+ fixed that, and for about the last decade the only times I have encountered a corrupted file system on a Mac, that discovery was followed shortly by total failure of the hard disk.
So, what was your fucking point?
Re:I've also had this happen with HFS+ (Score:4, Insightful)
Ultimately, it isn't a "failure" of HFS+ when your files get corrupted. It was (definitely) a hardware failure. It's just that HFS+ didn't catch the error when it happened.
Granted, HFS+ is due for an update. That's something I've said myself many times. But blaming it when something goes wrong is like blaming your Honda Civic for smashing your head in when you roll it. It wasn't designed with a roll cage. You knew that but you bought it anyway, and decided to hotdog.
Checksums also have performance and storage costs. So there are several different ways to look at it. One thing I strongly suggest is keeping records of your drive's S.M.A.R.T. status, and comparing them from time to time. And encourage Apple to update their FS, rather than blaming it for something it didn't cause, or for not doing something it wasn't designed to do.