One Developer's Experience With Real Life Bitrot Under HFS+ 396
New submitter jackjeff (955699) writes with an excerpt from developer Aymeric Barthe about data loss suffered under Apple's venerable HFS+ filesystem. HFS+ lost a total of 28 files over the course of 6 years. Most of the corrupted files are completely unreadable. The JPEGs typically decode partially, up to the point of failure. The raw .CR2 files usually turn out to be totally unreadable: either completely black or having a large color overlay on significant portions of the photo. Most of these shots are not so important, but a handful of them are. One of the CR2 files in particular, is a very good picture of my son when he was a baby. I printed and framed that photo, so I am glad that I did not lose the original.
(Barthe acknowledges that data loss and corruption certainly aren't limited to HFS+; "bitrot is actually a problem shared by most popular filesystems. Including NTFS and ext4." I wish I'd lost only 28 files over the years.)
I've also had this happen with HFS+ (Score:5, Informative)
Re:Legacy file systems should be illegal (Score:2, Informative)
The problem is, neither ZFS or Btrfs would have stopped an arbitrary bit inside an arbitrary file from becoming corrupt if the disk failed to write it or read it correctly. Only multiple disks and redundancy would have solved that.
Re:Legacy file systems should be illegal (Score:4, Informative)
copies=1 | 2 | 3 Controls the number of copies of data stored for this dataset. These copies are in addition to any redundancy provided by the pool, for example, mirroring or RAID-Z. The copies are stored on different disks, if possible. The space used by multiple copies is charged to the associated file and dataset, changing the used property and counting against quotas and reservations. Changing this property only affects newly-written data. Therefore, set this property at file system creation time by using the -o copies=N option.
Isn't Samsung the largest UNIX vendor? *grin* (Score:2, Informative)
Due to their commanding smartphone marketshare, along with millions of devices with embedded Linux shipped every year, wouldn't Samsung be the largest UNIX vendor?
Oh? What's that? You weren't counting embedded Linux and I'm a pedantic #$(*#$&@!!!. Can't argue with that!
article is suspect, summary is worse (Score:5, Informative)
In a footnote he admits that the corruption was caused by hardware issues, not HFS+ bugs, and of course the summary ignores that completely.
So, for that, let me counter his anecdote with my own anecdote: I have an HFS+ volume with a collection of over 3,000,000 files on it. This collection started in 2004, approximately 50 people access thousands of files on it per day, and occasionally after upgrades or problems it gets a full byte-to-byte comparison to one of three warm standbys. No corruption found, ever.
Clueless article (Score:5, Informative)
People talking about "bit rot" usually have no clue, and this guy is no exception.
It's extremely unlikely that a file would become silently corrupted on disk. Block devices include per-block checksums, and you either have a read error (maybe he has) or the data read is the same as the data previously written. As far as I know, ZFS doesn't help to recover data from read errors. You would need RAID and / or backups.
Main memory is the weakest link. That's why my next computer will have ECC memory. So, when you copy the file (or otherwise defragment or modify the file, etc), you read a good copy, some bit flips in RAM, and you write back corrupted data. Your disk receives the corrupted data, happily computes a checksum, therefore ensuring you can read back your corrupted data faithfully. That's where ZFS helps. Using checksumming scripts is a good idea, and I do it myself. But I don't have auto-defrag on Linux, so I'm safer : when I detect a corrupted copy, I still have the original.
ext2 was introduced in 1993, and so was NTFS. ext4 is just ext2 updated (ext was a different beast). If anything, HFS+ is more modern, not that it makes a difference. All of them are updated. By the way, I noticed recently that Mac OS X resource forks sometimes contain a CRC32. I noticed it in a file coming from Mavericks.
Re:Backup? (Score:2, Informative)
The bitrot will change the checksums and cause the files to show up as modified.
Moreover, what will you do about a reported bitrotted file unless you have genuine archival backups somewhere else?
Re: Legacy file systems should be illegal (Score:4, Informative)
Not if your OS is tied intimately to your filesystem. Linux might not, because a large number of things are abstracted out, but FreeBSD depends on its file system, Solaris took a very long time/effort before it could boot off ZFS. Forget about moving Windows off NTFS. Apple actually did some work on putting it onto ZFS, maybe they will continue.
Re:Backup? (Score:2, Informative)
Macosx Time Machine works by listening to filesystem events except for the first backup where everything is copied over as is. Bit rot doesn't get transferred until you overwrite the file, time by which it should have been obvious something was fishy or the bitrot was negligible and you didn't notice yourself. There are also situations where Time Machine itself says "this backup is fishy, regenerate from scratch?". Happened last week, but only after a failed drive had to be replaced which caused a 150GB backup.
Re: Legacy file systems should be illegal (Score:4, Informative)
Re: Legacy file systems should be illegal (Score:4, Informative)
They did try and replace the file system around the time of the Intel switch. Got killed by licensing problems.
http://appleinsider.com/articl... [appleinsider.com]