Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Bug Data Storage Apple

One Developer's Experience With Real Life Bitrot Under HFS+ 396

New submitter jackjeff (955699) writes with an excerpt from developer Aymeric Barthe about data loss suffered under Apple's venerable HFS+ filesystem. HFS+ lost a total of 28 files over the course of 6 years. Most of the corrupted files are completely unreadable. The JPEGs typically decode partially, up to the point of failure. The raw .CR2 files usually turn out to be totally unreadable: either completely black or having a large color overlay on significant portions of the photo. Most of these shots are not so important, but a handful of them are. One of the CR2 files in particular, is a very good picture of my son when he was a baby. I printed and framed that photo, so I am glad that I did not lose the original. (Barthe acknowledges that data loss and corruption certainly aren't limited to HFS+; "bitrot is actually a problem shared by most popular filesystems. Including NTFS and ext4." I wish I'd lost only 28 files over the years.)
This discussion has been archived. No new comments can be posted.

One Developer's Experience With Real Life Bitrot Under HFS+

Comments Filter:
  • by carlhaagen ( 1021273 ) on Saturday June 14, 2014 @09:30AM (#47235971)
    An old partition of some 20000 files, most of them 10 years or older, in where I found 7 or 8 files - coincidentally jpg images as well - that were corrupted. It struck me as nothing other than filesystem corruption as the drive was and still is working just fine.
  • by Anonymous Coward on Saturday June 14, 2014 @09:39AM (#47236003)

    The problem is, neither ZFS or Btrfs would have stopped an arbitrary bit inside an arbitrary file from becoming corrupt if the disk failed to write it or read it correctly. Only multiple disks and redundancy would have solved that.

  • by mgmartin ( 580921 ) on Saturday June 14, 2014 @09:54AM (#47236083)
    As does zfs: man zfs
    copies=1 | 2 | 3 Controls the number of copies of data stored for this dataset. These copies are in addition to any redundancy provided by the pool, for example, mirroring or RAID-Z. The copies are stored on different disks, if possible. The space used by multiple copies is charged to the associated file and dataset, changing the used property and counting against quotas and reservations. Changing this property only affects newly-written data. Therefore, set this property at file system creation time by using the -o copies=N option.
  • by sirwired ( 27582 ) on Saturday June 14, 2014 @10:06AM (#47236127)

    Due to their commanding smartphone marketshare, along with millions of devices with embedded Linux shipped every year, wouldn't Samsung be the largest UNIX vendor?

    Oh? What's that? You weren't counting embedded Linux and I'm a pedantic #$(*#$&@!!!. Can't argue with that!

  • by sribe ( 304414 ) on Saturday June 14, 2014 @10:15AM (#47236183)

    In a footnote he admits that the corruption was caused by hardware issues, not HFS+ bugs, and of course the summary ignores that completely.

    So, for that, let me counter his anecdote with my own anecdote: I have an HFS+ volume with a collection of over 3,000,000 files on it. This collection started in 2004, approximately 50 people access thousands of files on it per day, and occasionally after upgrades or problems it gets a full byte-to-byte comparison to one of three warm standbys. No corruption found, ever.

  • Clueless article (Score:5, Informative)

    by alexhs ( 877055 ) on Saturday June 14, 2014 @10:27AM (#47236227) Homepage Journal

    People talking about "bit rot" usually have no clue, and this guy is no exception.

    It's extremely unlikely that a file would become silently corrupted on disk. Block devices include per-block checksums, and you either have a read error (maybe he has) or the data read is the same as the data previously written. As far as I know, ZFS doesn't help to recover data from read errors. You would need RAID and / or backups.

    Main memory is the weakest link. That's why my next computer will have ECC memory. So, when you copy the file (or otherwise defragment or modify the file, etc), you read a good copy, some bit flips in RAM, and you write back corrupted data. Your disk receives the corrupted data, happily computes a checksum, therefore ensuring you can read back your corrupted data faithfully. That's where ZFS helps. Using checksumming scripts is a good idea, and I do it myself. But I don't have auto-defrag on Linux, so I'm safer : when I detect a corrupted copy, I still have the original.

    ext2 was introduced in 1993, and so was NTFS. ext4 is just ext2 updated (ext was a different beast). If anything, HFS+ is more modern, not that it makes a difference. All of them are updated. By the way, I noticed recently that Mac OS X resource forks sometimes contain a CRC32. I noticed it in a file coming from Mavericks.

  • Re:Backup? (Score:2, Informative)

    by Antique Geekmeister ( 740220 ) on Saturday June 14, 2014 @10:41AM (#47236285)

    The bitrot will change the checksums and cause the files to show up as modified.

    Moreover, what will you do about a reported bitrotted file unless you have genuine archival backups somewhere else?

  • by the_B0fh ( 208483 ) on Saturday June 14, 2014 @10:58AM (#47236357) Homepage

    Not if your OS is tied intimately to your filesystem. Linux might not, because a large number of things are abstracted out, but FreeBSD depends on its file system, Solaris took a very long time/effort before it could boot off ZFS. Forget about moving Windows off NTFS. Apple actually did some work on putting it onto ZFS, maybe they will continue.

  • Re:Backup? (Score:2, Informative)

    by Anonymous Coward on Saturday June 14, 2014 @12:21PM (#47236663)

    Macosx Time Machine works by listening to filesystem events except for the first backup where everything is copied over as is. Bit rot doesn't get transferred until you overwrite the file, time by which it should have been obvious something was fishy or the bitrot was negligible and you didn't notice yourself. There are also situations where Time Machine itself says "this backup is fishy, regenerate from scratch?". Happened last week, but only after a failed drive had to be replaced which caused a 150GB backup.

  • by O('_')O_Bush ( 1162487 ) on Saturday June 14, 2014 @01:30PM (#47237023)
    Vs the chances you back up already corrupted files and don't notice until you've aged off the good versions.
  • by maccodemonkey ( 1438585 ) on Saturday June 14, 2014 @03:57PM (#47237569)

    They did try and replace the file system around the time of the Intel switch. Got killed by licensing problems.
    http://appleinsider.com/articl... [appleinsider.com]

I've noticed several design suggestions in your code.

Working...