Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
OS X Businesses Operating Systems Apple

Mac OS X 10.3 Defrags Automatically 181

EverLurking writes "There is a very interesting discussion over at Ars' Mac Forum about how Mac OS X 10.3 has implemented an on-the-fly defragmentation scheme for files on the hard drive. Apparently it uses a method known as 'Hot-File-Adaptive-Clustering' to consolidate fragmented files that are under 20 MB in size as they are accessed. Source code from the Davwin 7.0 Kernel is cited as proof that this is happening."
This discussion has been archived. No new comments can be posted.

Mac OS X 10.3 Defrags Automatically

Comments Filter:
  • Amortized cost... (Score:4, Interesting)

    by Ianoo ( 711633 ) on Thursday October 30, 2003 @06:25PM (#7353112) Journal
    Obviously doing this process slows down file access a little. I wonder whether any safeguards are in place, such as turning the system off after a certain I/O load is reached? If not, this may not be such a good idea.

    Also, I wonder whether if you were to calculate the extra time (perhaps 500ms) to defragment each fragmented 20MB file against doing a manual defrag every month, and whether it's actually worth it...

    Don't some Linux filesystems already do this to some extent? I could be hallucinating again, but I'm sure I read this somewhere.
    • The process could be delayed until the disk isn't being used. The file would still use twice as many blocks but with today's hard drives that shouldn't be a problem.

      As for Linux filesystems they don't support FileIDs so who cares |-)
      • I think it's a battery-saving feature for the new PowerBooks...
        • How do you figure, out of curiosity? Wouldn't these occasional defragmentation operations tend to counteract the power saving option that puts the drive to sleep as much as possible? Or do you assume that wouldn't be an issue because the defrags only happen when you're using the drive anyway? Even if that's the case, this is still causing more disc activity than would happen ordinarily, so I can't see how there would be a net gain as far as power saving for the battery goes...
          • ...was meant as humor! Obviously increasing background disk activity is gonna suck juice outta yer battery like a Whitehouse intern...
            • Ahhh.... :-)

              /me readjusts his sarcasm detector

            • Re:I think it's... (Score:3, Insightful)

              by coolgeek ( 140561 )
              Perhaps reading the source will reveal some insight for you all as it has for me. The defrag takes place on an hfs_open call, thus the disk, if not spinning at the time, will be powered on shortly. It also is NOT a background operation, and only applies to files that are being opened.

              I believe the rationale is that it takes little more than the same number of IOs to defrag as it is going to take to read the file once, and will take less IOs on subsequent accesses to the file (after defrag), which would ap

    • OK, so I was only partially hallucinating. Searching Google reveals that most Linux filesystems don't usually need to be defragmented simply because they're better designed. I don't quite understand the reasons, however. Anyone have information?
      • Re:Amortized cost... (Score:5, Informative)

        by jrstewart ( 46866 ) on Thursday October 30, 2003 @06:58PM (#7353455) Homepage
        I can only speak for ext2/ext3. Linux tries to preallocate large blocks when writing files to prevent fragmentation. If you disk is mostly full (or even was once mostly full) or you have heavy concurrent disk activity going on you can still get fragmentation.

        The end goal of the disk subsystem is to get your data to you as soon as you need it. In general that goal would be achieved if the data you want to read next happened to be under the read head. If you're reading sequentially through a single file then this will happen when the file is in a single contiguous region (i.e. unfragmented). For any other access pattern fragmentation doesn't matter as much, since you'll be skipping around the disk regardless of how the files are arranged.

        Prefetching heuristics and caching can hide a lot of these problems from the user as well.
        • For any other access pattern fragmentation doesn't matter as much, since you'll be skipping around the disk regardless of how the files are arranged.

          For an extent based FS a less fragmented file is more likely to have whatever part of it's extent table is needed in memory to do a read or write on it. (for a block mapped one that is a non issue because the block map takes space proportional to the file size, not the number of fragments)

      • information (Score:4, Insightful)

        by djupedal ( 584558 ) on Thursday October 30, 2003 @09:53PM (#7354644)
        Try this... [salmar.com]

        ...and this... [linuxgazette.com]
    • No such limits. (Score:4, Informative)

      by Trillan ( 597339 ) on Thursday October 30, 2003 @09:55PM (#7354655) Homepage Journal

      The source code is posted to that thread; the only conditions are (1) 3 minutes after the system time starts (i.e. avoid doing so when booting up), (2) less than 20 MB of size, (3) file isn't already opened.

      The only negative consequence is a possible speed hit, though. There's no danger.

      I'm pretty impressed by this. Sure, it's been done before. Sure, there are more elaborate methods. But this is just a simple little lump of code that'll defragment the worst files most of the time.

      • I'm a bit less sure of this one, but I think the file must be severely fragmented (i.e. into 8 pieces).
      • The only negative consequence is a possible speed hit, though.

        Isn't the point of keeping your drive defrag'ed to increase the performance of reading and writing?

        With 200+ gig hard drive capacities becoming more ubiquitous, the performance hit of on the fly defragging is worthwhile. Over the long run, it improves performance of a machine to keep it defragged, right?

        • Apple's choice of algorithm defrags files as it encounters them. If you are upgrading a drive, it may have a lot of fragmented files that need to be defragmented. So initially, it will be very slow.

          Once the code has been exercised for a while, yeah, it'll be much faster.

  • I'd still like to defragment larger files even if it's done manually. Plus Optimizer requires a reboot.
    • sudo find / -exec wc {}

      this should defrag all of the 20M or less files on your hard drive.

      it locates every file, opens it and reads every bite then closes it.

      This should force the defragger to run on all files under 20M. Not that technically the defragger only activates when the file is broken into more that 8 extent regions. So this does not actually defrag everything.

      but its also possible that having the file broken into less extents is harmless. first because the the first 8 extents are the fastt

      • sudo find / -exec wc {}

        That'll also read them even if they don't need to be defraged. This may be better:
        sudo find / -exec head {} >/dev/null \;

        Left as an exercise to the reader:

        • only run on stuff less then 20M (not that that will save you much, but it is a good way to learn how to use random tools)
        • Sort by access time, and head the files in that order so the most recently (and hopefully frequently) accessed files have more chance of being defraged then the older files
        • Parallelise it, and see
  • Hmm... (Score:1, Offtopic)

    by Ianoo ( 711633 )
    I also wonder how well Ars OpenForum will hold up to a Slashdotting. They run Infopop [infopop.com] Opentopic which is Java backed by Oracle on clusters, so I'd imagine pretty darned well. Then again, it's still dynamic content with a bigass XSL (XML to HTML) transform on the top before it reaches the browser...
  • by account_deleted ( 4530225 ) on Thursday October 30, 2003 @06:38PM (#7353254)
    Comment removed based on user account deletion
  • by speechpoet ( 562513 ) on Thursday October 30, 2003 @07:34PM (#7353806) Homepage Journal

    In my day, we'd crack open the drive on our Mac SE30s, sharpen a magnet on a whetstone, and defrag that sucker by hand.

    Kids these days. It's the MTV, ya know - makes 'em lazy.

  • What exactly are.... (Score:3, Interesting)

    by Creepy Crawler ( 680178 ) on Thursday October 30, 2003 @07:37PM (#7353828)
    MacOS FileID's?

    Are they comparible to what Reiser4FS will have? Are they better that XYZ offering in Linux?

    I'm seriously interested in what EXACTLY they are. Please spare the fanboy attitude if you do wish to answer..
    • A more technical explanation: http://developer.apple.com/technotes/tn/tn1150.htm l [apple.com]
      • Ahh, thanks ;-)

        After looking through the basics, isn't the FileID something similar Hans did in Reiser? Course in the earlier versions, the "same ID bug" got my /usr/X11R6 directory mixed up with /etc. Let's just say I've not used Reiser3 again...
        • I have no idea, you'll have to ask the author.

          If you have multiple files or directories referring to one file that sounds more like an inode than a FileID.
        • After looking through the basics, isn't the FileID something similar Hans did in Reiser?

          Any filesystem that supports hard links has to have something like a Mac FileID. Traditional Unix filesystems call them "inode numbers" or "i-numbers", try "ls -i" sometime.

          On Mac OS you can open a file given it's FileID but on most Unixes you can not (I assume OSX can, and some versions of SunOS/Solaris can as pat of one of the backup products). It opens a small security hole where you put files in a directory

          • In short Unix has something exactly like FileIDs
            BULLSHIT!!!
            FileIDs are not inodes. They are NOT equivalent as I pointed out elsewhere.
            • FileIDs are not inodes. They are NOT equivalent as I pointed out elsewhere.

              Um, all I saw you say that one can do with FileIDs that you can't do on most Unixish systems is #1 open files directly by them, and #2 convert them to a name without a full scan of the filesystem. Am I wrong?

              If I'm right they are equivalent data structures, but the operations you want are not normally available. Wit pretty minimal work one could put both operations into an Open Source Unixish system. I would say 3 hours work

              • Firstly I'd like to apologize for acting like some ass with tourette's syndrome.

                Secondly the problem is the roles are in reverse. The FileID is like the path in UFS. You don't have a situation where multiple paths can point to one FileID because the path is more like a file attribute. In other words the FileID (or CNID) has a path, not the other way around.

                On UNIXish filesystems it's the path which has the inode, not the other way around.
  • by BortQ ( 468164 ) on Thursday October 30, 2003 @08:11PM (#7354051) Homepage Journal
    It seems as if the /. crowd isn't all that impressed by this advance by apple.

    Well that's fine. The real upside of this is for people that have never heard of /. and don't really know what a hard drive is, let alone know how to defrag one.

    Previously these people would just go forever without defragging. Now they can still do that, because Apple is doing it for them behind the scenes.

    This is yet one more example of Apple's winning philosophy: Keep it simple, make it better.

    • With HFS+ and UFS, you simply don't need to defrag. The very nature of the filesystems keeps things from getting too fragmented. It might get you some trivial performance advantages for some specialized activities, but for everyday use it's simply not needed.

      The upside is that 95% of this is simply marketing to wow the Windows users raised on FAT. IMHO.
  • Damn! (Score:4, Funny)

    by csoto ( 220540 ) on Thursday October 30, 2003 @09:39PM (#7354543)
    That's gonna mess up my UT 2003 ranking! I work hard for those frags! Every one of them!

  • Windows XP has a similar feature that waits a until the computer is not in use for a certain amount of time. It would make sense that Apple would give users the same option.
    • Oh, neat. Where's the option? Or is it always on?
    • Windows XP has a similar feature that waits a until the computer is not in use for a certain amount of time. It would make sense that Apple would give users the same option.

      I'm Not sure the windows approach is really better. Notice that the apple approach is more minimalist in moving files.

      • If you aren't actively using a file it wont get moved--that's good since moving a file always entails a tiny but finite risk of corruption.
      • (notice that the apple method relies on journaling to save your butt if
      • If you aren't actively using a file it wont get moved--that's good since moving a file always entails a tiny but finite risk of corruption.

        Not if you do it intelligently (copy data, compare to original, delete original as an atomic operation).

        (notice that the apple method relies on journaling to save your butt if the computer crashes mid write.)

        That mightn't save your file data. AFAIK HFS+'s journalling is metadata only.

        the windows program wont be able to move files that are currently open (I would

        • Not if you do it intelligently (copy data, compare to original, delete original as an atomic operation).

          The algo Apple uses is get a write lock on the file, write the current data out as if it were being appended (which attempts to write it in as few chunks as possiable), then get a read lock on the file, then free up the "old part" of the file and adjust meta data to make the newly written blocks be the start of the file. I assume something prevents the file from looking twice as long as it should dur

  • by ploiku ( 217526 ) on Friday October 31, 2003 @05:06AM (#7356420) Homepage
    The summary appears to not be quite right.

    To clarify, there are 2 separate file optimizations going on here.

    The first is automatic file defragmentation. When a file is opened, if it is highly fragmented (8+ fragments) and under 20MB in size, it is defragmented. This works by just moving the file to a new, arbitrary, location. This only happens on Journaled HFS+ volumes.

    The second is the "Adaptive Hot File Clustering". Over a period of days, the OS keeps track of files that are read frequently - these are files under 10MB, and which are never written to. At the end of each tracking cycle, the "hottest" files (the files that have been read the most times) are moved to a "hotband" on the disk - this is a part of the disk which is particularly fast given the physical disk characteristics (currently sized at 5MB per GB). "Cold" files are evicted to make room. As a side effect of being moved into the hotband, files are defragmented. Currently, AHFC only works on the boot volume, and only for Journaled HFS+ volumes over 10GB.
  • I wonder how that interacts with the "secure" delete. Does it seeks out previous copies of the file and securely delete them too? That would be quite a feat.

    (Also, has anyone confirmed that the code snippet is actually executed?)

"I'm a mean green mother from outer space" -- Audrey II, The Little Shop of Horrors

Working...