Slashdot Log In
Measuring Fragmentation in HFS+
Posted by
pudge
on Wed May 19, 2004 12:03 PM
from the bring-out-the-big-guns dept.
from the bring-out-the-big-guns dept.
keyblob8K writes "Amit Singh takes a look at fragmentation in HFS+. The author provides numbers from his experiments on several HFS+ disks, and more interestingly he also provides the program he developed for this purpose. From his own limited testing, Apple's filesystem seems pretty solid in the fragmentation avoidance department. I gave hfsdebug a whirl on my 8-month-old iMac and the disk seems to be in good shape. I don't have much idea about ext2/3 or reiser, but I know that my NTFS disks are way more fragmented than this after similar amount of use."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Huh? (Score:5, Insightful)
Is this based off of instinct, actual data, or what?
Re:Huh? (Score:5, Funny)
(Last Journal: Friday February 18 2005, @03:11PM)
Re:Huh? (Score:5, Informative)
My own experience, using a small tool I wrote to analyze NTFS fragmentation:
NTFS is pretty good at avoiding fragmentation when creating new files if the size of the file is set before it is written. In other words, if the file is created, the EOF set, and then the file data is written, NTFS does a good job of finding a set of contiguous clusters for the file data.
NTFS does a poor job of avoiding fragmentation for files written sequentially. Consider a file retrieved with wget. An empty file is created, then the contents are written sequentially as it is read from the net. Odds are, the file data will be scattered all over the disk.
Here's a concrete example. Today, I downloaded Andrew Morton's 2.6.6-mm4.tar.bz2 patch set. (Yes, I run WinXP on my Toshiba laptop -- deal with it.) Anyway, the file is less than 2.5MB, but it is allocated in 19 separate fragments. I copied it to another file, and that file is unfragmented. Since the copy command sets EOF before writing the data, NTFS can try ot allocate a contiguous run of clusters.
Note - This was done on uncompressed NTFS. My feeling is that compressed NTFS is even worse about fragmentation, but I don't have any numbers to back that up.
Re:Huh? (Score:5, Insightful)
(http://bitter-and-impotent-loser-counselling.com/ | Last Journal: Tuesday August 03 2004, @08:27PM)
Why would anybody have a problem with you running Windows XP on your laptop? I'm a card-carrying Linux Zealot, and I don't have a problem with it.
Re:Huh? (Score:5, Funny)
(Last Journal: Tuesday June 06 2006, @08:27PM)
Apparently you are actually a closet Rational Linux Advocate. I'm sure there are a few people in the drooling horde reading these comments that will have a problem with someone being foolish enough to actually choose to run Windows on anything ;)
I run Gentoo on my laptop, but the specs on the crusty old thing are so low that my only other "choice" would be the run Windows 95, and I'd sooner eat my usb key than do that.
Re:Huh? (Score:5, Interesting)
(Last Journal: Tuesday October 29 2002, @10:47AM)
For the record I also use XP on my laptop. Until everything works perfectly out of the box, ACPI and all, I'm not installing any nix on it.
How to determine fragmentation... (Score:5, Funny)
1. Right click on drive icon, select properties
2. Select Tools tab and click on "Defragment Now"
3. Click on "Analyze"
4. When analysis finishes, click on "View Report"
This shows two list windows, one containing general properties of the disk such as volume size, free space, total fragmentation, file fragmentation and free space fragmentation. The second list shows all fragmented files and how badly they are fragmented.
Re:How to determine fragmentation... (Score:5, Insightful)
(http://www.popcornfilms.com/)
1. Right click on drive icon, select properties
2. Select Tools tab and click on "Defragment Now"
3. Click on "Analyze"
4. When analysis finishes, click on "View Report"
This shows two list windows, one containing general properties of the disk such as volume size, free space, total fragmentation, file fragmentation and free space fragmentation. The second list shows all fragmented files and how badly they are fragmented.
If you're not using the same tool to measure fragmentation on each OS, how do you know that they're using the same semantics to decide what a fragmented file is?
IIRC, the Linux tools use a different metric to calculate fragmentation than the NT ones.
Re:How to determine fragmentation... (Score:5, Informative)
NTFS is horrible. on a system installed less than a week ago, and a few programs (nwn, firefox, avg, itunes, aa, nvdvd, windows updates, and a couple more programs, it has 9.3GB used, and it is reported that it has "Total Fragmentation: 22%, File Fragmentation: 45%"
So yes there are various methods of calculating file fragmentation. (2 I can think of: (# of files with fragments)/(total number of files) = 0 for a totally defragemented hd (& gives nice percentages) & (# of file fragments)/(total number of files) = 1 for a perfectly defragmented hd. or variations on those, and I haven't been able to find what calculations Windows, & e2fstools use, so I can't tell.
Re:How to determine fragmentation... (Score:4, Informative)
(http://www.popcornfilms.com/)
As an example, look up the docs on ext2. Note that file fragments are not necessarily the same as fragmented files. Also note that people use the "file fragment" number as an indicator of how fragmented their ext2 partition is - which is wrong.
Re:How to determine fragmentation... (Score:4, Funny)
(http://edified.org/ | Last Journal: Wednesday May 14 2003, @02:00PM)
-Faithful Macuser
(ok I have a 3 button logitech)
HFS+ defrag source (Score:5, Informative)
(Last Journal: Friday May 21 2004, @12:42PM)
Re:HFS+ defrag source (Score:5, Interesting)
(http://www.exitthree.com/)
You've only defeated the purpose if you re-fragment the file again after opening it. If this isn't the case, the amortized cost (the initial cost of de-fragmentation when opening the first time minus the speed benefits from a file in a single chunk) over the many times the file is read yields a speed bonus, not a speed loss.
A good example is me, installing a program from disk onto my computer. I run the program and it accesses a group of files that have been fragmented when copied to my hard drive. The first time it opens the files it spends a little extra time de-fragmenting them. However, subsequent times that I open the program, these files will load faster.
Re:HFS+ defrag source (Score:5, Informative)
I believe the actual sequence is this:
In other words, it defrangments after the file has been returned to the program needing it, as a background process. The buffer to memory is a pre-existing optimization, so the only real trade off is the background processor usage goes up. If you aren't doing major work at the time, you'll never notice. (And if you are doing major work, you probably are using files larger than 20MB in size anyway.)
Files larger than 20MB just aren't defragmented, unless you have another tool to do it.
Re:HFS+ defrag source (Score:4, Informative)
(http://www.frostbytes.com/~jimf)
If you ever wondered why there is a "soft limit" on FFS filesystems, the reason why is that its allocator's effectiveness breaks down at about the point where the filesystem is 90% full. So they sacrifice 10% of the filesystem space so that they can avoid fragmentation problems. It's not a bad tradeoff, particularly these days.
I didn't know that HFS+ used an after-the-fact defragmentation system, but they've been around for awhile too. Significant research was done into such things as part of log-based filesystem research in the early 1990s (reference BSF LFS and Sprite). You had to have a "cleaner" process with those filesystems anyway (to pick up abandoned fragments of the log and return them to the free pool) so it made sense to have it also perform some optimization features.
Re:HFS+ defrag source (Score:5, Informative)
(http://www.jazz-sax.com/)
Good luck
Re:HFS+ defrag source (Score:5, Interesting)
HFS+ has been around since OS 8.5 (?? somewhere in OS 8). So either this is a feature of HFS+ that hasn't been implemented until now, or its a bit of code added to Panther. Or has HFS+ been updated?
Re:HFS+ defrag source (Score:5, Informative)
(http://www.macgeekery.com/)
Re:HFS+ defrag source (Score:5, Insightful)
(http://www.macgeekery.com/)
Then you didn't check hard. Again, HFS+ is a specification of how to write data to media in order to organize another collection of data. The implementation is what handles the defragging. There are no drivers involved as drivers are the software component of a hardware/software union and there is no hardware involved at this level (just logical organization).
Re:HFS+ defrag source (Score:4, Informative)
So therefore it might be a part of the operating system's filesystem. That's the system that deals with files. But that's not what was asked. What was asked was whether it was an inherent feature of HFS+, and that's not possible, since HFS+ doesn't tell the OS what to do when a file is opened, only how the stuff is stored on the disk.
Perhaps you didn't understand the dual nature of the word filesystem: it can be the subsystem of the OS that handles files, or it can be the physical representation of the data on to the hard drive. If you assume it's only the first, your explanation makes sense. If you assume the second one (which would be the usage intended and understood by most people given the fact that the question and response were about HFS+ (physical filesystem) compared to Panther (OS filesystem)), then you'd be wrong.
And I've been trolled, but who cares.
Re:HFS+ defrag source (Score:5, Informative)
(Last Journal: Thursday March 25 2004, @06:59PM)
And the person who came up with this idea was a genius. This is far far better than what most other operating systems do (refuse to mount the volume.)
If I boot MS-DOS on a machine that has FAT-32 or NTFS volumes, I simply don't find any volume. I can't tell the difference between an unsupported file system and an unformatted partition. If the file system would create a FAT-compatible read-only stub (like HFS+ does), it would be much better for the user. Instead of thinking you have a corrupt drive, you'd know that there is a file system that your OS can't read.
Re:HFS+ defrag source (Score:5, Insightful)
http://developer.apple.com/documentation/Perfor
In theory, when you install anything (on any system) and have a reasonable amount of contiguous free space on your disk, the installed files should always be unfragmented since I believe that's what most file systems look for first to allocate: a large chunk of contiguous space.
Fragmentation typically occurs more when you open a file, increase its size, and write it back out. But operations that write large files to disk that do not know beforehand what the final size may also do this to some files that were only written once to your disk. For example, some of the largest fragmented files on my HFS+ volume are things snagged with BitTorrent. The fragments in these files are very regular chunks of blocks, which could be the typical 'buffer' size BT grabs when writing.
File allocation Table (Score:4, Interesting)
(Last Journal: Wednesday July 16 2003, @04:29PM)
Re:File allocation Table (Score:5, Insightful)
Re:File allocation Table (Score:5, Informative)
(http://www.intelligentblogger.com/ | Last Journal: Monday August 27, @11:47AM)
You're probably thinking "just store the size of the file", This is perfectly valid, but it does have certain implications. You see, in Comp-Sci, we refer to a list like this as a "linked list". The concept basically being that each item in the list has information (i.e. a "link") that helps identify the next item in the list. Such a data structure has a worst case access time of O(n). Or in other words, if your item is at the end of the list,and you have you have 2000 files, you'll have to check through all two thousand headers before finding your file.
Popular file systems circumvent this by using what's called a Tree structure. A tree is similar to a linked list, but allows for multiple links that point to children of the node. A node that has no children is referred to as a "leaf node". In a file system the directories and files are nodes of a tree, with files being leaf nodes. This configuration gives us two performance characteristics that we must calculate for:
1. The maximum number of children in a node.
2. The maximum depth of the tree.
Let's call them "c" for children and "d" for depth. Our performance formula is now O(c*d) and is irrespective of the number of items in the data structure. Let's make up and example to run this calculation against:
Path:
Nodes:
/ (34)
Longest path:
Plugging the above numbers (72 for c, 4 for d) we get a worst case of 72*4 = 288 operations. Thus our worst case is much better than the linked list. And if we calculate the real case to access
Hope this helps.
Measuring fragmentation in NTFS (Score:2, Informative)
This was my PhD Thesis.
Re:Measuring fragmentation in NTFS (Score:5, Interesting)
C: 5,72 GB Total, 1,97 GB (34%) Free, 4% Fragmented (8% file fragmentation)
D: 40,00 GB Total, 1,00 GB (2%) Free, 41% Fragmented (82% file fragmentation)
E: 66,69 GB Total, 105 MB (0%) Free, 10% Fragmented (21% file fragmentation)
F: 30,00 GB Total, 1,21 GB (4%) Free, 3% Fragmented (7% file fragmentation)
G: 10,00 GB Total, 1,54 GB (15%) Free, 5% Fragmented (9% file fragmentation)
H: 35,03 GB Total, 551 MB (1%) Free, 39% Fragmented (79% file fragmentation)
D ("Dump") and H ("Online") get a lot of throughput, by personal computing standards anyway, E ("Games") doesn't get changed that much, but if it does, a lot of data leaves and comes. Seems like whenever I defrag D or H, they're back to the values above within days. I guess Win XP has a hard time doing the internal on-the-fly defragging of the hard drives that rarely have moer than 1% free space... Guess I should just get a new HD and have some more free space that way - but I bet I'd have that filled up with junk after some weeks, anyway.
That said, I'm not sure how relevant this is for NTFS partitions, anyway. I recall hearing that they aren't affected by fragmentation as much as FAT partitions (which were a nightmare), however I'm not sure if that means they don't fragment that easily (heh) or whether accessing data isn't slowed down as much by any existing fragmentation.
I've also rarely heard anyone talking about fragmentation in the popular Linux file systems, a Unix partisan I know actually thought they didn't fragment full stop, which I don't believe is possible, at least not if you consider situations which might not occur in practice. But then again, I suppose Linux might solve it the same way Apple seems to - I guess I'll know more after a couple of hundred comments on this article.
NTFS is not so bad (Score:5, Interesting)
(http://www.ws83.net/ | Last Journal: Monday May 14 2007, @03:38AM)
Bzzt! Nope. Close, though! (Score:4, Informative)
(http://homepage.mac.com/danaris/anaris.html | Last Journal: Monday February 14 2005, @03:58PM)
That's not quite correct. In Panther (Mac OS X 10.3, for the uninitiated), journaling is enabled by default: that is, when you first install Panther, it will add journaling to your existing HFS+ disk, and if you're reformatting, it will default to HFS+ (Journaled). However, prior to Panther, there was no journaling support in HFS+, to my knowledge.
Dan Aris
Re:Bzzt! Nope. Close, though! (Score:5, Informative)
(http://slashdot.org/)
Even in 10.3 it's optional, not required, but it's the new default for new disks. Probably because Apple decided that their code was solid enough to put into production. After testing it on 10.2 I agree with them.
Re:NTFS is not so bad (Score:5, Informative)
Re:NTFS is not so bad (Score:5, Informative)
(http://bestpractic.es/)
Re:NTFS is not so bad (Score:4, Informative)
NTFS fragments _very_ fast on me, after a few months of use, it is in the 20% or more range.
Same user (i.e. me), so same usage pattern, on my HPFS disks (yes, HPFS, that would be OS/2, not OS X), the fragmentation after 3 _years_ is less than 2% on ALL of my HPFS disks.
Re:NTFS is not so bad (Score:5, Interesting)
HFS Filesystem vs. ReiserFS (Score:2, Interesting)
(http://www.everythin...x.pl?node_id=1188435)
Under HFS+ in Mac OS X Jaguar or Panther, after about a day of having a clean install, fresh partition and format my hard drive starts making clunking noises and the system locks up (without actually freezing) -- then when reboot attempts are made they take aeons.
Under ReiserFS in Gentoo Linux for PPC: never have the problem. Same hard drive. Months of use, never once hear the hard drive being funky. No lockups.
Do I put the blame on HFS? OS X? I just can't figure out this strange problem.
My stats (Score:5, Informative)
I've got a G4 with an 80 GB root drive which I use all day, every day. Well, almost. It's never had anything done to it, filesystem-maintenance-wise, since I last did an OS upgrade last fall, about eight months ago.Not too shabby, methinks.
Big frag issues under EXT2 too (Score:1)
I know that Reiser does extremely well with space management on small files (CDDB database is a great example). Do any of the other Linux FSs do better than EXT2 with frag?
Re:Big frag issues under EXT2 too (Score:5, Informative)
(http://slashdot.org/ | Last Journal: Tuesday July 24, @05:09PM)
that's not a good measurement... (Score:5, Funny)
(http://pitchforkmedia.com/ | Last Journal: Tuesday March 23 2004, @09:08PM)
CVB
Panther Defrag (Score:5, Interesting)
HPFS (Score:3, Interesting)
(Last Journal: Tuesday April 12 2005, @11:12PM)
But not sure how this are managed in linux filesystems, not just ext2/3 and reiserfs, but also in xfs and jfs.
Looks like I have a problem... (Score:2)
I can't find any info about this on the site. Is anyone else getting this error?
ReiserFS and fragmentation (Score:2, Interesting)
(http://00f.net/)
It's really good on filesystems with a lot of files or on databases.
Who even cares about Fragmentation anymore? (Score:5, Interesting)
(http://www.morbidgames.com/ | Last Journal: Tuesday November 30 2004, @07:38PM)
Both have 40gig HD's and both have applications installed/uninstalled quite often. My PC feels the worst of this as he gets games installed and uninstalled in addition to the apps.
For example the last time I reinstalled either of these machines was back in january(new year fresh install) and since then my pc has felt the install/uninstal of various games usually ranging from 2-5 gigs each. The Apple has been installed and with the exception of updates, plugins, video codecs and basic small apps that get added/upgraded often has done alright.
Right now Norton System Works on my PC is saying the drive is 4% fragmented. Disk Warrior on my Apple is saying the drive is 2% fragmented.
Conclusion: Fragmentation is no longer an issue for the HOME USER(note how i'm not saying your companies network doesn't need to be concerned), unless there still running a FAT32 partition >. which well they deserve to have there computer explode at that point anyway.
Other FSs (Score:1)
(http://surtani.org/)
Centrifugal Force (Score:5, Funny)