Measuring Fragmentation in HFS+ 417
keyblob8K writes "Amit Singh takes a look at fragmentation in HFS+. The author provides numbers from his experiments on several HFS+ disks, and more interestingly he also provides the program he developed for this purpose. From his own limited testing, Apple's filesystem seems pretty solid in the fragmentation avoidance department. I gave hfsdebug a whirl on my 8-month-old iMac and the disk seems to be in good shape. I don't have much idea about ext2/3 or reiser, but I know that my NTFS disks are way more fragmented than this after similar amount of use."
Huh? (Score:5, Insightful)
Is this based off of instinct, actual data, or what?
Re:Huh? (Score:5, Funny)
Re:Huh? (Score:5, Informative)
My own experience, using a small tool I wrote to analyze NTFS fragmentation:
NTFS is pretty good at avoiding fragmentation when creating new files if the size of the file is set before it is written. In other words, if the file is created, the EOF set, and then the file data is written, NTFS does a good job of finding a set of contiguous clusters for the file data.
NTFS does a poor job of avoiding fragmentation for files written sequentially. Consider a file retrieved with wget. An empty file is created, then the contents are written sequentially as it is read from the net. Odds are, the file data will be scattered all over the disk.
Here's a concrete example. Today, I downloaded Andrew Morton's 2.6.6-mm4.tar.bz2 patch set. (Yes, I run WinXP on my Toshiba laptop -- deal with it.) Anyway, the file is less than 2.5MB, but it is allocated in 19 separate fragments. I copied it to another file, and that file is unfragmented. Since the copy command sets EOF before writing the data, NTFS can try ot allocate a contiguous run of clusters.
Note - This was done on uncompressed NTFS. My feeling is that compressed NTFS is even worse about fragmentation, but I don't have any numbers to back that up.
Re:Huh? (Score:5, Insightful)
Why would anybody have a problem with you running Windows XP on your laptop? I'm a card-carrying Linux Zealot, and I don't have a problem with it.
Re:Huh? (Score:5, Funny)
Apparently you are actually a closet Rational Linux Advocate. I'm sure there are a few people in the drooling horde reading these comments that will have a problem with someone being foolish enough to actually choose to run Windows on anything ;)
I run Gentoo on my laptop, but the specs on the crusty old thing are so low that my only other "choice" would be the run Windows 95, and I'd sooner eat my usb key than do that.
Re:Huh? (Score:5, Interesting)
For the record I also use XP on my laptop. Until everything works perfectly out of the box, ACPI and all, I'm not installing any nix on it.
Bring on the browser stats! (Score:3, Insightful)
A word about browsers (and any thing else that requires change):
People, in general (more than 50% of them), prefer to resist change, and for that matter, extra work and/or thinking. It's just the way they are. It's what explains product loyalty. In this case, the product loyalty is browser based.
In my job, as a web server support admin, I find that 95%, or more, of the people I speak with in support situations are not even aware of the alternatives available to them. In fact, just last Sunday, a frie
How to determine fragmentation... (Score:5, Funny)
1. Right click on drive icon, select properties
2. Select Tools tab and click on "Defragment Now"
3. Click on "Analyze"
4. When analysis finishes, click on "View Report"
This shows two list windows, one containing general properties of the disk such as volume size, free space, total fragmentation, file fragmentation and free space fragmentation. The second list shows all fragmented files and how badly they are fragmented.
Re:How to determine fragmentation... (Score:5, Insightful)
1. Right click on drive icon, select properties
2. Select Tools tab and click on "Defragment Now"
3. Click on "Analyze"
4. When analysis finishes, click on "View Report"
This shows two list windows, one containing general properties of the disk such as volume size, free space, total fragmentation, file fragmentation and free space fragmentation. The second list shows all fragmented files and how badly they are fragmented.
If you're not using the same tool to measure fragmentation on each OS, how do you know that they're using the same semantics to decide what a fragmented file is?
IIRC, the Linux tools use a different metric to calculate fragmentation than the NT ones.
Re:How to determine fragmentation... (Score:5, Informative)
NTFS is horrible. on a system installed less than a week ago, and a few programs (nwn, firefox, avg, itunes, aa, nvdvd, windows updates, and a couple more programs, it has 9.3GB used, and it is reported that it has "Total Fragmentation: 22%, File Fragmentation: 45%"
So yes there are various methods of calculating file fragmentation. (2 I can think of: (# of files with fragments)/(total number of files) = 0 for a totally defragemented hd (& gives nice percentages) & (# of file fragments)/(total number of files) = 1 for a perfectly defragmented hd. or variations on those, and I haven't been able to find what calculations Windows, & e2fstools use, so I can't tell.
Re:How to determine fragmentation... (Score:4, Informative)
As an example, look up the docs on ext2. Note that file fragments are not necessarily the same as fragmented files. Also note that people use the "file fragment" number as an indicator of how fragmented their ext2 partition is - which is wrong.
Re:How to determine fragmentation... (Score:4, Funny)
-Faithful Macuser
(ok I have a 3 button logitech)
Re:How to determine fragmentation... (Score:3, Interesting)
The files on the drive had an average size of 200 MB, were downloaded in 1kB increments several files at a time over a period of a week on average per file.
The reason for it failing on defraging (it doesn't say it fails, it just doesn't do much and stops after a while) is because the free space was also so badly fragmented that it couldn't even defragment a
HFS+ defrag source (Score:5, Informative)
Re:HFS+ defrag source (Score:2)
--
New deal processing engine online: http://www.dealsites.net/livedeals.html [dealsites.net]
Re:HFS+ defrag source (Score:3, Insightful)
That would seem to defeat the purpose to me. The main reason you want to avoid fragmentation of the data is that fragmented data takes longer to pull from the disk. So if by preventing fragmentation you slow down pulling data from the disk, you have just defeated your purpose.
Re:HFS+ defrag source (Score:2)
Re:HFS+ defrag source (Score:5, Interesting)
You've only defeated the purpose if you re-fragment the file again after opening it. If this isn't the case, the amortized cost (the initial cost of de-fragmentation when opening the first time minus the speed benefits from a file in a single chunk) over the many times the file is read yields a speed bonus, not a speed loss.
A good example is me, installing a program from disk onto my computer. I run the program and it accesses a group of files that have been fragmented when copied to my hard drive. The first time it opens the files it spends a little extra time de-fragmenting them. However, subsequent times that I open the program, these files will load faster.
Re:HFS+ defrag source (Score:5, Informative)
I believe the actual sequence is this:
In other words, it defrangments after the file has been returned to the program needing it, as a background process. The buffer to memory is a pre-existing optimization, so the only real trade off is the background processor usage goes up. If you aren't doing major work at the time, you'll never notice. (And if you are doing major work, you probably are using files larger than 20MB in size anyway.)
Files larger than 20MB just aren't defragmented, unless you have another tool to do it.
Re:HFS+ defrag source (Score:2)
Re:HFS+ defrag source (Score:4, Informative)
If you ever wondered why there is a "soft limit" on FFS filesystems, the reason why is that its allocator's effectiveness breaks down at about the point where the filesystem is 90% full. So they sacrifice 10% of the filesystem space so that they can avoid fragmentation problems. It's not a bad tradeoff, particularly these days.
I didn't know that HFS+ used an after-the-fact defragmentation system, but they've been around for awhile too. Significant research was done into such things as part of log-based filesystem research in the early 1990s (reference BSF LFS and Sprite). You had to have a "cleaner" process with those filesystems anyway (to pick up abandoned fragments of the log and return them to the free pool) so it made sense to have it also perform some optimization features.
Re:HFS+ defrag source (Score:3, Interesting)
You can also use UFS.
Re:HFS+ defrag source (Score:3, Interesting)
Re:HFS+ defrag source (Score:5, Informative)
Good luck
Re:HFS+ defrag source (Score:5, Interesting)
HFS+ has been around since OS 8.5 (?? somewhere in OS 8). So either this is a feature of HFS+ that hasn't been implemented until now, or its a bit of code added to Panther. Or has HFS+ been updated?
Re:HFS+ defrag source (Score:5, Informative)
Re:HFS+ defrag source (Score:5, Insightful)
Then you didn't check hard. Again, HFS+ is a specification of how to write data to media in order to organize another collection of data. The implementation is what handles the defragging. There are no drivers involved as drivers are the software component of a hardware/software union and there is no hardware involved at this level (just logical organization).
Re:HFS+ defrag source (Score:4, Informative)
So therefore it might be a part of the operating system's filesystem. That's the system that deals with files. But that's not what was asked. What was asked was whether it was an inherent feature of HFS+, and that's not possible, since HFS+ doesn't tell the OS what to do when a file is opened, only how the stuff is stored on the disk.
Perhaps you didn't understand the dual nature of the word filesystem: it can be the subsystem of the OS that handles files, or it can be the physical representation of the data on to the hard drive. If you assume it's only the first, your explanation makes sense. If you assume the second one (which would be the usage intended and understood by most people given the fact that the question and response were about HFS+ (physical filesystem) compared to Panther (OS filesystem)), then you'd be wrong.
And I've been trolled, but who cares.
Re:HFS+ defrag source (Score:3, Informative)
Journalling didn't show up until one of the Jaguar updates, where it could be enabled via the command line on clients and via disk utility on Server.
Re:HFS+ defrag source (Score:5, Informative)
And the person who came up with this idea was a genius. This is far far better than what most other operating systems do (refuse to mount the volume.)
If I boot MS-DOS on a machine that has FAT-32 or NTFS volumes, I simply don't find any volume. I can't tell the difference between an unsupported file system and an unformatted partition. If the file system would create a FAT-compatible read-only stub (like HFS+ does), it would be much better for the user. Instead of thinking you have a corrupt drive, you'd know that there is a file system that your OS can't read.
Re:HFS+ defrag source (Score:3, Insightful)
At some companies, a developer would go to his project manager, propose this feature, and get a head shake. Too much work to test and spec, not worth the gains. Let's devote our time to our core competencies.
Apple on the other hand was built on details like this. In fact, one of my favorite things about OS 10.3 is Expose...a feature nobody really asked for, and now I can't live without it (fuck virtual desktops...I want one desktop I can use!)
Re:HFS+ defrag source (Score:5, Insightful)
http://developer.apple.com/documentation/Perfor
In theory, when you install anything (on any system) and have a reasonable amount of contiguous free space on your disk, the installed files should always be unfragmented since I believe that's what most file systems look for first to allocate: a large chunk of contiguous space.
Fragmentation typically occurs more when you open a file, increase its size, and write it back out. But operations that write large files to disk that do not know beforehand what the final size may also do this to some files that were only written once to your disk. For example, some of the largest fragmented files on my HFS+ volume are things snagged with BitTorrent. The fragments in these files are very regular chunks of blocks, which could be the typical 'buffer' size BT grabs when writing.
Re:Offtopic (Score:3, Insightful)
Ah, liberal tolerance rears its head again.
Bush lied, Bush continues to lie, and our country is far, FAR more in danger now than when we started this stupid fucking war.
I'd be interested in the metric you use to compute danger, seeing as how there have been exactly zero terrorist attacks on US soil since 9/11. (By the way, were you out protesting the "stupid fucking war" in Serbia, or are Democrats allowed to invade sovereign nations who pose no external threat?)
Bush said he was 1
Re:Offtopic (Score:3, Interesting)
And I'd be interested in the metric you use. Tell me, how often did we have terrorist attacks on US soil before 9/11? Well let's see, there was Oklahoma city, but that was an American so it doesn't really apply does it. So that leaves The first WTC bombing as the most recent attack preceding 9/11. 8 years lie between those two attacks.
If we take the extremely generou
Re:Offtopic (Score:3, Informative)
You are forgetting two embassies in Africa and an American Warship. All of those are American soil. So it is not an attack every eight years.
File allocation Table (Score:4, Interesting)
Re:File allocation Table (Score:5, Insightful)
Re:File allocation Table (Score:5, Informative)
You're probably thinking "just store the size of the file", This is perfectly valid, but it does have certain implications. You see, in Comp-Sci, we refer to a list like this as a "linked list". The concept basically being that each item in the list has information (i.e. a "link") that helps identify the next item in the list. Such a data structure has a worst case access time of O(n). Or in other words, if your item is at the end of the list,and you have you have 2000 files, you'll have to check through all two thousand headers before finding your file.
Popular file systems circumvent this by using what's called a Tree structure. A tree is similar to a linked list, but allows for multiple links that point to children of the node. A node that has no children is referred to as a "leaf node". In a file system the directories and files are nodes of a tree, with files being leaf nodes. This configuration gives us two performance characteristics that we must calculate for:
1. The maximum number of children in a node.
2. The maximum depth of the tree.
Let's call them "c" for children and "d" for depth. Our performance formula is now O(c*d) and is irrespective of the number of items in the data structure. Let's make up and example to run this calculation against:
Path:
Nodes:
/ (34)
Longest path:
Plugging the above numbers (72 for c, 4 for d) we get a worst case of 72*4 = 288 operations. Thus our worst case is much better than the linked list. And if we calculate the real case to access
Hope this helps.
Measuring fragmentation in NTFS (Score:2, Informative)
This was my PhD Thesis.
Re:Measuring fragmentation in NTFS (Score:2)
Total fragmentation = 16 %
File fragmentation = 33 %
Free space fragmentation = 0 %
I win!
Re:Measuring fragmentation in NTFS (Score:5, Interesting)
C: 5,72 GB Total, 1,97 GB (34%) Free, 4% Fragmented (8% file fragmentation)
D: 40,00 GB Total, 1,00 GB (2%) Free, 41% Fragmented (82% file fragmentation)
E: 66,69 GB Total, 105 MB (0%) Free, 10% Fragmented (21% file fragmentation)
F: 30,00 GB Total, 1,21 GB (4%) Free, 3% Fragmented (7% file fragmentation)
G: 10,00 GB Total, 1,54 GB (15%) Free, 5% Fragmented (9% file fragmentation)
H: 35,03 GB Total, 551 MB (1%) Free, 39% Fragmented (79% file fragmentation)
D ("Dump") and H ("Online") get a lot of throughput, by personal computing standards anyway, E ("Games") doesn't get changed that much, but if it does, a lot of data leaves and comes. Seems like whenever I defrag D or H, they're back to the values above within days. I guess Win XP has a hard time doing the internal on-the-fly defragging of the hard drives that rarely have moer than 1% free space... Guess I should just get a new HD and have some more free space that way - but I bet I'd have that filled up with junk after some weeks, anyway.
That said, I'm not sure how relevant this is for NTFS partitions, anyway. I recall hearing that they aren't affected by fragmentation as much as FAT partitions (which were a nightmare), however I'm not sure if that means they don't fragment that easily (heh) or whether accessing data isn't slowed down as much by any existing fragmentation.
I've also rarely heard anyone talking about fragmentation in the popular Linux file systems, a Unix partisan I know actually thought they didn't fragment full stop, which I don't believe is possible, at least not if you consider situations which might not occur in practice. But then again, I suppose Linux might solve it the same way Apple seems to - I guess I'll know more after a couple of hundred comments on this article.
Re:Measuring fragmentation in NTFS (Score:3, Insightful)
NTFS is not so bad (Score:5, Interesting)
Re:NTFS is not so bad (Score:3, Informative)
Bzzt! Nope. Close, though! (Score:4, Informative)
That's not quite correct. In Panther (Mac OS X 10.3, for the uninitiated), journaling is enabled by default: that is, when you first install Panther, it will add journaling to your existing HFS+ disk, and if you're reformatting, it will default to HFS+ (Journaled). However, prior to Panther, there was no journaling support in HFS+, to my knowledge.
Dan Aris
Re:Bzzt! Nope. Close, though! (Score:5, Informative)
Even in 10.3 it's optional, not required, but it's the new default for new disks. Probably because Apple decided that their code was solid enough to put into production. After testing it on 10.2 I agree with them.
Re:NTFS is not so bad (Score:5, Informative)
Re:NTFS is not so bad (Score:5, Informative)
Re:NTFS is not so bad (Score:3, Insightful)
Analysis is complete for: (C:)
You should defragment this volume.
I then looked at the report and found the following:
Total fragmentation = 21%
File fragmentation = 42%
Free space fragmentation = 1%
Pretty bad especially considering I've only
Re:NTFS is not so bad (Score:3, Insightful)
Any new machine will have an image dumped onto the hard-drive by the manufacturer.
Most imaging apps don't bother with defragmenting so you probably started out with it fairly fragmented from the initial build of the image.
Re:NTFS is not so bad (Score:3, Insightful)
http://en.wikipedia.org/wiki/NTFS
NTFS has its strong points. It is reliable and has several extensions that make it quite flexible. On the other hand, it's not hard to "outdo NTFS" in some respects. There are many things that HFS+ and ReiserFS do better than NTFS. There are many things that NTFS does better.
I think that NTFS is pretty good when it comes to cataloging chan
Re:NTFS is not so bad (Score:4, Informative)
NTFS fragments _very_ fast on me, after a few months of use, it is in the 20% or more range.
Same user (i.e. me), so same usage pattern, on my HPFS disks (yes, HPFS, that would be OS/2, not OS X), the fragmentation after 3 _years_ is less than 2% on ALL of my HPFS disks.
Re:NTFS is not so bad (Score:5, Interesting)
Re:NTFS is not so bad (Score:3, Insightful)
Youngster. Go back far enough in UNIX and it required PERFECT disk packs to function -- no handling of bad sectors. Of course, those were the days when disk "drives" were the size of a small washing machine, the top opened, and you loaded/unloaded the multi-platter disk pack that was the size of a hat box. Was always interesting to see one of the gurus arrive to troubleshoot your system carrying their own disk pack with their specialized utilities..
Re:Unsupported diss, unsupported support (Score:3, Insightful)
OK, I was being a bit snobbish in saying it is not a 'real filesystem', it does have its uses - small devices, floppies, etc. BUT, even when it was originally designed it was considered primitive it had many known fla
HFS Filesystem vs. ReiserFS (Score:2, Interesting)
Under HFS+ in Mac OS X Jaguar or Panther, after about a day of having a clean install, fresh partition and format my hard drive starts making clunking noises and the system locks up (without actually freezing) -- then when reboot attempts are made they take aeons.
Under ReiserFS in Gentoo Linux for PPC: never have the problem. Same hard drive. Months of use, never once hear the hard drive being funky. No lockups.
Do I put the blame on HFS? OS
Re:HFS Filesystem vs. ReiserFS (Score:2, Interesting)
Seriously - it's likely that gentoo just isn't using the particular sector on the drive that OSX is - perhaps there is a file there that doesn't get accessed regularly or something. In any case, Clunk-Clunk is never ok.
Re:HFS Filesystem vs. ReiserFS (Score:3, Insightful)
Re:HFS Filesystem vs. ReiserFS (Score:2, Informative)
My stats (Score:5, Informative)
I've got a G4 with an 80 GB root drive which I use all day, every day. Well, almost. It's never had anything done to it, filesystem-maintenance-wise, since I last did an OS upgrade last fall, about eight months ago. Not too shabby, methinks.
that's not a good measurement... (Score:5, Funny)
CVB
Panther Defrag (Score:5, Interesting)
Re:Panther Defrag (Score:3, Informative)
HPFS (Score:3, Interesting)
But not sure how this are managed in linux filesystems, not just ext2/3 and reiserfs, but also in xfs and jfs.
Looks like I have a problem... (Score:2)
I can't find any info about this on the site. Is anyone else getting this error?
ReiserFS and fragmentation (Score:2, Interesting)
It's really good on filesystems with a lot of files or on databases.
Re:ReiserFS and fragmentation (Score:3, Funny)
Parted [gnu.org]
Who even cares about Fragmentation anymore? (Score:5, Interesting)
Both have 40gig HD's and both have applications installed/uninstalled quite often. My PC feels the worst of this as he gets games installed and uninstalled in addition to the apps.
For example the last time I reinstalled either of these machines was back in january(new year fresh install) and since then my pc has felt the install/uninstal of various games usually ranging from 2-5 gigs each. The Apple has been installed and with the exception of updates, plugins, video codecs and basic small apps that get added/upgraded often has done alright.
Right now Norton System Works on my PC is saying the drive is 4% fragmented. Disk Warrior on my Apple is saying the drive is 2% fragmented.
Conclusion: Fragmentation is no longer an issue for the HOME USER(note how i'm not saying your companies network doesn't need to be concerned), unless there still running a FAT32 partition >. which well they deserve to have there computer explode at that point anyway.
Centrifugal Force (Score:5, Funny)
Defrag = placebo? (Score:3, Interesting)
Re:Defrag = placebo? (Score:5, Interesting)
As a cute side note, I remember having to explain fragmentation to my high school FORTRAN class and teacher back in the '80's. I'd changed schools in my senior year and the new school had just started using the Apple II FORTRAN environment, which happened to be the same as the Aple II Pascal environment that I'd used at the previous school. The file system was incapable of slapping files into whatever blocks happened to be available (I'm not even sure it used blocks. Probably not...) so you would not be able to save your files if it was too fragmented, even if there was enough space available to do so. Ah those were the days...
Re:Defrag = placebo? (Score:3, Informative)
Head seek and rotational latency is still much slower than contiguous blocks. True, modern systems deal with it better, partially due to b-tree a
Re:Defrag = placebo? - yes and no (Score:4, Interesting)
however, defraging is not the same for every defrag utility. For example, I was working with Avid Audiovision about 5-6 years ago on a TV show, it seems that defraging a drive hosting files created or edited with Audiovision with Speed Disk by Symantec would actually corrupt the entire projects contained on the drive (the biggest mistake and the only serious one I had in my career, I didn't loose my job but my boss did loose his temper, live and learn!), audio file were not readable at all after, it was actually a documented bug of Audiovision and I even think it was affecting every OMF files not just the ones used by Audiovision (not sure about this though), thats what happens when your boss won't let you RTFM. Only Disk Express, some Avid defrager or, later, Techtool could defrag those drives.
On a side note, in the Classic mac (7-9.2), defragmenting your drive was also a way to prevent data corruption, actually its the other way around, not defraging would lead to data corruption. I don't know if its also the case with NTFS, EXT2 et al.
Re:Defrag = placebo? (Score:3, Informative)
If you need to save a 100kb file, it will take 10ms (1/100th of a second) to seek to the first block, and then, assuming everything is perfect, it will take 100MB/sec / 100kb = 1/1000th of a second to write the file... so, seeking to the start of the file took 10 times as long as writing it!
This gross simplification actually trivializes the real effect. The 10ms seek figure is an average track-to-track seek delay between adjacent tr
Disk Fragmentation (Score:5, Insightful)
On-the-fly defragmentation for NTFS disks in WinNT (Score:3, Informative)
Apple updated their stand recently (Score:5, Informative)
Mac OS X: About Disk Optimization
Do I need to optimize?
You probably won't need to optimize at all if you use Mac OS X. Here's why:
Defragging XP now... (Score:3, Interesting)
20 minutes later, and it's on 17%. That's pretty damn fragmented, in my opinion.
Re:Defragging XP now... (Score:4, Insightful)
No, it's just that the defragger built-in to Win2K/XP is shite. Its runs like molasses in liquid helium, and it almost never does a complete job in a single run. You have to run it several times in a row before it's even close to doing a reasonable job. And if it's your system drive, then there are some files (including the swap file) that it simply won't touch no matter how badly the blocks are scattered. This can be a real pain in the posterior if you're trying to defrag a drive in preparation for a Linux install.
Schwab
Re:Defragging XP now... (Score:4, Informative)
With the latest versions of ntfsresize, fairly safe. I did it on a machine at work with very important data on it (yes, I backed it up first), and had no trouble at all. However, all ntfsresize can do is truncate an NTFS partition's free space. In other words, it won't relocate blocks to other free areas of the disk. So the most you can shrink it is by however much free space you have at the end of the partition. ((After Googling around a bit, I've learned that the most recent versions of ntfsresize [rulez.org] will now move datablocks around, so apparently that restriction is now gone. I have not personally tested this, however.))
Incidentally, ntfsresize is part of Knoppix, and gets run through QTPartEd, a partition editing tool. It is an older, non-relocating version, however.
Schwab
my stats (Score:3, Interesting)
Not bad. That's 8 months of heavy use since my last format.
I gotta bring this to work today and see what that machine's like. My co-worker has been complaining that he doesn't have a defrag utility since he got OSX. I've been telling him that I don't think it matters. Now I can prove it to him.
I remember back in the days of my Powermac 8100/80av, we would leave the 2 800mb drives defragging over the weekend because they had like 75% fragmentation.
I rarely see XP drives w/ bad fragmentation probs (Score:5, Funny)
portable fragmenter (Score:3, Interesting)
Result: you have a bunch of large files, all very fragmented, and the free space is very fragmented.
HPFS fragments could be good (Score:3, Informative)
Defrag utils for OS/2 had options to only defrag if there were more than 3 extents, to avoid nullifying this effect.
funny, years after the death of OS/2, it still kicks ass on much what we use now.
Re:HPFS fragments could be good (Score:3, Informative)
Vendors used to do interleaving with the format/fdisk commands I recall. The idea was that writing the sectors in a continuous stream was not very efficient as the drives of the time could not move data to or from the disk so quickly. You'd read sector 1, and by the time you were ready to read sector two, sector 3 was under the head, so you had to wait almost an entire disk revolution to find sector 2 again.
The interleave told the OS to skip X physical dis
Microsoft really pisses me off (Score:5, Interesting)
That would be well and good if the problem were otherwise insurmountable. But, it turns out, we've known how to minimize, if not entirely eliminate, filesystem fragmentation for twenty years now - since the introduction of the BSD Fast File System.
It doesn't take expensive (in time, if not in money) tools. All it takes is a moderately clever block allocation algorithm - one that tries to allocate a block close in seek time to the previous one, rather than just picking one at random.
The fundamental insight that the authors of FFS had was that while there may only be one "optimal" block to pick for the next one in a file, there are tens of blocks that are "almost optimal" and hundreds that are "pretty darn good." This is because a filesystem is not a long linear row of storage bins, one after another, as it is treated by many simplistic filesystems. The bins are stacked on top of each other, and beside each other. While the bin right next to you might be "best", the one right next to that, or in another row beside the one you're on, or in another row above or below, is almost as good.
The BSD folk decided to group nearby bins into collections and try to allocate from within collections. This organization is known as "cylinder groups" because of the appearance of the group on the disk as a cylinder. Free blocks are managed within cylinder groups rather than across the whole disk.
It's a trivial concept, but very effective; fragmentation related delays on FFS systems are typically within 10% of optimum.
This kind of effectiveness is, unfortunately, difficult to achieve when the geometry of the disk is unknown -- and with many modern disk systems the actual disk geometry is falsely reported (usually to work around limits or bugs in older controller software). There has been some research into auto-detecting geometry but an acceptable alternative is to simply group some number of adjacent blocks into an allocation cluster. In any case, many modern filesystems do something like this to minimize fragmentation-related latency.
The gist of this is that Microsoft could have dramatically reduced the tendency towards fragmentation of any or all of their filesystems by doing nothing else but dropping in an improved block allocator, and done so with 100% backward compatibility (since there is no change to the on-disk format).
Maybe it was reasonable for them to not bother to so extravagantly waste a few days of their developers' time with MS-DOS and FAT, seeing as they only milked that without significant improvement for eight or nine years, but it's hard to explain the omission when it came to Windows NT. NTFS is a derivative of HPFS which is a derivative of FFS. They had to have known about cylinder group optimizations.
So the fact that, in 2004, we're still seeing problems with filesystem fragmentation absolutely pisses me off. There's no reason for it, and Microsoft in particular ought to be ashamed of themselves. It's ridiculous that I have to go and degragment my WinXP box every few months (which takes like 18 hours) when the FreeBSD box in the basement continues to run like a well-oiled machine despite the fact that it works with small files 24 hours a day, 365 days a year.
Hey Microsoft: You guys have like fifty billion bucks in the bank (well, ok, 46 or 47 billion after all the antitrust suits) and yet you can't even duplicate the efforts of some hippy Berkeleyite some twenty years after the fact? What's up with that?
(I mean "hippy Berkeleyite" in an affectionate way, Kirk. :-)
File types and fragnentation (Score:5, Informative)
There are fundamentally only a few types of files when it comes to fragmentation.
1. There are files that simply never change size, and once written don't get overwritten. (Type 1). Most programs are actually type 1, if you use sufficiently small values of never
2. There are files that will often shorten or lengthen in use, for example a word processor document in
Of type 2, there are files of type 2a. Files that may get either longer or shorter with use, on a (relatively) random basis. (as a relatively simple case, a
Then there are files of type 2b. Files that get longer or shorter only for predictable reasons, (such as a Windows
what to expect for these files, which suggests a well-written defragger could theoretically also auto-predict the consequences of the changes a user is making).
3. Then there are type 3 files, which only get longer. These too have predictable and unpredictable subtypes. Most log files for example, are set up to keep getting longer on a predictable basis when their associated program is run (type 3b). Anything that has been compressed (i.e.
4. Type 4 would be files that always get smaller, but there are no known examples of this type
These types are basic in any system, as they are implied by fundamental physical constraints. However, many defrag programs use other types instead of starting from this model, often with poor results.
In analyizing what happens with various defrag methods, such as reserving space for predicted expansion or defragging in the background/on the fly methods, the reader should try these various types (at least 1 through 3), and see what will happen when that method is used on each type. Then consider how many of those type files will be involved in the overall process, and how often.
For example, Some versions of Microsoft Windows (tm) FAT32 defragger move files that have been accessed more than a certain number of times (typically f
Fast! (Score:3, Informative)
It might be the way they've 'frobbed' UFS for use with OS Server, but UFS really gives high priority to disk ops with GUI ops taking the back seat, and yet HFS+ is in comparison blazingly fast.
I believe in a good clean machine like anyone, and I do see the probability DiskWarrior will be needed now and again, but the speed alone is quite a pedigree for HFS+ IMHO.
Re:Anonymous (Score:3, Informative)
Re:Anonymous (Score:5, Interesting)
Re:Anonymous (Score:3, Insightful)
Re:Anonymous (Score:2)
Re:Anonymous (Score:3, Informative)
Fragmentation is a performance killer for Win 9x on older machines
Re:Anonymous (Score:2)
Re:Anonymous (Score:2, Interesting)
Re:Anonymous (Score:4, Informative)
What are you talking about?
No, they don't. But since they borrow their design from BSD's FFS they don't need it either.
Erm, that's fsck. fsck doesn't do defragmentation.
It's true, however performance is severely degraded when disk usage reaches around 90% for classic FFS-like filesystems. While the BSDs can mount ext2 partitions none of them uses ext[23] as default. AIX uses a JFS version that's a bit different from the one you see in Linux, which was based on OS/2's code. I think you're mixing up filesystem integrity with fragmentation. In classic BSD UFS/FFS data is stored in datablocks, which are partitioned in fragments, usually 1/4th of the datablock size. A fragmented file is a file that's stored in non-contiguous fragments. Just that. The performance impact of fragmented files vs the time needed to reorganize the data shows that it's not worth running a defrag program on FFS filesystems.
This paper [harvard.edu] has some more info on the subject.
Re:Give it a rest (Score:4, Informative)
Re:Give it a rest (Score:3, Insightful)
Re:Big frag issues under EXT2 too (Score:5, Informative)
Re:fragmentation and dimension (Score:3, Informative)