Can Anyone Explain Fragmentation?

Status
Not open for further replies.

SNGX1275

Posts: 10,615   +467
Can Anyone Explain Fragmentation? I mean I know what it is but what I'm getting at is why does it occur? If anyone has any information or links to this I'd like to read and understand why it happens.
It just doesn't make sense to me that something is read and then written in fragments all across the drive.
 
Of course it makes sense. When you write a file, the data is written to the first unused chunk that is found in the table (FAT or MFT, etc...) Such a chunk is probably left over from when you previously wrote something there and then deleted it. If that's not enough it goes to the next chunk, etc. When you delete a file, you leave a chunk around which may or may not be the right size for the next file you are writing (in fact its highly unlikely that the chunk is the right size). But the OS will use the first free chunk anyway. It will keep writing this file into any chunks of unused space that it find until the file is either completely written, or data must be written to the free space at the end, where nothing has been written so far, and there is a continuous stretch of unused space.

If we just wrote everything into the continuous free space at the end, then there would be holes all over the space of unused space, and we would waste disk space. We can't afford to do that because HDD space (and RAM) is finite. One day, when we invent a HDD that has infinite space (or near enough infinite at any rate), then there will be no fragmentation because unused chunks left over from file deletions will remain unused and ALL data will be written to the end of the drive were all unused space is continuous.

But for now we must use the method of writing to the first area of unused space, and so we get fragmentation occuring naturally as a partition is written to, files are deleted, more files are written, more files are deleted, etc....

A similar thing occurs in some forms of memory management where differing sizes of memory segments can be freed up and then used when and where available leading to memory fragmentation.

There are ways around both problems of disk space fragmentation and memory fragmention, but both tend to lend themselves towards some form of wasted space. The above model wastes less space but incurrs fragmentation as a result.

Fragmentation is a problem because when we load a file from HDD space into RAM, we want to load it in one continous read operation that does not involve zipping to other parts of the drive to read.... A file should start at one location and then continue unbroken until its end... Otherwise the time it takes to read something increases.... thusly a badly fragmented drive leads to a slow computer.

As I said, its the process of writing and deleting that leads to fragmentation. That's why if you fresh format a partition and then copy a large number of files onto it in one continuous operation (like if you copied the whole contents of your old data partition onto a new, freshly formatted partition on a new HDD) there will be little or no fragmentation. But after a while, when you have deleted and written, written and deleted, etc... things get fragmented. Thus the need to defragment.

A newly installed Windows installation will lead to a fairly fragmented partition, because in addition to a lot of files being written, there are a lot of temporary files written and deleted during the process. Therefore, one should really defragment immediately following a fresh installation after a format, or after installing a lot of software.

I hope I have explained this OK, its late and I mean to turn in soon. Post back if you want to talk about it some more.
 
That makes sense.

What if you don't delete anything only create new files or install new ones, don't they get fragmented anyway? I guess maybe the OS creates and deletes temporary files anyway?
 
Well one of the big files that the OS could create is a swap file. Typically this is pretty big - 1.5 times the size of the RAM or even 2 times.

A big problem that can emerge is that the swap file becomes fragmented. That's why its a good idea to have a dedicated swap partition. Its also good to have this partition near the beginning of the disk where the access times are better. When its the only single file on a partition, the swap file will not really get fragmented at all. This can be a real performance boost. I always have a dedicated swap partition. If you have several operating systems like Windows 2000, Windows 2000 server and Windows XP, say, installed then they could all share a common swap file partition with little or no problems.

As I explained if all you did was create a new partition and format it, and then copy lots of files on there and then do nothing with it, then that partition would not really be fragmented at all. But normal operation, especially on an OS partition, is for lots of files to be written, deleted, written, deleted, etc. Thusly fragmentation appears in the file system over a period of time.
 
That was very well explained Phant, I can't think of anything to add. Well, maybe I'll just restate the importance of the dedicated Swap Partition. I also have a partition that I use to download stuff to. This is for putting anything I put on my computer that I am either not going to keep long or that I haven't figured out where to put it yet. You'd be amazed at how much that has cut down on the fragmentation of my main partition.

Hmm, maybe someone should make this a "Sticky" I'm sure it would be of great help to others. I get asked about fragmentation all the time.
 
Originally posted by StormBringer
Hmm, maybe someone should make this a "Sticky" I'm sure it would be of great help to others. I get asked about fragmentation all the time.
Done.
 
Just to add to the importance of the swap partition. By default windows has a "dynamic" swap file. Basically, the file will increase and reduce its size according to the machines need. This'll cause the "pagefile" to fragment and spread itself throughout your hard drive, and because of the size of this file, it often will not defragment without 20%-25% free hard drive space.

You can help reduce pagefile fragmentation by setting a permenant size on the file, but a swap partion is your best bet.
 
Fragmentation - 1 December 1997
We hadn't intended to write an article on fragmentation, since these articles primarily go to people who already use Diskeeper, but so many of you have asked for the article on fragmentation that we had to write one!

We have stated before that fragmentation is the most significant factor in system performance. Here's why:

An average fragments per file value of 1.2 means that there are 20% more pieces of files on the disk than there are files, indicating perhaps 20% extra computer work needed. It should be pointed out that these numbers are merely indicators. Some files are so small that they reside entirely within the MFT. Some files are zero-length. If only a few files are badly fragmented while the rest are contiguous, and those few fragmented files are seldom accessed, then fragmentation may have no performance impact at all. On the other hand, if your applications are accessing the fragmented files heavily, the performance impact could be much greater than 20%. You have to look further to be sure. For example, if there were 1,000 files and only one of those files is ever used, but that one is fragmented into 200 pieces (20% of the total fragments on the disk), you would have a serious problem, much worse than the 20% figure would indicate. In other words, it is not the fact that a file is fragmented that causes performance problems, it is the computer's attempts to access the file that degrade performance.

To explain this properly, it is first necessary to examine how files are accessed and what is going on inside the computer when files are fragmented.

What's Happening to Your Disks?
Tracks on a disk are concentric circles, divided into sectors. Files are written to groups of sectors called "clusters". Often, files are larger than one cluster, so when the first cluster is filled, writing continues into the next cluster, and the next, and so on. If there are enough contiguous clusters, the file is written in one contiguous piece. It is not fragmented. The contents of the file can be scanned from the disk in one continuous sweep merely by positioning the head over the right track and then detecting the file data as the platter spins the track past the head.

Now, suppose the file is fragmented into two parts on the same track. To access this file, the read/write head has to move into position as described above, scan the first part of the file, then suspend scanning briefly while waiting for the second part of the file to move under the head. Then the head is reactivated and the remainder of the file is scanned.

As you can see, the time needed to read the fragmented file is longer than the time needed to read the unfragmented (contiguous) file. The exact time needed is the time to rotate the entire file under the head, plus the time needed to rotate the gap under the head. A gap such as this might add a few milliseconds to the time needed to access a file. Multiple gaps would, of course, multiply the time added. The gap portion of the rotation is wasted time due solely to fragmentation. Then, on top of that, you have to add all the extra operating system overhead required to process the extra I/Os.

Now, what if these two fragments are on two different tracks? We have to add time for movement of the head from one track to another. This track-to-track motion is usually much more time-consuming than rotational delay, since you have to physically move the head. To make matters worse, the relatively long time it takes to move the head from the track containing the first fragment to the track containing the second fragment can cause the head to miss the beginning of the second fragment, necessitating a delay of nearly one complete rotation of the disk, waiting for the second fragment to come around again to be read. Further, this form of fragmentation is much more common than the gap form.

But the really grim news is this: files don't always fragment into just two pieces. You might have three or four, or ten or a hundred fragments in a single file. Imagine the gymnastic maneuvers your disk heads are going through trying to collect up all the pieces of a file fragmented into 100 pieces!

On really badly fragmented files, there is another factor: The Master File Table record can only hold a limited number of pointers to file fragments. When the file gets too fragmented, you have to have a second MFT record, maybe a third, or even more. For every such file accessed, add to each I/O the overhead of reading a second (or third, or fourth, etc.) file record segment from the MFT.

On top of all that, extra I/O requests, due to fragmentation, are added to the I/O request queue along with ordinary and needful I/O requests. The more I/O requests there are in the I/O request queue, the longer user applications have to wait for I/O to be processed. This means that fragmentation causes everyone on the system to wait longer for I/O, not just the user accessing the fragmented file.

Fragmentation overhead certainly mounts up. Imagine what it is like when there are 100 users on a network, all accessing the same server, all incurring similar amounts of excess overhead.

What's Happening to Your Computer?
Now, let's take a look at what these excess motions and file access delays are doing to the computer.

Windows NT is a complicated operating system. This is a good thing because the complexity results from the large amount of functionality built in to the system, saving you and your programmers the trouble of building that functionality into your application programs, which is what makes Windows NT a truly great operating system. One of those functions is the service of providing an application with file data without the application having to locate every bit and byte of data physically on the disk. Windows NT will do that for you.

When a file is fragmented, Windows NT does not trouble your program with the fact, it just rounds up all the data requested and passes it along. This sounds fine, and it is a helpful feature, but there is a cost. Windows NT, in directing the disk heads to all the right tracks and clusters within each track, consumes system time to do so. That's system time that would otherwise be available to your applications. Such time, not directly used for running your program, is called overhead.

What's happening to your applications while all this overhead is going on? Simple: Nothing. They wait.

The users wait, too, but they do not often wait without complaining, as computers do. They get upset, as you may have noticed.

The users wait for their applications to load, then wait for them to complete, while excess fragments of files are chased up around the disk. They wait for keyboard response while the computer is busy chasing up fragments for other programs that run between the user's keyboard commands. They wait for new files to be created, while the operating system searches for enough free space on the disk and, since the free space is also fragmented, allocates a fragment here, a fragment there, and so on. They even wait to log in, as the operating system wades through fragmented procedures and data needed by startup programs. Even backup takes longer - a lot longer - and the users suffer while backup is hogging the machine for more and more of "their" time.

Fragmentation vs. CPU Speed
A system that does a lot of number crunching but little disk I/O will not be affected much by fragmentation. But on a system that does mainly disk I/O (say a mail server), severe fragmentation can easily slow a system by 90% or more. That's much more than the difference between a 486/66 CPU and a 250MHz Pentium II!

Of course, for the vast majority of computers, the impact of fragmentation will fall somewhere in the middle of this range. In our experience, many Windows NT systems that have run for more than two months without defragmenting have, after defragmentation, at least doubled their throughput. It takes quite a large CPU upgrade to double performance.

Fragmentation vs. Memory
The amount of memory in a computer is also important to system performance; just how important depends on where you are starting from. If you have 16 megabytes of RAM, it's almost a certainty that adding more will tremendously boost performance, but if you have 256 megabytes, most systems would get no benefit from more. Raising the RAM from 32 to 96 megabytes on the author's machine, which does much memory-intensive work, almost tripled performance. We see this as the high end of possible benefit from adding memory. The typical site, in our experience, will see about a 25% boost from doubling the RAM. Again, we generally see more performance improvement from eliminating fragmentation.

(This article was primarily excerpted from Chapter 4 of the book Fragmentation - the Condition, the Cause, the Cure, by Craig Jensen, CEO of Executive Software. It has been modified for application to Windows NT. The complete text of the book is available at this web site.)

Source>> http://www.execsoft.com/tech-support/NT-articles/article.asp?F=1997120112.htm
 
Just to add my tuppence worth ...

Fragmentation is bad for hard drives because of all the additional disk I/O required to read a fragmented file.

The story is different for memory fragmentation, because memory access does not suffer from the data access delays that affect hard drive performance - hard drives are mechanical devices.

To access memory merely requires that the correct address is used for the required memory location. Memory access speed is therefore not affected by fragmentation, because the data is accessed by applying a series of memory addresses as required. The actual bits in the address field that have to change makes no difference whatsoever to a solid state device such as memory.

It's just as quick to change 8 address bits as it is to change 1. Defragmenting memory is just a waste of effort and shouldn't affect performance in any way.
 
this emphasizes my belief... linux rocks. but as for my win2k box for games, i ran evidence eliminator and had it defrag the registry, and that greatly improved windows speed, just like when it was a fresh install. .02
 
Status
Not open for further replies.
Back