[NTLUG:Discuss] Is there a disk defragmenter for Linux

Mon Dec 23 10:27:59 CST 2002

Rev. wRy wrote:
> On Sat, 2002-12-21 at 16:03, Steve Baker wrote:
> 
>>Kipton Moravec wrote:
>>
>>>Coming from a MS background, I always run the disk defragmenter once a week.
>>>
>>>Is there one for Linux?  What should I look for?  Is it needed?
>>
>>No - only crappy OS's need to have their disks defragged.
> 
> 
> Can you expand on this for a moment?  It's a critical difference between M$ 
> and Linux, and I've yet to stumble across anything other than one line answers

> 
> So how exactly does *nix write data to a hard drive that eliminates the need 
> for a defrag?  And why is there fsck if there is no need for a defrag?
> 

> 
> Ry

You won't get this info from normal *nix books and classes.  To get this 
info you'll need to get into OS specific classes and books which most 
people aren't willing to spend the time and money on.  If you want to 
dig into the kernel source you'll have to get into the low level device 
drivers and controllers that manage the inode structure and physical 
read/write routines.

The fsck utility has nothing to do with defragging a disk or physical 
file layout.  The fsck utility validates the contents of the inode chain 
(and superblock) and provides a way to recover inodes that have been 
broken.  After running fsck (depending on your command line options) 
broken inodes will be placed in the lost+found directory for that disk 
so that you can view the contents and attempt to recover the data.

A short synopsis:

The *nix operating systems use an inode structure to manage disk 
storage.  I won't go into the details of a superblock vs regular inode 
chain here.  An inode chain, simply put, is a linked list of file 
control information.  Each inode contains details about the file name, 
file size, disk location, disk block allocation size, dates, type, etc. 
as well as links to other inodes.  A single file can be referenced by 
several inodes while a single inode can only reference 1 file.  The 
inodes are used to manage both allocated and unallocated space on your 
disk drive.

When a file is written to disk *nix OSs use a best fit logic for the 
file being stored.  The file will be stored in 8K blocks (normal track 
size) if at all possible.  A disk drive can only read/write 1 track at a 
time so *nix OSs take advantage of this.  A *nix OS will also not 
allocate disk space less than 1 block (usually 512 bytes) in size to a 
file which allows a file to occupy up to 16 blocks per track.

When a file is updated on disk (this is where the 2 OSs differ) a *nix 
OS will move the file if a better fit is found.  You might say that the 
*nix OSs do garbage collection every time they write data to a disk.

Example:

Say you have a file that was created with 100 bytes.  This file will 
occupy 1 512 byte block of physical disk space and take 1 inode.  Later 
you update this file and add another 500 bytes of size to it.  Since the 
file is now 600 bytes in size the OS will expand the allocation so that 
1024 bytes of physical disk space will be allocated and still take 1 
inode.  That physical space where the data starts may or may not be at 
the same location as before depending on the allocation status of the 
next physical block.  The data will be written to disk and the inode 
structure will be updated.

Remember that the physical nature of a disk is that a read/write 
operation always happens in full tracks.  The best engineering solution 
is to take advantage of this and put as much related data in a track as 
possible.

Things may have changed since the last time I was in the bowels of M$s 
drivers and controllers but I don't think so.  The M$ file management 
approach is to not move the data when a file overflows the current 
physical allocation (also 512 bytes IIRC) for a file.  Instead they 
allocate the next available fit for the expanded data.  This causes your 
fragmentation problem since the next block may be on a different 
physical track.  As long as the expanded block is on the same physical 
track there is no difference in performance.

Example:

Using the same scenario as before.  When the file is updated 512 bytes 
will be written back to the original location and the remaining 88 bytes 
will be written to a newly allocated 512 byte block.  They are also 
forced to live with the 1 track write but because of the FAT (and other 
adopted names) design moving the original 100 bytes is grossly inefficient.

The advantage to the M$ way is that you do not need to flush a FAT 
before you shutdown a system.  The advantage to the *nix way is that you 
do not have to defrag a system to improve I/O performance.

I hope this helps, and I hope my memory isn't too corrupted by the 
cobwebs ;)

-- 
Greg Edwards
New Age Software, Inc.
http://www.nas-inet.com