[NTLUG:Discuss] Is there a disk defragmenter for Linux
Greg Edwards
greg at nas-inet.com
Mon Dec 23 10:27:59 CST 2002
Rev. wRy wrote:
> On Sat, 2002-12-21 at 16:03, Steve Baker wrote:
>
>>Kipton Moravec wrote:
>>
>>>Coming from a MS background, I always run the disk defragmenter once a week.
>>>
>>>Is there one for Linux? What should I look for? Is it needed?
>>
>>No - only crappy OS's need to have their disks defragged.
>
>
> Can you expand on this for a moment? It's a critical difference between M$
> and Linux, and I've yet to stumble across anything other than one line answers
>
> So how exactly does *nix write data to a hard drive that eliminates the need
> for a defrag? And why is there fsck if there is no need for a defrag?
>
>
> Ry
You won't get this info from normal *nix books and classes. To get this
info you'll need to get into OS specific classes and books which most
people aren't willing to spend the time and money on. If you want to
dig into the kernel source you'll have to get into the low level device
drivers and controllers that manage the inode structure and physical
read/write routines.
The fsck utility has nothing to do with defragging a disk or physical
file layout. The fsck utility validates the contents of the inode chain
(and superblock) and provides a way to recover inodes that have been
broken. After running fsck (depending on your command line options)
broken inodes will be placed in the lost+found directory for that disk
so that you can view the contents and attempt to recover the data.
A short synopsis:
The *nix operating systems use an inode structure to manage disk
storage. I won't go into the details of a superblock vs regular inode
chain here. An inode chain, simply put, is a linked list of file
control information. Each inode contains details about the file name,
file size, disk location, disk block allocation size, dates, type, etc.
as well as links to other inodes. A single file can be referenced by
several inodes while a single inode can only reference 1 file. The
inodes are used to manage both allocated and unallocated space on your
disk drive.
When a file is written to disk *nix OSs use a best fit logic for the
file being stored. The file will be stored in 8K blocks (normal track
size) if at all possible. A disk drive can only read/write 1 track at a
time so *nix OSs take advantage of this. A *nix OS will also not
allocate disk space less than 1 block (usually 512 bytes) in size to a
file which allows a file to occupy up to 16 blocks per track.
When a file is updated on disk (this is where the 2 OSs differ) a *nix
OS will move the file if a better fit is found. You might say that the
*nix OSs do garbage collection every time they write data to a disk.
Example:
Say you have a file that was created with 100 bytes. This file will
occupy 1 512 byte block of physical disk space and take 1 inode. Later
you update this file and add another 500 bytes of size to it. Since the
file is now 600 bytes in size the OS will expand the allocation so that
1024 bytes of physical disk space will be allocated and still take 1
inode. That physical space where the data starts may or may not be at
the same location as before depending on the allocation status of the
next physical block. The data will be written to disk and the inode
structure will be updated.
Remember that the physical nature of a disk is that a read/write
operation always happens in full tracks. The best engineering solution
is to take advantage of this and put as much related data in a track as
possible.
Things may have changed since the last time I was in the bowels of M$s
drivers and controllers but I don't think so. The M$ file management
approach is to not move the data when a file overflows the current
physical allocation (also 512 bytes IIRC) for a file. Instead they
allocate the next available fit for the expanded data. This causes your
fragmentation problem since the next block may be on a different
physical track. As long as the expanded block is on the same physical
track there is no difference in performance.
Example:
Using the same scenario as before. When the file is updated 512 bytes
will be written back to the original location and the remaining 88 bytes
will be written to a newly allocated 512 byte block. They are also
forced to live with the 1 track write but because of the FAT (and other
adopted names) design moving the original 100 bytes is grossly inefficient.
The advantage to the M$ way is that you do not need to flush a FAT
before you shutdown a system. The advantage to the *nix way is that you
do not have to defrag a system to improve I/O performance.
I hope this helps, and I hope my memory isn't too corrupted by the
cobwebs ;)
--
Greg Edwards
New Age Software, Inc.
http://www.nas-inet.com
More information about the Discuss
mailing list