|
Linux Training 1825 Monetary Lane Suite #104 Carrollton, TX Do a presentation at NTLUG. What is the Linux Installation Project? Real companies using Linux! Not just for business anymore. Providing ready to run platforms on Linux |
Created by Chris Cox on August 19, 2009, at 03:42 AM (but were afraid to ask) Inside of Unix and Linux, files are very important. They not only serve as places for data and programs and such, but also serve as access points for devices and sometimes provide views and access into the deep recesses of the kernel itself! A Folder is a FileTo understand a folder you need to know that a file is just data... it has no name. Well, it does have an "address" if you will, but file names are assignments that are made inside of a special file called a directory. Thus a special file contains a list of file names and the "address" of the data that is associated with the file. Directory files also have "addresses" and thus can be named inside of another directory file. The Chicken and the EggIf directories contain files and other directories, how does this all get started? A filesystem is mounted to an empty directory called a mount point from which all folders and files will be found for that filesystem. But even a mount point, which is an empty directory, has to go somewhere... where? The answer is the "root". One filesystem has to be mounted at boot automatically. This filesystem is called the "root" and has a special psuedo mount point called "/". The creation of this root filesystem is usually done at initial installation of the OS. It is essential to the proper operation of the system. That does not mean it has to stored on a typical harddrive though. Just note that it is important to have, however it is made. What's Your Address?The abstraction to get at the file data we called an "address" earlier. Technically, this is called an "i-node" or inode for short. An inode stores information about the file that isn't part of the actual data content of the file. This includes things like permissions, ownership, etc. Within a mounted filesystem, these numeric inode "addresses" are unique. From them, the OS can get to the file content. Thus inode numbers are not guaranteed to be unique across different filesystems, so the context for an inode is the filesystem in which it resides. Think of it like our own communities. A local area might consists of many neighborhoods, towns, villages or cities. My community may have a First St. and someone not terribly far away may also live on First St. We may both have the same numeric address on First St., yet our addresses are not the same location. And so it is for filesystems and files. An inode for a file is unique within the context of its filesystem. Directories contain a mapping of names to inodes. A directory has its own inode number and apart from the special root directory, a directory inode must have a name to inode mapping in a parent directory in order to be found. Anytime a file or directory loses all connection to the directory heirarchy, that is, the inode has no name and can no longer be found, that inode is said to be an orphan inode. Orphans have no parent structure, and therefore cannot be found. Normally, you will not see orphaned inodes. Orphans are usually created as a part of an interrupted and failed operation where an inode never was associated with a name mapping inside of a directory. Many filesystems have transaction journaling that helps mitigate the formation of orphaned files. With that said, if a filesystem were to get corrupted (always possible), filesystem repair might end up identifying orphaned inodes. These orphans end up in a special directory at the root of a mounted filesystem called (appropriately) lost+found. Oh What a Tangled Web We Weave...To better get a handle on files and directories and these things called inodes, it's wise to examine them in a context. The most familiar directory for users is their own home directory. So let's start there. No Place Like HomeJust like a booting system needs a root filesystem on which to hang all files, users start out in a home directory. The home directory can be any directory inside of any filesystem. While typically found as a subdirectory of /home, a user's home directory could be anywhere. The mapping of a username to their home directory is found the file $ getent passwd ccox ccox:x:1000:100:Chris Cox:/localhome/ccox:/bin/bash Where's my password? "I don't see my password in there?" Nope. You don't. The Linux and some newer Unix distributions use other hash types like MD5 and Blowfish to make brute force attacks much harder even when the hashes are made visible. The hash types available in Linux depend upon what was compiled into glibc. MD5 is almost always present. Distributions like openSUSE include Blowfish as well. Your home directory is also stored in your login shell at the variable called HOME. $ echo $HOME /localhome/ccox You can also refer to any user's home directory using $ echo ~pulse /var/lib/pulseaudio $ echo ~ /localhome/ccox It is also good to know that if you do not supply a directory to the change directory, Home is Where the Heart IsLet's create a new directory in our home called "heart". Inside of that directory, create a file called "aorta" and make two directories called "atrium" and "ventricle". Go into both of those sub directories and create files called "left" and "right". $ cd # Remember this will take us to our home directory. $ mkdir heart $ cd heart $ touch aorta $ mkdir atrium ventricle $ cd atrium $ touch left right $ cd ../ventricle $ touch left right $ cd ~/heart Tying Things TogetherDirectories are the key to not only associating names to inodes but they serve as the canvas to create the whole structure of the filesystem we navigate as users. The current directory is known as ".", called "dot". Every directory has a parent directory to which it belongs known as "..", called "dot-dot". Hidden filenames in Unix/Linux start with ".", so normally you cannot see these directory names. Use the $ cd $ cd heart $ ls -ia1F # That 1 is a ONE!! 2624 ./ 4 ../ 4842 aorta 4852 atrium/ 4853 ventricle/ $ cd atrium $ ls -ia1F 4852 ./ 2624 ../ 4856 left 4861 right Notice that before we traversed into the directory The Key to Untying ThingsSo far we have seen an association of names to inodes. Another value that is kept on every file and directory is an incremented counter called a "link count". When files are "removed" the link count is decremented. If the link count goes down to zero, then the file is not referenced anywhere else and the file is removed. Referenced anywhere else? What does that mean? One way to examine this is to use our sample directory scheme. You can view the link count by using the long listing option $ cd $ cd heart $ ls -ia1Fl # First -1 is a ONE, second -l is a lowercase L total 3 2624 drwxr-xr-x 4 ccox users 128 2009-08-22 19:29 ./ 4 drwxr-xr-x 55 ccox users 3512 2009-08-22 19:28 ../ 4842 -rw-r--r-- 1 ccox users 0 2009-08-22 19:28 aorta 4852 drwxr-xr-x 2 ccox users 96 2009-08-22 19:29 atrium/ 4853 drwxr-xr-x 2 ccox users 96 2009-08-22 19:29 ventricle/ $ cd atrium $ ls -ia1Fl total 0 4852 drwxr-xr-x 2 ccox users 96 2009-08-22 19:29 ./ 2624 drwxr-xr-x 4 ccox users 128 2009-08-22 19:29 ../ 4856 -rw-r--r-- 1 ccox users 0 2009-08-22 19:29 left 4861 -rw-r--r-- 1 ccox users 0 2009-08-22 19:29 right The third field of the long listing shows the link count. For example, The Underneath the Underneath user commands are programs that make function calls inside of code to perform operations. Removing a file is a system command called It is NOT possible to create cross directory linkages for directories. That is you cannot create additional references to a directory inside of other directories. What makes the link count go up on directories is the fact that additional directories are created directly underneath a directory. Consider the Creating LinksThere are actually two kinds of "links". The true link, called a hard link, is what we have examined so far, so let's continue with that. As noted already, you cannot create links to directories, but the system maintains directory linkages to enable you to traverse directories. Without those links, whole directory trees could become lost... and oddly enough, files that "exist" but cannot be reached are called orphans. To create a link to an existing inode, you can use the $ cd $ cd heart $ ln aorta artery $ ls -ia1Fl # First -1 is a ONE, second -l is a lowercase L total 3 2624 drwxr-xr-x 4 ccox users 128 2009-08-22 19:29 ./ 4 drwxr-xr-x 55 ccox users 3512 2009-08-22 19:28 ../ 4842 -rw-r--r-- 2 ccox users 0 2009-08-22 19:28 aorta 4842 -rw-r--r-- 2 ccox users 0 2009-08-22 19:28 artery 4852 drwxr-xr-x 2 ccox users 96 2009-08-22 19:29 atrium/ 4853 drwxr-xr-x 2 ccox users 96 2009-08-22 19:29 ventricle/ This creates a hard link in the same directory. The file data can be manipulated using either entry. Notice that both entries are showing a link count now of 2. we would have to remove both the artery entry and aorta entry in order to actually free up any space occupied by the file (which is miniscule since our file is empty). File links like this can be created throughout a given filesystem which allows for space saving and snapshot-like functions. For example: $ cd $ cp -alR heart new-heart # lower case L option The The Missing Link?So which one is the REAL file? If you haven't guessed there is no difference between the original file in it's original location and a hard link you created anywhere else in the filesystem. Which one is real? All references to the inode are REAL. No difference. |