Disks, partitions and filesystems

Posted 10-27-2020 at 10:08 AM by hazel
Updated 01-05-2022 at 12:57 PM by hazel

One thing that Linux newbies often find confusing is the way that Linux deals with disks. Windows does its disk management "under the hood". Partitions are checked to see if they are organised in a way that Windows recognises; if so, they are automatically "mounted", i.e. made available to the system. Each disk or partition is given a letter by which it can be accessed, starting with C: (A: and B: were historically reserved for floppy disks). Unrecognised disks, including Linux partitions, are simply ignored. I understand that there are now third-party Windows applications that can read some Linux filesystems, but you need to install those separately.

In Linux, the management of disks is much more visible. This gives users more control but, like many things in Linux, it needs to be learned. Fortunately, like everything in Linux, it is extremely logical.

Modern hard drives are large, so they are invariably divided into independent partitions. This is not the case with CDs and DVDs, nor is it usual to partition usb memory sticks, although it can be done. Partitioned disks need to have a partition table starting in the first sector, so that the partitions can be found by the software.

On a traditional DOS disk, the partition table occupies the first sector only and has room to index only four primary partitions. However, one of these can be designated as an extended partition and can contain further "logical disks". In Windows, a primary partition has to be used for the Windows installation itself; logical disks can be used only for data storage. Linux however makes no such distinctions and it is quite common in a multiboot system for distros to be hosted on logical disks. Modern GPT disks have a multi-sector partition table and as many primary partitions as you need, so the concept of a logical disk is disappearing.

The Linux kernel recognises both hard drives and plug-in drives by giving them a device name beginning with sd (for scsi drive). A letter is then appended, so that the first internal hard drive usually becomes sda. Left to its own devices, the kernel will name such drives in the order in which it detects them, which might change from boot to boot. This would cause confusion, so an anciliary program called udev renames the drives according to rules that provide constant names. The partitions are then represented by an added number, for example sda1 for the first partition. In Linux, counting usually starts from zero but partitions are the exception. Optical drives are named by modern kernels as sr0 and upward.

All these device names recognise drives and partitions purely as physical entities, that is as hardware devices, and not as containers for information. The names are stored in the special pseudo-directory /dev along with many other names for devices that the kernel can access. Each filename in /dev gives processes access to code inside the kernel that reads from or writes to the device, or performs other operations on it, and passes the results back to the requesting process.

Device names are good enough to give the kernel the kind of raw access to disks that it needs to create partitions, format them, identify bad spots on the disk and so on. But programs usually want to access the files stored on the disks, rather than the disks themselves, and this requires additional information about where the files are and how to read them. In other words it requires a filesystem.

Filesystems are simply ways of organising data on a disk or partition so that it can be retrieved at will. Linux recognises many different filesystems: native ones like ext4 and btrfs, and foreign ones like Microsoft's vfat and ntfs. Indeed there are probably no filesystems that the Linux kernel cannot read. That is made possible by the separation of the read/write operations in the kernel from the actual filesystem on the disk. The kernel uses its own virtual filesystem to find the data and a separate driver within the kernel translates from the actual filesystem into this virtual format.

In Windows each partition is an island universe. All pathnames therefore begin with the drive letter. The path to a file might take the form C:\dir\subdir\file. Of course you won't actually see this pathname in the graphical interface, but that is the form it takes internally. In Linux, partitions are linked together into a single tree. The root of this tree (designated by /) is the root directory of the designated root partition. Thus the corresponding absolute pathname in Linux would be /dir/subdir/file.

What about files on other partitions? To access them, it is necessary to link the filesystem on that partition to the one one the root partition, which is done by "mounting" it on a suitable empty directory. That directory then becomes an alias for the root directory of the mounted partition. So if your root partition is sda1 and your home partition is sda2, it will be mounted on /home/yourname and a possible pathname might be something like /home/yourname/dir/subdir/file.

NOTE
If you accidentally mount a filesystem on a directory that is not empty, the original contents of that directory will appear to vanish! But this is only true for the duration of the mount. As soon as you unmount the filesystem, the directory will revert to its normal state.

There are three main ways in which partitions can get mounted: automatically at boot time, by hand as required, and (for plug-in devices) automatically when they are detected. All three methods use the kernel's internal mount() function, and the first two do this via the traditional mount command which is a wrapper for that function. The automounting of plug-in devices, often on a directory like /run/media/yourname, is usually handled by the udisks program.

On a traditional Linux system, automounting at boot (as well as mounting by hand for non-root users) depends on the file /etc/fstab, which contains lines specifying the partitions and plug-in storage devices known to the system, what kind of filesystem each of them carries, where they are to be mounted, what parameters are to be used for the mount, and whether the device is to be mounted automatically at boot. One of the boot scripts carries out any specified automounts by using the mount command, which collects the necessary data from this file.

If your distro uses systemd to initialise itself, there will usually be an /etc/fstab file because users expect it to be there. Systemd can use this file but does not actually need it and can work entirely from its own configuration files, which contain the same information in a slightly different form.

Mounting storage devices by hand works in exactly the same way. If there is a line in fstab containing the requisite information, and if the options specify that the device is user-mountable, any user can mount it by using the mount command. You can use either the device name or the name of the mount point, since the two are associated in the file:

Code:

mount /dev/sda5
mount /home/data

If the device is not marked as user-mountable, then only root can mount it. Mounting with different parameters from those specified in /etc/fstab or mounting devices that are not named in that file are likewise operations reserved for root.

Automounting of plug-in devices by udisks uses a slightly different configuration mechanism. The device is not looked for in /etc/fstab but is mounted somewhere under the /run/media directory, provided that the user who plugs it in is a member of the plugdev group.

The kernel communicates with disks via large buffers. As a result, data which has apparently been written out to disk may still be physically buffered, waiting to be written out when the cpu is less busy. To ensure that this is actually done, filesystems must be cleanly unmounted using the umount command (note the spelling: umount, not unmount). For mounted internal partitions, umount is invoked by the closedown scripts. For plug-in devices, this must be done by hand in a terminal, or by clicking an unmount option in a file manager window. If you physically disconnect a writeable device without unmounting it, it may get corrupted and you will certainly get a warning when you next use it.

The umount command mirrors the mount command:

Code:

umount /dev/sda5
umount /home/data

Posted in For newbies

Views 1661 Comments 1

« Prev Main Next »

Total Comments 1

Comments

	I just now noticed you're doing the blog thing... my goodness, these are excellent articles. Carry on! (I also see there's an RSS feed for my convenience!)
	Posted 01-05-2022 at 12:24 AM by Reziac