Translate

Image of Operating System Concepts
Image of XSLT 2.0 and XPath 2.0 Programmer's Reference (Programmer to Programmer)
Image of RHCE Red Hat Certified Engineer Linux Study Guide (Exam RH302) (Certification Press)
Image of Android Wireless Application Development

Fedora 16 GPT, GRUB2 and BIOS Boot Partition

As people start experimenting with Fedora 16 (Verne), many are encountering the concept of of a GPT (GUID Partition Table), the GRUB2 multiboot utility and the concept of a BIOS Boot partition for the first time.

Here is how Fedora 16 Beta set up the partitions on a 8Gb disk when all defaults were selected:

# parted  /dev/vda
GNU Parted 3.0
Using /dev/vda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Model: Virtio Block Device (virtblk)
Disk /dev/vda: 8590MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name  Flags
 1      1049kB  2097kB  1049kB                     bios_grub
 2      2097kB  526MB   524MB   ext4         ext4  boot
 3      526MB   8589MB  8063MB                     lvm

(parted) q

# gdisk /dev/vda
GPT fdisk (gdisk) version 0.7.2

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.

Command (? for help): p
Disk /dev/vda: 16777216 sectors, 8.0 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): FB0FEC7D-7C6F-4E35-A5DC-4496C5C02A7E
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 16777182
Partitions will be aligned on 2048-sector boundaries
Total free space is 4029 sectors (2.0 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048            4095   1024.0 KiB  EF02
   2            4096         1028095   500.0 MiB   EF00  ext4
   3         1028096        16775167   7.5 GiB     8E00

Command (? for help): i
Partition number (1-3): 1
Partition GUID code: 21686148-6449-6E6F-744E-656564454649 (BIOS boot partition)
Partition unique GUID: 4DF28119-09B0-4C6E-855E-427C62CF832D
First sector: 2048 (at 1024.0 KiB)
Last sector: 4095 (at 2.0 MiB)
Partition size: 2048 sectors (1024.0 KiB)
Attribute flags: 0000000000000000
Partition name: ''

Command (? for help): q
# 


Note that the partition scheme is GPT and the use of gdisk instead of fdisk. This is because fdisk does not understand GPT. If you examine the output from the two utilities, you will notice that parted uses the bios_grub flag to indicate that a partition is a BIOS Boot partition, whereas gdisk uses a partition type code of 0xEF02. I recommend that you use gdisk instead of parted when working with GPT-labelled disks as Rod Smith’s gdisk utility provides much more information in a more usable format.

Many Linux users are only aware of the disk partitioning scheme called Master Boot Record (MBR) which has been around since the dawn of PC computers. Now they are going to have to become familiar with GPT. As an aside, there are many other disk partitioning schemes (often referred to as disk labels) including the following:

  • MIPS DVH
  • Amiga partition table
  • PC98 partition table
  • Sun disk label
  • BSD disk label
  • AIX disk label
  • Macintosh Partition Map

I will not be discussing any of the above disk partitioning schemes in this post nor going into detail on GPT. I assume that you have some knowledge of GPT. If you are unfamiliar with GPT, I suggest you read this Wikipedia article.

The Fedora installer is called Anaconda and is written in Python. Among the listed Anaconda changes for Fedora 16 are the following:

  • grub2 is now the default bootloader, though upgrades will stay with whatever was previously installed.
  • x86 uses GPT disklabels by default on all machines, even non-EFI.

Earlier releases of Fedora did not support the use of a GPT boot disk on non-EFI systems. In fact, a warning message to that effect was added to the Fedora 15 installation script.

So what do these two changes really mean? To find out we need to look at the relevant sections of the Fedora 16 installation scripts as the Beta release notes are incomplete. (Note to Fedora developers – you really need to improve your Beta release notes as they relate to major changes.) If you want to browse the Anaconda source code, you can do so here.

From …/pyanaconda/platform.py:

def getPlatform(anaconda):
    """Check the architecture of the system and return an instance of a
       Platform subclass to match.  If the architecture could not be determined,
       raise an exception."""
    if iutil.isPPC():
        ppcMachine = iutil.getPPCMachine()

        if (ppcMachine == "PMac" and iutil.getPCacGen() == "NewWorld"):
            return NewWorldPPC(anaconda)
        elif ppcMachine in ["iSeries", "pSeries"]:
            return IPSeriesPPC(anaconda)
        elif ppcMachine == "PS3":
            return PS3(anaconda)
        else:
            raise SystemError, "Unsupported PPC machine type"
    elif iutil.isS390():
        return S390(anaconda)
    elif iutil.isSparc():
        return Sparc(anaconda)
    elif iutil.isEfi():
        return EFI(anaconda)
    elif iutil.isX86():
        return X86(anaconda)
    else:
        raise SystemError, "Could not determine system architecture."

....

class X86(Platform):
    _bootloaderClass = bootloader.GRUB2
    _boot_stage1_device_types = ["disk"]
    _boot_mbr_description = N_("Master Boot Record")
    _boot_descriptions = {"disk": _boot_mbr_description,
                          "partition": Platform._boot_partition_description,
                          "mdarray": Platform._boot_raid_description}


    _disklabel_types = ["gpt", "msdos"]
    # XXX hpfs, if reported by blkid/udev, will end up with a type of None
    _non_linux_format_types = ["vfat", "ntfs", "hpfs"]

    def setDefaultPartitioning(self):
        """Return the default platform-specific partitioning information."""
        from storage.partspec import PartSpec
        ret = Platform.setDefaultPartitioning(self)
        ret.append(PartSpec(fstype="biosboot", size=1,
                            weight=self.weight(fstype="biosboot")))
        return ret

    def weight(self, fstype=None, mountpoint=None):
        score = Platform.weight(self, fstype=fstype, mountpoint=mountpoint)
        if score:
            return score
        elif fstype == "biosboot":
            return 5000
        else:
            return 0

The GRUB2 class is defined in …/pyanaconda/bootloader.py.

class GRUB2(GRUB):
    """ GRUBv2

        - configuration
            - password (insecure), password_pbkdf2
                - http://www.gnu.org/software/grub/manual/grub.html#Invoking-grub_002dmkpasswd_002dpbkdf2
            - --users per-entry specifies which users can access, otherwise
              entry is unrestricted
            - /etc/grub/custom.cfg

        - how does grub resolve names of md arrays?

        - disable automatic use of grub-mkconfig?
            - on upgrades?

        - BIOS boot partition (GPT)
            - parted /dev/sda set <partition_number> bios_grub on
            - can't contain a filesystem
            - 31KiB min, 1MiB recommended

    """
    name = "GRUB2"
    packages = ["grub2"]
    obsoletes = ["grub"]
    _config_file = "grub.cfg"
    _config_dir = "grub2"
    config_file_mode = 0600
    defaults_file = "/etc/default/grub"
    can_dual_boot = True
    can_update = True

    # requirements for boot devices
    stage2_format_types = ["ext4", "ext3", "ext2", "btrfs"]
    stage2_device_types = ["partition", "mdarray", "lvmlv"]
    stage2_raid_levels = [mdraid.RAID0, mdraid.RAID1, mdraid.RAID4,
                          mdraid.RAID5, mdraid.RAID6, mdraid.RAID10]

    # XXX we probably need special handling for raid stage1 w/ gpt disklabel
    #     since it's unlikely there'll be a bios boot partition on each disk

    #
    # constraints for target devices
    #
    def _gpt_disk_has_bios_boot(self, device):
        """ Return False if device is gpt-labeled disk w/o bios boot part. """
        ret = True

        if device is None:
            return ret

        if not self.platform.weight(fstype="biosboot"):
            # this platform does not need bios boot
            return ret

       # check that a bios boot partition is present if the stage1 device
        # is a gpt-labeled disk
        if device.isDisk and getattr(device.format, "labelType", None) == "gpt":
            ret = False
            partitions = [p for p in self.storage.partitions
                          if p.disk == device]
            for p in partitions:
                if p.format.type == "biosboot":
                    ret = True
                    break

            if not ret:
                self.errors.append(_("Your BIOS-based system needs a special "
                                     "partition to boot with Fedora's new "
                                     "disk label format (GPT). To continue, "
                                     "please create a 1MB 'BIOS Boot' type "
                                     "partition."))

        log.debug("_gpt_disk_has_bios_boot(%s) returning %s" % (device.name,
                                                                ret))
        return ret

    def is_valid_stage1_device(self, device):
        ret = super(GRUB2, self).is_valid_stage1_device(device)
        ret = ret and self._gpt_disk_has_bios_boot(device)

        log.debug("is_valid_stage1_device(%s) returning %s" % (device.name,
                                                                ret))
        return ret

    #
    # grub-related conveniences
    #

   def grub_device_name(self, device):
        """ Return a grub-friendly representation of device.

            Disks and partitions use the (hdX,Y) notation, while lvm and
            md devices just use their names.
        """
        drive = None
        name = "(%s)" % device.name

        if device.isDisk:
            drive = device
        elif hasattr(device, "disk"):
            drive = device.disk

        if drive is not None:
            name = "(hd%d" % self.drives.index(drive)
            if hasattr(device, "disk"):
                name += ",%d" % device.partedPartition.number
            name += ")"
        return name

    def write_config_console(self, config):
        if not self.console:
            return

        console_arg = "console=%s" % self.console
        if self.console_options:
            console_arg += ",%s" % self.console_options
        self.boot_args.add(console_arg)

    def write_device_map(self, install_root=""):
        """ Write out a device map containing all supported devices. """
        map_path = os.path.normpath(install_root + self.device_map_file)
        if os.access(map_path, os.R_OK):
            os.rename(map_path, map_path + ".anacbak")

        dev_map = open(map_path, "w")
        dev_map.write("# this device map was generated by anaconda\n")
        devices = self.drives
        if self.stage1_device not in devices:
            devices.append(self.stage1_device)

        if self.stage2_device not in devices:
            devices.append(self.stage2_device)

        for drive in devices:
            dev_map.write("%s      %s\n" % (self.grub_device_name(drive),
                                            drive.path))
        dev_map.close()

   def write_defaults(self, install_root=""):
        defaults_file = "%s%s" % (install_root, self.defaults_file)
        defaults = open(defaults_file, "w+")
        defaults.write("GRUB_TIMEOUT=%d\n" % self.timeout)
        defaults.write("GRUB_DISTRIBUTOR=\"%s\"\n" % productName)
        defaults.write("GRUB_DEFAULT=saved\n")
        if self.console and self.console.startswith("ttyS"):
            defaults.write("GRUB_TERMINAL=\"serial console\"\n")
            defaults.write("GRUB_SERIAL_COMMAND=\"%s\"\n" % self.serial_command)

        # this is going to cause problems for systems containing multiple
        # linux installations or even multiple boot entries with different
        # boot arguments
        defaults.write("GRUB_CMDLINE_LINUX=\"%s\"\n" % self.boot_args)
        defaults.close()

   def _encrypt_password(self, install_root=""):
        """ Make sure self.encrypted_password is set up properly. """
        if self.encrypted_password:
            return

        if not self.password:
            raise RuntimeError("cannot encrypt empty password")

        (pread, pwrite) = os.pipe()
        os.write(pwrite, "%s\n%s\n" % (self.password, self.password))
        os.close(pwrite)
        buf = iutil.execWithCapture("grub2-mkpasswd-pbkdf2", [],
                                    stdin=pread,
                                    stderr="/dev/tty5",
                                    root=install_root)
        os.close(pread)
        self.encrypted_password = buf.split()[-1].strip()
        if not self.encrypted_password.startswith("grub.pbkdf2."):
            raise BootLoaderError("failed to encrypt bootloader password")

    def write_password_config(self, install_root=""):
        if not self.password and not self.encrypted_password:
            return

        users_file = install_root + "/etc/grub.d/01_users"
        header = open(users_file, "w")
        header.write("#!/bin/sh -e\n\n")
        header.write("cat << EOF\n")
        # XXX FIXME: document somewhere that the username is "root"
        header.write("set superusers=\"root\"\n")
        self._encrypt_password(install_root=install_root)
        password_line = "password_pbkdf2 root " + self.encrypted_password
        header.write("%s\n" % password_line)
        header.write("EOF\n")
        header.close()
        os.chmod(users_file, 0755)

    def write_config(self, install_root=""):
        self.write_config_console(None)
        self.write_device_map(install_root=install_root)
        self.write_defaults(install_root=install_root)

        # if we fail to setup password auth we should complete the
        # installation so the system is at least bootable
        try:
            self.write_password_config(install_root=install_root)
        except (BootLoaderError, OSError, RuntimeError) as e:
            log.error("bootloader password setup failed: %s" % e)

        # make sure the default entry is the OS we are installing
        entry_title = "%s Linux, with Linux %s" % (productName,
                                                   self.default.version)
        rc = iutil.execWithRedirect("grub2-set-default",
                                    [entry_title],
                                    root=install_root,
                                    stdout="/dev/tty5", stderr="/dev/tty5")
        if rc:
            log.error("failed to set default menu entry to %s" % productName)

        # now tell grub2 to generate the main configuration file
        rc = iutil.execWithRedirect("grub2-mkconfig",
                                    ["-o", self.config_file],
                                    root=install_root,
                                    stdout="/dev/tty5", stderr="/dev/tty5")
        if rc:
            raise BootLoaderError("failed to write bootloader configuration")


If the boot disk already has a MBR and you are not going to use the whole disk for Fedora 16, the disk remains MBR and is not changed to GPT.

If the boot disk is already GPT, Anaconda checks to see that a BIOS Boot partition exists and errors if the partition does not exist. I have not done a detailed examination but I do not think that Anaconda actually checks that the size of an existing BIOS Boot partition is at least some minimum size.

Note the undocumented minimum and maximum limits on the size of a BIOS Boot partition that are embedded in Anaconda. According to the GRUB2 documentation a minimum of 31 KiB is required with 512K recommended for possible future use. See …/formats/biosboot.py.

class BIOSBoot(DeviceFormat):
    """ BIOS boot partition for GPT disklabels. """
    _type = "biosboot"
    _name = "BIOS Boot"
    _udevTypes = []
    partedFlag = PARTITION_BIOS_GRUB
    _formattable = True                 # can be formatted
    _linuxNative = True                 # for clearpart
    _bootable = True                    # can be used as boot
    _maxSize = 2                        # maximum size in MB
    _minSize = 0.5                      # minimum size in MB

....


The actual space needed by GRUB2 depends on the number of GRUB2 modules you add to core.img when building the image.

If Anaconda goes the GRUB2 route, it sets up the new GPT with a BIOS Boot partition. A BIOS Boot partition is a partition on a data device that may be used by BIOS-based systems in order to boot when the partition table of the device is a GPT label. The de facto Globally Unique Identifier (GUID) for a GPT BIOS Boot partition is 21686148-6449-6E6F-744E-656564454649. On disk, I understand that this translates to Hah!IdontneedEFI when you byteswap the first 8 octets of the string but I have not checked.

The concept behind a BIOS Boot partition is not something particularly new. On BIOS-based computers, boot loaders’ images are larger than can be fitted on a single disk block or two. To overcome this inherent limitation, boot loaders are often split into a number of stages. For instance, GRUB Legacy has Stage1 code that lives in bytes 0 to 445 of the MBR, i.e. LBA0 (Logical Block Address), of a disk, and other code (the core image) that lives in what is known as the post-MBR gap. Colin Watson referred to this space as the boot track on the GRUB mailing list but I have never come across this terminology before. The post-MBR gap is the 63 contiguous sectors (more precisely, one track) immediately following LBA0. This comes from the early days of DOS where the default starting offset for the first partition on a hard disk drive was sector 0x3F (63). Where did 63 sectors come from? Well in the early days of PC hard disks, disks typically had 63 sectors per track. 63 sectors translates to 31.5 KiB.

Using a post-MBR gap was never a good idea because there is no agreement about who or what could use these sectors. GRUB Legacy simply assumed the sectors belonged to it. Over the years there have been a number of other users of some of the sectors in this gap. It is unmanaged space, and has all the potential problems of unmanaged space. There are also partitioning systems out there that do not reserve any space after the MBR and some that reserve more. For example, Microsoft Vista and later use a 1-MiB alignment boundary instead of the traditional 63 sectors.

An alternative mechanism is to install the GRUB core image in a file system and the list of the blocks that comprise the core image is stored in the first sector of that partition. However, this also has its drawbacks. A bootloader is vulnerable to its blocks being moved around by certain filesystem features such as tail packing (for example XFS, ReiserFS), or aggressive fsck implementations. The bootloader must also determine the correct drive number.

Along came the GPT specification in the last 1990s as part of Intel’s Extensive Firmware Interface (EFI) effort. GPT is now part of the UEFI specification and is defined in section 5 of that specification. The GPT specification does not provide for a post-MBR gap – so a small number of concerned parted developers came up with the concept of reserving a GPT partition to contain code that previously was placed in the post-MBR gap.

If you want to know more about the history of how the BIOS Boot partition came into being, read the parted-devel mailing list for December 2007 and the first few months of 2008. It is quite interesting to read as it shows you, with the benefit of hindsight, how poorly thought out open source design decisions can be when there is no one individual taking ownership of the problem.

Support for the concept of a BIOS Boot partition by parted and GRUB2 is not something that has only happened in the last few months. Such support was initially added by Robert Millan to parted by this patch in February 2008. If you want to look at the source for parted, a web interface to the parted GIT is available here. By the way, Millan also wrote the initial Wikipedia entry for BIOS Boot partition way back in February 2008 and was an early experimenter with hybrid GPT-MBR disk labels and GRUB2.

Are there other ways of to boot a GPT disk on a BIOS-based platform? Absolutely. For example, you can put the boot code in a filesystem and hardcode its offset or you can assume that filesystems reserve some space at the beginning of their partitions and put the boot code there. Another way is to use a hybrid MBR-GPT label.

The following document by H. Peter Arwin of SYSLINUX fame, who currently works for Intel, is interesting. From …/syslinux/doc/gpt.txt initially authored May 13, 2008:

			  

                   GPT boot protocol

There are two ways to boot a GPT-formatted disk on a BIOS system.
Hybrid booting, and the new GPT-only booting protocol originally
proposed by the author, and later adopted by the T13 committee in
slightly modified form.


	*** Hybrid booting ***

Hybrid booting uses a standard MBR, and has bootable ("active")
partitions present, as partitions, in the GPT PMBR sector.  This means
the PMBR, instead of containing only one "protective" partition (type
EE), may contain up to three partitions: a protective partition (EE)
*before* the active partition, the active partition, and a protective
partition (EE) *after* the active partition.  The active partition is
limited to the first 2^32 sectors (2 TB) of the disk.

All partitions, including the active partition, should have GPT
partition entries.  Thus, changing which partition is active does NOT
change the GPT partition table.

This is the only known way to boot Microsoft operating systems from a
GPT disk with BIOS firmware.


	*** New protocol ***

This defines the T13-approved protocol for GPT partitions with BIOS
firmware.  It maintains backwards compatibility to the extent
possible.  It is implemented by the file mbr/gptmbr.bin.

The (P)MBR format is the normal PMBR specified in the UEFI
documentation, with the first 440 bytes used for the boot code.  The
partition to be booted is marked by setting bit 2 in the GPT Partition
Entry Attributes field (offset 48); this bit is reserved by the UEFI
Forum for "Legacy BIOS Bootable".


    -> The handover protocol

The PMBR boot code loads the first sector of the bootable partition,
and passes in DL=, ES:DI=, sets EAX to
0x54504721 ("!GPT") and points DS:SI to a structure of the following
form:

	Offset	Size	Contents
	---------------------------------------------------------
	  0	  1	0x80 (this is a bootable partition)
	  1	  3	CHS of partition (using INT 13h geometry)
	  4	  1	0xED (partition type: synthetic)
	  5	  3	CHS of partition end
	  8	  4	Partition start LBA
	 12	  4	Partition end LBA
	 16	  4	Length of the GPT entry
	 20	varies	GPT partition entry

The CHS information is optional; gptmbr.bin currently does *NOT*
calculate them, and just leaves them as zero.

Bytes 0-15 matches the standard MBR handover (DS:SI points to the
partition entry), except that the information is provided
synthetically.  The MBR-compatible fields are directly usable if they
are < 2 TB, otherwise these fields should contain 0xFFFFFFFF and the
OS will need to understand the GPT partition entry which follows the
MBR one.  The "!GPT" magic number in EAX and the 0xED partition type
also informs the OS that the GPT partition information is present.

Syslinux 4.00 and later fully implements this protocol.


The T13 committee referred to above is the T13 Committee of the INCITS (International Committee for Information Technology Standards). This committee (AT Attachment) is responsible for all interface standards relating to the AT Attachment (ATA) storage interface utilized as the disk drive interface on most personal computers.

The idea behind hybrid patitions is that a boot disk has both a GPT disk label and an MBR disk label. The MBR has the boot partitions(s), aliased from the corresponding GPT partitions, and possibly other partitions below 2 Tb plus the PMBR (Protective Master Boot Record) with all other partitions being defined in the GPT. Christoph Pfisterer wrote a tool called gptsync which can be used to read the GPT partition table on a device and synchronise the PMBR partition table with the GPT partition table.

Hybrid MBR-GPT partitioning is also not something new. Millan demonstrated and advocated an interesting proof of concept hybrid GPT-MBR scheme in the 2008 timeframe but abandoned his work when Peter Arvin started work on standardizing his handover protocol (see above). However, what emerged from the standardization process is slightly different. A GPT partition record now includes a Legacy BIOS Bootable bit that can be set to indicate that the partition can be used for something like the BIOS boot partition. For example, the GRUB2 boot code in LBA0 could search through the GPT partition table for a partition with that bit set to 1 and load its second stage (core.img) from that partition. This algorithm is documented in T13 EDD-4 revision 2.

Hybrid GPT-MBRs have their own set of problems. GPT-aware BIOS-based operating systems can use GPT partitions whereas GPT-unaware BIOS-based operating systems can only use three MBR partitions, PMBR being unavailable. The 2Tb partition size remains for MBR partitions and for non-GPT-aware BIOS-based operating systems. In spite of this issues, recent versions of Microsoft operating systems use a hybrid MBR-GPT for BIOS-based systems.

Where things get confusing with BIOS Boot partitions is the use of a flag named bios_grub by parted to, as Jim Meyering and Millan discussed, "mark the selected partition as usable for BIOS-based boot, so that bootloaders that use static embedding (like GNU GRUB) can put their boot code in it." According to Millan (see Debian Bug #48111), the use of bios_grub was "an unfortunate choice of words, but HPA insisted in using it"; HPA being H. Peter Arwin.

Instead of trying to claim that a BIOS Boot partition is a reserved partition for use by any boot loader, why not acknowledge the reality of things and stop pretending that the BIOS Boot partition can be used by other bootloaders. The current design is as bad as the post-MBR gap. Basically, the parted developers are saying is here is unmanaged disk space, defined by a BIOS Boot partition, which any boot loader can use. Should another boot loader simply assume this space is used and place its code in this partition or should it check first and create another BIOS Boot partition if it finds that some or all of an existing BIOS Boot partition is used. Does a BIOS Boot partition have to be the first partition? See the problem? We are back to the unmanaged space issue that plagued MBR boot disks.

Currently, as far as I am aware, a BIOS Boot partition is only used by GRUB2. It was be better long term for Linux users if the concept of a BIOS boot partition was depreciated and GRUB2 simply used a partition with a GUID that reflected the fact that it was reserved for exclusive use by GRUB2. If other bootloaders needed a similar exclusive partition, they would simply create their own GPT partition with a well-known GUID.

Finally, as of Fedora 16 there must be a BIOS Boot partition for the bootloader to be installed successfully by Kickstart. You can create this partition with the following Kickstart option:

part biosboot --fstype=biosboot --size=1 


Time to finish this blog! As a teaser, I will point out that you do not need a BIOS Boot partition if you plan to to chainload GRUB2 from another bootloader. Can you figure out why?

1 comment to Fedora 16 GPT, GRUB2 and BIOS Boot Partition

  • Ben

    Either way, just another boot partition. And smaller, to boot. :D

    Thanks for the incredibly detailed post.