Cooking and Baking Linux Distributions in Nix

Nix package manager logo.
Nix package manager

The Linux distribution space is vast and fragmented. It’s the wild wild west, and that’s all right in my book. Having one distribution installed, to a lesser degree means having Provided that the architecture is the same. installed. Packages in one distribution may be unavailable in another but it’s always nice to be able to run unavailable programs without much ado. No one can stop you from spinning up a “container” or virtual machine with Arch Linux to try out an idea, while mixing in packages from NixOS to compare or hack around. The process of Bedrock Linux is a fully cooked/baked Linux distribution. binaries and creating ad–hoc virtual machines is sometimes called “cooking” or “baking”.

The terms “cook” and “bake” is a bit of jargon. In this context and for my purposes “cooking” refers to the cherry picking of various binaries and files from multiple Linux distributions into a final system, and “baking” refers to the process of injecting a Linux kernel into a “cooked” distribution image for boot ability. The recipe for a “cake” — a cook and bake, can be composed in any preferred programming language or process.

Countless programs offer a way to cook or bake systems in some form. LinuxKit, Darch, BitBake, and manual chroots (change root) are just a few. Let’s use Nix , nix-shell, and PRoot (user space chroot) to produce a basic environment that lets us cook and bake various distributions.

Nix Shell

The nix-shell command sets up an interactive shell environment according to the specified nix code. A minimal shell.nix file below creates an empty interactive shell environment with a custom PS1 (Prompt String #1).

let

  name = "nix-shell";
  pkgs = import <nixpkgs> { };

in pkgs.mkShell {

  inherit name;

  buildInputs = [ ];

  shellHook = ''
    export PS1='\h (${name}) \W \$ '
  '';
}

This shell.nix sets a name that describes the environment, buildInputs that adds a list of programs to the new shell path, and a shellHook that runs abitrary commands after nix-shell invocation.

Executing nix-shell without a file path acts on the default.nix or shell.nix in the current directory.

$ nix-shell

Adding --pure as an argument to nix-shell clears the majority of the environment variables before entering the nix shell environment.

$ nix-shell --pure

The --command argument can be used to further cleanup the environment variables by specifying a shell of choice and the sourcing procedure.

nix-shell --pure --command 'bash --login --norc --noprofile'

Cooking

Now that the nix-shell summary is out of the way — let’s set up a basic shell.nix that implements the cooking functionality. Using the previous minimal shell.nix setup, the interactive environment is named, the pkgs attribute locked to a specific version of nixpkgs, and a file system for Alpine Linux During the time this article was written there were certificate issues with Alpine Docker version 3.14. My stack rarely uses docker, so that was a minor inconvenience. 3.12 Cheap way to obtain slim root file systems. using pkgs.dockerTools.pullImage.

let

  name = "nix-shell.cake";

  pkgs = import (builtins.fetchTarball {
    url = "https://releases.nixos.org/nixos/21.05/nixos-21.05.650.eaba7870ffc/nixexprs.tar.xz";
    sha256 = "08fpds1bkv9106c6s5w3p5r4v3dc24bhk9asm9vqbxxypjglqg9l";
  }) { };

  alpine-3-12-amd64 = pkgs.dockerTools.pullImage rec {
    imageName = "alpine";
    imageDigest = "sha256:2a8831c57b2e2cb2cda0f3a7c260d3b6c51ad04daea0b3bfc5b55f489ebafd71";
    sha256 = "1px8xhk0a3b129cc98d3wm4s0g1z2mahnrxd648gkdbfsdj9dlxp";
    finalImageName = imageName;
    finalImageTag = "3.12";
  };

in pkgs.mkShell {

  inherit name;

  buildInputs = [ ];

  shellHook = ''
    export PS1='\h (${name}) \W \$ '
  '';
}

Inside the let block, the cook function creates a derivation that extracts a rootfs, passes it to proot for bootstrapping, and cherry picks contents into the file system. The nix sandbox defaults to a strict policy that prohibits Internet access. The derivation’s hash mode is made recursive to allow Internet access inside the proot. If sandboxing is In the nix configuration file nix.conf. to relaxed, then Internet access is allowed by setting the __noChroot option to true in the derivation without the need for a recursive hash. Most attributes set in a derivation are exposed as bash Many would gasp while reading some parts of the nixpkgs source code —bash is everywhere. See, The dark secret of nixpkgs. in a phase.

{
  cook = { name, src, contents ? [ ], path ? [ ], script ? "", prepare ? "", cleanup ? "", sha256 ? pkgs.lib.fakeSha256 }: pkgs.stdenvNoCC.mkDerivation {
    inherit name src contents;
    phases = [ "unpackPhase" "installPhase" ];
    buildInputs = [ pkgs.proot pkgs.rsync pkgs.tree pkgs.kmod ];
    bootstrap = pkgs.writeScript "bootstrap-${name}" ''
      ${script}
      rm "$0"
    '';
    installPhase = ''
      set -euo pipefail
      mkdir --parents rootfs $out/rootfs
      tar --extract --file=layer.tar -C rootfs

      ${prepare}

      cp $bootstrap rootfs/bootstrap
      proot --cwd=/ --root-id --rootfs=rootfs /usr/bin/env - /bin/sh -euc '. /etc/profile && /bootstrap'
      printf 'PATH=${pkgs.lib.strings.makeBinPath path}:$PATH' >> rootfs/etc/profile

      [ -n "$contents" ] && {
        printf "\n"
        for paths in $contents; do
          printf "Cooking... Adding %s \n" "$paths"
          rsync --copy-dirlinks --relative --archive --chown=0:0 "$paths/" "rootfs" || exit 1
        done
        printf "\n"
      } || printf '\n%s\n' 'No contents to cook.';

      ${cleanup}

      printf '\n%s\n\n' "$(du --all --max-depth 1 --human-readable rootfs | sort --human-numeric-sort)"
      cp -rT rootfs $out/rootfs
    '';
    outputHashAlgo = "sha256";
    outputHashMode = "recursive";
    outputHash = sha256;
  };
}

This soft abstraction allows us to minimally cook a distribution. Binaries are added in from the nix store or from other distributions. The NixOS packages glibc and awk are thrown into the mix just for fun. GNU’s awk is added to the system path and serial console ttyS0 activated for low level virtual machine access.

{
  alpine = cook {
    name = "alpine";
    src = alpine-3-12-amd64;
    contents = [ pkgs.glibc pkgs.gawk ];
    path = [ pkgs.gawk ];
    script = ''
      apk update
      apk upgrade
      apk add openrc
      sed -i 's/#ttyS0/ttyS0/' /etc/inittab
      printf 'migh7Lib\nmigh7Lib\n' | adduser alpine
    '';
  };
}

Notice that the sha256 is left off due to the chicken or the egg problem. The cooking hash is unknown until the shell completes the derivation because as you can imagine running apk update breaks purity. Once the image is cooked to my liking, the hash is added to save the derivation as a The output is referred to by hash and cached inside the nix store. Nothing else matters. point output.

This cooked alpine file system is good enough for useful work. Inside pkgs.mkShell serve up the system by running proot from the shellHook. Pass through selected host directories to the guest using the --bind argument with proot. These host directories will Depending on your use case, /nix and /home may be convenient pass through directories on the guest system. inside the guest.

pkgs.mkShell {

  inherit name;

  buildInputs = [ pkgs.proot pkgs.qemu ];

  shellHook = ''
    export PS1='\h (${name}) \W \$ '
    proot --cwd=/ --rootfs=${alpine}/rootfs --bind=/proc --bind=/dev /usr/bin/env - /bin/sh -c '. /etc/profile && sh'
    exit
  '';
}

Check out my cake.nix shell that creates this environment. You can Discovered in this discussion on invoking the contents of a URL in a nix-shell. this shell using a nix-shell argument --expr on the raw format URL. Never ever run code from a URL in your shell without first downloading and auditing the code. The command below is like a curl to bash invocation.

nix-shell -E 'import (builtins.fetchurl "https://raw.githubusercontent.com/tdro/dotfiles/b9a9051a06d7bba2911c2ed0da0f2e73b4b5de81/.config/nixpkgs/shells/cake.nix")'

Baking

The bake function begins by cooking the initrd (initial ramdisk) image. The initial RAM (Random Access Memory) disk is the first root file system the kernel sees after initialization. The simplest initrd my mind can come up with is worked out in a cooking script. Kernel modules are prepared and cooked before baking.

{
  bake = { name, image, size ? "1G", debug ? false, kernel ? pkgs.linux, options ? [ ], modules ? [ ], uuid ? "99999999-9999-9999-9999-999999999999", sha256 ? pkgs.lib.fakeSha256 }: let
    initrd = cook {
      inherit sha256;
      name = "initrd-${name}";
      src = alpine-3-12-amd64;
      script = ''
        rm -rf home opt media root run srv tmp var
        printf '#!/bin/sh -eu
        mount -t devtmpfs none /dev
        mount -t proc none /proc
        mount -t sysfs none /sys
        sh /lib/modules/initrd/init
        ${pkgs.lib.optionalString (debug) "sh +m"}
        mount -r "$(findfs UUID=${uuid})" /mnt
        mount -o move /dev /mnt/dev
        umount /proc /sys
        exec switch_root /mnt /sbin/init
        ' > init
        chmod +x init
        find . ! -name bootstrap ! -name initramfs.cpio | cpio -H newc -ov > initramfs.cpio
        gzip -9 initramfs.cpio
      '';
      prepare = ''
      modules='${pkgs.lib.strings.concatMapStringsSep " " (module: module) modules}'
      initrd_directory=rootfs/lib/modules/initrd
      [ -n "$modules" ] && {
      mkdir --parents "$initrd_directory"
      printf "\n"
      for module in $modules; do
        module_file=$(find ${kernel} -name "$module.ko*" -type f)
        module_basename=$(basename "$module_file")
        printf "Cooking initrd... Adding module %s \n" "$module"
        cp "$module_file" "$initrd_directory" || exit 1
        printf 'insmod /lib/modules/initrd/%s\n' "$module_basename" >> "$initrd_directory/init"
      done
      } || printf '\n%s\n' 'No modules to cook.'
      '';
    }; in pkgs.writeScript name ''
      # Baking Script
    '';
}

The script for the bake function creates a disk image, formats its partitions, installs the syslinux bootloader, and injects the kernel. Some commands are partially parameterized for more control.

{
  pkgs.writeScript name ''
    set -euo pipefail
    PATH=${pkgs.lib.strings.makeBinPath [
        pkgs.coreutils
        pkgs.e2fsprogs
        pkgs.gawk
        pkgs.rsync
        pkgs.syslinux
        pkgs.tree
        pkgs.utillinux
      ]
    }
    IMAGE=${name}.img
    LOOP=/dev/loop0
    ROOTFS=rootfs
    rm "$IMAGE" || true
    fallocate --length ${size} $IMAGE && chmod o+rw "$IMAGE"
    printf 'o\nn\np\n1\n2048\n\na\nw\n' | fdisk "$IMAGE"
    dd bs=440 count=1 conv=notrunc if=${pkgs.syslinux}/share/syslinux/mbr.bin of="$IMAGE"
    mkdir --parents "$ROOTFS"
    umount --verbose "$ROOTFS" || true
    losetup --detach "$LOOP" || true
    losetup --offset "$((2048 * 512))" "$LOOP" "$IMAGE"
    mkfs.ext4 -U ${uuid} "$LOOP"
    mount --verbose "$LOOP" "$ROOTFS"
    rsync --archive --chown=0:0 "${image}/rootfs/" "$ROOTFS";
    mkdir --parents "$ROOTFS/boot"
    cp ${kernel}/bzImage "$ROOTFS/boot/vmlinux"
    cp ${initrd}/rootfs/initramfs.cpio.gz "$ROOTFS/boot/initrd"
    printf '
    DEFAULT linux
    LABEL linux
      LINUX  /boot/vmlinux
      INITRD /boot/initrd
      APPEND ${pkgs.lib.strings.concatMapStringsSep " " (option: option) options}
    ' > "$ROOTFS/boot/syslinux.cfg"
    extlinux --heads 64 --sectors 32 --install $ROOTFS/boot
    printf '\n%s\n\n' "$(du --max-depth 1 --human-readable $ROOTFS | sort --human-numeric-sort)"
    umount --verbose "$ROOTFS"
    rm -r "$ROOTFS"
    losetup --detach "$LOOP"
  '';
}

Inside the let block, expand further and bake the cooked image. The desired kernel version is injected, and kernel options for tracking the virtual console tty1 and serial console ttyS0 are added. The modules are baked in dependency order using a minimal Reminds me of a tiny ramdisk project repository. Module dependency order is determined using modprobe with the --show-depends argument on a live machine. These modules are just This is bare minimum for a virtual machine. to sustain a virtio based block device and a root switch to an ext4 file system.

{
  alpine-machine = bake {
    name = "alpine-machine";
    image = alpine;
    kernel = pkgs.linuxPackages_5_10.kernel;
    options = [ "console=tty1" "console=ttyS0" ];
    size = "128M";
    modules = [
      "virtio"
      "virtio_ring"
      "virtio_blk"
      "virtio_pci"
      "jbd2"
      "mbcache"
      "crc16"
      "crc32c_generic"
      "ext4"
    ];
  };
}

Baking delegates to the shellHook because keeping the implementation inside the nix derivation sandbox requires specialized wizardry. It’s easier to use privileged programs such as doas or sudo within the shell environment to bake the image. Once baked, initialize the image inside a virtual machine using QEMU (Quick Emulator) in -curses or -nographic serial mode.

pkgs.mkShell {

  inherit name;

  buildInputs = [ pkgs.proot pkgs.qemu ];

  shellHook = ''
    export PS1='\h (${name}) \W \$ '
    doas ${alpine-machine}
    sudo ${alpine-machine}
    qemu-system-x86_64 -nographic -drive if=virtio,file=./${alpine-machine.name}.img,format=raw
    exit
  '';
}

Conclusion

The nix-shell paradigm is extremely powerful. The functional compositions give way to replacing a lot of tools with minimal effort through abstraction. A lot more could be done with more advance usage of this type of nix-shell but it is good enough to bind a few distributions into my user environment and cook up small virtual machines quickly.

My main systems run somewhat lean NixOS setups, and while spinning out nix code is easier for me now, the language seems more like a glue, and getting too much of it everywhere may slow you down. Escape pods like virtual machines and easily bound user space chroots are nice to have.

Updated 4 July 2021
View Source