One of the many delicate issues when provisioning a bare metal server is dealing with GRUB and kernel changes.
If you are coming from an AWS world, this might be a new adventure, but having to install an OS over 10 different Hardware Vendors is not so simple. From testing all the new Kernels, to making sure that we enable the latest drivers of the Mellanox
ConnectX-5 NICs, there are quite a few moving parts in the middle.
NOTE: We will try our best to keep this KB up to date, but the customer will alway be responsible for the changes they do after a server is provisioned, or for custom kernels they want to run.
GRUB (short for GNU GRand Unified Bootloader) is a boot loader package from the GNU Project. Grub provides a user the choice to boot one of multiple operating systems installed on a computer or select a specific kernel configuration available on a particular operating system's partitions.
On each newly provisioned server, we install grub2 on the boot partition (and grub2 tools in Ubuntu).
Additionally, in order to offer console access through our SOS console and to ensure predictable NIC naming, we populate
/etc/default/grub with the necessary configs, including a combination of the following two lines:
GRUB_CMDLINE_LINUX='console=tty0 console=ttyS1,115200n8 biosdevname=0 net.ifnames=1'
GRUB_SERIAL_COMMAND='serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1'
NOTE: When running an upgrade or when installing a new kernel, please alway make sure to keep the local version of /etc/default/grub
We inject a known good kernel on all our provisions. Sometimes it is the same kernel as the current default kernel for the upstream distribution of your chosen operating system, but sometimes it is not.
In all cases we drop the kernel, modules, and a pre-built initramf into the correct locations and maintain /boot/grub/grub.cfg so it all works properly. While these are always official distro kernels, they may not be known to the package manager - as such /boot is the source of truth here.
root@ubuntu1604:~# ls /boot
initrd ← our injected initrd
vmlinuz ← our injected kernel
Part of /boot/grub/grub.cfg:
linux /boot/vmlinuz root=UUID=$ROOT_DEVICE_UUID ....
We make sure what is injected is properly tested on each specific server hardware, has all the right drivers enabled so the server will provision properly.
Do's and Don'ts
After a server is provisioned, you are free to upgrade/downgrade the kernel, or even install a custom one. We recommend that you keep a backup of the original grub config and kernel, so that you can manually boot the server up if anything breaks.
Our SOS console works great if
/etc/default/grub has not changed. If you are not able to access the server even though the console, then you can use our Rescue OS in order to manually roll back to the original (hopefully backed up!) grub and kernel.