Hardware discovery and kernel auto-configuration in MAAS
by Andres Rodriguez on 12 September 2019
In this blog, we are going to explore how to leverage MAAS for hardware discovery and kernel auto-configuration using tags.
In many cases, certain pieces of hardware require extra kernel parameters to be set in order to make use of them. For example, when configuring GPU passthrough we will typically need to configure the GPU card with specific kernel parameters. To achieve this, we will rely on MAAS’ hardware discovery, Xpath expressions and machine tags.
Tags, XPath expressions and kernel parameters
Machine tags is a mechanism used in MAAS to easily identify machines. While tags can be manually assigned to machines, they can also be automatically assigned if those machines match a specific pattern – the XPath expression – which describes the location of an element or an attribute in an XML document.
When commissioning a machine, MAAS gathers the lshw output (in XML) which lists all the information about the attached hardware. When creating a tag, MAAS allows to provide the XPath definition. This definition is then matched to the gathered lshw information. If this matches, the tag will be applied to all of the commissioned machines.
Similarly, when creating a tag one can specify which kernel parameters to apply to the machine by assigning the tag. Combining the definition and the kernel options in the single tag creation will allow MAAS to automatically discover all machines that match the XPath expression and automatically apply the kernel parameters once this machine is deployed. The following demonstrates the base command to use.
$ maas <username> tags create \
definition=’<XPath expression>’ \
kernel_opts=’<Kernel parameters>’
A practical example
As a practical example, we want to configure GPU passthrough. For this, we want to create a tag that automatically matches all machines with Intel VT-d enabled and have a Tesla v100 PCIe 16GB GPU. We do so by using a definition similar to:
definition='//node[@id="cpu:0"]/capabilities/capability/@id = "vmx" and //node[@id="display"]/vendor[contains(.,"NVIDIA")] and //node[@id="display"]/description[contains(.,"3D")] and //node[@id="display"]/product[contains(.,"Tesla V100 PCIe 16GB")]'
But since we want this to be configured at deployment time, we want to set the kernel parameters to apply on a deployed machine:
kernel_opts="nomodeset modprobe.blacklist=nouveau,nvidiafb,snd_hda_intel nouveau.blacklist=1 nouveau.blacklist=1 nouveau.blacklist=1 video=vesafb:off,efifb:off intel_iommu=on rd.driver.pre=pci-stub rd.driver.pre=vfio-pci pci-stub.ids=10de:1db4 vfio-pci.ids=10de:1db4 vfio_iommu_type1.allow_unsafe_interrupts=1 vfio-pci.disable_vga=1"
These kernel parameters will:
- Blacklist drivers and disable displays
- Enable IOMMU
- Pre-load kernel modules
- And reserve PCI ID (10de:1db4) for GPU Passthrough
As such, creating a tag that will auto-apply to all machines that match the hardware definition and apply kernel parameters at deployment time will look like this:
$ maas <username> tags create name=gpgpu-tesla-vi \
comment="Enable passthrough for Nvidia Tesla V series GPUs
on Intel" \
definition='
//node[@id="cpu:0"]/capabilities/capability/@id = "vmx"
and //node[@id="display"]/vendor[contains(.,"NVIDIA")]
and //node[@id="display"]/description[contains(.,"3D")]
and //node[@id="display"]/product[contains(.,"Tesla V100
PCIe 16GB")]' \
kernel_opts="console=tty0 console=ttyS0,115200n8r nomodeset
modprobe.blacklist=nouveau,nvidiafb,snd_hda_intel
nouveau.blacklist=1 nouveau.blacklist=1
nouveau.blacklist=1 video=vesafb:off,efifb:off
intel_iommu=on rd.driver.pre=pci-stub
rd.driver.pre=vfio-pci pci-stub.ids=10de:1db4
vfio-pci.ids=10de:1db4
vfio_iommu_type1.allow_unsafe_interrupts=1
vfio-pci.disable_vga=1"
Once this tag is created, every time a new machine is commissioned MAAS will automatically apply this tag if machines match the definition, allowing administrators to configure their homogeneous hardware at scale by simply defining a few set of tags.
For more information, please contact us or visit https://maas.io/docs/tags .
Related posts
MAAS Outside the Lines
Far from the humdrum of server setups, this is about unusual deployments – Raspberry Pis, loose laptops, cheap NUCs, home appliances, and more. What the heck is stormrider deploying this week? […]
Data Centre AI evolution: combining MAAS and NVIDIA smart NICs
It has been several years since Canonical committed to implementing support for NVIDIA smart NICs in our products. Among them, Canonical’s metal-as-a-service (MAAS) enables the management and control of smart NICs on top of bare-metal servers. NVIDIA BlueField smart NICs are very high data rate network interface cards providing advanced s […]
A call for community
Introduction Open source projects are a testament to the possibilities of collective action. From small libraries to large-scale systems, these projects rely on the volunteer efforts of communities to evolve, improve, and sustain. The principles behind successful open source projects resonate deeply with the divide-and-conquer strategy, a […]