Friday, 23 March 2012

Xen Part 9: PCI Passthrough

One of my requirements was to expose a dedicated graphics card to the guest OS. Before you start this, make sure you have switched over to using xl instead of xm (or you may encounter some surprises with pci-list-assignable-devices).

This is how pci passthrough is going to work. The xen-pciback driver is loaded and bound to the dedicated graphics card. xen-pciback exposes the graphics card to Xen, and we instruct Xen to offer it to a guest OS.

Loading PCIBack

Debian Wheezy's 3.2.0-1 kernel includes xen-pciback as a module, which means we have to do a little work to load the module and to bind the graphics card to it. First, we load the pciback module:

# modprobe xen_pciback
# echo "xen_pciback" >> /etc/modules

Binding the Graphics Card

Now we have to bind the graphics card to the pciback driver. 

To do this, we have to know the pci ID of the card in BDF notation. We can find this easily via lspci:

# lspci | grep VGA

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
01:00.0 VGA compatible controller: ATI Technologies Inc Cayman XT [Radeon HD 6970]

In the above example, 01:00.0 is the ID of the graphics card I'm going to passthrough to the guest, and to convert this to BDF notation, all I need to do is add 0000: to the front. Now we have the card's BDF ID, we can bind it to xen-pciback. 

This binding code needs to be added to a startup script and should ideally run before the other xen init scripts. In the following example I have chosen to piggyback off of the /etc/init.d/xencommons init script, but you could just as easily create your own init script and use update-rc.d if you prefer.

# vim /etc/init.d/xencommons

Add this line at the top of the do_start() function:


so it looks like this:

do_start () {
        echo do_start
        local time=0

Add this function in some clear space, e.g. after the end of the do_stop() function:

init_passthrough () {
        # Unbind a PCI function from its driver as necessary
        [ ! -e /sys/bus/pci/devices/$BDF/driver/unbind ] || \
                echo -n $BDF > /sys/bus/pci/devices/$BDF/driver/unbind
        # Add a new slot to the PCI Backend's list
        echo -n $BDF > /sys/bus/pci/drivers/pciback/new_slot
        # Now that the backend is watching for the slot, bind to it
        echo -n $BDF > /sys/bus/pci/drivers/pciback/bind

so it looks like this:

        echo WARNING: Not stopping xenstored, as it cannot be restarted.
init_passthrough () {

Don't forget to change the BDF= line so it uses your BDF ID.

Now that's done, reboot your machine, and check the card is showing as available for passthrough:

# xl pci-list-assignable-devices

Hopefully you got a line of output which matches your BDF ID, like the above. If not, check the pciback module is loaded: 

# lsmod | grep pciback && echo loaded || echo "not loaded"

double-check it is actually working:

# [ -e /sys/bus/pci/drivers/pciback/ ] && echo working || echo "not working"

and check your graphics card is bound:

# BDF=0000:XX:XX.X    <- enter your BDF ID
# [ -e /sys/bus/pci/drivers/pciback/$BDF ] && echo bound || echo "not bound"

Cede the Device to the Guest

All that remains is to expose the device to the guest.

# xl create /etc/xen/ace2x1
# xl console ace2x1
root@ace2x1:~# lspci

On dom0, issue the passthrough command with your BDF ID:

# xl pci-attach ace2x1 0000:01:00.0

See if dom0 thinks passthrough worked:

# cat /var/log/messages | grep pciback | tail -n 1
Mar 18 22:44:26 ace2 kernel: [ 3703.267343] xen-pciback: vpci: 0000:01:00.0: assign to virtual slot 0
# xl pci-list ace2x1
Vdev Device
00.0 0000:01:00.0

Back on the guest, verify that the device can be seen:

root@ace2x1:~# lspci
00:00.0 VGA compatible controller: ATI Technologies Inc Cayman XT [AMD Radeon HD 6900 Series]

If you see errors like these in dmesg, as per this bug:

pci 0000:00:00.0: address space collision: [mem
0xfa8f8000-0xfa8fffff 64bit] conflicts with System RAM [mem
pcifront pci-0: Could not claim resource 0000:00:01.0/4! Device
offline. Try giving less than 4GB to domain.

Then adjust the RAM assigned to the domU to < 4GB. I haven't had time to work out a workaround for this just yet.

Finishing Off

Once you're happy that everything's working, set things up permanently by adding the BDF ID (this time without the 0000: prefix) to domU's config file.

# vim /etc/xen/ace2x1
pci = [ '01:00.0' ]

I recommend you shutdown the guest and start it again. Verify that your device shows under lspci and that the dmesg logs look healthy.

root@ace2x1:~# lspci
00:00.0 VGA compatible controller: ATI Technologies Inc Cayman XT [AMD Radeon HD 6900 Series]

root@ace2x1:~# dmesg | grep -E "(pci.*0000|pcifront)"
[    6.003747] pcifront pci-0: Installing PCI frontend
[    6.003861] pcifront pci-0: Creating PCI Frontend Bus 0000:00
[    6.004351] pci 0000:00:00.0: [1002:6718] type 0 class 0x000300
[    6.004577] pci 0000:00:00.0: reg 10: [mem 0xc0000000-0xcfffffff 64bit pref]
[    6.004744] pci 0000:00:00.0: reg 18: [mem 0xfbe20000-0xfbe3ffff 64bit]
[    6.004860] pci 0000:00:00.0: reg 20: [io  0xe000-0xe0ff]
[    6.005522] pci 0000:00:00.0: supports D1 D2
[    6.007309] pcifront pci-0: claiming resource 0000:00:00.0/0
[    6.007313] pcifront pci-0: claiming resource 0000:00:00.0/2
[    6.007316] pcifront pci-0: claiming resource 0000:00:00.0/4