Linux Optimus Setup
I’ve finally obtained a laptop. It is an end-of-life Metabox Prime-S P950EP; based on the Clevo P950EP6. Its sole purpose is for mobile development and demonstration of `Edict' (and other associated software) at some meetups around Melbourne; I will continue to use my desktop for almost all day-to-day work.
It includes what has become the standard combination of an Intel GPU, for simple desktop workloads, and an nVidia GPU, for the more resource intensive graphics operations.
Naturally (for me) I’m using Linux for a majority of my work. However I rapidly encountered a constellation of platform quirks and driver frictions that one hears about so often in the Linux community in relation to nVidia devices.
If your device hangs with a black screen as your graphical environment is starting, and the GPU is disabled, you may need to workaround some ACPI bugs.
My particular laptop appears to require Windows 10 ACPI functionality is disabled. We can do this by adding an option to the kernel’s command line; in my case by appending this line to my default GRUB configuration:
Other laptops require some combination of `Windows 2009', `Windows 2013', `Windows 2015'; or the negation of some or all of these. Unfortunately, the only way to discover the correct combination may be through brute force.
The most commonly cited mechanism for switching between GPUs on Linux is a combination of
Bumblebee provides an method to execute an application using a hidden X server (for exclusive use of the nVidia driver), and an environment that is modified to promote nVidia’s libGL.so (and friends) above the default Intel installation.
bbswitch is a kernel module that provides a robust mechanism to power down (and up) the nVidia GPU, and manage the loading and unloading of the
nvidia kernel module.
However, given that this appears to trigger ACPI related system hangs on newer systems the better option appears to be avoiding the use of
bbwitch altogether. Instead we can rely on the kernel’s default PCIe power management facilities.
The above prevents the
nvidia module from being automatically loaded at boot, but does not prevent it from being manually loaded as the
modules_blacklist kernel parameter does.
Bumblebee requires the
PMMethod directive is set to
none so as to avoid the use of
bbswitch. It will instead default to the kernel’s power management system.
The kernel will only power down the device when the driver is unloaded, so we also require
Alas, while the driver was not loaded at the point my greeter was displayed, it was loaded at some point while XFCE was starting up.
After evaluating some overkill solutions to answering the
who loaded the module' question via `systemtap I instead used a technique usually used for blacklisting the module.
The install directive will execute the listed command instead of loading the module.
Instead of actually loading the module we’ll dump a list of all running processes in the system. With a bit of luck one might see a likely candidate.
In our case the likely offender was
nvidia-settings which is a sufficiently unique name to just grep `/etc' and come out with a call to `/etc/X11/xinit/xinitrc.d/95-nvidia-settings'. It’s extraodinarily easy to accidentally trigger nVidia binaries/libraries into loading the kernel module; so anything related is a good candidate.
95-nvidia-settings script belongs to the
nvidia-drivers package which we obviously can’t remove. But we can disable it by removing execute permissions from the script (and thus punt the problem back to our future selves when we next reinstall the driver and undo our changes).
Now that the driver is likely to be unloaded by default we can set the PCIe bus to automatically power down when idle.
echo "auto" > /sys/bus/pci/devices/0000:01:00.0/power/control
An easy method for this is to use something like
powertop --auto-tune, or automate it via a
Laptop Mode Tools' rule, or via the `systemd
To verify we’ve got the correct behaviour after all this we reboot, login and then check:
- The GPU fan isn’t overly loud, and
lsmod | grep nvidiadoes not report any loaded modules, and
optirun glxinfo | grep NVIDIAreports the vendor is some variant of `NVIDIA'
I hope this helps someone avoid a goodly number of painful days rebooting their system.