Sorry, you need to enable JavaScript to visit this website.

Idling Efficiently on Linux*: A Case Study

BY Joe Konno ON Aug 09, 2019

Introduction

This article presents a practical case study of how I got a reference platform to idle deeply on Clear Linux OS. I will go over some tools, Linux kernel debug features, and tricks of the trade I used along the way. This case study is meant to provide first-pass guidance for integrators trying to increase energy efficiency on their Intel platform—running Linux—during idle. I assume that you, the reader, have a high level of Linux expertise, are comfortable running low-level commands as a superuser, and understand that over the course of debugging, data loss or system damage can occur.

Idle and sleep are the parts of power management concerned with device activity and energy consumption when the device is not in active use. This article is concerned with the idle part: when the user has stopped interacting with the system for a short period of time, typically less than 15 minutes. For cases where the user stows a device for an extended period—more than 15 minutes, perhaps many hours—the device is expected to enter a sleep state. Whether the device is idle or sleeping, the device is expected to consume minimal energy. These expectations from you, as the end-user, combined with new regulations, such as California’s new energy regulations for compute devices, are applying downward pressure on how much energy your platform can consume while idle or sleeping. For the sake of your battery life and your local power grid’s capacity, effective idle power management is needed to meet consumer and regulatory requirements.

In order to talk about idle and sleep on Intel platforms running Linux operating systems, like Clear Linux OS, we need to know about S0ix, Intel’s modern approach to platform power management. Briefly, S0ix is shorthand for S0 Idle States. Device manufacturers can build in support for these S0 Idle States, which allow for increased energy efficiency during idle and sleep, without the latency associated with traditional ACPI S3[1].

My colleague, Wendy Wang, published an article about S0ix on Linux in October 2018. If you are unfamiliar with S0ix, I highly recommend you give Wendy’s article a read. Another of my colleagues, Rajneesh Bhardwaj, also posted an article about S0ix debug on Linux in June 2019. Both articles are worthy of your time, so I encourage you to read them.

Please note the following: S0ix is fundamentally a hardware technology, so you must have a device that supports S0ix if you want to use its features.

Case Study: Clear Linux OS on a Laptop

I started working for the Clear Linux Project a couple months ago. Power Management is a passion of mine, so I wanted to see where Clear Linux OS was in terms of idle power management. I knew there was work to do within minutes and I set out to make some changes. This is my account of what I did, how, and why.

The reference platform for this case study is a Dell* XPS 13 9360 laptop. My sole peripheral is Dell’s power adapter for the device. The platform was running Clear Linux OS Desktop. I picked this platform because it supports S0ix (Wendy’s article provides instructions for how to discover this), Intel integrated graphics, and a PSR (panel self-refresh) display. The PSR display was important to me because I wanted to be able to idle deeply with the display on.

The objective: ensure the laptop, during idle, resides in CPU-LPI (low power idle) with Clear Linux OS “out of the box”.

The First Pass

My first task involved determining how deeply my reference platform could idle while running Clear Linux OS. I booted to the Clear Linux desktop, installed the linux-tools bundle, launched gnome-terminal, put it in full-screen mode, and ran the following:

$ sudo turbostat -q -i 60 -n 3 -S -s \
       GFX%rc6,Pkg%pc2,Pkg%pc8,Pkg%pc9,CPU%LPI,SYS%LPI

Turbostat is an excellent first test because it can narrow the field considerably with minimal effort. Each column (arguments to -s) provide residency data as a percentage of elapsed time. I selected column GFX%rc6 to determine residency for graphics idle (render c-states). I also selected a few package c-states (PkgCstates) to see how deeply the package was able to idle. Also, I wanted to see if the package could idle in its deepest idle state (CPU%LPI) or if the system could idle in its deepest S0 Idle state (SYS%LPI).

The results were interesting. I got well over 90% residency in GFX%rc6, which was promising. I got some Pkg%pc8 residency, but zero Pkg%pc9 and zero CPU%LPI residency. Graphics rc6 residency is a key consideration for deep idle, so with a PSR display and high rc6 residency, I would expect the package to be idling in CPU%LPI. So, why was I not getting residency in CPU%LPI?

Next, I ran PowerTop and checked the Tunables tab, looking for Bad settings. Two items were called out:

Bad   Autosuspend for USB Device Touchscreen [ELAN]
Bad   Runtime PM for PCI Device Toshiba America Info Systems Device 0116

If you do not run Clear Linux OS, the list of “bad” tunables will be much longer, due to a Clear Linux OS optimization I will discuss later.

I had two areas of concern:

  • rc6 but no CPU%LPI during idle
  • two “bad” tunable devices

Graphics

As my reference platform has Intel graphics (driven by the i915 driver), I did some dmesg grepping[2]. Modern Intel graphics have a loadable firmware component (the DMC) which plays a key role in graphics power management, enough that it’s required in order to achieve deep idle. The Clear Linux OS native/bare-metal kernel builds the i915 driver into the kernel, so the loadable firmware has to be available to the driver early, before the root filesystem is mounted[3]. This can also be verified through an i915 debugfs node, specifically the fw loaded: field:

$ sudo cat /sys/kernel/debug/dri/0/i915_dmc_info

At the time, Clear Linux OS only used an initrd (initial ramdisk) when the root filesystems was encrypted, and said initrd did not have the DMC firmware files included. So, in release 29760, my colleagues and I introduced general initrds into the Clear Linux OS native boot flow so DMC loadable firmware files were available to the i915 device driver at the right time.

So, i915 was now able to load DMC firmware reliably during early boot and do its own power management. This was verified through dmesg as well as debugfs:

$ sudo cat /sys/kernel/debug/dri/0/i915_dmc_info
fw loaded: yes
path: i915/kbl_dmc_ver1_04.bin
version: 1.4

How did we fare? I re-ran my initial check with turbostat... alas, turbostat reported the platform was still stuck at PkgC8.

I decided to re-run my turbostat initial check with the display off. I went into the Gnome* system settings, into the Power panel, and reduced the Blank Screen time to 1 minute. I let the initial check run. After 3 minutes passed, I woke the display back up and checked turbostat’s output: still no CPU%LPI.

Then I ran sudo powertop --auto-tune (remember those two “bad” list items?) and my turbostat checks again, display on and display off. Still no CPU%LPI, regardless of the display’s state. I started a discussion with my colleagues through another bug filing.

I could verify i915 was loading its DMC firmware, but no apparent progress on the CPU%LPI residency front. Past experience had me thinking about bad software actors in play: device drivers, system services, or otherwise. So, it was time to do the tried and true power management debug exercise: “clear the deck”, and re-introduce system services one by one.

I made sure weston was installed, so I could start building up from a minimal graphical environment once the deck was clear. There is some setup to do, particularly in the “Running Weston” section, so a typical user can invoke weston-launch.

$ sudo swupd bundle-add weston weston-extras

The first thing I did was disable GDM, the Gnome Display Manager. This ensured I would get a text console and no graphical environment and, most importantly, no Gnome services:

$ sudo systemctl disable gdm

Then I rebooted. I logged into my text console, and noted all running system.slice services:

$ sudo systemctl status

Now that the deck was clear of services, it was time to clear the “bad” tunables. So, I ran:

$ sudo powertop --auto-tune

Then, I ran my initial turbostat check once more, and I saw something very interesting (first line only):

GFX%rc6     Pkg%pc2     Pkg%pc8     Pkg%pc9     CPU%LPI     SYS%LPI
100.00      2.09        17.67       0.00        72.70       0.00

I was now seeing CPU%LPI, meaning that the CPU package, with the display on, was able to idle in its deepest idle state. So, this narrowed the universe of possibles to software or services launched between a text console and the Gnome display manager.

Now it was time to start introducing system services. Time to start weston.

$ weston-launch

I then launched weston-terminal, put it into full-screen mode, and re-ran my turbostat check. I got some interesting results (first line only):

GFX%rc6     Pkg%pc2     Pkg%pc8     Pkg%pc9     CPU%LPI     SYS%LPI
99.98       1.33        21.03       0.00        74.67       0.00

This exercise established that I could, with a minimal graphical environment, achieve CPU%LPI residency with the display on. Recall that, under Gnome, running powertop --auto-tune did not translate to CPU%LPI residency, so it’s hard to blame a device driver at this juncture. Further, the “clear the deck” exercise seemed to indicate something within the Gnome services ecosystem was at fault.

Gnome

The weston exercise showed I could get CPU%LPI residency in a minimal graphical environment. Still not done yet, but all activities to this point have brought the spotlight on the Gnome services ecosystem. It was time to start digging.

For the next round of investigation, I re-enabled GDM:

$ sudo systemctl enable gdm

And rebooted. Once I got the graphical login prompt, I elected to log into a weston desktop. I then launched weston-terminal and full-screened it. Then:

$ sudo powertop --auto-tune

Followed by our turbostat check, which yielded (first line only):

GFX%rc6     Pkg%pc2     Pkg%pc8     Pkg%pc9     CPU%LPI     SYS%LPI
99.98       18.52       75.23       0.00        0.00        0.00

This information shows us that logging into a weston desktop via the Gnome display manager affected how my reference platform idled. Weston, when invoked after “clearing the deck” and launching weston-launch, saw CPU%LPI residency. Now, we have zero. This is where it is appropriate to scrutinize all running system services.

$ sudo systemctl status
* localhost
      State: running
      Jobs: 0 queued
   Failed: 0 units
      Since: Fri 2019-06-28 14:59:41 PDT; 5min ago
   CGroup: /
            ├─user.slice
            │ └─user-1000.slice
            │   ├─user@1000.service
            │   │ ├─gvfs-daemon.service
            │   │ │ ├─741 /usr/libexec/gvfsd
            │   │ │ └─746 /usr/libexec/gvfsd-fuse /run/user/1000/gvfs -f -o big_writes
            │   │ ├─init.scope
            │   │ │ ├─703 /usr/lib/systemd/systemd --user
            │   │ │ └─705 (sd-pam)
            │   │ └─dbus.service
            │   │   └─722 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
            │   └─session-2.scope
            │     ├─689 gdm-session-worker [pam/gdm-password]
            │     ├─720 /usr/libexec/gdm-wayland-session weston
            │     ├─724 weston
            │     ├─726 /usr/libexec/weston-keyboard
            │     ├─727 /usr/libexec/weston-desktop-shell
            │     ├─728 /usr/bin/weston-terminal
            │     ├─729 /bin/bash
            │     ├─767 sudo bash
            │     ├─768 bash
            │     └─836 systemctl status
            ├─init.scope
            │ └─1 /usr/lib/systemd/systemd
            └─system.slice
            ├─clr_debug_fuse.service
            │ ├─393 /usr/bin/clr_debug_fuse
            │ └─527 /usr/bin/clr_debug_fuse
            ├─pacdiscovery.service
            │ └─458 /usr/sbin/pacdiscovery
            ├─tallow.service
            │ └─394 /usr/sbin/tallow
            ├─systemd-udevd.service
            │ └─328 /usr/lib/systemd/systemd-udevd --children-max=16
            ├─thermald.service
            │ └─446 /usr/bin/thermald --no-daemon --dbus-enable
            ├─polkit.service
            │ └─455 /usr/lib/polkit-1/polkitd --no-debug
            ├─rtkit-daemon.service
            │ └─593 /usr/libexec/rtkit-daemon
            ├─bluetooth.service
            │ └─611 /usr/libexec/bluetooth/bluetoothd
            ├─accounts-daemon.service
            │ └─482 /usr/libexec/accounts-daemon
            ├─wpa_supplicant.service
            │ └─480 /usr/bin/wpa_supplicant -u
            ├─ModemManager.service
            │ └─629 /usr/bin/ModemManager
            ├─systemd-journald.service
            │ └─326 /usr/lib/systemd/systemd-journald
            ├─NetworkManager.service
            │ └─397 /usr/bin/NetworkManager --no-daemon
            ├─gdm.service
            │ └─465 /usr/bin/gdm
            ├─upower.service
            │ └─585 /usr/libexec/upowerd
            ├─mcelog.service
            │ └─400 /usr/sbin/mcelog --ignorenodev --daemon --foreground
            ├─systemd-resolved.service
            │ └─414 /usr/lib/systemd/systemd-resolved
            ├─dbus.service
            │ └─396 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
            ├─systemd-timesyncd.service
            │ └─392 /usr/lib/systemd/systemd-timesyncd
            └─systemd-logind.service
            └─401 /usr/lib/systemd/systemd-logind

There are a lot more services running than we saw after we cleared the deck earlier on. Which one is at fault? Is more than one at fault? This is how the next round of analysis began.

So, we have GDM, a Weston desktop, an otherwise idle system with its PSR display on, and no CPU%LPI residency. Looking at the system.slice services listed earlier, we have several possible bad actors. I took a quasi-scientific approach, resolving to kill all non-essential services one by one until I saw CPU%LPI residency. Once I saw CPU%LPI residency, I would note the last killed service, reboot into GDM and a weston desktop and kill that service to see if it was the solitary bad actor.

The first service I killed was accounts-daemon, because I have had issues with it in the past:

$ sudo systemctl stop accounts-daemon

I re-ran turbostat. Still no CPU%LPI. Next I tried killing thermald:

$ sudo systemctl stop thermald

Re-ran turbostat. Nope. Next I tried killing BlueZ:

$ sudo systemctl stop bluetooth

Re-ran turbostat. First line only:

GFX%rc6     Pkg%pc2     Pkg%pc8     Pkg%pc9     CPU%LPI     SYS%LPI
99.98       1.34        12.34       0.00        82.18       0.00

Bluetooth? It was now time to establish whether this was a solitary bad actor.

Bluetooth®

I rebooted, logged into a weston desktop through GDM, launched weston-terminal, full-screened it, and then:

$ sudo powertop --auto-tune

Followed by:

$ sudo systemctl stop bluetooth

And re-ran turbostat. Sure enough. It was a solitary bad actor: Bluetooth. Or was it? Recall earlier that PowerTop identified two "bad" tunable items. By running powertop --auto-tune, we dodged those.

Bad   Autosuspend for USB Device Touchscreen [ELAN]
Bad   Runtime PM for PCI Device Toshiba America Info Systems Device 0116

It was time to reboot (hopefully) one last time. I logged into weston through GDM, launched weston-terminal, full-screened it, and ran turbostat straight away. Lo and behold, I was stuck at PkgC8.

Next, I went into PowerTop, and tabbed to the Tunables tab:

$ sudo powertop

And pressed ENTER on the top-most tunable (USB Device Touchscreen). This flipped the tunable to “Good”. I exited PowerTop, and re-ran turbostat. CPU%LPI residency once more.

Wrap-Up

The objective of this case study was to get my Dell XPS 13 9360 laptop, running Clear Linux OS, to reside in CPU-LPI during idle “out of the box.” Through a few rounds of analysis, three issues were called out:

  • DMC firmware was not loaded by the built-in i915 device driver at boot time
  • The USB Touchscreen held the package at PkgC8 unless autosuspend was enabled
  • Bluetooth held the package at PkgC8

The DMC firmware issue seems to have been resolved as of Clear Linux OS 29760, when initrds were introduced to all native installations. This ensures the built-in i915 device driver is able to load the DMC before the root filesystem is mounted. The DMC firmware plays a role in graphics power management, so for our purposes DMC is required for deep idle.

The USB Touchscreen issue required a line item be added to clr_power, a oneshot service run at system boot. This ensured the device always had USB autosuspend enabled.

Bluetooth, on the other hand, requires more investigation. For now, my advice is to disable Bluetooth if you do not use it. I continue to work with my colleagues with far more BlueZ and Bluetooth knowledge on a reasonable solution.

It is important to note that this exercise is the first pass. Just because you can idle in CPU%LPI does not mean the platform is idling as efficiently as possible, though strides have been made in that direction. To maximize idle energy efficiency, and to make products with long battery life or low power draw, additional steps are needed to optimize the platform’s hardware and software—all of which are outside the scope of this article.

Conclusion

In order to build an energy efficient device, manufacturers have a power management strategy, which requires they scrutinize the entire platform: from CPU and memory, to components and peripherals, as well as the software they intend to run on the device. The case study presented here covers a small part of the software aspect of that strategy, and hopefully provides a window into what occurs when a device manufacturer ensures a device can idle deeply in Linux.

By default, many Linux distributions do not aggressively enable power management features in devices, instead they err on the side of stability to the detriment of energy efficiency. However, “bad” tunables have the potential to prevent the platform from deeply idling. As an experiment, I invite you to install other Linux distributions on the same platform to see the list of “bad” tunables grow.

In contrast, Clear Linux OS introduced clr-power-tweaks early in its inception. This optimization is  a simple program, run at boot time, that enables power management features on a known-good-list of devices on various buses. This program can certainly be improved, but it begs a larger question: why is such a program needed in the first place? I believe that concerted efforts within the kernel development community can make energy efficiency the default across device drivers, which will improve the end-user experience.

Your analysis of your platform may highlight particular pieces of software—in my platform’s case, it was graphics, Bluetooth, and a touch screen. Bluetooth, as it happens, is an interesting software stack, spanning kernel space, user space, and lots of IPC. BlueZ, the userspace component, seems to pivot on the rfkill (software kill switch) of the Bluetooth controller(s) on the platform. And if BlueZ is on, the Bluetooth controller on my reference platform is on, even without any paired Bluetooth devices. On this platform, Bluetooth being on is enough to keep the package at PkgC8 during idle. I think we can do better.

This case study underscores a much larger problem within the Linux software ecosystem. Power management feels like an afterthought, to the extent that the user has to a) know that they want these features, b) have a nuanced understanding of their device, c) have well-maintained Linux device drivers for their device’s components, d) be prepared to analyze said device, and e) have the wherewithal to optimize their device for energy efficiency. This means you are responsible for the energy efficiency of your device, as major distributions do not prioritize it today. This is problematic for the average end-user. The parties with the wherewithal to improve the Linux ecosystem with respect to energy efficiency are: device manufacturers, Linux developers (kernel or userspace), and Linux distributions.

The broader Linux ecosystem has a lot of rough edges with respect to power management. Being a contributor to that ecosystem, it behooves me, and others like me, to make energy efficiency a priority across myriad Linux software stacks. The Linux kernel will need work, as will userspace stacks like BlueZ, or “noisy” applications that do unnecessary work when they should be idle. There is much work to be done, and together we can make energy efficiency the default across the Linux ecosystem.

Call to Action

Hopefully you found this case study helpful, and will use many of the same tools I did (and more) to optimize your platform’s idle energy efficiency. I invite you to make energy efficient behaviors the default across kernel space, user space, and Linux distributions. By increasing energy efficiency throughout the software stack, we will encourage more device manufacturers to build Linux-driven solutions—as a Linux enthusiast, that prospect excites me, and I hope it excites you, too.

References

Footnotes

[1] In Linux, this is known as the “mem” power state, or the “suspend-to-RAM” flow. The S3 state is described in the latest ACPI Specification.

[2] The linked bug report is for an Intel NUC, which exhibited the same issue. Dmesg output was similar enough to justify a link.

[3] It is worth mentioning that, if the i915 driver is built as a module, the DMC firmware is read directly from the root filesystem.

 

Disclaimer: The Bluetooth® word mark and logos are registered trademarks owned by Bluetooth SIG, Inc. and any use of such marks by Intel Corporation is under license.