Linux* S0ix Troubleshooting
Introduction
The S0ix system sleep states combine the low latency of CPU idle power states with the low power of system-wide sleep states. They also provides a rich set of wakeup sources, which are not available in deeper system sleep states, such as ACPI S3. Due to these advantages, S0ix is being positioned to replace S3 in the future. For more information on S0ix states in Linux*, refer to how to achieve S0ix states in Linux.
Linux can enter S0ix states through two paths. The first path is via system-wide suspend to idle, referred to as s2idle. This is very similar to the path Linux uses to suspend to ACPI S3. The second path is opportunistic idle. Reaching S0ix via opportunistic idle is very rare since common device states (such as display-on) and system activity often prevent hardware from sleeping so deeply. This document focuses on the first system-wide s2idle path to S0ix, which is a more fundamental power-saving feature. If system-wide s2idle doesn't reach S0ix, it is less likely for opportunistic idle to reach S0ix.
The downside of S0ix states is that achieving it requires the cooperation of many devices in the system, any of which can prevent high S0ix residency, or block entry into S0ix altogether.
S0ix works when the system suspends and resumes correctly, and when the value stored in /sys/kernel/debug/pmc_core/slp_s0_residency_usec
is greater than zero and less than the total system suspend time. However, if that is not the case, the tips in this document can help to identify the problem and possibly to correct it.
This document also provides step-by-step Linux S0ix troubleshooting tips to help users identify and remove Linux S0ix blockers.
The flow-chart in the next section illustrates the appropriate troubleshooting steps for each of the five chapters of this document.
S2idle entry and exit failure
For a Linux system to achieve satisfactory residency in S0ix, it first must be able to successfully enter and exit the S2idle state. In the flowchart below, the orange blocks summarize basic troubleshooting tips to understand how to analyze S2idle entry and exit failures.
We summarized four cases based on different S2idle entry or exit failure symptoms. Compare your case scenario to find the recommended troubleshooting methods. Be sure to use the latest versions of the Linux kernel and BIOS.
Case 1. S2idle suspend failure
In Linux, the commands to place the system into Suspend to idle (S2idle) are:
~# echo s2idle > /sys/power/mem_sleep && echo mem > /sys/power/state
or
~# echo freeze > /sys/power/state
If the system fails to suspend after the S2idle command, here are the steps to take:
- Compile the kernel with
CONFIG_ACPI_DEBUG=y
, appendinitcall_debug
to the kernel command line, and enable power management debug messages before executing the entering S2idle command by issuing the following command:~# echo 1 > /sys/power/pm_debug_messages
Check the Linux kernel dmesg log for any device driver failure, error, timeout, call trace, Bug, or Warning indicating a device driver issue that may prevent the system from entering suspend. As an alternative, you may run the sleepgraph tool to gather S2idle ftrace and dmesg logs with the following command example:
~# sleepgraph.py -m freeze -multi 3 5 -rtcwake 15 -dev
For more information about the sleepgraph tool, refer to: https://01.org/pm-graph/
- Disable
pm_async
to see where the suspend procedure stops:~# echo 0 > /sys/power/pm_async
Check the Linux kernel dmesg log for any device driver failures between PM: suspend entry (s2idle) and PM: suspend exit:
PM: suspend entry (s2idle) . . . PM: suspend exit
- Use
pm_test
to determine which mode breaks the S2idle entry. For example, observe whichpm_test
variable is supported:~# cat /sys/power/pm_test [none] core processors platform devices freezer
Change the value of the
/sys/power/pm_test
variable from "none" to "freezer" using the following commands:~# echo freezer > /sys/power/pm_test ~# echo freeze > /sys/power/state
For more information about pm_test, refer to https://www.kernel.org/doc/Documentation/power/basic-pm-debugging.txt.
If a device driver fails to suspend, unload the driver and then execute the following commands to confirm that S2idle works without it:
~# grep . /sys/power/suspend_stats/* ~# rtcwake -m freeze -s 30 ~# grep . /sys/power/suspend_stats/*
Note that this may be a kernel regression; for example, one of the previous versions of the kernel worked, but the most recent one does not. If so, use the git-bisect tool to identify the failing commit: https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html
- Check if runtime suspend has been enabled for any device. If so, disable it with the PowerTop tool.
Launch PowerTop, switch to the Tunables tab and disable runtime suspend for the suspected device (i.e. switch the setting from Good to Bad).
- If an NVMe SSD is suspected, upgrade the NVMe firmware to the latest version available. Or, if practical, use a SATA SSD instead of the NVMe drive. Alternatively, append pcie_aspm=off to the kernel command line to see if that makes any difference.
Case 2. Devices fail after S2idle exit
The second case is when the users’ system can enter the S2idle state, but some devices fail after S2idle exit, which may not be blocking S0ix directly, but may be annoying to users or even harmful. Here are some tips for narrowing down the issues of this kind.
-
Always check any driver failure in the dmesg log with “initcall_debug” appended to the kernel command line.
-
If a PCIe device is affected, check the ASPM status; in particular, check whether the PCIe ASPM policy has been set to default:
~# cat /sys/module/pcie_aspm/parameters/policy
-
Disable TLP through the TLP_ENABLE=0 kernel parameter and see if that makes any difference.
Case 3. System cannot wake up after S2idle entry
The third case is when the system successfully suspends to idle state, but it fails to wake up from it. The troubleshooting tips for that case are the following:
-
Check which kind of ACPI wakeup resource is in use, or if any other wake up method works if that resource fails.
~# cat /proc/acpi/wakeup
-
If a USB device plug/unplug cannot wake up the system from S2idle, try the PowerTop tool (v2.10 or newer) which supports the WakeUp tab allowing the USB device wakeup setting to be changed from “Disable” to “Enable”.
-
Try to set up the RTC wake alarm to wake up the system:
~# echo +0 > /sys/class/rtc/rtc0/wakealarm ~# echo +30 > /sys/class/rtc/rtc0/wakealarm ~# echo freeze > /sys/power/state
Or
~# rtcwake -m freeze -s 30
Case 4. Unexpected S2idle wake up
The fourth case is when the system wakes up from S2idle unexpectedly, which may be difficult to diagnose.
The unexpected wake up events are usually invisible to end users, but there is a kernel debug patch that can help to print the suspected wakeup IRQ.
-
Apply the patch shown below against the test kernel and enable power management debug messages through the command:
~# echo 1 > /sys/power/pm_debug_messages
Reproduce the issue, capture the dmesg log, and submit a report with the dmesg log attached.
From fd29ab5871f6089f28f5181923aea82bd0d61903 Mon Sep 17 00:00:00 2001 From: Zhang Rui <rui.zhang@intel.com> Date: Mon, 10 Jun 2019 15:08:16 +0800 Subject: [PATCH] pm/wakeup: export wakeup reason Signed-off-by: Zhang Rui <rui.zhang@intel.com> --- drivers/base/power/wakeup.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index 5b2b6a0..30f6d95 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -852,6 +852,12 @@ bool pm_wakeup_pending(void) void pm_system_wakeup(void) { + if (pm_suspend_target_state != PM_SUSPEND_ON) { + /* dump the call trace of wakeup events during suspend */ + if (pm_debug_messages_on) + dump_stack(); + } + atomic_inc(&pm_abort_suspend); s2idle_wake(); } @@ -864,6 +870,9 @@ void pm_system_cancel_wakeup(void) void pm_wakeup_clear(bool reset) { + if (pm_wakeup_irq) + pm_pr_dbg("Previous wakeup irq %d cleared\n", pm_wakeup_irq); + pm_wakeup_irq = 0; if (reset) atomic_set(&pm_abort_suspend, 0); -- 2.7.4
-
On some old platforms, the embedded controller (EC) generates spurious wakeup events, which can be prevented from occurring by adding
acpi_ec_no_wakeup=1
to the kernel command line, to check whether the issue is related to the EC.Note:
acpi_ec_no_wakeup=1
is just for diagnostics. It should not be regarded as a final solution because it prevents the EC from generating any wakeup events that may include the power button and lid (for laptops).
Report S2idle bugs
If a broken driver is detected during S2idle entry or exit, verify it by disabling the controller from the platform firmware (BIOS) setup or by unloading the driver via modprobe before repeating the s2idle test. Drivers may also be blacklisted via kernel parameters (e.g. if the intel_ish_ipc
driver is suspected, it can be blacklisted by appending modprobe.blacklist=intel_ish_ipc
to the kernel command line). If the issue can not be reproduced without the driver, it is very likely that the driver is the source of it. In that case, report the issue through the kernel Bugzilla against “power management” and “system suspend/hibernation” (attach the kernel dmesg log and the output of lspci -vvv
to the bug entry; if ACPI is involved, also attach the output of acpidump and dmidecode from the affected system).
No PC10 residency, no PC2 and PC8 residency
After confirming that the platform generally supports S0ix (refer to how-to-achieve-s0ix-states-linux for details on how to do that), and S2Idle entry and exit are successful, the next key indication before getting S0ix residency is whether the system can achieve PC10 residency.
The commands to check PC10 residency are:
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
Or
~# cat /sys/kernel/debug/pmc_core/package_cstate_show
Note: The pmc_core debug sysfs relies on intel_pmc_core driver, the prerequisite of which is to compile the kernel with CONFIG_INTEL_PMC_CORE=y
. Refer to PMC debug for an explanation of how the Intel Core SoC Power Management Controller driver supports low power S0 idle debug features.
When you observe non-zero PC10 residency after S2idle exit, and the PC10 residency keeps increasing following multiple S2idle entry and exit cycles, this means your system can achieve PC10 residency consistently.
The command to read PC10 residency through pmc debug sysfs file is:
~# cat /sys/kernel/debug/pmc_core/package_cstate_show
Package C2 : 9144254
Package C3 : 3659722
Package C6 : 66391
Package C7 : 105517
Package C8 : 41277015
Package C9 : 0
Package C10 : 28814142
If you do not observe PC10 residency or PC2 residency, here are some other things to check.
In the following flow-chart, the orange blocks summarize three key checkpoints to analyze PC2 and PC8 residency failures:
- Any abnormal interrupts?
- Are the graphics in high RC6 residency?
- Are all the CPU Cores in C6 or deeper C-state?
Check any abnormal Interrupts
Execute the commands below to check if any abnormal interrupts are generated before and after S2idle. If you notice that a large number of interrupts occurred, that may block PC2 entry.
~# cat /proc/interrupts
~# rtcwake -m freeze -s 15
~# cat /proc/interrupts
Check graphics RC6 residency
The commands to gather graphics RC6 residency are:
~# cat /sys/class/drm/card0/power/rc6_residency_ms
Or
~# turbostat --show GFX%rc6 rtcwake -m freeze -s 15
If you observe graphics RC6 residency but zero or low PC2 residency, check if RC6 residency is too low to prevent from entering PC2 (e.g. < 50%). Disable the graphics controller from BIOS setup or append the nomodeset parameter in the kernel command line to disable graphics, and repeat the S2idle steps to check if that makes any difference.
The commands are shown below:
~# rtcwake -m freeze -s 15
~# cat /sys/kernel/debug/pmc_core/package_cstate_show
Or
~# turbostat --show GFX%rc6,Pkg%pc2,Pkg%pc3,Pkg%pc6,Pkg%pc7,Pkg%pc8,Pkg%pc9,Pk%pc10 rtcwake -m freeze -s 15
If you observe higher PC2 residency or deeper Package C-state residency after disabling graphics, then graphics driver bugs should be suspected and reported to the freedesktop.
Check CPU core 7 residency
Use the latest version of the turbostat tool to gather CPU Core C States. The command is shown below:
~# turbostat --Summary --show CPU%c1,CPU%c6,CPU%c7 rtcwake -m freeze -s 15
If CPU core deepest C7 is not achieved, check which cpuidle driver is in use and what are the CPUs idle states by executing the following commands:
~# cat /sys/devices/system/cpu/cpuidle/current_driver
~# grep . /sys/devices/system/cpu/cpu*/cpuidle/state*/*
If CPU C7 residency is not achieved during S2idle, submit the bug and attach the outputs from the following commands:
~# cat /sys/devices/system/cpu/cpuidle/current_driver
~# grep . /sys/devices/system/cpu/cpu*/cpuidle/state*/*
~# turbostat -o ts.out rtcwake -m freeze -s 15
~# dmesg > dmesg.log
If you observe CPU Core C7 residency, graphics RC6 residency, and no abnormal interrupts during S2idle, but remains stuck at the PC2 state, submit the bug in Bugzilla. It's possible that other kernel issues may need to be debugged in the future.
No PC10 residency, only PC8 residency
Determine if you have PC8 residency after S2idle exit, but no PC10 residency by running the command shown below, and then compare your output to the example output below:
~# cat /sys/kernel/debug/pmc_core/package_cstate_show
Package C2 : 0xcba771c664
Package C3 : 0x878f5e39
Package C6 : 0x1295869d7
Package C7 : 0x0
Package C8 : 0x10fe366620
Package C9 : 0x0
Package C10 : 0x0
From PC8 to PC10, the key requirements include but are not limited to Graphics display state, devices LTR values, PCIe LPM, XHCI LPM etc. See the orange blocks in the flow-chart below for PC10 troubleshooting tips:
Check display state
From the Intel® E8501 Chipset North Bridge (NB) perspective, the major change requirement from PC8 to PC9/PC10 is Display State 9 (DC9).
For the integrated Graphics Controller on Broxton, Gemini lake, Ice lake, and newer platforms, check the DC9 state when the display screen is OFF. Usually there are no DC9 counter-exposures for users to read, so you must check DC5 indirectly. By design, DC5 value is reset after DC9 entry.
Note: There are two important prerequisites for DC5 value: the user’s platform display must support the Panel Self Refresh feature (PSR) and the latest Display Microcontroller(DMC) firmware must be loaded. An example to judge DC5 count reset commands captured from an Ice Lake platform is given below:
~# cat /sys/kernel/debug/dri/0/i915_edp_psr_status
Sink support: yes [0x03]
PSR mode: PSR2 enabled
Source PSR ctl: enabled [0xc2000016]
Source PSR status: DEEP_SLEEP [0x80030130]
Busy frontbuffer bits: 0x00000000
Frame: PSR2 SU blocks:
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
~# cat /sys/kernel/debug/dri/0/i915_dmc_info
fw loaded: yes
path: i915/icl_dmc_ver1_07.bin
version: 1.7
DC3 -> DC5 count: 125
DC5 -> DC6 count: 54
program base: 0x0b004040
ssp base: 0x00006fc0
htp: 0x00a40088
~# rtcwake -m freeze -s 15
rtcwake: wakeup from "freeze" using /dev/rtc0 at Sat Sep 21 06:54:26 2019
~# cat /sys/kernel/debug/dri/0/i915_dmc_info
fw loaded: yes
path: i915/icl_dmc_ver1_07.bin
version: 1.7
DC3 -> DC5 count: 3
DC5 -> DC6 count: 1
program base: 0x0b004040
ssp base: 0x00006fc0
htp: 0x00a40088
In the example logs shown above, the DC5 count value is smaller than before S2idle entry, which indicates that DC9 was achieved.
If the DC5 value always shows zero during idle—though the platform display supports Panel Self Refresh (PSR) and the latest DMC firmware is loaded— this means there may be something wrong with the graphics driver. Please report this graphics bug in freedesktop.
For the integrated graphics card running on Coffee lake, Whiskey lake, and Comet Lake platforms, the DC6 counter is the key checkpoint to achieve DC9 residency. The DC6 count value is expected to increase after each S2idle entry and exit cycle. Otherwise, check the graphics DMC FW load status, or report a bug in freedesktop.
For an external graphics card, there are no DMC FW and DC5/DC6 requirements, but ensure the external graphics card driver runs properly in the Linux kernel, and there is no graphics driver error in the kernel dmesg log. If you observe any external graphics card driver issues, please report the bug(s) to the third-party graphics card vendor.
Check device LTR value
Latency Tolerance Report (LTR) provides the means for devices to dynamically report the amount of delay they can tolerate for access to memory. The LTR value will impact the Package C-state transition. If you detect that the LTR value is small, such as when external PCIe or USB devices are attached, use the Intel_pmc_core driver-supported sysfs file to ignore the LTR value. Then check any Package C-state residency changes. The details for the LTR value ignore debug solution is introduced in the PMC debug article.
Example commands to ignore the device LTR value are listed below.
First, check which device supports LTR value by executing the following command:
~# cat /sys/kernel/debug/pmc_core/ltr_show
Here is an example of the log:
SOUTHPORT_A LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
SOUTHPORT_B LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
SATA LTR: RAW: 0x9001 Non-Snoop(ns): 0 Snoop(ns): 1048576
GIGABIT_ETHERNET LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
XHCI LTR: RAW: 0x89fc Non-Snoop(ns): 0 Snoop(ns): 520192
Reserved LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
ME LTR: RAW: 0x8000800 Non-Snoop(ns): 0 Snoop(ns): 0
EVA LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
SOUTHPORT_C LTR: RAW: 0x10031003 Non-Snoop(ns): 0 Snoop(ns): 0
HD_AUDIO LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
CNV LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
LPSS LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
SOUTHPORT_D LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
SOUTHPORT_E LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
CAMERA LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
ESPI LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
SCC LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
ISH LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
UFSX2 LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
EMMC LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
WIGIG LTR: RAW: 0x0 Non-Snoop(ns): 0 Snoop(ns): 0
CURRENT_PLATFORM LTR: RAW: 0x40201 Non-Snoop(ns): 0 Snoop(ns): 0
AGGREGATED_SYSTEM LTR: RAW: 0x7fdfeff Non-Snoop(ns): 0 Snoop(ns): 0
Then, check which IP shows the LTR value. If the Non-Snoop or Snoop shows zero, there is no latency requirement. The smaller the value is, the stricter the requirement. You can try to ignore the related LTR value by executing the following command:
~# echo <IP OFFSET> > /sys/kernel/debug/pmc_core/ltr_ignore
Note: For the IP OFFSET mapping, refer to the table below:
IP OFFSET | IP NAME |
---|---|
0 | SPA |
1 | SPB |
2 | SATA |
3 | GBE |
4 | XHCI |
5 | RSVD |
6 | ME |
7 | EVA |
8 | SPC |
9 | Azalia/ADSP |
10 | RSVD |
11 | LPSS |
12 | SPD |
13 | SPE |
14 | Camera |
15 | ESPI |
16 | SCC |
17 | ISH |
18 | UFSX2* |
19 | EMMC* |
20 | WIGIG** |
You may also execute the following script to quickly ignore all the devices LTR values:
~# for i in {0..20}; do echo $i > /sys/kernel/debug/pmc_core/ltr_ignore; done
Check IP Link PM state
From PC8 to PC10 residency, another key requirement is the IP Link Power State. There are three kinds of IP link power states:
- PCIe Link is requested to enter L1.1/2 or L2,
- USB 2.0 Link is requested to enter L2,
- USB 3.0 Link is requested to enter U3, SATA Link should be in Slumber or DEVSLP.
If additional peripheral devices are connected that may not support D3 state, such as some Thunderbolt™ devices, they may block PCIe Link entering the L1.2 state.
Currently, only the NDA version of the Intel® SoC Watch tool can expose the IP Link Power State. The NDA-version SoC watch package is downloaded as part of the Intel® System Studio product; follow the How to get started instructions. Follow the path to the NDA release section for the Intel System Studio product. There, you will be able to download the standalone install package for the NDA SoC Watch package. If you have any questions, please refer to your Intel representative.
No S0ix residency, only PC10 residency
If PC10 residency is available, but not S0ix residency, this means the system meets the complex north bridge requirements, but does not meet the south bridge S0ix requirements. Use the PCH troubleshooting tips shown in the orange blocks of the following flow-chart diagram:
Check PCH IP power gating status
When PC10 residency is available, but S0ix residency shows zero after running the following command:
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
Then the first checkpoint is to read PCH IP power gating status by executing the following command:
~# cat /sys/kernel/debug/pmc_core/pch_ip_power_gating_status
Note: The pch_ip_power_gating_status
sysfs file lists all the devices’ power gating status on PCH. The following list of Ice lake platform example logs, with the key IP power gating status comments, is intended to help users understand the basic devices that should be suspected if S0ix is blocked.
PCH IP Name |
Power Gating Status |
Comment |
---|---|---|
PCH IP: 0 - PMC |
State: On |
|
PCH IP: 1 - OPI-DMI |
State: On |
|
PCH IP: 2 - SPI/eSPI |
State: Off |
|
PCH IP: 3 - XHCI |
State: On |
#On is expected when reading from OS Another way to check XHCI power state is to compile the kernel with |
PCH IP: 4 - SPA |
State: Off |
# SPA means South Port A; usually SPA maps with PCIe controller 1, and PCie controller maps with multi PCIe root ports. If SPA Power gating status “On” is observed, figure out which PCIe root port number blocks South Port power gating by disabling and enabling PCIe root ports one by one. If after disabling a suspected PCIe root port from BIOS setup, the SPA power gating status can change from “On” to “Off”. If this occurs, report an issue against the suspected PCIe root port. |
PCH IP: 5 - SPB |
State: Off |
Refer to the comments for SPA |
PCH IP: 6 - SPC |
State: Off |
Refer to the comments for SPA |
PCH IP: 7 - GBE |
State: On |
#On is expected when reading from OS While if user runs old version Linux kernel e.g. older than v5.5-rc1, and the PCH LAN controller is enabled in the BIOS setup, e.g. running on Intel® Ice lake or Intel® Comet lake platforms, which may block S0ix entry; Meanwhile the S0ix is not supported in the ethernet cable connecting scenario by v5.5 kernel version. |
PCH IP: 8 - SATA |
State: Off |
|
PCH IP: 9 - HDA_PGD0 |
State: Off |
# HDA means High Definition Audio. If “On” is observed, check whether wake on voice is supported, otherwise “OFF” is expected. |
PCH IP: 10 - HDA_PGD1 |
State: Off |
Refer to the comments for HDA_PGD0 |
PCH IP: 11 - HDA_PGD2 |
State: Off |
Refer to the comments for HDA_PGD0 |
PCH IP: 12 - HDA_PGD3 |
State: Off |
Refer to the comments for HDA_PGD0 |
PCH IP: 13 - SPD |
State: Off |
Refer to the comments for SPA |
PCH IP: 14 - LPSS |
State: On |
|
PCH IP: 15 - LPC |
State: Off |
|
PCH IP: 16 - SMB |
State: Off |
|
PCH IP: 17 - ISH |
State: Off |
#Intel sensor hub service subsystem is expected to be power gated for S0ix. The prerequisite is the latest ISH FW should be loaded. |
PCH IP: 18 - P2SB |
State: Off |
|
PCH IP: 19 - NPK_VNN |
State: Off |
|
PCH IP: 20 - SDX |
State: Off |
#SDx controller is expected to be in D3, if “On” is observed. Check the SD card controller Sideband enable status in the BIOS setup. |
PCH IP: 21 - SPE |
State: Off |
Refer to the comments for SPA |
PCH IP: 22 - Fuse |
State: Off |
|
PCH IP: 23 - SBR8 |
State: Off |
|
PCH IP: 24 - CSME_FSC |
State: Off |
|
PCH IP: 25 - USB3_OTG |
State: Off |
|
PCH IP: 26 - EXI |
State: Off |
|
PCH IP: 27 - CSE |
State: Off |
#CSME/CSE subsystem is expected to be power gated for S0ix; the prerequisite is that the latest CSMe FW should be flashed. |
PCH IP: 28 - CSME_KVM |
State: Off |
|
PCH IP: 29 - CSME_PMT |
State: Off |
|
PCH IP: 30 - CSME_CLINK |
State: Off |
|
PCH IP: 31 - CSME_PTIO |
State: Off |
|
PCH IP: 32 - CSME_USBR |
State: Off |
|
PCH IP: 33 - CSME_SUSRAM |
State: Off |
|
PCH IP: 34 - CSME_SMT1 |
State: Off |
|
PCH IP: 35 - CSME_SMT4 |
State: Off |
|
PCH IP: 36 - CSME_SMS2 |
State: Off |
|
PCH IP: 37 - CSME_SMS1 |
State: Off |
|
PCH IP: 38 - CSME_RTC |
State: Off |
|
PCH IP: 39 - CSME_PSF |
State: Off |
|
PCH IP: 40 - SBR0 |
State: On |
|
PCH IP: 40 - SBR1 |
State: On |
|
PCH IP: 40 - SBR2 |
State: On |
|
PCH IP: 43 - SBR3 |
State: Off |
|
PCH IP: 44 - SBR4 |
State: On |
|
PCH IP: 45 - SBR5 |
State: On |
|
PCH IP: 46 - CSME_PECI |
State: Off |
|
PCH IP: 47 - PSF1 |
State: On |
|
PCH IP: 48 - PSF2 |
State: On |
|
PCH IP: 49 - PSF3 |
State: On |
|
PCH IP: 50 - PSF4 |
State: On |
|
PCH IP: 51 - CNVI |
State: Off |
#Integrated Connectivity (CNVi) is expected to be off for S0ix. If this is “On”, check CNVi-wifi or CNVi-bluetooth’s network, driver status etc. |
PCH IP: 52 - UFS0 |
State: Off |
|
PCH IP: 53 - EMMC |
State: Off |
|
PCH IP: 54 - SPF |
State: Off |
Refer to the comments for SPA |
PCH IP: 55 - SBR6 |
State: On |
|
PCH IP: 56 - SBR7 |
State: On |
|
PCH IP: 57 - NPK_AON |
State: On |
|
PCH IP: 58 - HDA_PGD4 |
State: Off |
Refer to the comments for HDA_PGD0 |
PCH IP: 59 - HDA_PGD5 |
State: Off |
Refer to the comments for HDA_PGD0 |
PCH IP: 60 - HDA_PGD6 |
State: Off |
Refer to the comments for HDA_PGD0 |
PCH IP: 61 - PSF6 |
State: On |
|
PCH IP: 62 - PSF7 |
State: Off |
|
PCH IP: 63 - PSF8 |
State: Off |
|
PCH IP: 64 - RES_65 |
State: Off |
|
PCH IP: 65 - RES_66 |
State: Off |
|
PCH IP: 66 - RES_67 |
State: Off |
|
PCH IP: 67 - TAM |
State: Off |
|
PCH IP: 68 - GBETSN |
State: Off |
|
PCH IP: 69 - TBTLSX |
State: Off |
|
PCH IP: 70 - RES_71 |
State: Off |
|
PCH IP: 71 - RES_72 |
State: Off |
|
Here’s an example of how we diagnose that the PCIe Controller did not have any power-gating blocks for S0ix entry:
First, we observe PC10 residency, but no S0ix residency with the following commands:
~# cat /sys/kernel/debug/pmc_core/package_cstate_show
Package C2 : 36886463
Package C3 : 13358106
Package C6 : 1379201
Package C7 : 25911
Package C8 : 33625650
Package C9 : 1252442440
Package C10 : 15175263
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
0
Then, we check PCH IP power gating status and notice that the SPB power gating state is “On”.
~# grep On /sys/kernel/debug/pmc_core/pch_ip_power_gating_status
PCH IP: 0 - PMC State: On
PCH IP: 1 - OPI-DMI State: On
PCH IP: 2 - SPI / eSPI State: On
PCH IP: 3 - XHCI State: On
PCH IP: 5 - SPB State: On
PCH IP: 14 - LPSS State: On
PCH IP: 17 - ISH State: On
PCH IP: 20 - SCC State: On
PCH IP: 22 - FUSE State: On
Note: We suggest repeating the grep On /sys/kernel/debug/pmc_core/pch_ip_power_gating_status
command several more times because it reports runtime state.
Based on S0ix requirement and experience, SPB should be selected as a potential blocker to investigate.
Usually S0ix requirement configuration is defined in BIOS; you can disable the suspected controller in the BIOS setup to see if it has an effect. For this case, we determined that the PCIe root port 5 matches with SPB, so if we disable PCIe Root port 5 in the BIOS setup, place the system into S2idle again, and S0ix residency is observed, then SPB should be suspected as the s0ix blocker.
Note: There is a possibility that if the Thunderbolt controller is disabled in BIOS setup, the Thunderbolt-controller-dependent PCIe Root port should also be disabled, otherwise unused PCIe root port enable will block S0ix entry.
Check ModPHY core lanes power gating status
If you do not find an obvious clue from pch_ip_power_gating_status
, the next step is to check ModPHY core lanes power gating status through slp_s0 debug sysfs file.
Note: slp_s0 debug sysfs file is supported beginning with Cannonlake PCH and later platforms.
ModPHY core lanes are high-speed I/O controllers, including XHCI, XDCI, SATA (all instances), PCIe (all instances), Gbe, and SCC (UFS). If any high-speed controller does not gate power correctly, ModPHY core lanes will not gate power either.
Check the ModPHY core lanes power gating state with the following commands:
~# echo Y > /sys/kernel/debug/pmc_core/slp_s0_dbg_latch
~# rtcwake -m freeze -s 15
~# grep No /sys/kernel/debug/pmc_core/slp_s0_debug_status
If you observe the ModPHY lane CORE domain power gating state showing “No”, then you should investigate the ModPHY-related high-speed I/O controller list; for example, if the PCIe controller is suspected, try to disable it in the BIOS setup, and repeat the S0ix steps to see if that makes any difference.
SLP_S0_DBG: MPHY_CORE_GATED State: No
Check OC PLL and Main PLL status
OC (Oscillator Crystal) PLL and Main PLL are PLL generating reference clocks for CPU, internal PLL, and external PCIe devices. If you observe that Main PLL or OC PLL is not OFF as shown in slp_s0_debug_status
, it is possible that the PLL is not OFF for one or more devices. If so, submit a bug.
SLP_S0_DBG: MAIN_PLL_OFF State: No
SLP_S0_DBG: OC_PLL_OFF State: No
Check CSMe power gating status
CSMe domains are expected to be power gated for S0ix. If CSMe is not OFF from slp_s0_debug_status, you must check whether CSMe FW is the latest version and report a bug.
SLP_S0_DBG: CSME_GATED State: No
Check ACPI DSM callback
On some platforms, there is a potential ACPI DSM callback issue that may block S0ix residency. You can try with ACPI debug file (CONFIG_ACPI_DEBUG=y
is needed) to set no ACPI DSM callback, then repeat the S0ix steps to see if that makes any difference.
~# echo Y > /sys/module/acpi/parameters/sleep_no_lps0
S0ix residency Is not good
The target of high S0ix residency is to have good battery life. If S0ix residency is too low, the system will drain the battery within a short time. Tips to diagnose poor S0ix residency are listed in the orange block diagram parts of the following flow-chart.
Check PC10 residency
If you observe poor battery life during S2idle, check the PC10 and S0ix residency by using the turbostat tool. For example, investigate the PC10 and S0ix residency that can be achieved in a period using the commands shown below. If the PC10 residency is too low, S0ix residency will be low as well. The expected PC10 residency should be higher than 95%.
~# turbostat --show Pkg%pc2,Pkg%pc3,Pkg%pc6,Pkg%pc7,Pkg%pc8,Pkg%pc9,Pk%pc10,SYS%LPI rtcwake -m freeze -s 60
Check any spurious S0ix wake up event
Unexpected S0ix wake up events will trigger S0ix entry failure or poor S0ix residency during S2idle. One possible wake up event is an interrupt generated by the device itself, such as wireless or a bluetooth device, which triggers poor S0ix residency. You can try to isolate it by disabling the network and any device. If this does not improve the situation, report a bug.
Known Linux issues and engineering Improvements
You should always update Linux Kernel to the latest version. Engineers continually improve the kernel to ensure S0ix is more stable running on Intel® architecture platforms in Linux OS. Here are some improvements:
- Beginning with v5.3, Linux kernel handles NVMe devices in a special way in the suspend to idle flow, which makes NVMe devices support S0ix more stable.
- Beginning with v5.4-rc1, suspend to idle control flow rework patches fix the S0ix failure because of spurious wakeups from the EC, which has been verified on Intel® Ice lake platform.
- Beginning with v5.5-rc1, PCH LAN e1000e driver support S0ix in Linux OS with network cable disconnected scenario, which is verified on Intel® Comet lake and Intel® Ice lake platforms
- Always update BIOS to the latest version, as one BIOS solution supports both Linux and Windows OS, especially S0ix. BIOS is continually updated to fix compatibility issues.
Automatic tool
There is an automatic tool available from https://github.com/intel/S0ixSelftestTool, that converts the basic Intel® CPU Package C-state and S0ix issue triage methods into a shell script.
If your system experiences S0ix failure, we recommend you run the tool. The tool can do the initial debugging automatically, show the potential blockers or collect the debug message, and add to the advanced debugging discussion in the Kernel Bugzilla.
Summary
This post summarized the S0ix troubleshooting tips that we accumulated while enabling and debugging S0ix in Linux OS running on Intel® architecture reference design platforms. This post aims to provide users with basic triage skills when they encounter the S0ix problem in Linux OS, and provide the necessary information for users to submit bugs in Bugzilla with correct bug component selected.
References
How to achieve S0ix state in Linux: https://01.org/blogs/qwang59/2018/how-achieve-s0ix-states-linux
Using power management controller debug: https://01.org/blogs/rajneesh/2019/using-power-management-controller-drivers-debug-low-power-platform-states
Pm-graph tool: https://01.org/pm-graph/
PowerTop tool: https://01.org/powertop
Thunderbolt is a trademark of Intel Corporation or its subsidiaries.