SuspendResume - Hard disk resume optimization, a simpler approach

Proposed Solution

Rather than altering the power management infrastructure in the kernel (as with the former approach), a simpler approach can be to remove the wait time in the disk drivers themselves. The essential issue behind hard disks' lengthy resume time is the ata port driver blocking until the ATA port hardware is finished coming online. So the kernel isn't really doing anything during all those seconds that the disks are resuming, it's just blocking until the hardware says it's ready to accept commands. This patch changes the ATA port driver to issue the wakeup command and then return immediately. Any commands issued to the hardware will be queued up and will be executed once the port is physically online. Thus no information is lost, and although the wait time itself isn't removed, it doesn't hold up the rest of the system which can function on what's left in RAM and cache.

The patch to the ATA subsystem is here:

https://github.com/01org/suspendresume/blob/master/dev/hard-disk-resume/ata_port_resume_async.patch

The patch to the SCSI subsystem is here:

https://github.com/01org/suspendresume/blob/master/dev/hard-disk-resume/sd_resume_async.patch

Applying these two patches allows SATA disks to resume asynchronously without holding up system resume completion. This means that there will be a short period when the system appears to be back from resume, but has disks which are still in the process of resuming. In most cases the user shouldn't notice, unless they were in the middle of a disk access when the system suspended. But even in that case the system will take exactly as long to resume as it would have without the patch, but at least the the UI is online.

Performance testing (S3 resume)

To demonstrate the massive performance improvement I've run the AnalyzeSuspend tool on three different platforms patched with the new behavior. Each is running Ubuntu Raring with a kernel built from the upstream kernel source. There are two graphs for each of the three systems, the first shows the S3 behavior with the unaltered kernel, and the second shows the same test with the above patch applied.

Computer One (10.5X speedup: from 11.6 seconds to 1.1 seconds)

Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz
SATA controller: Intel Corporation C600/X79 series chipset 6-Port SATA AHCI Controller (rev 05)
Has 6 disks attached of varying types and sizes

  1. ATA1: 240 GB SSD
  2. ATA2: 3 TB Hard Disk
  3. ATA3: 500 GB Hard Disk
  4. ATA4: DVD-ROM (with cd inserted)
  5. ATA5: 2 TB Hard Disk
  6. ATA6: 1 TB Hard Disk

Unpatched 3.11.0-rc7 S3 suspend/resume

PATCHED 3.11.0-rc7 S3 suspend/resume

Computer Two (12X speedup: from 5.4 seconds to 0.45 seconds)

Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
Has 1 disk attached

  1. ATA1: 320 GB Hard Disk
  2. ATA2 - ATA6: Empty slots

Unpatched 3.11.0-rc7 S3 suspend/resume

PATCHED 3.11.0-rc7 S3 suspend/resume

Computer Three (7.8X speedup: from 5.4 seconds to 0.69 seconds)

Intel(R) Core(TM) i7-4770S CPU @ 3.10GHz
SATA controller: Intel Corporation Lynx Point 6-port SATA Controller 1 [AHCI mode] (rev 02)
Has 2 disks attached

  1. ATA1,3,4,6: Empty Slots
  2. ATA2: DVD-ROM (empty)
  3. ATA5: 500 GB Hard Disk

Unpatched 3.11.0-rc7 S3 suspend/resume

PATCHED 3.11.0-rc7 S3 suspend/resume

Project: