Matthew Garrett (mjg59) wrote,
Matthew Garrett

ACPI general purpose events

ACPI is a confusing place. It's often thought of as a suspend/resume thing, though if you're unlucky you've learned that it's also involved in boot-time configuration because it's screwed up your interrupts again. But ACPI's also heavily involved in the runtime management of the system, and it's necessary for there to be a mechanism for the hardware to alert the OS of events.

ACPI handles this case by providing a set of general purpose events (GPEs). The implementation of these is fairly straightforward - an ACPI table points at a defined system resource (typically an area of system io space, though in principle it could be something like mmio instead), and when the hardware fires an ACPI interrupt the kernel looks at this region to see which GPEs are flagged. Then things get more interesting.

The majority of GPEs are implemented in the ACPI tables via methods with names like _Lxx or _Exx. The xx is the number of the GPE in hex, while the leading _L or _E indicates whether the GPE is level- or edge-triggered. If an ACPI interrupt is fired and GPE 0x1D is flagged as being the source of the interrupt, the ACPI interpreter will then look for an _L1D or _E1D method. Upon finding one, it'll execute it. What this method does is entirely up to the firmware - on most HP laptops, GPE 0x1D is hooked up to the lid switch[1] and so executing it will send a notification to the OS that the lid switch has changed state. The OS will then evaluate the state of the lid switch (generally by making another ACPI query) and send the event up to userspace.

How does the lid end up triggering GPE 0x1D? Things get pretty hardware specific at this point. Intel motherboard chipsets have a set of general purpose io (GPIO) lines that can, for the most part[2], be used by the system vendor for anything they want. For a lid switch, one of these lines is hooked to the switch and the BIOS configures the GPIO as an input. Pressing the switch will cause the GPIO line to become active. The GPIO lines are mapped to GPEs in a 1:1 manner, though with an offset of 16 - ie, GPIO 0xd will map to GPE 0x1d. If GPIO 0xd becomes active, GPE 0x1d will be flagged and an ACPI interrupt sent. The ACPI code will then do something to quash the interrupts, such as inverting the polarity of the GPIO[3], as well as send the notification to the OS.

Why are the GPIOs offset by 16 relative to the GPEs? The lower 16 GPEs (again, talking about Intel hardware) have pre-defined purposes[4]. These range from things like "Critically low battery" to "PCIe hotplug event" down to "This device triggered a wakeup". And the latter is what I'm most interested in here.

Various pieces of modern hardware can be placed into power saving states when not in use. The problem with this is that the user experience of having to turn on hardware before you can use it is not a good one, so in order to make this the default behaviour we need the hardware to tell us that something happened that requires us to wake the hardware up.

There's something of a chicken and egg problem here, but thankfully most of the relevant modern hardware has out of band mechanisms to tell us about things going on. The PCI spec defines something called Power Management Events (PME), which are driven by an additional current that's supplied to the hardware even when it's otherwise turned off. On plug-in PCI Express cards, firing a PME generates an interrupt on the root bridge and a native driver can interpret that, but for legacy PCI devices and integrated chipset devices the notification has to come via ACPI.

The example I've been working on is USB. It's a good choice for various reasons - firstly, there's already support for detecting when the USB controller is idle. Secondly, modern USB host controllers have support for generating PMEs on device insertion, removal or (and this is important) remote wakeup. In other words, as long as the USB bus is idle we can power down the entire USB controller. If the OS tries to access a USB device, we'll power it back up. If the user unplugs or plugs a device, we'll power it back up. If a previously idle device suddenly responds to some external input, we'll power it back up. And it's all nicely invisible to the user.

How does this work? The controller retains a small amount of power even when nominally pwoered down. This is used to keep the detection circuitry alive. When it receives a wakeup event, it asserts the PME line. The chipset detects this and fires a GPE. The OS runs this GPE and receives a device notification on the ACPI representation of the USB controller, telling us to power it back up. We do so and process whatever woke us - if the bus then goes idle again, we can power down once more.

The astonishing thing is that this all works. The only problem we have is that it relies on the machine vendor to have provided the ACPI methods that are associated with the GPEs. If they haven't, we can't enable this functionality - even though the hardware is capable of generating the GPEs, we have no method to execute to let us know which device has to be woken up. The GPE is never answered, we never acknowledge the PME and the hardware keeps on screaming for attention without getting any. And, more to the point, it never gets powered up and your mouse doesn't work.

There's a pretty gross hack to deal with this. In general, we know what the GPE to device mappings are - they're pretty static across Intel chipsets, and while AMD ones can be programmed differently by the BIOS we can read that information back and set up a mapping ourselves. This trick also comes in handy when some vendors (like, say, Dell) manage to implement one of the GPE events wrongly. Everything looks like it should work, but the method never sends a notification because it's buggy. In that case we can unregister the existing method and implement our own instead.

This code isn't upstream yet, but patches have been posted to the linux-acpi mailing list and with luck it'll be there in the 2.6.33 timeframe. My tests suggest about 0.2W saving per machine, which isn't going to save all that many polar bears but seems worth it anyway.

[1] _L1D = lid. Sigh.

[2] There's a few that are reserved for specific purposes

[3] So where before it had to be high to be active, it now has to be low to be active - this means that it'll now trigger on the switch being opened rather than closed, so you'll get another event when you open the lid again.

[4] You can find a list in the documentation for the appropriate ICH chip - the relevant section is "GPE0_STS" under the LPC interface chapter.

Tags: advogato, fedora


November 10 2009, 04:14:00 UTC 8 years ago

0.2W for USB specifically?

What do you expect you can save with other devices?
0.2W for USB, yes. I've no idea about other devices yet - there's certainly some power to be saved there, with the most obvious targets being the sound hardware (we power down the codec but not the hda core itself) and AHCI (though the main concern there would be how much latency it introduces).
This is a great article. Full of information, interesting to read, and none of the acidic ranting that is usually attached to articles like this. Thanks!
Seconded. I love articles like this. (A bit of acidic ranting doesn't hurt either, assuming it's entertaining -- and Matthew's rants are.)

Re: Thanks


May 4 2010, 00:25:57 UTC 8 years ago

Thank you for the nice explanation/article and for your work.
_L1D = lid. Sigh. <-- that is hilarious :D

Seriously though, thanks for the detailed article. Who the hell thought up this system?

Does the variance between machines mean that there has to be some kind of laptop model lookup table, so that the acpi subsystem knows it's on a Dell laptop and has to implement its own GPE function, for example?
Workarounds will require a vendor specific table, yes. WIth luck there'll be something we can use to identify when the bug is fixed and only do the workarounds when necessary.
Who do you think thought up this system? From the Comes antitrust trial:

From: Bill Gates
Sent: Sunday, January 24, 1999 8:41 AM
To: Jeff Westorinen; Ben Fathi
Cc: Carl Stork (Exchange); Nathan Myhrvold; Eric Rudder
Subject: ACPI extensions

One thing I find myself wondering about is whether we shouldn’t try and make the “ACPI” extensions somehow Windows specific.

It seems unfortunate if we do this work and get our partners to do the work and the results is that Linux works great without having to do the work.

Maybe there is no way to avoid this problem but it does bother me.

Maybe we could define the APIs so that they work well with NT and not
the others even if they are open.

Or maybe we could patent something related to this.

The ACPI code will then do something to quash the interrupts, such as inverting the polarity of the GPIO

Always a nice little race. Hope all the switches are well debounced on those cheapass laptops.
The worst case I've seen here is when Acer managed to forget that the GPIO->GPE mapping is offset by 16, resulting in the lid always claiming to be closed.
What I love is that this means the hardware is entirely capable of knowing the current state of the lid, yet all the OS ever seems to see is "lid state has toggled". The number of times I've managed to get machines confused such that they wake up when closed and sleep when opened...
There's a separate method for querying what the current state is. When an OS receives a "Lid state has changed" notification, it's expected to go and read it back rather than just toggling some internal state.


November 12 2009, 23:33:45 UTC 8 years ago

Confirming that I understood something correctly from the article: PCI Express devices don't need this system, only PCI devices and such?

Ethernet and wifi cards seem like another obvious target for powering down when not in use. Not particularly helpful when watching for link status changes, but great for saving more power when the user disables networking. However, most modern laptops use PCI-E for these.

With enough care taken, the graphics card seems like another vaguely sensible thing to power off completely, when not actively displaying graphics (such as with the screen blanked or the lid closed).

Finally, how insane might it prove to power off the SMBus controller when not in use?
PCI Express doesn't need this in principle, but hardware that's integrated into the motherboard chipset (like USB controllers) will typically still generate PMEs via GPEs rather than natively.

Graphics cards, ethernet and wifi are all on the list of things to do this with. Our ability to do anything about graphics hardware does involve us being able to figure out how to actually power the hardware down, which isn't possible yet - everything else will get there over time. The main problem about powering down the smbus controller is that the firmware tends to make use of it as well, so pulling it out from udnerneath it may result in issues...
Oh, by the way. Yet again I discover that a work laptop's battery life is better in Linux than XP. This time it's a Toshiba Portege R600 that was apparently very problematic until quite recently, but has worked pretty much flawlessly for me in Ubuntu 9.10. (The webcam doesn't work out the box. I'll live.) And gets 4 hours in XP but presently 4.7 hours in Ubuntu. This comment is to thank you.

Comments for this post were locked by the author