Matthew Garrett ([info]mjg59) wrote,
@ 2008-11-15 18:01:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
Entry tags:advogato, fedora

Hybrid suspend
One often requested feature for suspend support on Linux is hybrid suspend, or "suspend to both". This is where the system suspends to disk, but then puts the machine in S3 rather than powering down. If the user resumes without power having been removed they get the benefit of the fast S3 resume. If not, the system resumes from disk and no data is lost.

This is, clearly, the way suspend should work. We're not planning on adding it by default in Fedora, though, for a couple of reasons. The main reason right now is that the in-kernel suspend to disk is still slow. Triggering a suspend to disk on a machine with gigabytes of RAM (which is a basic laptop configuration these days) will leave you sitting there for an extended period of time when all you actually want to do is pick your machine up and leave. Fixing this properly is less than trivial. TuxOnIce improves the speed somewhat, at the expense of being a >500k patch against upstream that touches all sorts of interesting bits of the kernel such as the vm system. We're not supporting that for fairly obvious reasons. But even then, the suspend to disk process involves discarding some pages. Those need to be pulled in from disk again on resume. With the current implementation, suspend to both is fundamentally slower than suspend to RAM for both the suspend and resume paths.

So, what other approaches are there? One is to resume from RAM some period of time after suspending and then if the battery is low suspend to disk. Many recent machines will automatically resume when the battery level becomes critical. If the hardware doesn't support that, we can wake up after a set period and measure the battery consumption, set a new alarm and go to sleep again. The downside of this approach is that your system wakes up and does stuff without you being aware of it, which may be bad if it's inside a neoprane cover at the time. Cooking laptops is generally considered unhelpful.

Using the kexec approach to hibernation provides a more straightforward way of handling the problem. The fundamental problem with the existing approach is that it ties suspend into the vm system and involves making atomic copies of RAM into other bits of RAM. kexec would allow us to pre-allocate enough space on disk to save RAM as-is, and then simply kexec into a new kernel and dump RAM to disk without any of the tedious shrinking required first. Resuming from S3 would kexec back into the old kernel, whereas losing power would just fall back to reading off disk. The extra time taken on the S3 path would be minimal.

In an ideal world we'd adopt the Vista approach where "off" is synonymous with suspend. There's still more work to be done on enhancing reliability before that can be achieved, though.




(Post a new comment)

What's good in hybrid?
[info]zdzichubg.jogger.pl
2008-11-15 07:02 pm UTC (link)
Hybrid sleep always stroked me as a way to combine disadvantages of both types: it drains battery (like Suspend) and takes long time to enter (like Hibernation). Why people like this approach is beyond me.

Is there any real possibility for Hibernation to be fast? Even if my laptop disk could write at 50MB/s, dumping RAM would still took about a minute. How fast are solutions you're writing about?

(Reply to this) (Thread)

Re: What's good in hybrid?
[info]mjg59
2008-11-15 07:14 pm UTC (link)
Even a naive compression method is likely to give a significant reduction in the amount of data you have to dump to disk. One benefit of the kexec approach is that you can potentially choose which chunks of data to save in a non-invasive way (ie, don't bother saving any pages which just contain cached data). That's trickier in the approaches which require you to do an atomic copy of your memory.

(Reply to this) (Parent)

Re: What's good in hybrid?
[info]jldugger
2008-11-16 03:14 am UTC (link)
Hybrid sleep is not for laptops. It's mainly a way to balance responsiveness and power. If you go away for lunch, you'd like to come back to a rapid resume on your desktop; overnight though, maybe suspend to disk. FWIW, it doesn't always work on desktops either.

(Reply to this) (Parent)(Thread)

It most certainly was first seen on laptops
(Anonymous)
2008-11-16 08:54 am UTC (link)
I remember mac laptop owners crowing about how they never rebooted their machines (except for security updates) and how they could just close the lid and know that the resume would work. One day I got one of them to leave their laptop off in a drawer for a week suspended with very little battery. When it resumed it did so from hibernation because the battery had gone completely flat but no work was lost. I believe apple call this safe suspend.

I have also seen someone Vista laptop doing a sort of hybrid too. Normally closing the lid would result in the system doing a suspend to ram but at some point after the lid had been closed for a while it seemed to wake up and do a suspend to disk before powering off which could give better battery life...

(Reply to this) (Parent)(Thread)

Re: It most certainly was first seen on laptops
[info]jldugger
2008-11-16 11:15 pm UTC (link)
Clarification: Vista does not do hybrid sleep on laptops by default, for exactly the reasons I mentioned.

(Reply to this) (Parent)


[info]thaytan
2008-11-15 07:56 pm UTC (link)
Is there any reality where it's useful to use idle time to optimistically prepare for hibernation before the user ever requests it? I'm thinking of, for example, dropping copies of not-recently-touch memory pages into swap that can be marked invalidated if they subsequently do get touched in RAM.

(Reply to this) (Thread)


[info]mjg59
2008-11-15 08:06 pm UTC (link)
Yes, that would be beneficial and I believe that Windows does this. But implementing it would involve more VM entertainment than I'm familiar with...

(Reply to this) (Parent)

How far are we?
[info]gsbarbieri
2008-11-15 08:15 pm UTC (link)
Maybe we can have this reliably working by 2009?

I really hate to have suspend to ram (most useful for my use cases) working and then update something and it stop working... :-/ Well, at least it made me optimize my boot times a bit :-P

(Reply to this)

how much is stored when linux suspends
(Anonymous)
2008-11-15 09:43 pm UTC (link)
suppose i have 2GB or RAM, 500MB used by apps, and 500MB of disk cache.
if i hibernate, how much of that gets written to disk? just the 500MB that i need, all 1GB of used RAM, or all 2GB?

Quite a few apps use RAM to cache things, eg firefox. would it be possible to tell firefox to empty its RAM cache just before hibernate?

Also: on resume from hibernate, does all the saved data get writen back to real RAM before you can do anything? or can it be paged back when needed (as if it were swapped out).

(Reply to this) (Thread)

Re: how much is stored when linux suspends
(Anonymous)
2008-11-16 12:34 am UTC (link)
Depends on what hibernation system you use. None of them will save all 2GB, except perhaps if you have a machine so old that it still uses S4BIOS and thus can't know what the OS has allocated. Some hibernation methods will throw away your disk cache, some will save it, and some save a configurable portion of it.

(Reply to this) (Parent)

What about S53?
(Anonymous)
2008-11-15 10:12 pm UTC (link)
S53 is a new made up suspend state that we will all soon be clamouring for. In this magic state, shutdown's meaning will be abused and shutdown will in fact be restart and suspend to disk. I propose that this new approach be named "rebootinate".

(Reply to this) (Thread)

Re: What about S53?
(Anonymous)
2008-11-16 12:32 am UTC (link)
That only applies to operating systems which boot slower than they resume from hibernation. That set no longer includes Linux. Hibernation no longer provides a speed advantage, it just lets you save your current state. Since rebooting throws away all that state, rebooting and then hibernating makes no sense for Linux.

(Reply to this) (Parent)(Thread)

Re: What about S53?
(Anonymous)
2008-11-21 02:30 am UTC (link)
If hibernation is properly implemented (empty pages aren't saved and data is compressed and stored contiguously), that set will always include all operating systems, because a normal boot will involve reading more information and doing more than simply reloading & decompressing pages. If you give me a system that boots in 5 seconds, I'll be able to resume from disk to the same state in 3 or 4 seconds.

(Reply to this) (Parent)

Re: What about S53?
[info]jbailey
2008-11-17 03:34 am UTC (link)
I already only reboot in order to make sure that the system is running in some sort of sane state. Regular Linux system daemons aren't perfect enough yet to never need restarting. I find I usually need to do a prophylactic reboot about once every week or two on a current Ubuntu system (either syslog randomly stops logging, or X goes nuts)

(Reply to this) (Parent)


(Anonymous)
2008-11-16 12:26 am UTC (link)
Hibernation can go pretty fast if you drop all the caches, but of course if you do that then the system runs slowly when you resume until it gets everything off of the disk again. On the other hand, hibernation could save all the caches so the system runs just as well after resume, but then hibernation takes excessively long.

How about taking a cue from the 5-second boot and the future plans in that direction? On hibernate, ask the Linux kernel for an index of its caches in terms of files, write out a sreadahead file, drop the caches, and quickly write the remaining memory as part of hibernation. On resume, you have a small hibernation file which resumes quickly, and then as soon as userspace comes back you can run sreadahead on the file you saved in the background. You end up with a much faster hibernation because you don't write out the caches that just duplicate other data on the disk, and resume takes about the same amount of time except that you can interact with it sooner.

If you want to combine this with hybrid suspend, you just need some way to ensure that the caches don't actually go away until the battery dies and you have to resume from S4 rather than S3. One approach would simply let the kernel keep the caches and just not write them as part of the hibernation image.

- Josh Triplett

(Reply to this)


[info]jldugger
2008-11-16 03:19 am UTC (link)
Cooking laptops is one thing, but it gets worse. If you suspend-to-disk a laptop and transport it, the last thing you want is the damn thing spinning up the drives while you're carrying it. Sure, there may be drive shock protection, but that also means it may take even longer than usual to suspend to disk. Potentially longer than you have battery power.

(Reply to this) (Thread)


[info]lionsphil
2008-11-16 04:04 pm UTC (link)
There may be drive shock protection? Not last I looked, without a lot of manual fiddling. You've got more luck getting your accellerometer supported as a Neverball controller than as way to park disks.

(Reply to this) (Parent)

Lazy suspend
(Anonymous)
2008-11-16 04:14 am UTC (link)
suspend-to-disk takes too fucking long. The process at the end where memory is written to disk should happen lazily in the background using spare I/O capacity so that writing the final few pages takes less than a second.

(Reply to this)


[info]theraphim
2008-11-16 02:40 pm UTC (link)
In my opinion, the ideal system would not hibernate any of the clean mappings and pagecache since they can be reloaded from disk. What's left is kernel text, data, and of course dirty pages either from some files or swap. If we're saving kernel text, we can save it ahead because it almost always stays the same. Dirty pages can also be pre-saved (controllable by some knob, background lazy batch writeout). What's left is kernel data...

(Reply to this) (Thread)


[info]theraphim
2008-11-16 02:41 pm UTC (link)
And, see, we do not save all those caches because, usually, we will resume from memory, what we really do care is suspend time - if we can bring this down to 1 second it will be a big deal.

(Reply to this) (Parent)


[info]mjg59
2008-11-16 02:43 pm UTC (link)
Remember that failing to save the clean mappings and cache means that it's likely you'll end up pulling that data off disk again when the system's resumed. You speed up suspend and resume, but at the cost of increased latency in the resumed system. If you get multiple applications trying to page themselves back in then you get seek penalties, so the overall time taken to get back to the same state may be significantly longer.

(Reply to this) (Parent)(Thread)


[info]lionsphil
2008-11-16 04:01 pm UTC (link)
I suspect most users would rather have a slow system after one second and a fast one after ten than a fast system after five. I would.

Surely the kernel could lazily page everything back in after resume anyway, leaving the immediate on-resume thrash to just those pages the awaking applications actually need right then?

Although, from the sounds of this thread, Linux's virtual memory management is seriously dumb. And seriously behind Windows.

(Reply to this) (Parent)(Thread)


[info]mjg59
2008-11-16 04:13 pm UTC (link)
Resuming graphical state requires asking every application to do a repaint, which in turn requires every visible application to have a moderately large chunk of its code paged back in. No, saving the contents of video memory is not an option.

(Reply to this) (Parent)(Thread)


[info]lionsphil
2008-11-16 06:55 pm UTC (link)
I bet that's still a lot less than all of every application, although I don't have any experimentation to support that. And is it still true in this new world of composited graphics? I thought handling exposes was now a case of the X server/compositing malarky just throwing its buffer of what the application painted last time at the graphics card (presumably as a texture, so it can be semitransparent, skewed, and on fire).

(Reply to this) (Parent)(Thread)


[info]mjg59
2008-11-16 07:08 pm UTC (link)
The off-screen pixmap is still stored in video memory, so you still need to generate a redraw. While you're still pulling in less of the application you're also losing the benefit of a single large linear read. Once we're in a ssd world this will make less difference, but it's pretty unavoidable regardless of how good your vm implemtation is.

(Reply to this) (Parent)(Thread)


[info]lionsphil
2008-11-16 08:39 pm UTC (link)
Ah, right; I thought it was into system memory.

(Reply to this) (Parent)

screenshot?
[info]gsbarbieri
2008-11-16 10:07 pm UTC (link)
AFAIK macos X saves a screenshot of the screen just before the real hibernate, usually it looks like the desktop lock, but if user doesn't want/have one it will look like his desktop.

I'd really want that, it would make the process look less ugly, since instead of some garbage as we usually see now or maybe some text being displayed, we'd have an image and possible cursor moving... it's more user friendly IMO.

(Reply to this) (Parent)(Thread)

Re: screenshot?
[info]stsquad
2008-11-17 07:58 am UTC (link)
It's frustrating as hell though. When my partners laptop does it I'm often sitting their tapping the touchpad wondering where my cursor has gone until it magically appears when things start actually running. It's slight of hand really to give the illusion of fast resume.

(Reply to this) (Parent)


[info]theraphim
2008-11-16 11:27 pm UTC (link)
As I said in my followup, usually we'll have to resume from memory anyway. It's only if/when we run out of battery we'll need to resume from disk. But it won't be too often, and in that case, I think user could tolerate extra N seconds of cache warmup.

(Reply to this) (Parent)

Tuxonice
(Anonymous)
2008-11-17 04:11 pm UTC (link)
Before makeing any claims I'd suggest to at least run a Tuxonice kernel for a while.. just as input point.

(works very good, and it's fast, and you can choose how much of the cache it saves)

And resumeing without the cache is awfull .. reminds me of windows which shows you the desktop, but is unusable.

(Reply to this) (Thread)

Re: Tuxonice
[info]mjg59
2008-11-17 04:16 pm UTC (link)
While Tuxonice remains outside the kernel, it's fundamentally uninteresting. And design-wise, it's still stuck relying on the freezer.

(Reply to this) (Parent)(Thread)

Re: Tuxonice
[info]mr_figvam
2008-11-19 07:41 am UTC (link)
> While Tuxonice remains outside the kernel, it's fundamentally uninteresting.

What does it take to push Tuxonice into the kernel? Is it so fundamentally invasive so there is no hope to entice the kernel powers that be to finally integrate it?
Also, if it always be uninteresting outside the main kernel, isn't it a chicken and egg problem?

(Reply to this) (Parent)(Thread)

Re: Tuxonice
[info]nigelcunningham
2008-11-21 02:48 am UTC (link)
The invasive nonsense is just that.

The vast majority of TuxOnIce is adding new files. Apart from that, there's preliminary work for freezing fuse filesystems, which can be treated completely separately. The ugliest part is hooking into the existing swsusp code, which isn't even _that_ bad (though I could get rid of the four #ifdef CONFIG_TOI instances in kernel/power/disk and kernel/power/snapshot.c). In current head (and for a little while now), the changes to mm/vmscan have been reduced to adding two lines.

Regards,

Nigel

(Reply to this) (Parent)(Thread)

Re: Tuxonice
[info]mr_figvam
2008-11-21 05:38 am UTC (link)
Nigel, thanks for the explanation, and for all your efforts to drag Linux into the 21th century by making suspending "just work".

If only the ordinary Linux users had any say in what features to merge into the mainline tree and what not...

(Reply to this) (Parent)(Thread)

Re: Tuxonice
[info]nigelcunningham
2008-11-21 05:47 am UTC (link)
I think it's mostly my fault that it's not merged. I've consistently made reliability, speed and so on higher priorities than getting it merged. If I'd made different decisions, maybe it would already be in mainline.

In the end, though, we are where we are. I hope to start making a more concerted effort to get it merged soon. With the addition of uswsusp and kexec since 2.6.16 or thereabouts, I think the "we don't want another implementation" argument has lost its teeth. If I do a good job of pointing out the advantages over other implementations and keep listening to feedback, I don't see why it shouldn't get in. Still, we'll just have to wait and see, won't we?

(Reply to this) (Parent)

Re: Tuxonice
[info]nigelcunningham
2008-11-21 02:53 am UTC (link)
I agree about merging. I should really work harder on getting that (and intend to soon).

Regarding kexec though, don't forget that it is just a bigger version of the freezer, and comes with its own problems (eg the need to allocate a larger amount of memory than current hibernation code uses for the second kernel, and the loss of context when switching to the kexec'd kernel causes issues too). I'm not saying the existing freezer is great. What I am trying to say is that kexec isn't going to solve every problem either.

(Reply to this) (Parent)(Thread)

Re: Tuxonice
[info]mjg59
2008-11-21 03:08 am UTC (link)
Well, no. The freezer is required because you want to make an atomic copy of RAM, and in the process of doing that you need to free at least as much memory as each stage of that operation is going to take you. A kexec-based approach requires enough RAM to fit a kernel and the trivial quantity of userspace required. Even 2.6 kernels will run on embedded platforms with 16MB or so, so that's the amount of space we're looking at in an optimal situation.

Of course, generating that optimal situation is an interesting coding project. However, when faced with the problem of "How do I generate an atomic snapshot of existing system state", the obvious answer is "Don't execute any more code in the existing system". Which includes the kernel. Adopting a finer grained approach is intrinisically awkward and as the kernel and userspace become more tightly entwined there's going to be more corner cases appearing and biting you.

(Reply to this) (Parent)(Thread)

Re: Tuxonice
[info]nigelcunningham
2008-11-21 04:44 am UTC (link)
Thanks for the reply.

TuxOnIce works slightly differently to [u]swsusp, at the cost of relying on the freezer more. The image is saved in two parts, with the first part being all of the page cache and process memory (except the userspace ui program, if used). We then reuse that memory as the target of the atomic copy, and so only need enough memory to do I/O, not enough for the atomic copy. (If the pagecache is smaller than than the amount to be atomically copied, we allocate more, but this generally only happens just after boot).

(Reply to this) (Parent)

Fix current state first?
(Anonymous)
2008-11-18 12:57 pm UTC (link)
How about focusing on actually suspending with a reasonable speed? Currently Suspend to disk is SUPER slow. Takes 3+ min to suspend and 10+ min to resume (to full functionality)
This is probably a huge cause of both:
PM: Shrinking memory... done (326983 pages freed)
PM: Freed 1307932 kbytes in 81.72 seconds (16.00 MB/s)

(Reply to this) (Thread)

Re: Fix current state first?
[info]mjg59
2008-11-18 01:05 pm UTC (link)
The current suspend to disk implementation is fundamentally broken in an impossible to fix way. What you're seeing is one aspect of that.

(Reply to this) (Parent)(Thread)

Re: Fix current state first?
[info]nigelcunningham
2008-11-21 02:57 am UTC (link)
That's not necessarily a problem with suspend to disk. If it takes 81.72 seconds to free 1.3GB, the vm is having a hernia. Perhaps it shouldn't be asked to free 1.3GB, but without more info, we can't tell.

This is why TuxOnIce used to have that extra code in vmscan.c - a finer grained approach to freeing memory. The reason I got rid of it was that the vm code was becoming too complicated for me to keep forward porting the existing code and be sure I was doing it right.

(Reply to this) (Parent)

MacOS vs. Linux
[info]zbraniecki.myopenid.com
2008-11-24 12:49 am UTC (link)
I'm using MacBook Pro right now and I have both, MacOS and Linux distros on it. It's my first longer period of using Mac after years with Linux... I'm rather a pro user with some dev skills.

I'm not writing it to convince anyone, I'm just trying to share my experience and observations after 6 months of heavy usage.

What I can say is that "close the lid and forget" thing from MacOS is the single most important reason for which I use MacOS. It simply blows my mind how easy and *natural* it feels. I cannot overestimate it and I believe it is impossible to explain that to anyone who had not try to use it for a month or so. (I also heard about it earlier and it did not impress me much).

It improves my productivity, but what's more important is that it changes the role of my laptop. It really becomes a sort of 15" notebook with WiFi. I really can drag it out of my backpack on the bus stop, open the lid and check email (since WiFi is from mcdonalds at the bus stop) or see what is the map I opened before closing with the path I have to take. Or I can start coding, write down my last thoughts or whatever, and when I see the bus I simply close the lid and put my laptop into backpack again.

I do not wait. I do not think of it. It never happened to me for the laptop actually NOT to go into sleep. I do not wait until it suspends before I can turn it 90 degrees vertical and put into backpack. I simply do not care. And this "do not care" is the best blessing I could have. I really have (as we all do) too much on my head to once again start thinking on how my laptop feels and should I wait a minute until it turns off or not. I also don't want to check if my laptop will start doing its fsck while turning on. Overall. Booting on laptop is unpleasant also because of battery waste. It's so good not to think about it. At all...

That's what it looks like from the user POV:
1) MacBook with MacOS in suspend mode does not drain battery. At all. Well, maybe it does but it's totally not noticeable. I'm often putting my lap without charging for a day or a weekend and I turn it on and it *just works*. I do not care if I plugged it in or not. It just works. Always.
2) MacOS turns off laptop BEFORE it actually reaches 0% and it keeps it from turining on a few minutes after I plugged in charge. I believe its like that to ensure that laptop has enough power to survive a few minutes including the process of bringing back to life. I don't mind at all.
3) If battery goes down while in suspend (happened several times when I drained battery and then closed lid and still did not plugged in for a few hours) I just plug in the charge, open the lid and wait like 2 minutes and my desktop with all data is back again.
4) Laptop goes into suspend mode in average 5-8 seconds. I have 4 GB of ram. It takes longer if I have a lot of apps and data loaded but it only goes to, say, 10 sec. I'm not sure how they did it, but it definitely does not dump 4GB on my hard drive during this "hybrid suspend" or whatever it is what MacOS does.

It's really amazing and hard to describe in words. It changes the way you use the laptop. I know that people are saying that it kills battery, may harm laptop itself, what if it doesn't work etc. So my response. It works. Always. You don't think about it. Today I was showing new OpenSUSE 11.1 booting screen and said that it looks so beautiful and that it looks better than MacOS booting screen. She said she does not remember how booting screen looks like on MacOS. She did not see it in a long time.

On my linux distros (opensuse, fedora, ubuntu 8.10) suspend works also reasonably well and pretty fast (I'd say, maybe 2 sec longer), so I believe that the only problem is with hibernation. And I believe, from my POV, that full suspend/hibernate/neverturnoff experience is crucial for the efficiency of laptop using (much more than desktop).

I did not use Vista, but I'm a bit surprised as its the first time I see someone talking about this suspend feature and comparing linux to Vista, not to MacOS. I have no idea if Vista is copying MacOS features or have some unique set, but I definitely recommend taking a close eye on how MacOS works here while you think of approaches for Linux.

(Reply to this) (Thread)

Re: MacOS vs. Linux
[info]zbraniecki.myopenid.com
2008-11-24 12:49 am UTC (link)
btw. thanks for overall working on this. I can't wait the day when I'll be able to fully switch back to Linux without missing features. KDE4, kernel-mode-setting, overall desktop responsiveness and work of people like you on booting/shutdown/suspend/hibernate performance is exactly what I'm waiting for. Thank you! :)

(Reply to this) (Parent)

Really the solution is else where.
(Anonymous)
2008-12-06 10:16 pm UTC (link)
Suspend to disk in cgroups.

How is this different. Current suspend is suspending the complete system so back ground services like logging also can frozen.

Cgroups allow users to choose what they wish suspended to disk and what the wish to be restarted. Problem here is X11 integration. Cgroups current don't support X11 applications well.

Suspend to ram + targeted suspend to disk would be far more stable. Generally users stop most of there programs in there general operation. Its not like a user normally has firefox open 24/7 they will close it at some point. Background services running 24/7 exposed bugs or bad hardware that does not support resume causes failures.

Ask yourself the question. What do users really want from a suspend from disk. The answer is there applications they were running back in the same state.

http://oiaohm.blogspot.com/

(Reply to this) (Thread)

Re: Really the solution is else where.
[info]mjg59
2008-12-07 12:48 am UTC (link)
I basically agree - this was something I suggested in my LCA talk earlier this year. The problem is ensuring that you can restore all the kernel state associated with an application, which is a lot harder than is immediately obvious. For instance, sound drivers will hold state associated with the requested rate and sample size of the application holding it open. How do you save and restore that?

(Reply to this) (Parent)


Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…