> *Anything else also would be insanely hard.* checkpoint restore of gpu state i...

LadyCailin · on Feb 27, 2022

You mean for the same GPU? Or across GPU brands and stuff?

namibj · on Feb 27, 2022

Even the latter, as long as you're doing something like restricting yourself to a shared subset of OpenGL.

You really just need to dump textures and set up a matching context, which at worst should require an indirection shim to translate handles if they are stored by the application but generated by the driver with no way to force custom values during object creation (like how Linux allows custom PIDs for checkpoint/restore and other replay tech like rr).

account42 · on March 2, 2022

"a matching context" means having exactly the same handles for all objects (since application memory will have references to them) and with modern-enough OpenGL even means having the same (virtual) memory addresses for all GPU-side buffers as well as the same addresses for any host-side mappings. This is probably not something you could build on top of OpenGL or even on top of the existing kernel-side drivers.

Even if have a perfect wrapper that achieves all that you now have to save all application-supplied texture and shader data in the original form even where that memory could normally be freed after translating it to GPU-specific formats.

But "restricting yourself to a shared subset of OpenGL" alone already makes this unviable for anything demanding as it also means restricting yourself to the lowest common denominator for all limits including VRAM size, maximum texture sizes, essentially guaranteeing your solution to run worse on both systems.

namibj · on March 3, 2022

Ehh, no need to save the application-supplied data in RAM. Also, I expect them to be typically supplied pre-compressed, not on-the-fly-converted (at least where performance matters).

The texture size issue I don't see as a problem (I thought we grew out of that being an issue years ago?), and the VRAM issue shouldn't be that hard, as games tend to not use spare VRAM for dynamic time-memory tradeoffs (unlike OS kernels with their pagecache).

On top of that, it's not like you need this functionality to deliver top performance from games that don't want to play their part to make this efficient. Working seamlessly at "just" 30% less FPS should already be very useful.

Hello71 · on Feb 28, 2022

rr doesn't need or use checkpoint/restore functionality. it's insufficient because rr already needs to catch and emulate all syscalls made by a process, and unnecessary because once you're doing that it's (relatively) easy to also emulate getpid. it also requires CAP_SYS_ADMIN or CAP_CHECKPOINT_RESTORE.

account42 · on March 2, 2022

GPU is just one part, there are many more. As soon as you have network connections in play the state you'd need to restore isn't even all on the device.