Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
An annoyance with Debian postinstall scripts during package upgrades (utcc.utoronto.ca)
52 points by zdw on Jan 10, 2022 | hide | past | favorite | 37 comments


It is required by Debian Policy §6.2 (https://www.debian.org/doc/debian-policy/ch-maintainerscript...) that package maintainer scripts be idempotent.

A maintainer could, theoretically, figure out whether its an upgrade or not by checking to see if a version is being passed into it. postinst configure is only called with a version on upgrade. See: https://www.debian.org/doc/debian-policy/ap-flowcharts.html


1) that's actually reasonable and I believe that all Debian-maintained packages follows this, but unfortunately some third-party, usually proprietary, packages don't follow this.

2) Huh, so it's actually there, just isn't explicit, probably because of their requirement of idempotency.


I'm imagining it's also possible that there could be bugs in a postinstall script that make it behave differently during an upgrade.

This is actually somewhat an argument that postinstall scripts are maybe a bad idea. They're quite hard to test. It's better to have the package manager simply manipulate files in a declarative fashion than have it run package-provided code that may be buggy and introduce issues.


Of course it'd be better, but I don't think a simple declarative language could easily cover the breadth of what is currently done in postinstall scripts.


From:

> Linux distributions: Can we do without hooks and triggers? (2019)

https://michael.stapelberg.ch/posts/2019-07-20-hooks-and-tri...

> If we want to get rid of hooks, we need another mechanism to achieve what we currently achieve with hooks.

> If the hook is not specific to the package, it can be moved to the package manager. The desired system state should either be derived from the package contents (e.g. required system users can be discovered from systemd service files) or declaratively specified in the package build instructions—more on that in another blog post. This turns hooks (arbitrary code) into configuration, which allows the package manager to collapse and sequence the required state changes. E.g., when 5 packages are installed which each need a new system user, the package manager could update /etc/passwd just once.

> If the hook is specific to the package, it should be moved into the package contents. This typically means moving the functionality into the program start (or the systemd service file if we are talking about a daemon). If (while?) upstream is not convinced, you can either wrap the program or patch it. Note that this case is relatively rare: I have worked with hundreds of packages and the only package-specific functionality I came across was automatically generating host keys before starting OpenSSH’s sshd(8).


What you're asking for is Configuration Management, which is even worse than buggy postinstall scripts.



> A maintainer could, theoretically, figure out whether its an upgrade or not by checking to see if a version is being passed into it.

Or use good old lock files in the software's data directory to determine what the current state is (e.g. /var/lib/mysql/debian-10.3.flag) and which updates need to be run.

Lock files aren't rocket science.


And the install process may hang / crash / lose power etc. There are lots of reasons for one to rerun a post install that aren’t an upgrade.


Package management on Linux is a real mess. It works great if you're running a server with standard software from official distro repos that is infrequently updated.

As soon as you go outside that it quickly goes wrong in my experience (I've been using Linux since around 1999).. Recently my RaspberryPi media server's packages completely blew up because a third party repo messed something up which now has half installed a broken package and I cannot figure out how to fix it. This has happened basically every time I have used desktop Linux with any third party PPAs/repos to vary degrees of brokenness.

This has _never_ happened to me on Windows/Mac (at least since the NT4 days - obviously software has broken things but I've never managed to break software installation system wide like this on other OSs). Even homebrew which I have used in anger seems to break far far less often than apt/dkpg despite having much more bleeding edge software.


You mention a server, but just putting it out there: Your distro can make the whole difference, there. I've gone though various distros and platforms over the years. My personal choice for server/headless - Debian - would not be something I run on a desktop these days. Many times for desktop, you want to run specific versions, apply specific patches, run a specific kernel, etc. Arch, Gentoo or NixOS may be a better pick.

Or just run OCI containers. If you don't want to bother with building your own, linuxserver have a decent library.

When running an underserved platform-use-case combo (Plex/Jellyfin on debian on arm64) I've found the smoothest route is often to build the .deb packages yourself (in a proper automated repo if you want to scale, locally installed with dpkg if it's not too much). Much easier than chasing dubious ppas that may or work or not for your setup.


The problem is a lot of 3rd party software requires a 'mainstream' distro, especially commercial software.

Regardless, my point isn't that it's not possible to solve it, which it potentially is with enough work, but the out of the box experience of the most popular Linux distros is very poor for software management for desktop/fast moving software.

This really hasn't improved much in the 20+ years I've used Linux.


If you want software outside the official repo that isn't self contained you should use a container or build it yourself and install into the /usr/local tree. Don't cross the streams.


It works so much better with archlinux than with Debian though


Rebuilding an official or unofficial package on Arch is so refreshingly simple. On Debian or Ubuntu, it seems significantly harder.

Arch is basically superior to Gentoo for me in that respect. Things are mostly binary, except when I choose to build from source. For someone who doesn't want to tweak compile-time flags, it's a good distro.


> Package management on Linux is a real mess. It works great if you're running a server with standard software from official distro repos that is infrequently updated.

> As soon as you go outside that it quickly goes wrong in my experience (I've been using Linux since around 1999).

My entry to Linux was slightly later, but I have the same opinion. What's worse is that it is maddening when you try and express this to the Linux Desktop community, they will tell you its 'insecure' to use third party packages and you shouldn't do it[0], or that you should put everything in a container[1], or explain that because freedom you can always compile from source[2]!

We are fortunate that parts of the community are finally waking up to the problem and trying to deal with it with things like AppImage[3] and Flatpak[4], at least for desktop software.

[0] There isn't really any greater theoretical insecurity with third party packages than first, but more importantly I'm not going to refrain from running software I need to run because it hasn't been blessed by some volunteer third party who almost certainly hasn't actually reviewed the code anyway. If the alternative is not running software then I'll take the risk, it is mine to take.

[1] Ever wonder why other desktop operating systems don't need to do this? It's adding a bunch of unnecessary complexity and extra management to what should be one of the primary things one can do with a personal computer!

[2] Leaving aside that that is a pretty ridiculous expectation, build environments are often complicated enough that projects distribute theirs in a container to avoid dealing with all the problems that could be caused because of conflicts and missing or out of date packages. Basically the same problems as compiled software!

[3] AppImage has been around a long time and so had its earlier incarnations (ROX AppDirs and Klik). I love it, but it isn't as popular as Flatpak seems to be.

[4] I have my issues with implementation, but at least it is dealing with a lot of the traditional model's problems including finally allowing me to install software to a different disk. Still has a ways to go to become ubiquitous, but it is making a lot of headway.


Agreed entirely. I was actually expecting to be massively downvoted for this post.


> It works great if you're running a server with standard software from official distro repos that is infrequently updated.

It's not frequent updates that screwed you, it's using third party repos. OpenSUSE Tumbleweed is an RPM based rolling release distro with up-to-date software that I've found to be very suitable, and stable, for desktop use. Before switching to Tumbleweed, I previously used Debian Sid for a long time. Contrary to Sid's 'unstable' moniker, was always very stable for me when I wasn't pulling in 3rd party repos or trying to mix Sid with Stable (the notorious 'frankendebian'.)


By "infrequently updated", I mean software where it is "ok" to be running versions released months/years ago in the official repos and not having to resort to 3rd party repos to get more up to date software.

Nearly everyone has to use 3rd party software at some point though. There just isn't a good solution for this on Linux. Too many people assume Linux users only use open source software that is in the official package repos. It totally excludes commercial software, niche stuff, etc.


Why is anyone still using packages for bloated software like plex? Pull the docker container and call it a day.


As a gentoo user, this postinstall behavior on debian/fedora/etc was rather shocking to me when I first encountered it. With gentoo portage, either when initially installing a package or when upgrading it, nothing under /var/lib is created or modified, and the process itself is never automagically started or stopped. The postinstall behavior actually feels like a regression toward windows-style installer.

The cassandra debian package is a good example. If I let the postinstall script runs, then it will use the default config to start a single-node cassandra cluster, creating a bunch of stuffs under /var/lib/cassandra, which then need to be removed first if I want to build a proper image from the disk. Doing the removal as part of the automation feels too risky to me, as it may only take one typo to accidentally delete /var/lib/cassandra in production. Instead, I usually just put a policy-rc.d script to disable all postinstall scripts from running.


It's worth noting that Fedora/RHEL's postinstall is generally less intrusive than Debain's, though.

For instance it doesn't just automatically create databases or starts/enables services.


Debian also has debconf, which attempts to be a universal configuration management system, so it'll generate and re-generate configuration files from templates. Unfortunately, in my experience it gets in my way more often than it helps since I usually have separate automation that configures the software anyway.

I use Debian-based systems pretty rarely, so I still haven't figured out how you actually make dpkg regenerate your configuration files properly if you use automation to change a debconf selection and need to reconfigure a package. dpkg-reconfigure seems not to do what I want and just restores the previous value (or prompts interactively, which is not useful) for some reason.


People expect `apt install package` to leave the package installed and running. That expectation does change from one distro to another. Slackware was one that wouldn't even create a start script for a deamon.


As a side note I switched from Debian to Fedora recently and I was amazed at just how much faster in almost every way possible dnf (rpm) is compared to apt (deb). Downloading packages is faster and has less broken mirrors. Actually running the transactions is much faster. Oh, and dnf seems to be able to handle the problem where you have multiple kernels updating or installing and each one has to regenerate the initrd gzip.


dnf has support for using the CoW support of Btrfs to make all the disk I/O more efficient. Might be why you're seeing dnf being more efficient, because my somewhat outdated experience is totally opposite.


YUM/DNF (and RPM itself) are very different beasts from what they used to be some time in the past. For some reason, updating repository metadata is still slow compared to APT, but installing packages is speedy enough. However, in terms of features it's just... better.

Two things that I especially appreciate: I can install things without knowing the package names, eg. "dnf install /usr/bin/foot 'perl(XML::Tiny)' 'pkgconfig(bwayland-client)' libaudit.so.1" will just work. These are especially useful when packaging stuff yourself because you can use these indirect references in your dependency list so it doesn't matter which package actually provides them.

I can also "sync" my system to a set of repositories with distro-sync, downgrading and/or replacing packages as necessary. I've had a few times with Ubuntu that I wished APT had this.

I also like that it does not have debconf. I don't have much love for debconf.


As I understand it the CoW support is for (optionally) taking filesystem snapshots before/after applying package upgrades, I don't think it inherently makes disk I/O more efficient. There are some other neat tricks that dnf does though. For example, in my experience dnf/rpm are smarter in the case where N packages need to run the same post-install step and run it once instead of apt/dpkg which might run it N times (examples: updating the man database, updating icon caches, updating ldconfig cache). And as the article points out, rpm packages are smarter about skipping post-install steps altogether for package updates.

In general it's my impression that dnf/rpm have been much more actively developed in the past few years than apt/dpkg. Depending on how long it's been since you last used it, they've made changes to the manifest format, the internal rpm database format, and rewritten a lot of parts of dnf in C for speed, and all of these have made a huge impact on the speed and usability of dnf.


If your outdated experience is with YUM as opposed to DNF, that could be why. DNF is much faster than YUM. However if you have used DNF then idk.


Using deltas for updates is a huge win for Fedora. On distros without delta updates, it's always a real pain in the ass when Wesnoth updates and you have to download half a gig of assets that are probably only a few kilobytes different from the last version.


A related gripe I have with Debian-based distros is that APT shuts down daemons and keeps them down for an unreasonably long time while it updates seemingly unrelated packages.

This is especially annoying on desktop Ubuntu boxes with automatic updates turned on. You're working on some frontend code and bam! nginx is down, the database is down.

It should only take a second to restart foo.service after updating foo. Why shut it down so early and wait while APT is updating the kernel and a whole bunch of other packages? Probably because shutting eveyrthing down makes it easier for the postinstall script to do a clean job.

YUM/DNF don't do this. They just update things and let me restart things at my leisure.


> Probably because shutting eveyrthing down makes it easier for the postinstall script to do a clean job.

It's also safer, at least for software that spawns instead of forks child processes - think of php-fpm pools, for example. You might end up with (let's say) nginx having one half of the pool on php8.0.0 and one half on php8.0.10 otherwise.

As for automatic updates... I like the way macOS operates, it asks the user for a confirmation to apply updates. Fully automated updates are a bad idea sigh


> APT shuts down daemons and keeps them down for an unreasonably long

Are you doing live upgrades on production servers? Don't.

If you are doing updates anywhere else a couple of seconds of downtime is negligible.


Who said it was a live server? And it can take much longer than a couple of seconds. Did you even read the parent comment?


dnf has recently added automatic restart support of systemd system services (and soon user services). It's opt-in per package, but expect things to restart more in the future.


Restart is fine. It only takes a fraction of a second.

But APT doesn't do restarts. It stops services and doesn't start them back up until it's finished with all the other packages. This can take a few minutes if there are kernel updates that trigger multiple initramfs rebuilds, or indefinitely long if the update process throws up a dialog to confirm something.

OP makes it clear why this is happening. Packages have no concept of an upgrade, so APT stops services before removing the old version, and starts them back up after installing the new version.


I have a use case to run both ntpd and chronyd NTP daemons.

Debian will not allow me to install both: it will delete the other package if you install one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: