Agreed! However the amount of maintenance that you need to do because of the vulnerabilities is disappointing, especially due to how contrived the upgrade paths are: https://docs.gitlab.com/ee/update/#upgrade-paths
Thus, it might be a better idea to look into a less popular stack that's less likely to be targeted as much due to not being such a juicy target.
For example:
Code: Gitea/Gogs/GitBucket
CI: Drone/Jenkins (okay there are probably better options than Jenkins, to be honest)
Registry: Nexus/Artifactory (not just for containers, they support most formats and have better control over cleanup of old data so you don't have to schedule GitLab cleanup yourself)
Of course, at the end of the day all of those still have an attacks surface, so i'm really leaning more and more into the camp of exposing nothing publicly since it's a losing battle.
What maintenance?! I just bump the docker-compose.yml version numbers and Stop/Start the service. It's very painless... My cellphone has more frequent updates than Gitlab does.
If you do that with minor versions, you should generally be fine. When you need to upgrade across major versions, you'll most likely be met with the following in case you haven't followed the updates closely:
> It seems you are upgrading from major version X to major version Y.
> It is required to upgrade to the latest Y.0.x version first before proceeding.
In addition to that, you should NEVER just bump versions without having backups (which you've hopefully considered), so there is probably another step in there, either validating that your latest automatic backups work, or even just manually copying the current GitLab data directory into another folder, in the case of an Omnibus install, or doing the same manually for all components in the more distributed installation type.
Disclaimer: this has little do to with GitLab in particular but is something you should consider with any and all software packages that you upgrade, especially the kind with non-trivial dependencies and data storage mechanisms, like PostgreSQL. Of course, you can always dump the DB but it's easier to back up everything else as well by taking the instances offline and making data copies of all container volumes/bind mounts.
* I have automated backups created every 2 days stored via S3, I've done a full restore twice in 5+ years of uptime.
* I run Gitlab at home and at work.
None of the points touch on a maintenance burden ... Just saying. Skipping versions while updating any software is just being a lazy sysadmin and praying it works. Typically skipping to major versions during upgrades always comes with breaking changes so operator beware.
> Skipping versions while updating any software is just being a lazy sysadmin and praying it works. Typically skipping to major versions during upgrades always comes with breaking changes so operator beware.
It's nice that GitLab actually prevent you from doing that and give you messages that it's unsupported and direct you to their documentation, which describes the supported upgrade paths...
However, at the same time one cannot help but to wonder about why you can't go from version #1 to version #999 in one go. Most of the software at my dayjob (at least the one that i've written) absolutely can do that - since the DB migrations are fully automated, even if i had to create a lot of pushback against other devs going: "Well, it would be easier just to tell the clients who are running this software to just do X manually when going from version Y to Z."
But GitLab's updates are largely automated (the Chef scripts and PostgreSQL migrations etc.), it's just that for some reason they either don't include all of them or require that you visit certain key points throughout the process, which cannot be skipped (e.g. certain milestone releases, as described in their docs).
Of course, i acknowledge that it's extremely hard to sustain backwards compatibility and i've seen numerous projects start out that way and the devs give up on the idea at first sign of difficulty, since it's not like they care much for that and it doesn't always lead to clear value add - it's a nice to have and they won't earn any less of a salary for making some ops' person's life harder down the line.
I also have automated backups with BackupPC, however i expect software to remain reasonably secure and stable without having to update that often - props to GitLab for disclosing the important releases, but i'm migrating over to Gitea for my personal needs as we speak, even if having someone else manage a GitLab install at work is still like having a superpower (with GitLab CI, GitLab Registry etc.).
I actually wrote an article about how really frequent updates cause problems and lots of churn: https://blog.kronis.dev/articles/never-update-anything (though the title is a bit tongue in cheek, as explained by the disclaimer at the top of the article).
Your db migrations may support updates from #1 to #99 but your OS does not directly support updates of MySQL 5 to MySQL 8 with issues. For example there are plenty of examples of deprecated my.cnf configuration values. Similarly APT on Ubuntu will prompt how to handle a my.cnf that differs from the Distribution release when upgrading Versions. Often times this is more painful than minor version updates.
I think the version milestones in Gitlab are akin to dependency changes for self-hosted Gitlab. An example is the Gitlab v9 (?) Postgres upgrade to Postgres v11 I think, it was opt-in for a prior version of Gitlab than required at that version milestone. It's difficult to make db migration scripts for Gitlab,as in your example, that may depend on newer Postgres idioms not available in the legacy db version. So you can't simply support gitlab updates from X to Y version due to underlyng dependency constraints...
> Your db migrations may support updates from #1 to #99 but your OS does not directly support updates of MySQL 5 to MySQL 8 with issues.
That's just the thing - more software out there should have a clear separation between the files needed to run it (binaries, other libraries), its configuration (either files, environment variables or a mix of both) and the data that's generated by it.
The binaries and libraries can easily have breaking changes and be incompatible with one another (essentially treat them as a blob that fits together, though dynamic linking muddies this). The configuration can also change, though it should be documented and the binaries should output warnings in the logs in such cases (like GitLab actually already does!). The data should have extra care taken to make it compatible between most versions, with at least forwards only migrations available in all other cases (since backwards compatible migrations are just too hard to do in practice).
Alas, i don't install most software on my servers anymore, merely Docker (or Podman, basically any OCI compatible technology) containers with specific volumes or bind mounts for the persistent data. GitLab is pretty good in this regard with its Omnibus install, though there are certainly a few problems with it if you try to do too many updates or have a non-standard configuration.
Of course, i'll still use GitLab in my company because there it's someone else's job to keep it running with a hopefully appropriate amount of resources to keep it that way with minimal downtime and all the relevant updates. But at the same time, for certain circumstances (like my memory constrained homelab setup), it makes sense to look into multiple lightweight integrated solutions.
You can actually find more information about what broke for me while doing updates in particular, seemingly something cgroups related with gitaly related stuff not having the write permissions needed inside of the container, which later lead to the embedded PostgreSQL failing catastrophically. In comparison, right now i just have Gitea for similar goals which is a single binary that uses an SQLite database, as well as the other aforementioned tools for CI and storage of artefacts, which are similarly decoupled.
It's probably all about constraints, drawbacks and finding what works for you best!
Well, i don't think that there's actually that much that can be done here, since that page does contain adequate documentation and a linear example migration path:
It's just that the process itself is troublesome, in that you can'd just go from let's say 11.11.8 to 14.6.5 in one go and let all of the thousands of changes and migrations be applied automatically with no problems, as some software out there attempts to (with varying degrees of success).
Of course, it's probably not viable due to the significant changes that the actual application undergoes and therefore one just needs to bite the bullet and deal with either constant small updates or few and longer updates for private instances.
But hey, thanks for creating the issue and best of luck!
Side question: is the gitlab docker registry really that useful?
The fact that any non-protected runner can push to it makes it useless to store images to be used in other CI pipeline (unless I missed something, which I would be really glad)
The registry is fine for our use case, which is to manage all the dev artefacts on a private repo before CI might push a release to a production registry.
> any non-protected runner can push to it
My understanding is the job token inherits its permission set from the user causing the job to run. If the user has `write_registry` to a project (developer up), then the job does. Do you see more access than that?
The access can be limited per project to specific projects by setting a scope [0] but your description sounds like it might access within the project that is the issue.
The whole point of having protected runners is to do things that developers are not allowed to do. If any developer can push images to the registry without any review/approval, and those images are used in other CI pipeline that's a problem for us.
Having a separate production registry is good indeed, but for images to be used for CI itself, having something self-contained within gitlab would have been nice.