Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have my group's internal Jenkins service hosted on a single node EC2 instance running Kubernetes (t2.medium) and I would echo all of the advice you're getting. Kubeadm, definitely. And moreover, don't call it production-ready.

A production-ready cluster has dedicated master(s), period. In order to get your single-node cluster to work (so you can schedule "worker" jobs on it) you're going to "remove the dedicated taint," which is signaling that this node is not reserved for "master" pods or kube-system pods only. That will mean that if you do your resource planning poorly with limits and requests, you will easily be able to swamp your "production" cluster and put it underwater, until a reboot.

(The default configuration of a master will ensure that worker pods don't get scheduled there, which makes it harder to accidentally swamp your cluster and break the Kube API, but also won't do anything but basic Kubernetes API stuff.)

If things go south, you're going to be running `kubeadm reset` and `kubeadm init` again because it's 100% faster than any kind of debugging you might try to do, and you're losing money while you try to figure it out. That's not a production HA disaster readiness or recovery plan.

But it 100% works. Practice it well. Jenkins with the kubernetes-plugin is awesome, and if I have a backup copy of the configuration volume and its contents, I can start from scratch and be back to exactly where I was yesterday in about 15-20 minutes of work.

My 1.5.2 cluster's SSL certificate expired a few weeks ago, on the server's birthday, and after several hours trying to reconcile the way that SSL certificate management has changed, to find the proper documentation about how to change the certificate in this ancient version, as well as making considerations that I might upgrade, and what does that mean (read: figuring out how to configure or disable RBAC, at the very least)... I conceded that it was easy to implement the "DR-plan Lite" that we had discussed, went ahead and reinstalled over the same instance "from scratch" again with v1.5.2, and got back to work in short order.

I've spoken with at least half a dozen people that said administering Jenkins servers is an immeasurable pain in the behind. I don't know if that's what you intend to do, but I can tell you that if it's a Jenkins server you want, this is the best way to do it, and you will be well prepared for the day when you decide that it really needs more worker nodes. It was easy to deploy Jenkins from the stable Helm chart.



I've done a number of 1.5 to 1.9 migrations, if you need help figuring out what API endpoints/etc have changed I can give you some guidance if you ping me on k8s slack; mikej.

Once you get onto 1.8+ w/ CRDs you can manage your SSL certs automatically via Jetstacks Certmanager; https://github.com/jetstack/cert-manager/tree/master/contrib...


Thanks! I will check it out!

It just hasn't been a priority. I have no need for RBAC at this point, as I am the only cluster admin, and the whole network is fairly well isolated.

I couldn't really think of a good reason to not upgrade when it came time to kubeadm init again, but then I realized I could probably save ten minutes by not upgrading, it was down, and I didn't know what the immediate consequences of adding RBAC would be for my existing Jenkins deployment and jobs.

Chances are it would have worked.


Honestly for the situation you presented you'll find very few QOL improvements by upgrading. You could probably sit on 1.5 forever on that system (internal jenkins) forever.


The biggest driver is actually just to not be behind.

You can tell already from what little conversation we've had that "always be upgrading" is not a cultural practice here (yet.)

We have regular meetings about changing that! Had two just yesterday. Chuckle




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: