Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Shell scripts make starting processes trivial, but it's almost unthinkable that, say, bash, would integrate functionality for starting containers, so that every process is started in a container.

Doooooooo it



Is it me or does this not make sense? Bash glues and pipes together commands, has network access etc. every process being a container would require either knowing all commands and being able to ensure containers have proper access (even across pipes) or that containers were so open as to defeat the purpose.


Bash may need a high level of authority as it's spawning processes and wiring them together, but those processes don't necessarily need to have as much access as you do themselves.

Take the venerable `cowsay` for example. Currently, running `cowsay` (as with any other program) can cryptolocker your hard drive, delete all your files, reach out to arbitrary servers on the internet, etc. But how much access to your system does it actually need to do its job? Well, mostly... STDIN, and STDOUT, really.

Yes, actually doing this is complicated. Another reply has linked to some broad info about object-capability security, but here's a good introduction to the subject: http://habitatchronicles.com/2017/05/what-are-capabilities/

...and this paper is an excellent deep dive into a capability system: https://mumble.net/~jar/pubs/secureos/secureos.html


The problem is every executable can impersonate the user, it has access to do anything the user can do, including deleting or encrypting all their files, reading ssh private keys etc. Network access is rarely concerning unless the program has access to credentials.


Nothing is stopping you from using namespaces, and containers are just namespaces with cgroups etc

But containers aren't jails, pid and uid remapping is just remapping.

A huge problem container has to drop capabilities on the honor system. In the default docker mode, running as root, anyone who can launch a container can read from any block device if they don't drop the mknod capability as an example.

Actually a privileged container can update the bios or even load arbitrary kernel modules in the host context or change kernel parameters as it is a shared kernel.

I tried to get the docker folks to add a conf option disallow privileged container but they refused.

You can run in user mode now but most people want persistence and other features that don't allow for that.

The important point is if you assume containers are a security feature you are going to have a bad time. Jails were bad enough and containers are just one step up from chroots as far as security go.

namespace isolation is the main benefit of containers.

Selinux and apparmor are far more appropriate than containers for the security concerns. While I don't personally like selinux, apparmor profiles are pretty easy to write.


Plus the 'leaks' in the Linux process API is even worse as each container may run its own tiny-init

Containers make the first point of the OP far worse by adding way more pid namespaces.


> The problem is every executable can impersonate the user,

Um, what?

What do you mean by "impersonate" here? What does a process that does not impersonate the user look like? Do you just mean "executables that run as the user"?

When you log in, and a shell is started that runs as you, is that shell impersonating the user?

When you execute commands, as yourself, those commands run with your credentials. Because you ran them. Isn't that, like, the point?


Typically, any program I run has the totality of my (regular user) authority, which may let it do things I did not intend.

Related:

https://en.wikipedia.org/wiki/Ambient_authority

https://en.wikipedia.org/wiki/Confused_deputy_problem

https://en.wikipedia.org/wiki/Object-capability_model


> What does a process that does not impersonate the user look like?

A command running inside a virtual machine, maybe?


Or check out how iOS and Android use permissions.


Bubblewrap is a good tool in this space https://github.com/containers/bubblewrap


Use firejail then


Firejail


Maybe cgroups would be better than full containers here.


Which cgroups? Containers are not actually a thing in kernel-land. They're a combination of Process, Network, User, and other namespacing.


No, cgroups are a separate API from namespaces. https://man7.org/linux/man-pages/man7/cgroups.7.html


You're not wrong, but the point remains - Are you going to limit their CPUs? Are you going to limit their RAM? Network Performance?

The collections of Cgroups and Namespaces (and for all that they are different APIs, you almost never use one without the other, so perhaps it's bet to refer to the whole group of them as "Containers" or "Containment" to differentiate it from Docker-style containers) is complex and flexible for a reason, even if an absurd proportion of the common cases can be solved with a reasonable set of defaults of them.


Done. https://pallissard.net/2022/06/27/limiting_application_resou...

Tl'dr two functions "dispatch" that calls systemd-run and "wrap" that takes a command, a memory limit, and a cpu limit.


systemd is not bash. Otherwise indeed true.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: