> Shell scripts make starting processes trivial, but it's almost unthinkable tha...

edgyquant · on March 22, 2023

Is it me or does this not make sense? Bash glues and pipes together commands, has network access etc. every process being a container would require either knowing all commands and being able to ensure containers have proper access (even across pipes) or that containers were so open as to defeat the purpose.

ryukafalz · on March 23, 2023

Bash may need a high level of authority as it's spawning processes and wiring them together, but those processes don't necessarily need to have as much access as you do themselves.

Take the venerable `cowsay` for example. Currently, running `cowsay` (as with any other program) can cryptolocker your hard drive, delete all your files, reach out to arbitrary servers on the internet, etc. But how much access to your system does it actually need to do its job? Well, mostly... STDIN, and STDOUT, really.

Yes, actually doing this is complicated. Another reply has linked to some broad info about object-capability security, but here's a good introduction to the subject: http://habitatchronicles.com/2017/05/what-are-capabilities/

...and this paper is an excellent deep dive into a capability system: https://mumble.net/~jar/pubs/secureos/secureos.html

lozenge · on March 22, 2023

The problem is every executable can impersonate the user, it has access to do anything the user can do, including deleting or encrypting all their files, reading ssh private keys etc. Network access is rarely concerning unless the program has access to credentials.

nyrikki · on March 22, 2023

Nothing is stopping you from using namespaces, and containers are just namespaces with cgroups etc

But containers aren't jails, pid and uid remapping is just remapping.

A huge problem container has to drop capabilities on the honor system. In the default docker mode, running as root, anyone who can launch a container can read from any block device if they don't drop the mknod capability as an example.

Actually a privileged container can update the bios or even load arbitrary kernel modules in the host context or change kernel parameters as it is a shared kernel.

I tried to get the docker folks to add a conf option disallow privileged container but they refused.

You can run in user mode now but most people want persistence and other features that don't allow for that.

The important point is if you assume containers are a security feature you are going to have a bad time. Jails were bad enough and containers are just one step up from chroots as far as security go.

namespace isolation is the main benefit of containers.

Selinux and apparmor are far more appropriate than containers for the security concerns. While I don't personally like selinux, apparmor profiles are pretty easy to write.

nyrikki · on March 22, 2023

Plus the 'leaks' in the Linux process API is even worse as each container may run its own tiny-init

Containers make the first point of the OP far worse by adding way more pid namespaces.

Karellen · on March 22, 2023

> The problem is every executable can impersonate the user,

Um, what?

What do you mean by "impersonate" here? What does a process that does not impersonate the user look like? Do you just mean "executables that run as the user"?

When you log in, and a shell is started that runs as you, is that shell impersonating the user?

When you execute commands, as yourself, those commands run with your credentials. Because you ran them. Isn't that, like, the point?

dllthomas · on March 22, 2023

Typically, any program I run has the totality of my (regular user) authority, which may let it do things I did not intend.

https://en.wikipedia.org/wiki/Confused_deputy_problem

https://en.wikipedia.org/wiki/Object-capability_model

derefr · on March 22, 2023

> What does a process that does not impersonate the user look like?

A command running inside a virtual machine, maybe?

eru · on March 23, 2023

Or check out how iOS and Android use permissions.

klooney · on March 23, 2023

Bubblewrap is a good tool in this space https://github.com/containers/bubblewrap

bombolo · on March 23, 2023

Use firejail then

traverseda · on March 22, 2023

Firejail

wmf · on March 22, 2023

Maybe cgroups would be better than full containers here.

GauntletWizard · on March 22, 2023

Which cgroups? Containers are not actually a thing in kernel-land. They're a combination of Process, Network, User, and other namespacing.

wmf · on March 22, 2023

No, cgroups are a separate API from namespaces. https://man7.org/linux/man-pages/man7/cgroups.7.html

GauntletWizard · on March 22, 2023

You're not wrong, but the point remains - Are you going to limit their CPUs? Are you going to limit their RAM? Network Performance?

The collections of Cgroups and Namespaces (and for all that they are different APIs, you almost never use one without the other, so perhaps it's bet to refer to the whole group of them as "Containers" or "Containment" to differentiate it from Docker-style containers) is complex and flexible for a reason, even if an absurd proportion of the common cases can be solved with a reasonable set of defaults of them.

mattpallissard · on March 22, 2023

Done. https://pallissard.net/2022/06/27/limiting_application_resou...

Tl'dr two functions "dispatch" that calls systemd-run and "wrap" that takes a command, a memory limit, and a cpu limit.

nine_k · on March 22, 2023

systemd is not bash. Otherwise indeed true.