The short flags are really there for when you're doing things interactively.
For scripts that are read and executed more often than they're written or changed, the long flags really help ensure you don't fat-finger.
I'm accustomed to typing `rm -rf` when really I just need `rm -r` in a lot of cases, and in PRs it's very easy for your eyes to glaze over when there's more than a single short flag.
I guess the differences are how long it takes to detect a typo, and therefore the blast radius in getting it wrong in a script that could be executed many times before you notice.
It's a common thing for IT departments or people making enterprise images to put in because they think it makes it safer.
I feel it does more harm than good by normalising rm -f when you want to recursively delete a folder, but with CD deployments these days it's less of a deal.
CD, continuous deployment, is used to refer to the process not an individual instance like the term ATM. Simply saying "This deployment failed" doesn't convey that it's a deployment that was made through CD but "This CD failed" is also ambiguous as it doesn't indicate whether the process or the conditions particular to a deployment that caused it to fail. Using them both in combination like "CD deployment" seems perfectly valid to me to resolve both ambiguities.
To be fair though they were referring to CD deployment(s), plural, which is a bit redundant and just CD would have done just as well in this case.
Yeah... no. Also definitely not the case. If I actually am remembering it correctly then it's probably from Fedora circa FC5 or Ubuntu circa 6.06. Back when I was in school and had never touched Linux other than on my own computer.
Go do a fresh install of CentOS/RHEL (or just spin up a new docker container) and what do you see in root's bashrc?
alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'
I've also encountered one job which adds this to their base ubuntu image, and my current employer uses Macs which have it added to bashrc by their mdm software on initial install.
So what's your justification to dismiss the idea that this is a common practice?
I wasn't dismissing the idea that it was common practice, I was dismissing the idea that the common practice applied in my case. I did not have an IT department setting anything on my system when I experienced this.
And to add to your point, in my experience I've had "long-lived" scripts that the flags change meaning over the years. This is extremely rare on base-level installs, but frighteningly more common on the inhouse-programs I've had to deal with.
The long flags are less likely to be changed over time. The script will usually fail gracefully and I don't also need to relearn what I was thinking 25 years ago when I first did it. (long flags are like in-command comments too)
And as others have pointed out -v(ersion) or -v(erbose) can happen too.
The worst (well known) offender is grep -v meaning grep --invert-match just for the sheer bafflement of "verbose" now hiding what you were looking for.
That actually comes from ed and vi. The g command would search for a line that matches a given regular expression and run the following command on that line. The v command was the inverse (search for lines that don't match the given regular expression).
The default command was to just print the line. Hence the name grep
To be fair, the JVM predates GNU-style --long flags being "standard". Lots of older programs use "X11-style" -long flags (that many pre-X11 programs used, like `find`).
Same! Somehow I also start confusing this with somewhat related tools like gcc. It's rare enough that I need to query these that it's all blurry in my memory again by next time. "huh I guess gcc was that goofy one with just one dash.." Nope.
tar is one of the worst. I think I know what those do without looking it up but I think most people look at that with bewilderment. Randall Munroe is always helpful in these matters and has this to say: https://xkcd.com/1168/
Recent (as in, within the last ~3-5 years) versions of GNU tar automatically detect the type of compression and apply the appropriate tool, so you can often get away with `tar xvf`.
you can decompress things which are not files, such as stdin or device files. in addition, your code does not cover *.tar.xz (more common than tar.bz2 nowadays), or lzma, or lz4, or tar.zst, or many many other formats. further, it's not even consistent: bunzip2, gunzip, and unxz remove the input file, but tar, unrar, 7z, and zstd do not.
There's a utility called dtrx ("do the right extraction") that does basically this, and also makes sure that files are always extracted into a directory.
I can certainly understand the impulse to something like this, but as other posters pointed out, your case statement is incomplete, and will continue to become more so as new tools become available.
Personally, I consciously avoid using the internal decompression features of tar, both out of habit and to avoid unexpected results.
As such, I generally use command lines like:
bunzip2 -c <tar.bz2 file> |tar xvf -
instead of relying on tar xvjf
I don't see it as wrong or bad to use tar's decompression features, it's more about my own preferences and experience. Being able to perform similar actions in multiple ways is one of the things I've always appreciated about the shell and the Unix/GNU userland.
I guess you only ever extract punctually named files. But it's still rather amusing that with the exception of 'case', the only places where $1 is properly quoted are the ones where it's not really necessary, and vice versa (but then again, it's not done for the parameter, but rather the single quotes).
Anyway, my advice to you and anyone else having trouble with tar is that "tar caf" and "tar xaf" are the only things most people need to remember about tar, with or without compression.
(In this case, most people means people who use tar, but so rarely that they have trouble remembering how to use it; they probably never use it for anything else other than (un)archiving. Also, xkcd isn't gospel.)
These two commands should fit most of your use cases:
tar cf dir.tar dir # mnemonic: 'create file' <tar-file-name> <dir-to-tar>
tar xf dir.tar(.gz|bz2|...) # mnemonic: 'eXtract file' <tar-file-name>
On systems with "modern" versions of tar `-x` is capable of recognizing which compression format is used and doesn't require the explicit `-j/z` flags you usually see.
> 0: any program that doesn't support --help is defective, which was rather my point.
Fair enough. My point was that there's a category of operating systems where long options (including --help) isn't really a thing. You may of course consider them defective, though I'm not sure everybody agrees.
I dunno. I may know them but the next person may not. Where I work, there are plenty of people who primarily work w/ other langs and only dabble in bash once in a blue moon.
Case in point, when I was learning, I'd copy and paste snippets I found online without understanding what some flags did. At first I didn't even know how to look up docs, and googling for the meaning of shorthand isn't always fruitful, especially given the "don't know how to find docs" limitation.
I've even ran into cases where the next person is myself. For example at one point, I "knew" docker flags while it was fresh in my mind, but then forgot what they meant a few months later...
> If you're working on shell scripts, you probably know certain short flags well enough that long flags just add clutter and don't improve readability.
What about other people who might read the script?
I have more difficulties to read long flags. The people who want to improve by reading my scripts will have to search for the short flags because nobody use long flags from command line. IMHO the benefits of long flags vs short ones are not obvious.
I argue that 'very commonly used' is subjective, and someone new won't know all the ones you or I consider common. One of my goals in writing software is to make my code understandable to complete newcomers. I can't always achieve it. Sometimes other design goals take priority. But from my perspective this one is a no-brainer. I just don't see the down side. If it's too time consuming to type, use them for typing practice and get faster :-)
I would agree. Especially when you have something like https://explainshell.com/ to help you understand a command and its flags. It would be nice to have editor/IDE extensions which can do equivalent of explainshell.com's functionality.
While i agree these are "common" you now have to define what common flags are. It's easier for an organization to just have a blanket rule to use long form expressions
Yes. Suggestions are good. Following suggestions with purpose is better. Or, rather, blindly following suggestions without understanding the benefits (and costs to) the suggestions doesn't seem ideal to me.
-- It respects the great answer of "it depends".
Gonna admit that I wouldn't be able to tell what grep -i, sed -e do. (I would've used `sed -e` before, but not enough to remember that). Still, some things are clear from context.
That's not the point. The point is that you can discern intent if you are writing against the GNU implemention so one can trivially reimplement against BSD.
It's rare that I'd use it in a script. Perhaps I'd use it for a script whose main purpose is creating an archive to be widely distributed. Then the info might be of interest to the script's user.
But your comment brings up a good point about the difficulty of deciding what short flags truly are widely known. I assumed anyone who uses gzip regularly would know "-v", but maybe that isn't true.
It's useful when debugging your artifacts. It's a nice shortcut for adding a `find` after the `tar` invocation. Outside of debugging I've never seen the gzip/tar verbose output as anything other than clutter.
It’s not just “easier for a human” but more stable over time as commands evolve, and less likely to do weird things across Unix variants.
Some command-line parsers will auto-complete partial options so you’re less likely to see ambiguity errors in future versions if you picked long-form options. (This isn’t completely foolproof, e.g. a command could have "--foo" and later add "--foobar" but it does help in most cases.)
And unfortunately, some tools with the same name will use the same letter to mean different things across Unix variants. You are asking for trouble if you aren’t being clear about what you want.
> Some command-line parsers will auto-complete partial options
This sort of behaviour is tortuous, and should be against international laws.
Where _adding_ new, unrelated, options to the interface now changes or breaks the behaviour of existing calling scripts. Plus you never know which abbreviations are in use in the wild.
It makes it virtually impossible to maintain a stable interface without just freezing it long-hand.
I found this in Perl code; I believe it might be the default behaviour in the standard parser? Our developer really did like the philosophy that "the user may want 50 different ways to express the same thing".
Unfortunately, this isn't quite true. At least if by Unix you mean POSIX, i.e. you include macOS in your considerations.
1. Sadly, short options are in fact more portable than long options if you're targeting POSIX. For instance, macOS ships with versions of ls, rm, et cetera that only support short flags. This is because long options don't exist at all in POSIX (they're formalized here: https://pubs.opengroup.org/onlinepubs/009695399/functions/ge...).
2. Auto-completing partial options only happens for long flags, and it's a bit more than "some" parsers; the canonical implementation of "long" options, GNU's getopt_long, has it as a documented feature:
> Long option names may be abbreviated if the abbreviation is unique or is an exact match for some defined option.
This is how I write PowerShell scripts. I use full cmdlet names, full argument switch names, and I even specify argument switches to positional arguments. I also do my best to honor common flags like `-WhatIf`, `-Verbose`, and `-ErrorAction`. So I end up with scripts like this:
if (-not $(Test-Path -Path "$Path" -PathType Container)) {
Write-Error -Message "Path ($Path) does not exist or is not a directory." -Category InvalidArgument;
return;
}
When you do this properly, it feels like magic. For example, I wrote a script that does local Maven and Docker builds for a bunch of related projects. So I wrote two functions `Build-Maven` and `Build-Docker` with proper common flag support and error handling. Then, when I use them, I just do something like this:
That first clause automatically amends `Build-Maven` and `Build-Docker` commands with `-ErrorAction Stop`. So if any of those build commands fail, the entire script halts there. And, if I pass in `-Verbose` to this script, that's forwarded to the build commands and I'll see the Maven and Docker build output.
PS has another reason to use full names of parameters in long-lived scripts. If you use short names like `-Ver` expecting it to match `-Verbose`, a future version of the commandlet can add `-Version` and now `-Ver` is ambiguous.
On the other hand, I've seen some people eschew `%` and `?` in favor of writing `ForEach-Object` and `Where-Object`, which in my opinion is too extreme in the other direction.
BTW, in:
if (-not $(Test-Path -Path "$Path" -PathType Container)) {
... you don't need the `$`, and if $Path is already a string you don't need the `""` either.
To me this isn't just another reason, it's THE reason to not use the short form in scripts. Although unlikely, the worst case of this could be pretty nasty. It's possible for someone to remove a flag that was deemed useless and add another flag that matches the same prefix and has a very different effect.
I don't know how to feel about % and ?. The thought with not using them is that they are just aliases and could be changed. But that reasoning sort of breaks down since any command could be aliased to something dumb like `Set-Alias Get-Content Remove-Item`.
Yes, exactly. If your script is running in an environment that has redefined % and ?, it's either because they want your script to use their definitions, or it's busted beyond repair and you shouldn't worry about supporting it.
OK I admit to still using % and ? in pipelines. And yes, I tend to prefer keeping "unnecessary" syntax like you pointed out sometimes. Helps me parse and read the scripts easier.
Yep, the readability of PowerShell puts other shell scripting to shame. The "verbose default + short aliases" design is fantastic for maintaining both script readability and one-liner speed.
While developing a script I write them in short form then when it’s close to done I let vs-code help me convert them to long form. This way I write fast but leave a nicer product.
If you're trying to stick to the POSIX standard, you have to use short options for standard commands like grep, since the long options are GNU extensions.
Let's be honest it's outdated, has a bad UX and is with the standards I would hold such a standard against today generally not so good.
Sure systems seen try to somewhat be POSIX compliance but from my experience not because they care about POSIX but because that happen to overlap with the idea to be somewhat compiland with other similar systems so that porting software/scripts is easier.
So if the systems you use support long options for POSIX commands go ahead and use them. Furthermore if the command you can are not in the standard anyway you again can use long options because they are as much standard as the short ones. Let's be honest this leaves very little use-cases where long options are a problem.
fish shell has been burned in the few places it strictly implements POSIX.
The most vivid example is test, aka /bin/[. POSIX specifies that, if test is invoked with one argument, you must "Exit true (0) if $1 is not null; otherwise, exit false" [1].
This means that `test --help` is forbidden from printing any help. It means that `test -d $argv` expands to `test -d` if $argv is empty, which then must "succeed" because "-d" is not null. It bites users over and over again [2].
Implementing this POSIX piece has resulted in more headaches, not fewer. I regret implementing a POSIX-conformant test, I wish I had just picked mostly-compatible but also-sane semantics.
> It means that `test -d $argv` expands to `test -d` if $argv is empty, which then must "succeed" because "-d" is not null. It bites users over and over again [2].
To be fair, while the POSIX behaviour is obviously wrong, using `$argv` on a bourne-style shell[0] is also wrong; that should be either `test -d "$argv"` or (if I recall the syntax correctly) `test -d "${argv[@]}"` if you actually intended argv to be a list rather than a single directory ("argv" suggests the former, but `test -d X` only accepts a single argument).
0: more specifically, a shell that does word splitting at the wrong place (after parameter expansion)
> Sure systems seen try to somewhat be POSIX compliance
The mainstream systems you know and use adhere very rigorously to POSIX. GNU Libc, Kernel, Coreutils, ... and their counterparts in BSD Unixes, proprietary Unixes, Cygwin and whatnot all take POSIX seriously.
POSIX is very helpful, and smart coders and sysadmins use it as one of their references for required behavior.
Although I would agree that there's a slight hole there: you still need to port stuff like coreutils and all the dependencies, which has been done of course. But I'd be happier if something like busybox was actually portable.
I think it's easier to write a portable shell script today than 30 years ago.
The Autoconf system is predicated on the idea that writing a portable shells script is hard, and so we hide the shell programming behind a mountain of M4 macros.
It's not necessarily easier to port a shell script today than 22 years ago, which was not written with portability in mind. There are more shells with more extensions, and then beyond the language concerns, and the environments have exploded. A shell script can easily depend on all sorts of cruft you've never heard of. Oh, just install these five things from the following github repos ...
Same here — I felt it would detract from the point I was trying to make so I didn’t mention it, especially since I don’t think most people run multiple OS’s :) At the least though, I would assume people regularly have collaborators that run different OS’s.
POSIX is just too limited and not what people use in practice.
If you want a portable shell script, in many cases my (biased) advice would be to make your script work on both bash and Oil. (Obviously there are short shell scripts which you may want to run on BSD, etc. This is more about big scripts, which POSIX falls down for.)
Oil already runs some of the biggest shell scripts in the world, many of them unmodified. Moreoever, when there's a patch necessary to run it, it often IMPROVES the program.
I've given up trying to be as pure and standard as possible. I could write all my scripts to use `sh`, but `bash` can make things far easier, and in my docker containers is often required by some other package anyway, so may as well just own it.
I agree, bash vs sh is night and day. Although these days I mostly use python. If I really really need something I can always shell out in python as well.
Unless you count things like Makefiles, I don't think I've ever written or encountered a "script" that was intended for use on multiple *nix flavors. This strikes me as a YAGNI situation: do it if it comes up, but not before.
Yes most of the time the respecting POSIX to the letter is not needed, but it is of course satisfying knowing that your script can run fine on BSDs and other less common distributions :)
Though sometimes you don't need to go that far to break stuff, for instance switching from Fedora to Ubuntu.
I've seen many scripts fail on debian derivatives because people think using #!/bin/sh as a shebang is fine since it works on their computer where sh was in fact a symlink to bash.
But on debian based distributions /bin/sh is often dash, not bash, and dash is basically the strict POSIX subset + local, all fancy stuff like [[ ]], &>, arrays, ... will fail.
Though this is less about long options here and more about general shell scripting.
This attitude is definitely something that makes my life harder. My OS is usually OpenBSD, and I constantly need to fix other people's scripts.
It's not difficult to do, but it is annoying. I don't blame people for ignorance, but it'd be nice if people thought about portability.
Non-portable scripts can also bite you on Linux, where Debian, for example, will swap out the shell from bash to dash for performance. but others will not. So, even within the Linux ecosystem, you can end up with scripts that behave differently across distros.
> Non-portable scripts can also bite you on Linux, where Debian, for example, will swap out the shell from bash to dash for performance. but others will not. So, even within the Linux ecosystem, you can end up with scripts that behave differently across distros.
From the bash man page, "If bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible, while conforming to the POSIX standard as well." Also you should not be including bashisms if your shebang is sh and not bash. So using bash as sh shouldn't generally be a problem, if you stick to the published Shell Command Language [0], unless you hit one of the cases where the spec is a bit ambiguous and shells implement it differently [1].
> From the bash man page, "If bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible, while conforming to the POSIX standard as well."
From what I remember, it doesn't do a particularly good job.
It's really useful to have a *BSD user or two in your userbase complaining about your non-POSIX linuxisms, it improves the portability of your code no end ...
I do 95% of my work on linux so bash it is. I'm not really concerned about other systems. I can always google later. It would take down my productivity quite a bit if I worried if all my scripting needed to be compatible with BSD or other unix varieties. YAGNI is mostly true.
I think that's a reasonable attitude to have for your own personal scripts that you use on your own machines. But for stuff you do for work it might make sense to go for portability. Where I work most people develop on macOS, and our infra runs on Linux. I happen to develop on Linux, but if I'm writing development scripts that are intended to be run locally, I can't assume that GNU tools will be installed everywhere.
Portable shell scripts probably run on your system all the time, e.g. when installing development tools or packages, especially for cross-platform languages like Node.js, Ruby, Python, etc... If a hunk of software wants a shell script, it's gonna be portable.
Homebrew comes to mind, and any other piece of software that does the installation via a shell script.
Don't use long flags when scripting, if they have short equivalents, and POSIX only specifies the short equivalents.
Even if the stuff will only ever run on one system, the POSIX flags (1) come from a smaller set of options, since POSIX is fairly conservative in its content this area and (2) are generally well known (often three decades old or older).
Don't make me read a man page to confirm that "grep --fixed-strings" really is the same thing as the POSIX-standard "grep -F", and not something subtly different.
Wasted time for someone who has all of these memorized. How about all the future wasted time of people having to look up the flags cause they’re not explicitly spelled out? I’m in favor of explicitness because it saves time for future people and gatekeeps less.
-F is explicitly spelled out. Proof: I can see it. Explicit is the opposite of implicit, and implicit doesn't mean "abbreviated to a single character that is plainly visible".
You don't necessarily know what a long option means without looking it up, just because it is long. Firstly, if your native language isn't English, you may have to look it up in a dictionary. The ordinary meanings you find there may not reveal what that word means in the context of the program. So, at that point you're off to the documentation anyway.
I know what "hard" and "soft" are. Therefore, is it obvious what "git reset --soft" means? Hardly.
Once you know what it means, "--soft" jogs your memory by association a lot better than some "-X". So that would seem like it is less cognitive work. But when we have long options, it encourages the option vocabulary of a program to keep growing, which adds to the cognitive load.
I have been following this advice for about five years with my scripts and the best result has been compliments from co-workers and others who have picked up the scripts and felt like they had a much better understanding of what the scripts were doing.
Combined with use of shellcheck and shfmt working in shell scripting has come a long way in the last couple years. I still feel like BASH is a dead end--the quoting and whitespace issues combined with the bifurcation caused by Apple not shipping Bash 4.0+ make moving to Python probably the better course in 2020.
It's nice but you cannot comment individual lines. For that you can go one step further and use arrays:
cmd=(
command
-x -y
--bar OPTARG
ARG ARG
)
$cmd
This works in Zsh. In Bash that would be ${cmd[@]} I think. I use this with long qemu command lines where I often modify and comment out arguments during testing.
Wow. I just realized I'm terrible for this. I usually format my programming pretty verbosely to make it future-readable but I'm terrible for not doing that in my scripts.
I'll add a comment or something explaining it but now I feel guilty and should probably go back and rewrite a few lines...
I mean—for long scripts I'll break them up but I haven't paid much mind to arguments/flags.
I do this if I have to look up the right flag to use. If it's something commonly seen like "grep -o" or "tar -xvzf" I assume whoever is maintaining the script (probably future me) will be more familiar with the abbreviations than the long versions.
Yeah good point. It did strike me, though, because I'd just written several scripts running JMeter commands and sort of breezed over how they're written even though I had to go and look up all the flags.
If I'm splitting across lines, I prefer to put the pipe _after_ the newline. Otherwise, I like the cut of your jib, and will likely do this with arguments in scripts from now on.
using a line with a single backlash or breaking out the pipe can make grouping stuff clear. well, I mean as clear as having to use backslashes which to me is a little inelegant. Personally python and its indenting strategy has grown on me especially since editor support makes it painless to use and visually excellent.
I wonder a bit, that those POSIX comments are so far in the bottom of this discussion. I mean, you are loosing portability by using long flags (because it is not standard compliant) and yet it seems to be just a footnote in this discussion...
Not a lot of people these days care about portability, unfortunately. You'll be
lucky if your project's build scripts run the same way on both Debian and
vanilla macOS. And when they don't, the answer people will give you is probably
something like “Just install GNU tools!”. *BSD? Forget about it.
As the saying goes, code is typically written once and read thousands of times. It takes you less than a second to write the full flag name (add a few seconds if you need to look it up), but it will likely save at least a few hundred readings the time to look up the flag if they do not already know it. In addition, full flag names contributes "self-documentation" in many cases, and also potentially makes searching easier in certain cases.
You're assuming the long version is more well known than the short version. That's a wildly and provably false assumption.
Do you know, without referring to the manpage, exactly what `cp --no-dereference --preserve=links --recursive --preserve=all` does? I don't. I can take some guesses based on the names of the options, but those guesses could very easily miss an important corner case.
When writing scripts (CI especially) you'll rarely use, please always use long-flags. It'll save so much headache for the next dev who isn't as up to date with the latest short-hand for az-cli or whatever.
I think you are right, for all "exotic" tools, long flags are much better and future proof. For regular tools, short flags are so common that they are probably ok.
Agreed, bash3boilerplate.sh has long suggested that a few extra keystrokes in a script pay off in saving your collaborators and future self trips to manpages. Disclosure: b3bp author.
Great advice. Except a whole bunch of popular cli utils doesn’t have long versions.
Like kubectl -f. In apply, it means file, in logs it means follow. Drive me nuts how they overload short flags with different means but no long version.
Or gsutil/gcloud.
I get google folks, don’t really care much about always having a long readable version of short flag.
Went to try to update some of my scripts and stuff and found that most of OS X/macOS uses BSD utils which don't have the long flags found in linux versions. For example check out xargs:
I wanted to say that Microsoft explicitly says to use the full argument label and not omit them when writing PowerShell scripts but I can't find the doc.
I was thinking that the author would really like PowerShell. PowerShell is the most verbose thing I've ever used. I don't really like to type that much for simple things.
I say just learn the flags or look them up, it shouldn't be hard or time consuming. When reading scripts, many times you can infer what the flags mean by understanding the inputs and desired outputs.
Makefiles are one situation where one is essentially writing small shell scripts, but which are not really supposed to be quick and dirty in my opinion.
No, I'm not going to waste keystrokes and increase noise for lazy people. Convince physicists to write force_gravitational instead of Fg, or functional programmers to give up concepts from abstract algebra, and then I'll reconsider. Until then, RTFM just like you do in other fields. curl is so common you should already know what -s means and if not, well, RTFM.
I think I'll continue to write these:
instead of: If you're working on shell scripts, you probably know certain short flags well enough that long flags just add clutter and don't improve readability.