Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Use long flags when scripting (2013) (changelog.com)
468 points by maple3142 on Sept 18, 2020 | hide | past | favorite | 189 comments


Great suggestion, but I'd go for a middle ground: use long flags for the long tail, but short flags are fine for stuff that's very commonly used.

I think I'll continue to write these:

    grep -i
    rm -rf
    ln -s
    gzip -v
    sed -e
instead of:

    grep --ignore-case
    rm --recursive --force
    ln --symbolic
    gzip --verbose
    sed --expression
If you're working on shell scripts, you probably know certain short flags well enough that long flags just add clutter and don't improve readability.


The short flags are really there for when you're doing things interactively.

For scripts that are read and executed more often than they're written or changed, the long flags really help ensure you don't fat-finger.

I'm accustomed to typing `rm -rf` when really I just need `rm -r` in a lot of cases, and in PRs it's very easy for your eyes to glaze over when there's more than a single short flag.


Why are you more likely to typo something in a script you can read, review, and test than an interactive command prompt that executes immediately?


I guess the differences are how long it takes to detect a typo, and therefore the blast radius in getting it wrong in a script that could be executed many times before you notice.


You literally always need -f with -r in a script (unless you want rm to prompt the user), though...


You must have an alias rm='rm -i' lying around. rm doesn't prompt by default (unless you're the owner of the dir, but not its contents).

    mkdir demodir && touch demodir/{foo,bar} && rm -r demodir


IIRC 'rm -r' prompts by default for read-only files as well even if you own them.


... huh. No, no alias, I just misread the manpage.

I could've sworn that -I was the default......


It's a common thing for IT departments or people making enterprise images to put in because they think it makes it safer.

I feel it does more harm than good by normalising rm -f when you want to recursively delete a folder, but with CD deployments these days it's less of a deal.


"CD deployments" - Ha, like an "ATM machine"? At first in thought you were describing deployments on compact discs!


CD, continuous deployment, is used to refer to the process not an individual instance like the term ATM. Simply saying "This deployment failed" doesn't convey that it's a deployment that was made through CD but "This CD failed" is also ambiguous as it doesn't indicate whether the process or the conditions particular to a deployment that caused it to fail. Using them both in combination like "CD deployment" seems perfectly valid to me to resolve both ambiguities.

To be fair though they were referring to CD deployment(s), plural, which is a bit redundant and just CD would have done just as well in this case.


Yeah... no. Also definitely not the case. If I actually am remembering it correctly then it's probably from Fedora circa FC5 or Ubuntu circa 6.06. Back when I was in school and had never touched Linux other than on my own computer.


Go do a fresh install of CentOS/RHEL (or just spin up a new docker container) and what do you see in root's bashrc?

    alias rm='rm -i'
    alias cp='cp -i'
    alias mv='mv -i'
I've also encountered one job which adds this to their base ubuntu image, and my current employer uses Macs which have it added to bashrc by their mdm software on initial install.

So what's your justification to dismiss the idea that this is a common practice?


Aliases shouldn't be run in non-interactive shells IIRC.


I wasn't dismissing the idea that it was common practice, I was dismissing the idea that the common practice applied in my case. I did not have an IT department setting anything on my system when I experienced this.

Thank you for the info, though.


Long flags aren’t just for readability, they also add entropy which contributes defensiveness against typos.


And to add to your point, in my experience I've had "long-lived" scripts that the flags change meaning over the years. This is extremely rare on base-level installs, but frighteningly more common on the inhouse-programs I've had to deal with.

The long flags are less likely to be changed over time. The script will usually fail gracefully and I don't also need to relearn what I was thinking 25 years ago when I first did it. (long flags are like in-command comments too)

And as others have pointed out -v(ersion) or -v(erbose) can happen too.


Related to this, they're less likely to do something unexpected.

Many programs use -v as a short flag for --version, but some (such as curl) use it as short for --verbose

Probably something you'd catch pretty quickly, but still.


The worst (well known) offender is grep -v meaning grep --invert-match just for the sheer bafflement of "verbose" now hiding what you were looking for.


That actually comes from ed and vi. The g command would search for a line that matches a given regular expression and run the following command on that line. The v command was the inverse (search for lines that don't match the given regular expression).

The default command was to just print the line. Hence the name grep

g/re/p


Wow, that's funny, I normally expect '-v' to be verbose, but maybe I just use a lot of `tar` and `curl`!


Or maybe cp, or rm, or rsync, or mount, or wget, or netcat (both gnu and bsd) ... it’s really pretty reasonable to expect -v to mean verbose ;-).


As does Python, which drives me crazy. -V is version on Python.


Very annoying. Java HotSpot uses -version, rather than --version, which is even worse. Don't mess with the standard long-form flag!


To be fair, the JVM predates GNU-style --long flags being "standard". Lots of older programs use "X11-style" -long flags (that many pre-X11 programs used, like `find`).


Long flags are a GNUism, not a "standard" by any means. The closest that comes to a standard in this space are POSIX utility conventions [1].

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1...


I get it wrong every damn time. 16 years and counting.


Same! Somehow I also start confusing this with somewhat related tools like gcc. It's rare enough that I need to query these that it's all blurry in my memory again by next time. "huh I guess gcc was that goofy one with just one dash.." Nope.


There's so much variation in flags for version, I've taken the habit of always using the long `--version` flag.


... and it wrote the version to stderr up to 3.3, the stdout from 3.4; very convenient if you need to switch on version (which you do).


But also contributes to the heat death of the universe.


what doesn't?


Nah, it's already happened, and we all pop up every 10^10^100 eons to pretend this is all real.


Shhh don't share meta content with the NPCs...


I agree. To me, "sed -e" is an expression as a whole, and replacing -e with --expression is on the same level as aliasing "sed" to "stream-editor".

Though I would make the list of "allowed" short flags very short. I can't think of many more than the ones you listed:

  mkdir -p
  sh -c
  cp -r
  tar -xaf
  tar -caf


Is -a a GNU extension? Doesn't seem to be in the man page for BSD tar.


it is, and it's unnecessary for decompression since GNU tar 1.21 according to a comment lower in the thread.


tar is one of the worst. I think I know what those do without looking it up but I think most people look at that with bewilderment. Randall Munroe is always helpful in these matters and has this to say: https://xkcd.com/1168/


I remember 90% of the time which flag I want to use, but the other 10% sends me into a rage! =)

I gave up and added this to my .bashrc:

    function extract()      
  {
     if [ -f $1 ] ; then
         case $1 in
             *.tar.bz2)   tar xvjf $1     ;;
             *.tar.gz)    tar xvzf $1     ;;
             *.bz2)       bunzip2 $1      ;;
             *.rar)       unrar x $1      ;;
             *.gz)        gunzip $1       ;;
             *.tar)       tar xvf $1      ;;
             *.tbz2)      tar xvjf $1     ;;
             *.tgz)       tar xvzf $1     ;;
             *.zip)       unzip $1        ;;
             *.Z)         uncompress $1   ;;
             *.7z)        7z x $1         ;;
             *.zst)       zstd -d $1      ;;
             *.xz)        unxz $1         ;;
             *)           echo "'$1' cannot be extracted via >extract<" ;;
         esac
     else
         echo "'$1' is not a valid file"
     fi
}


Recent (as in, within the last ~3-5 years) versions of GNU tar automatically detect the type of compression and apply the appropriate tool, so you can often get away with `tar xvf`.


Actually close to 12 years, GNU tar 1.21 https://www.gnu.org/software/tar/

Another "recent" (much more recent IIRC) change is that you don't need the dot anymore in find to search in the current directory.


according to https://git.savannah.gnu.org/cgit/findutils.git/commit/find/..., this functionality has been present since findutils was moved under source control in 1996.


I believe that GNU find used to (in some cases?) show a warning if the directory was omitted. Could be wrong.


I think the dot actually gives you issues on Mac? Or something like that. I remember encountering issues with it.


You have to specify a directory for the find on Mac; it's only a GNUism to omit it.


Oh I see, I got it backwards somehow. Thanks!


  extract() {
    bsdtar -xf "$1"
  }
you can decompress things which are not files, such as stdin or device files. in addition, your code does not cover *.tar.xz (more common than tar.bz2 nowadays), or lzma, or lz4, or tar.zst, or many many other formats. further, it's not even consistent: bunzip2, gunzip, and unxz remove the input file, but tar, unrar, 7z, and zstd do not.


There's a utility called dtrx ("do the right extraction") that does basically this, and also makes sure that files are always extracted into a directory.


I can certainly understand the impulse to something like this, but as other posters pointed out, your case statement is incomplete, and will continue to become more so as new tools become available.

Personally, I consciously avoid using the internal decompression features of tar, both out of habit and to avoid unexpected results.

As such, I generally use command lines like: bunzip2 -c <tar.bz2 file> |tar xvf -

instead of relying on tar xvjf

I don't see it as wrong or bad to use tar's decompression features, it's more about my own preferences and experience. Being able to perform similar actions in multiple ways is one of the things I've always appreciated about the shell and the Unix/GNU userland.


You can use atool (https://www.nongnu.org/atool/) too. It is just some perl scripts wrapper around some commom extraction tools.


I guess you only ever extract punctually named files. But it's still rather amusing that with the exception of 'case', the only places where $1 is properly quoted are the ones where it's not really necessary, and vice versa (but then again, it's not done for the parameter, but rather the single quotes).

Anyway, my advice to you and anyone else having trouble with tar is that "tar caf" and "tar xaf" are the only things most people need to remember about tar, with or without compression.

(In this case, most people means people who use tar, but so rarely that they have trouble remembering how to use it; they probably never use it for anything else other than (un)archiving. Also, xkcd isn't gospel.)


These two commands should fit most of your use cases:

  tar cf dir.tar dir          # mnemonic: 'create file' <tar-file-name> <dir-to-tar>
  tar xf dir.tar(.gz|bz2|...) # mnemonic: 'eXtract file' <tar-file-name>
On systems with "modern" versions of tar `-x` is capable of recognizing which compression format is used and doesn't require the explicit `-j/z` flags you usually see.


Exactly, these are the only ones I know and never get confused. Actually I use cvzf and xvzf.


you might want to at -p for preserving permissions in there are well, handy for archiving


> To disarm the bomb, simply enter a valid tar command on your first try. No Googling. You have ten seconds.

  $ tar --help
Obviously. :)


    $ tar --help
    tar: unknown option -- help
    usage: tar [-]{crtux}[-befhjklmopqvwzHJOPSXZ014578] [archive] [blocksize]
               [-C directory] [-T file] [-s replstr] [file ...]
    $


Okay, so your tar is defective[0], and? I can do that too:

  $ list '#! /bin/sh' 'echo "tar: invalid command" >&2' 'exit 64' >/tmp/tar
  $ chmod a+x /tmp/tar
  $ PATH="/tmp:$PATH"
  $
  $ tar -cf a.tar a/
  tar: invalid command
  $
0: any program that doesn't support --help is defective, which was rather my point.


> 0: any program that doesn't support --help is defective, which was rather my point.

Fair enough. My point was that there's a category of operating systems where long options (including --help) isn't really a thing. You may of course consider them defective, though I'm not sure everybody agrees.


$ tar -h

Even better. :)


I dunno. I may know them but the next person may not. Where I work, there are plenty of people who primarily work w/ other langs and only dabble in bash once in a blue moon.

Case in point, when I was learning, I'd copy and paste snippets I found online without understanding what some flags did. At first I didn't even know how to look up docs, and googling for the meaning of shorthand isn't always fruitful, especially given the "don't know how to find docs" limitation.

I've even ran into cases where the next person is myself. For example at one point, I "knew" docker flags while it was fresh in my mind, but then forgot what they meant a few months later...


I think you're assuming a level of Unix knowledge that less and less new people in the field are going to have.


Someone who's tasked with maintaining shell scripts better be familiar with the shell.


that's kinda his point.

Kids today no longer know the shell. They think just writing some yaml files in ansible is sufficient. Giving them a shell script won't help at all.


>Kids today no longer know the shell.

Sounds like Kochan and Wood[0] need some love (and royalties).

Or is learning stuff deprecated these days?

[0] https://www.amazon.com/Unix-Shell-Programming-Stephen-Kochan...


> If you're working on shell scripts, you probably know certain short flags well enough that long flags just add clutter and don't improve readability.

What about other people who might read the script?


I have more difficulties to read long flags. The people who want to improve by reading my scripts will have to search for the short flags because nobody use long flags from command line. IMHO the benefits of long flags vs short ones are not obvious.


I argue that 'very commonly used' is subjective, and someone new won't know all the ones you or I consider common. One of my goals in writing software is to make my code understandable to complete newcomers. I can't always achieve it. Sometimes other design goals take priority. But from my perspective this one is a no-brainer. I just don't see the down side. If it's too time consuming to type, use them for typing practice and get faster :-)


I would agree. Especially when you have something like https://explainshell.com/ to help you understand a command and its flags. It would be nice to have editor/IDE extensions which can do equivalent of explainshell.com's functionality.


While i agree these are "common" you now have to define what common flags are. It's easier for an organization to just have a blanket rule to use long form expressions

Also is typing that much of a chore?


Yes. Suggestions are good. Following suggestions with purpose is better. Or, rather, blindly following suggestions without understanding the benefits (and costs to) the suggestions doesn't seem ideal to me. -- It respects the great answer of "it depends".

Gonna admit that I wouldn't be able to tell what grep -i, sed -e do. (I would've used `sed -e` before, but not enough to remember that). Still, some things are clear from context.


Indeed. Almost all arguments are "readability" are only relevant to a given reader. Everything is hard to read until you learn to read it.


I disagree. BSD and GNU implementations differ. Because of this, it's easier to discern intent of long flags.


I assume the intent to discern, then, is that the script isn't supposed to run on BSD :-)


Also, some BSD equivalents of GNU commands don't even have long flags.


That's not the point. The point is that you can discern intent if you are writing against the GNU implemention so one can trivially reimplement against BSD.


How often do you use gzip verbose? I don't think I've ever used it. What's it good for?


Personally a fan of '--verbose' to be honest. It's clear. "grep -v" anyone?


Particularly in this case, because `-v` sometimes means "print out the version number".


That's why they mentioned `grep -v`. Go look it up. It doesn't mean verbose OR version.

Doubt you'd be printing out a version number in a shell script though.


You may want to compare a version number of a command to make sure it's bigger than (or equal to) the minimum version required by your script, though.


>-v, --invert-match

> Selected lines are those not matching any of the specified patterns.

For the lazy.

Edit: I don't know hackernews formatting and I'm among the lazy, so... Marked as "won't fix".


Of all the options, it is so fitting and proper to use the long form for —-verbose.


It shows you the percentage reduction in size.

It's rare that I'd use it in a script. Perhaps I'd use it for a script whose main purpose is creating an archive to be widely distributed. Then the info might be of interest to the script's user.

But your comment brings up a good point about the difficulty of deciding what short flags truly are widely known. I assumed anyone who uses gzip regularly would know "-v", but maybe that isn't true.


It's useful when debugging your artifacts. It's a nice shortcut for adding a `find` after the `tar` invocation. Outside of debugging I've never seen the gzip/tar verbose output as anything other than clutter.


Pretty much just to make sure you're not wasting your time with the gzip.

But I also don't use gzip much these days. pbzip2 is a drop in replacement that is faster and compresses better.


It’s not just “easier for a human” but more stable over time as commands evolve, and less likely to do weird things across Unix variants.

Some command-line parsers will auto-complete partial options so you’re less likely to see ambiguity errors in future versions if you picked long-form options. (This isn’t completely foolproof, e.g. a command could have "--foo" and later add "--foobar" but it does help in most cases.)

And unfortunately, some tools with the same name will use the same letter to mean different things across Unix variants. You are asking for trouble if you aren’t being clear about what you want.


> Some command-line parsers will auto-complete partial options

This sort of behaviour is tortuous, and should be against international laws.

Where _adding_ new, unrelated, options to the interface now changes or breaks the behaviour of existing calling scripts. Plus you never know which abbreviations are in use in the wild.

It makes it virtually impossible to maintain a stable interface without just freezing it long-hand.

I found this in Perl code; I believe it might be the default behaviour in the standard parser? Our developer really did like the philosophy that "the user may want 50 different ways to express the same thing".


Unfortunately, this isn't quite true. At least if by Unix you mean POSIX, i.e. you include macOS in your considerations.

1. Sadly, short options are in fact more portable than long options if you're targeting POSIX. For instance, macOS ships with versions of ls, rm, et cetera that only support short flags. This is because long options don't exist at all in POSIX (they're formalized here: https://pubs.opengroup.org/onlinepubs/009695399/functions/ge...).

2. Auto-completing partial options only happens for long flags, and it's a bit more than "some" parsers; the canonical implementation of "long" options, GNU's getopt_long, has it as a documented feature:

> Long option names may be abbreviated if the abbreviation is unique or is an exact match for some defined option.

https://linux.die.net/man/3/getopt_long

I 100% agree it's a poor feature. You basically entirely preclude yourself from adding features in a backwards-compatible way.


This is how I write PowerShell scripts. I use full cmdlet names, full argument switch names, and I even specify argument switches to positional arguments. I also do my best to honor common flags like `-WhatIf`, `-Verbose`, and `-ErrorAction`. So I end up with scripts like this:

    if (-not $(Test-Path -Path "$Path" -PathType Container)) {
        Write-Error -Message "Path ($Path) does not exist or is not a directory." -Category InvalidArgument;
        return;
    }
When you do this properly, it feels like magic. For example, I wrote a script that does local Maven and Docker builds for a bunch of related projects. So I wrote two functions `Build-Maven` and `Build-Docker` with proper common flag support and error handling. Then, when I use them, I just do something like this:

    $PSDefaultParameterValues = @{
        'Build-Maven:ErrorAction' = 'Stop';
        'Build-Docker:ErrorAction' = 'Stop';
    };

    Build-Maven "$Path\A";

    Build-Maven "$Path\B";
    Build-Docker -Path "$Path\B" `
        -Dockerfile "$Path\B\Dockerfile" `
        -Tag "B:$Tag";
That first clause automatically amends `Build-Maven` and `Build-Docker` commands with `-ErrorAction Stop`. So if any of those build commands fail, the entire script halts there. And, if I pass in `-Verbose` to this script, that's forwarded to the build commands and I'll see the Maven and Docker build output.


PS has another reason to use full names of parameters in long-lived scripts. If you use short names like `-Ver` expecting it to match `-Verbose`, a future version of the commandlet can add `-Version` and now `-Ver` is ambiguous.

On the other hand, I've seen some people eschew `%` and `?` in favor of writing `ForEach-Object` and `Where-Object`, which in my opinion is too extreme in the other direction.

BTW, in:

    if (-not $(Test-Path -Path "$Path" -PathType Container)) {
... you don't need the `$`, and if $Path is already a string you don't need the `""` either.


To me this isn't just another reason, it's THE reason to not use the short form in scripts. Although unlikely, the worst case of this could be pretty nasty. It's possible for someone to remove a flag that was deemed useless and add another flag that matches the same prefix and has a very different effect.

I don't know how to feel about % and ?. The thought with not using them is that they are just aliases and could be changed. But that reasoning sort of breaks down since any command could be aliased to something dumb like `Set-Alias Get-Content Remove-Item`.


Yes, exactly. If your script is running in an environment that has redefined % and ?, it's either because they want your script to use their definitions, or it's busted beyond repair and you shouldn't worry about supporting it.


OK I admit to still using % and ? in pipelines. And yes, I tend to prefer keeping "unnecessary" syntax like you pointed out sometimes. Helps me parse and read the scripts easier.


Yep, the readability of PowerShell puts other shell scripting to shame. The "verbose default + short aliases" design is fantastic for maintaining both script readability and one-liner speed.


While developing a script I write them in short form then when it’s close to done I let vs-code help me convert them to long form. This way I write fast but leave a nicer product.


If you're trying to stick to the POSIX standard, you have to use short options for standard commands like grep, since the long options are GNU extensions.


Does anyone care about the standard?

Let's be honest it's outdated, has a bad UX and is with the standards I would hold such a standard against today generally not so good.

Sure systems seen try to somewhat be POSIX compliance but from my experience not because they care about POSIX but because that happen to overlap with the idea to be somewhat compiland with other similar systems so that porting software/scripts is easier.

So if the systems you use support long options for POSIX commands go ahead and use them. Furthermore if the command you can are not in the standard anyway you again can use long options because they are as much standard as the short ones. Let's be honest this leaves very little use-cases where long options are a problem.


fish shell has been burned in the few places it strictly implements POSIX.

The most vivid example is test, aka /bin/[. POSIX specifies that, if test is invoked with one argument, you must "Exit true (0) if $1 is not null; otherwise, exit false" [1].

This means that `test --help` is forbidden from printing any help. It means that `test -d $argv` expands to `test -d` if $argv is empty, which then must "succeed" because "-d" is not null. It bites users over and over again [2].

Implementing this POSIX piece has resulted in more headaches, not fewer. I regret implementing a POSIX-conformant test, I wish I had just picked mostly-compatible but also-sane semantics.

1: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/t...

2: https://stackoverflow.com/questions/29635083/fish-shell-chec...


> It means that `test -d $argv` expands to `test -d` if $argv is empty, which then must "succeed" because "-d" is not null. It bites users over and over again [2].

To be fair, while the POSIX behaviour is obviously wrong, using `$argv` on a bourne-style shell[0] is also wrong; that should be either `test -d "$argv"` or (if I recall the syntax correctly) `test -d "${argv[@]}"` if you actually intended argv to be a list rather than a single directory ("argv" suggests the former, but `test -d X` only accepts a single argument).

0: more specifically, a shell that does word splitting at the wrong place (after parameter expansion)


Last published in 2018 (and actively being worked on) is not outdated.

https://pubs.opengroup.org/onlinepubs/9699919799/

> Sure systems seen try to somewhat be POSIX compliance

The mainstream systems you know and use adhere very rigorously to POSIX. GNU Libc, Kernel, Coreutils, ... and their counterparts in BSD Unixes, proprietary Unixes, Cygwin and whatnot all take POSIX seriously.

POSIX is very helpful, and smart coders and sysadmins use it as one of their references for required behavior.


Larry Wall said It's easier to port a shell than a shell script in 1998 ... 22 years later I still think that holds.

https://news.ycombinator.com/item?id=10104203

Although I would agree that there's a slight hole there: you still need to port stuff like coreutils and all the dependencies, which has been done of course. But I'd be happier if something like busybox was actually portable.


I think it's easier to write a portable shell script today than 30 years ago.

The Autoconf system is predicated on the idea that writing a portable shells script is hard, and so we hide the shell programming behind a mountain of M4 macros.

It's not necessarily easier to port a shell script today than 22 years ago, which was not written with portability in mind. There are more shells with more extensions, and then beyond the language concerns, and the environments have exploded. A shell script can easily depend on all sorts of cruft you've never heard of. Oh, just install these five things from the following github repos ...


Yes, if you care about portability. Anything that's not Linux likely won't have GNU grep installed by default. This includes macOS and the BSDs.

If this is your own dotfiles or utilities, sure, but if I'm working on something collaborative, yes, I care about the standard.


I stick to POSIX in my dotfiles precisely because I carry it around on many different systems and that’s the only thing that can keep it sane.


Same here — I felt it would detract from the point I was trying to make so I didn’t mention it, especially since I don’t think most people run multiple OS’s :) At the least though, I would assume people regularly have collaborators that run different OS’s.


If I would create something which needs that degree of portability I would never use a shell script.

It's just way too easy to run into a unexpected gotcha on some platform/configuration.


Yeah I agree with this. I claimed a few years ago that you could view https://www.oilshell.org/ as a "better POSIX", and I still hold that view.

http://www.oilshell.org/blog/2018/01/28.html#limit-to-posix

Recent comment about this:

https://news.ycombinator.com/item?id=24428347

POSIX is just too limited and not what people use in practice.

If you want a portable shell script, in many cases my (biased) advice would be to make your script work on both bash and Oil. (Obviously there are short shell scripts which you may want to run on BSD, etc. This is more about big scripts, which POSIX falls down for.)

Oil already runs some of the biggest shell scripts in the world, many of them unmodified. Moreoever, when there's a patch necessary to run it, it often IMPROVES the program.

http://www.oilshell.org/blog/2020/06/release-0.8.pre6.html#p...

You'll be less tied to the vagaries of bash.

If anyone's script doesn't run under Oil, I'm interested. See https://github.com/oilshell/oil/wiki/What-Is-Expected-to-Run...


I've given up trying to be as pure and standard as possible. I could write all my scripts to use `sh`, but `bash` can make things far easier, and in my docker containers is often required by some other package anyway, so may as well just own it.


I agree, bash vs sh is night and day. Although these days I mostly use python. If I really really need something I can always shell out in python as well.


I usually test my POSIX scripts in an Alpine docker container. It has busybox by default, which only have a few flags and mostly short ones.


Unless you count things like Makefiles, I don't think I've ever written or encountered a "script" that was intended for use on multiple *nix flavors. This strikes me as a YAGNI situation: do it if it comes up, but not before.


Yes most of the time the respecting POSIX to the letter is not needed, but it is of course satisfying knowing that your script can run fine on BSDs and other less common distributions :)

Though sometimes you don't need to go that far to break stuff, for instance switching from Fedora to Ubuntu. I've seen many scripts fail on debian derivatives because people think using #!/bin/sh as a shebang is fine since it works on their computer where sh was in fact a symlink to bash.

But on debian based distributions /bin/sh is often dash, not bash, and dash is basically the strict POSIX subset + local, all fancy stuff like [[ ]], &>, arrays, ... will fail.

Though this is less about long options here and more about general shell scripting.


You don't know that your script will run fine anywhere, if you've not actually run it there.

But, still, that's no reason to adopt a mindset of actively wrecking the chances of such success.

(Which is what passive ignorance amounts to, effectively).


This attitude is definitely something that makes my life harder. My OS is usually OpenBSD, and I constantly need to fix other people's scripts.

It's not difficult to do, but it is annoying. I don't blame people for ignorance, but it'd be nice if people thought about portability.

Non-portable scripts can also bite you on Linux, where Debian, for example, will swap out the shell from bash to dash for performance. but others will not. So, even within the Linux ecosystem, you can end up with scripts that behave differently across distros.


> Non-portable scripts can also bite you on Linux, where Debian, for example, will swap out the shell from bash to dash for performance. but others will not. So, even within the Linux ecosystem, you can end up with scripts that behave differently across distros.

From the bash man page, "If bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible, while conforming to the POSIX standard as well." Also you should not be including bashisms if your shebang is sh and not bash. So using bash as sh shouldn't generally be a problem, if you stick to the published Shell Command Language [0], unless you hit one of the cases where the spec is a bit ambiguous and shells implement it differently [1].

[0] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V... [1] https://stackoverflow.com/questions/16069339/different-pipel...


> From the bash man page, "If bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible, while conforming to the POSIX standard as well."

From what I remember, it doesn't do a particularly good job.


It's really useful to have a *BSD user or two in your userbase complaining about your non-POSIX linuxisms, it improves the portability of your code no end ...


I do 95% of my work on linux so bash it is. I'm not really concerned about other systems. I can always google later. It would take down my productivity quite a bit if I worried if all my scripting needed to be compatible with BSD or other unix varieties. YAGNI is mostly true.


Sounds to me like your choice of OS is what makes your life harder...


It makes my life easier in most ways. This is one of the costs.

And, again, this still bites whenever using different linux distros.


I never said that wasn't the case. That doesn't mean anyone else's attitude is the cause of problems caused by your own choices.

#!/bin/bash


I think you mean

    #!/usr/bin/env bash


Nope, I don't.


I think that's a reasonable attitude to have for your own personal scripts that you use on your own machines. But for stuff you do for work it might make sense to go for portability. Where I work most people develop on macOS, and our infra runs on Linux. I happen to develop on Linux, but if I'm writing development scripts that are intended to be run locally, I can't assume that GNU tools will be installed everywhere.


If you've ever run a ./configure script before building a program, you have encountered a script intended for use on multiple Unix flavors.


Portable shell scripts probably run on your system all the time, e.g. when installing development tools or packages, especially for cross-platform languages like Node.js, Ruby, Python, etc... If a hunk of software wants a shell script, it's gonna be portable.

Homebrew comes to mind, and any other piece of software that does the installation via a shell script.


There's a good chance you did. Lots of opensource developers use MacOS, so there's at least compatibility between that and Linux.


Don't use long flags when scripting, if they have short equivalents, and POSIX only specifies the short equivalents.

Even if the stuff will only ever run on one system, the POSIX flags (1) come from a smaller set of options, since POSIX is fairly conservative in its content this area and (2) are generally well known (often three decades old or older).

Don't make me read a man page to confirm that "grep --fixed-strings" really is the same thing as the POSIX-standard "grep -F", and not something subtly different.


Why not just "man grep | grep -- --fixed-strings" real quick?


These real quick moments add up to wasted time real quick.


Wasted time for someone who has all of these memorized. How about all the future wasted time of people having to look up the flags cause they’re not explicitly spelled out? I’m in favor of explicitness because it saves time for future people and gatekeeps less.


-F is explicitly spelled out. Proof: I can see it. Explicit is the opposite of implicit, and implicit doesn't mean "abbreviated to a single character that is plainly visible".

You don't necessarily know what a long option means without looking it up, just because it is long. Firstly, if your native language isn't English, you may have to look it up in a dictionary. The ordinary meanings you find there may not reveal what that word means in the context of the program. So, at that point you're off to the documentation anyway.

I know what "hard" and "soft" are. Therefore, is it obvious what "git reset --soft" means? Hardly.

Once you know what it means, "--soft" jogs your memory by association a lot better than some "-X". So that would seem like it is less cognitive work. But when we have long options, it encourages the option vocabulary of a program to keep growing, which adds to the cognitive load.


I have been following this advice for about five years with my scripts and the best result has been compliments from co-workers and others who have picked up the scripts and felt like they had a much better understanding of what the scripts were doing.

Combined with use of shellcheck and shfmt working in shell scripting has come a long way in the last couple years. I still feel like BASH is a dead end--the quoting and whitespace issues combined with the bifurcation caused by Apple not shipping Bash 4.0+ make moving to Python probably the better course in 2020.


It's so unfortunate about Bash + Apple.

Bash probably deserves to die out, but Apple has really pulled a bait-and-switch on OSX's *nix/bsd underpinnings over the years.


Aren't Apple helping out here by defaulting it zsh?


Die out and be replaced by what?


Powershell? It's available for linux now


> Powershell? It's available for linux now

I'd rather have my tonsils extracted through my ears.

I went the other way[0] decades ago and never looked back.

[0] https://www.cygwin.com


I was mostly joking. The verbosity of powershell just makes me throw up in my mouth a little time every time I have to deal with it.


Oh noes! I've been Poe'd![0]

But I really would recommend Cygwin[1] for anyone who needs to use Windows.

Having a real shell makes a big difference.

[0] https://en.wikipedia.org/wiki/Poe's_law

[1] https://cygwin.com


In scripts, I usually do stuff like this

  command \
    --longopt1 \
    --longopt2 arg \
    argument
or

  command | \
    command 2 | \
    command 3 \
      --opt1 \
      --opt2 \
      arg


It's nice but you cannot comment individual lines. For that you can go one step further and use arrays:

  cmd=(
    command
    -x -y
    --bar OPTARG
    ARG ARG
  )
  $cmd
This works in Zsh. In Bash that would be ${cmd[@]} I think. I use this with long qemu command lines where I often modify and comment out arguments during testing.


You can comment using a hack; check this out:

  command `# a comment` \
    --longopt1 `# another comment` \
    --longopt2 arg \
    argument


I didn't mean commenting arguments for documentation but commenting them out while you're writing and testing the script. I.e. this won't work:

  command `# a comment` \
    # --longopt1 \
    --longopt2 arg \
    argument
AFAIK there's no hack for this.

EDIT: Oh, `# a comment`. Didn't notice that. I'm going to explore it, thanks.


You should probably note that this hack spawns an additional shell process per comment :)


For bash, add quotes: "${cmd[@]}"

Otherwise, you'll get word splitting. E.g.,

  cmd=(
   test
   "!= !="
   !=
   ""
  )
  "${cmd[@]}" && echo Success
  ${cmd[@]} || echo Failure


If you have a very long command you can run ctrl x e (hold control press x, than e) and edit it in your $EDITOR.


I wish I had known this 20 years ago. Thanks!


You don't even need the \ if the | is the last character of the line.


Wow. I just realized I'm terrible for this. I usually format my programming pretty verbosely to make it future-readable but I'm terrible for not doing that in my scripts.

I'll add a comment or something explaining it but now I feel guilty and should probably go back and rewrite a few lines...

I mean—for long scripts I'll break them up but I haven't paid much mind to arguments/flags.


I do this if I have to look up the right flag to use. If it's something commonly seen like "grep -o" or "tar -xvzf" I assume whoever is maintaining the script (probably future me) will be more familiar with the abbreviations than the long versions.


Yeah good point. It did strike me, though, because I'd just written several scripts running JMeter commands and sort of breezed over how they're written even though I had to go and look up all the flags.


If I'm splitting across lines, I prefer to put the pipe _after_ the newline. Otherwise, I like the cut of your jib, and will likely do this with arguments in scripts from now on.


sometimes I do something like this:

  foo \
    --group1 \
    \
    --group 2 \
    --more-stuff 2 \
    | \
   bar \
    --bar-stuff
using a line with a single backlash or breaking out the pipe can make grouping stuff clear. well, I mean as clear as having to use backslashes which to me is a little inelegant. Personally python and its indenting strategy has grown on me especially since editor support makes it painless to use and visually excellent.


Downside: forgetting a \ will truncate the flags list, leading to unexpected behaviour followed by an error.


This is how I always do it do - I find it much more readable, and it leads to better diffs too.


I try to write long form now, but every time I encounter short form and don’t know what it means I just use explainshell: https://explainshell.com/

Long form also allows me to find things faster in man pages.


I wonder a bit, that those POSIX comments are so far in the bottom of this discussion. I mean, you are loosing portability by using long flags (because it is not standard compliant) and yet it seems to be just a footnote in this discussion...


Not a lot of people these days care about portability, unfortunately. You'll be lucky if your project's build scripts run the same way on both Debian and vanilla macOS. And when they don't, the answer people will give you is probably something like “Just install GNU tools!”. *BSD? Forget about it.


As the saying goes, code is typically written once and read thousands of times. It takes you less than a second to write the full flag name (add a few seconds if you need to look it up), but it will likely save at least a few hundred readings the time to look up the flag if they do not already know it. In addition, full flag names contributes "self-documentation" in many cases, and also potentially makes searching easier in certain cases.


You're assuming the long version is more well known than the short version. That's a wildly and provably false assumption.

Do you know, without referring to the manpage, exactly what `cp --no-dereference --preserve=links --recursive --preserve=all` does? I don't. I can take some guesses based on the names of the options, but those guesses could very easily miss an important corner case.

Do you know what `cp -a` does? I do.


The long form of `-a` would be `--archive`, for the same reason that you write `-a` and not `-dR --preserve=all`


I'm afraid that doesn't fit with the "be as explicit as you can regardless of logic" espoused in the article.


When writing scripts (CI especially) you'll rarely use, please always use long-flags. It'll save so much headache for the next dev who isn't as up to date with the latest short-hand for az-cli or whatever.

When live scripting, feel free to use short.


I think you are right, for all "exotic" tools, long flags are much better and future proof. For regular tools, short flags are so common that they are probably ok.


Your regular tool is not my regular tool. Hell, maybe my regular tool today is not going to be all that regular in 3 years.


He's talking about thing like sed/awk/grep/find/xargs/tar not new_hotness_util_2020


I was. Thank you.


Agreed, bash3boilerplate.sh has long suggested that a few extra keystrokes in a script pay off in saving your collaborators and future self trips to manpages. Disclosure: b3bp author.



Great advice. Except a whole bunch of popular cli utils doesn’t have long versions.

Like kubectl -f. In apply, it means file, in logs it means follow. Drive me nuts how they overload short flags with different means but no long version.

Or gsutil/gcloud.

I get google folks, don’t really care much about always having a long readable version of short flag.


Great. Now try typing 'touch --no-create foo' on OS X and let me know how it goes.


Went to try to update some of my scripts and stuff and found that most of OS X/macOS uses BSD utils which don't have the long flags found in linux versions. For example check out xargs:

https://www.freebsd.org/cgi/man.cgi?xargs

https://man7.org/linux/man-pages/man1/xargs.1.html


I think adding the -- to the end, to stop option processing is advisable as well. Where that's supported, of course.

Some may disagree, but unexpected flags passing through seems dangerous to me.


Are there editors that use autocomplete to suggest arguments?

That would make finding and typing the long arguments easier. As it is, it’s easier to type the long arguments in a shell than in an editor.


I wanted to say that Microsoft explicitly says to use the full argument label and not omit them when writing PowerShell scripts but I can't find the doc.


I was thinking that the author would really like PowerShell. PowerShell is the most verbose thing I've ever used. I don't really like to type that much for simple things.

I say just learn the flags or look them up, it shouldn't be hard or time consuming. When reading scripts, many times you can infer what the flags mean by understanding the inputs and desired outputs.


Sometimes, but well-written PowerShell scripts can be surprisingly readable, particularly in comparison to bash.


Absolutely not. Long flags are non-standard and non-portable. Don't write GNU scripts - write shell scripts.


Depends on the situation as always. Obviously if it's going to be used cross platform go for it, otherwise YAGNI


Is there not a linter that can autofix this? Seems like it shouldn't be a human responsibility.


never thought about that. I usually try to keep my code short to try to impress other programmers, but that's really good advice


I'd say a painless, easy to understand program is more impressive than a short one


`rm -rf` is much more recognizable than `rm --recursive --force`.

`tar -cvf` is much more recognizable than `tar --create --verbose --file`.

Moreover, why are you writing a shell script if it's NOT quick and dirty?


Makefiles are one situation where one is essentially writing small shell scripts, but which are not really supposed to be quick and dirty in my opinion.


No, thanks


    --r<TAB>ecursive


[flagged]


The real shortcut here is the next person wanting to google every answer (instead of taking the time to read part of a manpage).


No, I'm not going to waste keystrokes and increase noise for lazy people. Convince physicists to write force_gravitational instead of Fg, or functional programmers to give up concepts from abstract algebra, and then I'll reconsider. Until then, RTFM just like you do in other fields. curl is so common you should already know what -s means and if not, well, RTFM.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: