How To Adopt 10 'Good' Unix Habits

← Back to Stories (view on slashdot.org)

How To Adopt 10 'Good' Unix Habits

Posted by Zonk on Friday December 15, 2006 @11:10PM from the i-constantly-use-the-grep-command dept.

An anonymous reader writes to mention an article at the IBM site from earlier this week, which purports to offer good Unix 'habits' to learn. The ten simple suggestions may be common sense to the seasoned admin, but users with less experience may find some helpful hints here. From the article: "Quote variables with caution - Always be careful with shell expansion and variable names. It is generally a good idea to enclose variable calls in double quotation marks, unless you have a good reason not to. Similarly, if you are directly following a variable name with alphanumeric text, be sure also to enclose the variable name in square brackets ([]) to distinguish it from the surrounding text. Otherwise, the shell interprets the trailing text as part of your variable name -- and most likely returns a null value."

26 of 360 comments (clear)

Min score:

Reason:

Sort:

Square or Curly brackets? by Beolach · 2006-12-15 23:27 · Score: 3, Informative

enclose the variable name in square brackets ([])

~ $ ls tmp/ a b ~ $ VAR="tmp/*" ~ $ echo $VARa ~ $ echo "$VARa" ~ $ echo "${VAR}a" tmp/*a ~ $ echo ${VAR}a tmp/a
Their example correctly uses Curly brackets, {}, but their text says square brackets []. That seems like a typo to me.

--
Join moola.com, play games to earn money.
1. Re:Square or Curly brackets? by lmfr · 2006-12-16 00:44 · Score: 5, Informative
  The correct form is {}, not []. There are other things you can use with ${VAR}:
  
  ${VAR:-text if $VAR is empty}
  
  ${VAR:=text if $VAR is empty and set VAR to this}
  
  ${VAR:+text if $VAR is set}
  
  ${#VAR} -> length of $VAR
  
  ${VAR#pattern} or ${VAR##pattern} -> remove match of pattern from beginning of $VAR (## -> longest match)
  
  ${VAR%pattern} or ${VAR%%pattern} -> remove match of pattern from end of $VAR (%% -> longest match)
  
  There are other formats (see the man page), but these are the ones I use the most. Eg:
  for i in *.png; do convert "$i" "${i%.*}.jpg"; done
Typo by seebs · 2006-12-15 23:31 · Score: 2, Informative

The quoted paragraph from the article is incorrect -- and it is in the article too -- but the example immediately following it correctly shows the use of braces ("curly brackets"), not square brackets, for variable names in shell.

--
My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/
This article... by nevali · 2006-12-15 23:43 · Score: 5, Informative

...is so littered with basic errors that it really shouldn't be recommended to anybody. How is 'tar xvf -C tmp/a/b/c newarc.tar.gz' expected to work, for example? Quote variables with square brackets? Running subshell commands using ; instead of && ? No mention of 'xargs -0' ? Don't pipe from cat to grep? Does anybody actually care that people do this (primarily so that the syntax is consistent between a munged- and unmunged-grep, and also such that the order of the command-line is logical from a human point of view)? Plus, of course, it's possible that cat | grep could yield better performance than grep alone: if cat uses mmap() to efficiently read the input files, and the kernel's pipe implementation is good, then it could do better than a grep implementation alone that simply read()s the files.
1. Re:This article... by treat · 2006-12-16 05:08 · Score: 3, Informative
  
  You're the only one who hasn't mentioned xargs -0. I think it's important to elaborate on this. You should never do "find | xargs" or "find | cpio", you should always do "find -print0 | xargs -0" and find -print0 | cpio -0". The former will break if filenames have spaces or newlines in them. You break xargs if filenames have quotes, backslashes, or spaces in them. I never come across a large data set where you can do find | xargs without the -0 options.
  
  If you are encountering data created by untrusted users, don't forget the strange consequences of filenames that contain newlines.
  
  Failing to use -0 is dangerous malpractice.
Comment removed by account_deleted · 2006-12-15 23:48 · Score: 4, Informative

Comment removed based on user account deletion
Re:Why? by Timothy+Brownawell · 2006-12-15 23:58 · Score: 2, Informative

What is Unix?
*nix is a highly modular component-based software system with a standard interface (flat byte streams) between components, and a basic set of standard components (given in the POSIX standard) that can be relied upon to always be present.
absolute drivel by Anonymous Coward · 2006-12-16 00:35 · Score: 4, Informative

This is, without a doubt, the most worthless article I have ever seen, both on Slashdot and on ibm.com, of which I thought better. It is not that the article is boring, but that it is factually incorrect in some places.
"the only excuse to define directories individually was that your mkdir implementation did not support this option, but this is no longer true on most systems. IBM, AIX®, mkdir, GNU mkdir, and others that conform to the Single UNIX Specification now have this option."
This is nonsense. The expansion of the path components in the {braces} is not a function of mkdir(1), but of the shell, and how its argument expansion is configured. I cannot believe that anyone "with 20 years of experience" is brazenly quoting names of standards in an effort to give his ramblings an air of credibility. Actually, wait a minute...
Another bad usage pattern is moving a .tar archive file to a certain directory because it happens to be the directory you want to extract it ...
Better is to check what's in the archive before extracting it in case some inconsiderate fool has failed to put a top-level directory in it.
His research interests include digital publishing and the future of the book.
Let me give you a couple of hints.
Re:Don't use shell by Anonymous Coward · 2006-12-16 00:40 · Score: 4, Informative

This code is not pure shellscript : it uses awk and wget to get the job done...

A Python equivalent might be :

#!/usr/bin/python import os for a in file('filename').readlines(): os.system('wget ' + a)

It's not that much longer, it's much easier to read and less error-prone (especially the awk part), and it uses fewer external utilities.

To me, the *only* advantage of shellscript is that it's the only language that you are sure to find on any Unix system.
Eh? There's more to Unix than shell scripting by Taagehornet · 2006-12-16 01:32 · Score: 3, Informative

"10 good habits that improve your UNIX command line efficiency" would probably have been a better title.

The title did however bring back fond memories of Eric Raymond's The Art of Unix Programming. The book is available online, and if you were hoping for something a bit more substantial as well, then the section Basics of the Unix Philosophy might be worth a read.
Re:Don't use shell by duguk · 2006-12-16 01:33 · Score: 4, Informative

Um, whats wrong with

wget -i filename

Or have I missed something?

Monkeyboi
tar comment by thomasa · 2006-12-16 02:18 · Score: 2, Informative

In their example with tar they did

tar xvf

without the dash. (E.g., tar -xvf)

While that does work, I prefer to add
the dash as it makes it more consistent
with the other commands. So I consider
that a bad example. tar is one of the
older commands like dd that have weird
command line syntax.
Actually useful hints by Artraze · 2006-12-16 02:21 · Score: 5, Informative

As has been pointed out, this article is riddled with errors. It's also not very interesting. So in the interest of perhaps actually providing some interesting tips:

In scripts, prefix dangersous commands with an 'echo' for a test run (So you can catch all those rm -rf /).

Single quotes are the best quotes for plain strings. The only reasion to use double quotes is if you need to quote a variable or a single quote.

Completion is fun, but using wildcards is more flexible (though you'll only want to use benign commans like cd, less, etc):
nano /etc/modules.autoload.d/kernel-2.6
nano /etc/m*a*d/*6

Note that the use of subpaths reduces the amount of flexibility.
cd /etc/m* -> /etc/mail
cd /etc/m*d -> /etc/modules.d
nano /*/m*/*6 -> /etc/modules.autoload.d/kernel-2.6, and /etc/modules.d/i386 (not quite!)

Finally, as a comment for the article, using:
test -e $DIR || mkdir -p $DIR
is much better than their suggestion and probaly faster anyway. Though I'd just do "mkdir -p $DIR" and maybe "&>/dev/null" under most circumstances anyway.

That's all I can think of at this point. Anyone else have tips?
Re:Very helpful by martin-boundary · 2006-12-16 02:33 · Score: 2, Informative

Especially since that example doesn't account for filesystem caching effects. There's no way of knowing if the bulk of the gain is because of the changed command or because the file is already in RAM, some background process was running, etc.
When timing commands, it's best to repeat the command several times and see if the times change significantly.
Re:lowercase uppercase by Scorpio · 2006-12-16 02:50 · Score: 2, Informative

for i in *.JPG ; do mv $i `basename $i .JPG`.jpg ; done
Re:welll.. by hackstraw · 2006-12-16 03:11 · Score: 5, Informative

cat-ing a file and then piping it to grep. surely that is a good point he is making, because grep already takes filenames as an argument?

That list was fairly arbitrary, but the piping cat thing is something that basically only annoys the most anal of anal, and they probably do it sometimes too.

Its common for me to do cat foo and then hit the up arrow and append a pipe to another command instead of editing the whole command line. Computers are pretty fast, and real anal people would use fgrep instead of grep, but again I always use egrep, because I never know when a regular expression will be edited into a more complex one, and to me all of the speeds are the same.

My #1 habit to tell people, although it is not a habit, but just where to start it to learn your shell. No science guys, csh is not a worthy shell in 2006. If you have to suffer with the wacky behavior of a csh variant, at least use tcsh.

My #2 thing to learn is a text editor.

As far as habits go. First and foremost, unalias cp, mv, rm to have the -i flag. In my opinion, that is a BAD habit to start. You WILL lose files sooner or later, and the more painful the better so that you will think so you will stop doing it. the -i flag will NOT stop you from redirecting into a file, and the most dangerous is the -rf flag with rm will override that -i. Remote copies via rcp or scp will not honor the -i flag. Unarchiving an archive will not honor the -i flag. There are tons of ways to lose files, and you will lose them. Its a much better habit to universally save yourself from yourself to not lose them by testing with -i, working off of a copy, and thinking before you hit return, creating new directories to eliminate clobbering a file, NEVER, EVER, do tar cf foo.tar . or tar cf foo.tar *. You will piss yourself and others by doing that.

Actually, this top 10 list is pretty lame, and should be ignored.
Re:welll.. by bugg · 2006-12-16 03:28 · Score: 3, Informative

I don't think it ever makes sense to use cat with one file - something I have seen far too many people do. To do so, logically, is to tell the commands to run through the file twice.

First you are telling cat to output the entire file, and then you are telling grep to go through the entire output of cat. If you're working with gigabytes of data here, that can quickly be a frustrating exercise! Folks who are in the mentality of using cut | grep and even a visual editor like vi instead of sed are up the creek when they find themselves needing to manipulate and get portions of very large data sets.

--
-bugg
Re:lowercase uppercase by Chandon+Seldon · 2006-12-16 03:46 · Score: 2, Informative

Why is the perl script so hard? The command line would have a bunch of sed going on, whereas the perl script only requires running perl.
I'm guessing... perl -e 'for(`ls`) { chomp; $n = lc $_; system("mv $_ $n"); }'

--
-- The act of censorship is always worse than whatever is being censored. Always.
Re:welll.. by theonetruekeebler · 2006-12-16 04:45 · Score: 2, Informative

The best reason to pipe to grep is to keep filenames out of the output. grep foo ?.txt will produce a.txt: foo b.txt: foo c.txt: foo whereas cat ?.txt|grep foo produces foo foo foo
I've also seen Unixes where their shells are linked against spectacularly broken libc's. Under Tru64's Bourne and Korn shells, for example, a multithreaded program foo fork bombs when run as foo < z.txt, but works fine as cat z.txt|foo (foo < z.txt under Bash works, though, because Bash is linked against the GNU libc).

--
This is not my sandwich.
Re:lowercase uppercase by multipartmixed · 2006-12-16 06:07 · Score: 3, Informative

Oh, for the love of God, stop bitching about how hard this is to do under UNIX without package x.y.z installed and imagine doing it under Mac OS9 or Windows.

Break it down into its constituent parts: Iterate, rename. Whoo! Simple now, eh?

# cd cameraDir
# find . -type f -prune | while read file
> do
> mv "$file" "`echo \"$filename\" | tr '[A-Z]' '[a-z]'`"
> done
#

... should do the trick. If you don't have tr, you can use sed with the y command. But I can't think of the last time I saw a box w/o tr.

--

Do daemons dream of electric sleep()?
Re:lowercase uppercase by newt0311 · 2006-12-16 07:08 · Score: 2, Informative

oh, I ran into this problem too many times so I set up the following function:

function pmv ()
{
local src dest oifs
oifs=$IFS
export IFS=$'\n'
src="${1:? 'Error: input pattern not specified'}"
dest="${2:? 'Error: destination pattern not specified'}"

for i in $src ; do
mv "$i" $(sed -e "s/$dest/" <<< "$i")
done
export IFS=$oifs
}

Now I can move files by regexps by typing pmv <file-match-pattern> <sed-matcher>/<replacement>
very very convenient.
Re:I don't think I ever use xargs by Anonymous Coward · 2006-12-16 07:25 · Score: 1, Informative

xargs can split up the args to execute bar multiple times.
bar `foo` can fail if `foo` happens to make the argument list too large.

Also use find -print0 | xargs -0 instead of find -exec if you prefer not forking a jillion times.
Re:FP? by lahi · 2006-12-16 07:27 · Score: 2, Informative

Actually it will work with Korn shell as well, and probably zsh too. Not to mention that many systems have a /bin/sh which is Bourne-compatible but enhanced. Many systems have a /bin/sh implementation that supports this, not just bash-based Linux systems.

-Lasse
Re:lowercase uppercase by value_added · 2006-12-16 09:01 · Score: 2, Informative

$ for i in *JPG ; do mv $i ${i/JPG/jpg} ; done

This isn't the first time I've seen this, but it will result in a file MYJPG.JPG being called MYjpg.JPG, ${i//JPG/jpg} would be better as at least the it would end up with the .jpg at the end, but ${i%.JPG}.jpg would be best.

Again, there's lots of ways to do it. To use the trivial JPG -> jpg, example, yes, you're correct in that using the shortest match at the end would be a better approach (excluding other issues). I just wanted to illustrate the redundant (and typically overused) use of basename with a simple example, and remind the folks that using parameter expansion is preferrable both in interactive form, and in scripts.

Me, I've always relied on Larry Wall's script exclusively to rename files interactively. Scripts, on the hand, are often best written with /bin/sh in mind, and should as a rule be as simple, clean and efficient as possible.
Re:FP? by Mr+Z · 2006-12-16 10:48 · Score: 2, Informative

Proprietary? From the man page:

Bash is intended to be a conformant implementation of the IEEE POSIX Shell and Tools specification (IEEE Working Group 1003.2). Bash can be configured to be POSIX-conformant by default.

--
Program Intellivision!
bad ibm no cookie by illuminatedwax · 2006-12-16 16:28 · Score: 3, Informative

Great, IBM, way to ignore the dreaded "xargs" security bug! Seriously, IBM notices some kind of obscure danger about underscores, but completely ignores the fact that xargs separates arguments by newlines??

Let's say I'm a sysadmin and I'm running as root, trying to remove all the files in the /tmp directory by a certain user for some reason:
find /tmp -user 1001 | xargs rm

User 1001 has a directory in /tmp called "haxor\n". Inside there he puts another directory "etc" and inside there he puts a file called "passwd."

Can you guess what happens?
find prints: /tmp/tmp43cc91 /tmp/haxor /tmp/haxor /etc/passwd
xargs sees: ["/tmp/tmp43cc91","/tmp/haxor","","/tmp/haxor","/e tc/passwd"]
Oops!! You just hosed your system!

The correct way to use xargs is to use the -0 switch, which will separate the input by null characters, which cannot appear in filenames. find has a handy -print0 option which will output the correct output:

find /tmp -user 1001 -print0 | xargs -0 rm

And your system is safe.

--
Did you ever notice that *nix doesn't even cover Linux?