How To Adopt 10 'Good' Unix Habits
An anonymous reader writes to mention an article at the IBM site from earlier this week, which purports to offer good Unix 'habits' to learn. The ten simple suggestions may be common sense to the seasoned admin, but users with less experience may find some helpful hints here. From the article: "Quote variables with caution - Always be careful with shell expansion and variable names. It is generally a good idea to enclose variable calls in double quotation marks, unless you have a good reason not to. Similarly, if you are directly following a variable name with alphanumeric text, be sure also to enclose the variable name in square brackets ([]) to distinguish it from the surrounding text. Otherwise, the shell interprets the trailing text as part of your variable name -- and most likely returns a null value."
export POST="first"
www.isoHunt.com
An anonymous reader writes to mention an article at the IBM site from earlier this week, which purports to offer good Unix 'habits' to learn.
I seriously doubt reading this article is going to get anyone to start showering on a regular basis.
Push Button, Receive Bacon
Join moola.com, play games to earn money.
or even better -- use perl.
The quoted paragraph from the article is incorrect -- and it is in the article too -- but the example immediately following it correctly shows the use of braces ("curly brackets"), not square brackets, for variable names in shell.
My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/
$ cd tmp/a/b/c || mkdir -p tmp/a/b/c
If the directory exists you end up in the directory, if it does not it creates the directory but leaves you where you first started. Hence you don't know which directory you will be in after the command is executed!
There are four sorts of people in the world: fools, lunatics, idiots and morons. - Umberto Eco, Foucaut's pendulum.
(plus he didn't mention my favourite shortcut: shell history)
How about being more inclusive and expanding this to deal with security features (surely the single biggest benefit?) and the ease of working on remote boxes?
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
...is so littered with basic errors that it really shouldn't be recommended to anybody. How is 'tar xvf -C tmp/a/b/c newarc.tar.gz' expected to work, for example? Quote variables with square brackets? Running subshell commands using ; instead of && ? No mention of 'xargs -0' ? Don't pipe from cat to grep? Does anybody actually care that people do this (primarily so that the syntax is consistent between a munged- and unmunged-grep, and also such that the order of the command-line is logical from a human point of view)? Plus, of course, it's possible that cat | grep could yield better performance than grep alone: if cat uses mmap() to efficiently read the input files, and the kernel's pipe implementation is good, then it could do better than a grep implementation alone that simply read()s the files.
OK, I agree. Please provide a concise Python script that unpacks a tarball (a .tar.gz or .tar.bz2 file), copies new files in to a said tarball, patches based on the contents of said new files, runs make from various directories in said extracted tarball, and then changes the name of the top-level directory created by the tarball to a new name and repacks the tarball.
Or a concise Python script that opens up a text file of URLs, and extracts the files listed in the URLs:
#!/bin/sh
for a in $( cat file | awk '{print "'\''" $0 "'\''"}' ) ; do
wget $a
done
Python has it place, and is far better for medium to large projects, and projects where the code needs to be maintainable. Shell, however, works a lot better for automating UNIX tasks than Python does. Not to mention embedded systems: I can compile Busybox to have both a good shell and all of the commands that one would run from shell scripts (including grep, cut, sed, and, yes, awk) in only about 300k. A Python binary is about a megabyte big, and you need about ten megabytes to fit all of the libraries Python 2.4 comes with.
Comment removed based on user account deletion
*nix is a highly modular component-based software system with a standard interface (flat byte streams) between components, and a basic set of standard components (given in the POSIX standard) that can be relied upon to always be present.
No, don't mod up anybody in this thread. Perl and Python are abominations. Pure, unadulterated Bourne shell is for the true, seasoned *nix user. Just like Java is an answer to a question nobody asked in the GUI world, so too is Perl and Python in the command line world.
I am so glad that he showed what a difference can make, because I was *really* getting annoyed at having to wait that extra
If I've got a simple task to do (eg the text-file-of-URLS example) then I knock it up in shell script. By the time that simple task has feature-creeped up to more than 20 lines I start to wish I'd written it in Perl. So I rewrite. By the time that Perl script has crept up to more than 200 lines I start to wish it was written in Python. So I rewrite. By the time that Python script has crept up to 2000 lines I start to wish I'd farmed the job out to a team of programmers, and I give up caring what language its written in and make them do it as a web service. Then I write a small shell script to call their web service. When that shell script has feature-creeped up to more than 20 lines...
1. Don't rm with an absolute path because you could easily
/tmp/dir
/tmp ; rm -r -f dir)
/tmp ; sudo rm -r -f dir)
/path$
#rm -r -f / tmp/dir
when "all" you wanted was
#rm -r -f
instead do this:
#(cd
or even better use sudo if you have it:
$(cd
2. When logged on as root or when using sudo on a production system think things over
at least twice before hitting enter.
3. Make sure at all times you're on the right machine, logged on as the right user in the right directory.
Set up your shell prompt to look like this user@host
This code is not pure shellscript : it uses awk and wget to get the job done...
A Python equivalent might be :
It's not that much longer, it's much easier to read and less error-prone (especially the awk part), and it uses fewer external utilities.
To me, the *only* advantage of shellscript is that it's the only language that you are sure to find on any Unix system.
No shit, Sherlock! You have clearly never worked in a large organisation, where - believe it or not - you, as a standard user, do not actually get to insist that the already-overworked IT department jump through bureaucratic hoops to install your favourite bloated scripting language, unless you have a damn good business case for it. And probably not even then.
Hint: if the task you want that scripting language to accomplish is trivial to achieve with a simple shell script, you don't have a good business case.
* This doesn't apply to wget, obviously, but if your platform really has no standard alternative, you are more likely to persuade IT to install something small and simple like wget, fetch, curl, etc. than a complete programming environment like Python.
"10 good habits that improve your UNIX command line efficiency" would probably have been a better title.
The title did however bring back fond memories of Eric Raymond's The Art of Unix Programming. The book is available online, and if you were hoping for something a bit more substantial as well, then the section Basics of the Unix Philosophy might be worth a read.
Um, whats wrong with
wget -i filename
Or have I missed something?
Monkeyboi
Articles on terminating zombie children are always a treat too.
One line blog. I hear that they're called Twitters now.
Yes -- and habits is what people desperately need. The people I know primarily need three habits: RTFM when they don't understand something; adjusting their behavior based on the FM; and managing their use of the current directory (i.e. you don't have to cd into a directory to use a file which lives there).
In their example with tar they did
tar xvf
without the dash. (E.g., tar -xvf)
While that does work, I prefer to add
the dash as it makes it more consistent
with the other commands. So I consider
that a bad example. tar is one of the
older commands like dd that have weird
command line syntax.
As has been pointed out, this article is riddled with errors. It's also not very interesting. So in the interest of perhaps actually providing some interesting tips:
/).
/etc/modules.autoload.d/kernel-2.6 /etc/m*a*d/*6
/etc/m* -> /etc/mail /etc/m*d -> /etc/modules.d /*/m*/*6 -> /etc/modules.autoload.d/kernel-2.6, and /etc/modules.d/i386 (not quite!)
In scripts, prefix dangersous commands with an 'echo' for a test run (So you can catch all those rm -rf
Single quotes are the best quotes for plain strings. The only reasion to use double quotes is if you need to quote a variable or a single quote.
Completion is fun, but using wildcards is more flexible (though you'll only want to use benign commans like cd, less, etc):
nano
nano
Note that the use of subpaths reduces the amount of flexibility.
cd
cd
nano
Finally, as a comment for the article, using:
test -e $DIR || mkdir -p $DIR
is much better than their suggestion and probaly faster anyway. Though I'd just do "mkdir -p $DIR" and maybe "&>/dev/null" under most circumstances anyway.
That's all I can think of at this point. Anyone else have tips?
Yuck, I never use bash scripts. I always use Perl scripts. I just do things like
... fi ?
#!/usr/bin/perl
system("blah");
system("blah");
if(perl code perl code) {
system("blah");
}
etc.
why?
1. because i can't remember the awful syntax of the bash if statement. isn't it something like
if[[""$X$$"" == ""$Y""]];;
2. how about accepting command line arguments in bash? in perl it's just $ARGV[0]. nice and simple and like C++ (except for the offset by one) so i don't want to have to bother learning another one.
3. because i can't bother learning how to do a regular expression in bash. in perl it's simple with =~/.../ and =~s/.../.../ and it was bad enough that PHP isn't like that.
4. because bash seems to think that sometimes you use x and sometimes you use $x
x="hi"
echo $x
i really don't want to learn this language. so i just use Perl everytime i need a script. it works.
for i in *.JPG ; do mv $i `basename $i .JPG`.jpg ; done
May contain traces of nut.
Made from the freshest electrons.
2. how about accepting command line arguments in bash? in perl it's just $ARGV[0]. nice and simple and like C++ (except for the offset by one) so i don't want to have to bother learning another one.
Command line args? $1 $2 etc or $* for all of them.
-- "You can lead a yak to water, but you can't teach an old dog to make a silk purse out of a pig in a poke" - Opus
Why is the perl script so hard? The command line would have a bunch of sed going on, whereas the perl script only requires running perl.
I'm guessing... perl -e 'for(`ls`) { chomp; $n = lc $_; system("mv $_ $n"); }'
-- The act of censorship is always worse than whatever is being censored. Always.
Wow, do you think you could be just a little bit more polite next time?
You should be using gzcat, not zcat, anyhow. zcat is only portably able to be compress -d.
/updateDir && find . -newer timestamp -type f | tar -T - -zcf -) | ssh user@foo 'cd /stagingDir && tar -zxvf -'
/updateDir && find . -newer timestamp -type f | tar -T - cf -) | compress | rsh -l user foo 'cd /stagingDir && compress -d | tar -xvf -'
gzcat will never be broken in the way described, hence the following is fine and portable IME:
gzcat arc.tar.gz | ssh user@foo 'cd tmp/a/b/c && tar -xvf -'
HOWEVER, I find that even vaguely modern CPUs are much faster at gunzipping than typical internet speeds. So, I would use this myself:
cat arc.tar.gz | ssh user@foo 'cd tmp/a/b/c && gzcat | tar -xvf -'
On the otherhand, I would never actually write that, because if I had the archive in place, I'd just transfer it with scp and untar it myself on the remote end. Unless, of course, cat in the example is just a place holder for 'arbitrary cool shit in the pipeline'.
HEY! PYTHON WEENIE! YEAH YOU, UP THERE!
Let's see you do this in your bloatware:
(cd
Incidentally, dropping the "z" flags and adding "-C" to ssh will make this totally cross platform, even to non-gnu-land, back as far as ssh 0.99 without significant penalty or performance difference. A reasonable alternative, before about 1992, would have been:
(cd
Moo hoo hahaha
What would you python weenies do if confronted with an AIX 2 or a SunOS 4 box? Go home and backport the behemoth? Any one of my sysadmins -- who have never used either of those OSs but know shell -- could solve that problem in five minutes flat.
Do daemons dream of electric sleep()?
Oh, for the love of God, stop bitching about how hard this is to do under UNIX without package x.y.z installed and imagine doing it under Mac OS9 or Windows.
... should do the trick. If you don't have tr, you can use sed with the y command. But I can't think of the last time I saw a box w/o tr.
Break it down into its constituent parts: Iterate, rename. Whoo! Simple now, eh?
# cd cameraDir
# find . -type f -prune | while read file
> do
> mv "$file" "`echo \"$filename\" | tr '[A-Z]' '[a-z]'`"
> done
#
Do daemons dream of electric sleep()?
Ben Hocking
Need a professional organizer?
I'm evaluating the tips based on them being prescriptions for things to do in interactive shell behavior, since that seems to be the theme. Writing scripts changes the situation to make some tips valuable. My number one tip as a response to these is don't try to be too clever (particularly when the biggest benefit of the approach is to say 'look how clever it is!'). Maybe it's because I don't work in a vacuum and all too many times have been called in to clean up where an administrator tried to do something too complicated for their understanding.
.01s instead of the .09s operation is a bad example. He could have set up a
mkdir -p is a convenience people should be aware of, but telling people to start getting overly creative with the shell expansion behavior is asking for mistakes/trouble. Also, having a mkdirhier script in case the example isn't supported on all shells is an indication that you shouldn't get overly cozy if you are going to be dealing with a lot of different systems/users with different default shells. The amount of time a lot of people take to figure out the 'clever' way in terms of how to phrase the expansion so the shell will expand it right is often longer than just typing the two lines more that the less thought takes. Not saying this isn't useful, but in my experience too many people mess things up too frequently or take too long to think up the expressions to tell them trying to be clever ends up taking more time than they think they are saving.
Change the path instead of the archive is not that dire to do normally, but if you avoid it, to me it's just easier to be in the target directly and use full path to the archive.
On combining commands, I second that ; can be dangerous and && as a default will make the chain more ready to break, but again I say not trying to be so clever as to put all you want on one line. Some things go wrong that aren't reflected in return codes, doing it one at a time let's you think of those. True, though, that the && never assume the first command works, while your fingers may keep moving and hit enter on next command before your brain realizes the command failed, so && may have merit, but then again taking your time may have more merit.
On the quotation thing, true enough, you must understand how quoting works to do remotely complex things, particularly nested circumstances (i.e. ssh to a system to run a command, where the output will be parsed by two shells.)
On the breaking up long lines thing, in a shell script it may be more necessary, but on an interactive command line it could also indicate you are trying too hard to do things in one chunk. I admit sometimes it does get too wide, but particularly less experienced admins should consider if there were a simpler way to do it in smaller chunks they won't screw up.
Grouping commands is important to know, and harmless (better than repeating the same pipe over and over and more powerful).
I will say xargs is way way over-rated. Too many people, particularly dealing with directory trees containing spaces, get into trouble piping the output of find into anything when IFS causes something like "/tmp/Monthly Report" to be parsed as two different files. find has a competent filtering mechanism (-type, -iname, -name, etc...) and it's own -exec. find is well aware of the state of each file. You could assign IFS to try to avoid it, but using find's built-ins where possible alleviates it.
When you are talking about interactive shell operation, picking the
much larger demonstration that would have been useful, but this just makes people mock the example. In any event, this seems like an okay thing to convey, but I dunno if it would've made my top 10.
Probably a more valid point about using awk, and a common trap I do see people stuck in.
On piping cat, that seems like more an annoyance than anything constructive. Some people use the cat | grep construct because it is so unambiguou
XML is like violence. If it doesn't solve the problem, use more.
oh, I ran into this problem too many times so I set up the following function:
function pmv ()
{
local src dest oifs
oifs=$IFS
export IFS=$'\n'
src="${1:? 'Error: input pattern not specified'}"
dest="${2:? 'Error: destination pattern not specified'}"
for i in $src ; do
mv "$i" $(sed -e "s/$dest/" <<< "$i")
done
export IFS=$oifs
}
Now I can move files by regexps by typing pmv <file-match-pattern> <sed-matcher>/<replacement>
very very convenient.
$ for i in *JPG ; do mv $i ${i/JPG/jpg} ; done
.jpg at the end, but ${i%.JPG}.jpg would be best.
/bin/sh in mind, and should as a rule be as simple, clean and efficient as possible.
This isn't the first time I've seen this, but it will result in a file MYJPG.JPG being called MYjpg.JPG, ${i//JPG/jpg} would be better as at least the it would end up with the
Again, there's lots of ways to do it. To use the trivial JPG -> jpg, example, yes, you're correct in that using the shortest match at the end would be a better approach (excluding other issues). I just wanted to illustrate the redundant (and typically overused) use of basename with a simple example, and remind the folks that using parameter expansion is preferrable both in interactive form, and in scripts.
Me, I've always relied on Larry Wall's script exclusively to rename files interactively. Scripts, on the hand, are often best written with
People who argue that piping a single file via cat is the best method are wrong. The following method has all of the advantages you cite, but is also shorter to type, uses less system resources (no cat process, no pipe(2) object), and doesn't require you to "rething the input method" in case you want to change the grep command:
:)
$ <file.txt grep foobar
Few people know that input-redirection can be established before the command name
Great, IBM, way to ignore the dreaded "xargs" security bug! Seriously, IBM notices some kind of obscure danger about underscores, but completely ignores the fact that xargs separates arguments by newlines??
/tmp directory by a certain user for some reason: /tmp -user 1001 | xargs rm
/tmp called "haxor\n". Inside there he puts another directory "etc" and inside there he puts a file called "passwd."
/tmp/tmp43cc91 /tmp/haxor /tmp/haxor /etc/passwd
e tc/passwd"]
/tmp -user 1001 -print0 | xargs -0 rm
Let's say I'm a sysadmin and I'm running as root, trying to remove all the files in the
find
User 1001 has a directory in
Can you guess what happens?
find prints:
xargs sees: ["/tmp/tmp43cc91","/tmp/haxor","","/tmp/haxor","/
Oops!! You just hosed your system!
The correct way to use xargs is to use the -0 switch, which will separate the input by null characters, which cannot appear in filenames. find has a handy -print0 option which will output the correct output:
find
And your system is safe.
Did you ever notice that *nix doesn't even cover Linux?
Does anybody else notice these benchmarks are flawed? For an article discussing the shell, we should know that in this first benchmark, time is only counting the execution time of grep, and not wc, and is thus undercounting how much CPU time is actually used. How about a neat shell trick to correctly run that benchmark?
> ~ $ time grep and tmp/a/longfile.txt | wc -l
> 2811
>
> real 0m0.097s
> user 0m0.006s
> sys 0m0.032s
> ~ $ time grep -c and tmp/a/longfile.txt
> 2811
>
> real 0m0.013s
> user 0m0.006s
> sys 0m0.005s
> ~ $
Right. So 10 *really* good Unix "habits" would be:
1. Never use csh or any derivative thereof.
2. Know the portable behaviour of your Unix tools.
3. Learn to use ed, one day you'll be glad you did. You can also use ed and ex from scripts or from a command.
4. A shell command is a small program. If you are unsure about a command, test it first, like you would any program.
5. Learn to use the standard shell on your system.
6. Learn useful nonstandard extensions of utilities, but use them with care.
7. Never rely on an extension to the point that you forget how to do it portably. The definition of "portably" is up to you.
8. Learn to use csh enough that you can make do in an emergency, and learn *why* you shouldn't use it.
9. If your standard shell is Bash, learn Korn too. And vice versa. Learn both, how they differ, and how they differ form your standard shell.
10. Sometimes a real C program or a script in a different language is better than using shell.
-Lasse