I see resources about Unix text processing utilities, about Bash, about readline shortcuts, etc. etc. submitted very often to HN.
Exactly how useful are they? Would you say they're among the most important skills a programmer could have? Or do we just have a disproportionately large amount of sysadmins for HN readers? Isn't some general-purpose language like Python or Ruby almost always much better, much faster than these solutions? Isn't it worth it to rather invest your time learning Python well -- instead of getting familiar to the segregated and messy environment of modern Unix-land? Certainly, the learning curve is much steep for Unix utilities than, say, Python.
Scripting languages still usually represents a substantial typing overhead compared to throwing together a pipeline on the command line with standard Unix tools for many simple tasks.
The moment I see I'll need to do something many times, I'll consider writing a script, and then I'll often pick Ruby. But for one-off stuff, the command line is often faster once you get comfortable with a handful Unix tools.
In fact I often find that even when I do things many times, the mental overhead of remembering "yet another script" is often high enough to make it faster to just re-compose the command line I want.
For example, I very frequently do some variation over "grep [some term] | sort | uniq -c | sort -n" to get a sorted list by number of occurrences of [some term], but the key part is "some variation", and that makes adding and remembering an alias less useful.
Another big consideration is that these tools are present on all or most machines many of us use.
For larger applications, having to install other packages is often no big deal, but for example I don't want to find myself dealing with an emergency and suddenly having to pull down tons of packages to use Ruby because I'm not comfortable with the tools that are already on the machine.
A number of things make me more inclined to look at writeups on using the command line, and command-line tools, than those on higher-level languages like Python.
First, many of the things I want to automate are most naturally done at the command line. For these, I already know the commands I want to run, and just need to script the logic that glues them together. Second, I mostly want small scripts I can send to colleagues, and not have to worry about whether or not they have Python (or whatever) installed.
I do write scripts in Python and Ruby, but they tend to be longer, since they reflect tasks where the data have to be pulled apart and put back together in multiple ways (for example, a dependency-generator for some custom makefiles I maintain). This sort of task favors building up structures in memory, over pipelines, and I use the appropriate tool accordingly.
As for the question whether any of these tools are the most important skills a programmer could have, no, I don't think so. As a programmer, your most important skills are in the language you use all the time, the language your "deliverable" is written in. Most people probably don't use the unix tools as their primary programming platform. But those tools support and extend the environment in which we get our "real" programming done.
To borrow the woodworking analogy from (I believe) the Pragmatic Programmers, all those articles about the Unix tools are probably the equivalent of articles on keeping your chisels and saws sharp. No, the file you use to keep your chisel sharp isn't your most important tool; your chisel or saw is your most important tool. But the craft of keeping your most important tool sharp can be fun and rewarding in itself.
By the way, speed is never an issue with any of the scripts I'm likely to run. But if it was, I doubt Python would be faster, since those scripts involve a lot of calls on system resources.
I used to have that attitude, and Python is still my main language. But I do more and more stuff in shell now. Every project I write has a whole bunch of shell snippets now, which saves a ton of code and lowers the barrier to automation considerably.
If you're trying to write shell scripts in Python, they're going to be 3-5x longer. Bash is a higher level language than Python. Every tool has its place, and bash and Python are complementary.
Also I realize there is some useless ceremony in Python standard practice. Do you want to know what my Python test runner looks like now?
$ find . -name \*_test.py | sh -x -e
That's it... no BS. I don't know what people are using in the Python world these days but some of it has drifted toward "framework land".
Also, it's not very hard to make this parallel, whereas it is somewhat annoying in Python. And you don't have to worry about global variables polluting each other -- tests stay independent.
I test a big part of my C/C++ code with shell scripts as well.
So basically I find it very helpful to think of yourself as writing shell utilities in Python. Python's not your world. It's part of your world.
Automating stuff at the command line is super important. If it's easier for you to do in Python, then that's fine. I use awk and bash for a lot of that stuff because it's what I happen to know.
Other people I know use perl. The important thing is to be able to automate anything non-trivial that you will do more than a single-digit number of times. One time I spent 30 minutes writing a script for things that a dozen people were doing a dozen times a day. It maybe saved a minute each time, but that's over 2 hours in the first day we had it. It paid off in time for me in 3 days, and probably less than that it terms of "damnit this is boring"
For me, at least, Python has a much higher friction to get from "here's the list of what I do" to "here's an automated version"
I do use Python for some of the more involved things (manipulating timestamps from bash is a pain), and I've converted some bash scripts into "real language" scripts when things started to get hairy, but most of the time, you get 90%+ of the time savings from the first 1% of effort.
Just yesterday I was able to write a one liner on the command line to analyze my skype usage. I wanted to see if it was making financial sense to keep skype alongside my prepaid wireless plan.
You can download your last six months of skype activity from skype.com and they are in the form:
After 15 minutes or so, I came up with the following one liner:
cut -f7 -d";" call_history* | grep -v "Duration" | awk '{ FS=":"; s+=$1*60; s+=$2; if ($3 != 00) { s+=1 } } END {print s " minutes"}'
I could have done the same thing in perl or python in 5 minutes, but it was interesting to "program" only by hooking programs together to achieve the same thing.
After analyzing my skype logs, I found I used 1800 minutes. That would have cost me $180 with prepaid minutes, but only cost $30 with skype.
Yeah, it's a good idea to learn Python well. And perl.
But... sitting here in this cubicle, in front of a PuTTy window working on a production server for a large financial institute to manipulate text files and I don't have access to either of those. Ruby is not installed.
Most of the time cut, comm, paste, diff, tr and ed|sed does my job. When they aren't, then there is an old awk binary. So, it depends on the job and the tools you have. IMO every developer should learn the bare minimum about UTP utilities. It won't hurt.
Depends, for one thing one how frequently you intend to use it. Consider variable names, it is often said that it is desirable to have nice, long explanatory names. But again it depends. For variables that have big scopes, it is indeed important to have descriptive names, but those that disappear after the line, its ok to use i,j,k. In fact short names in such cases increase comprehension rather than impede it. The reverse holds as well.
> Certainly, the learn curve is much steep for Unix
> utilities than, say, Python
I am not certain about that at all. Yes they may have trucload of options, but I have never had to memorize them. Many disagree that these tools really dont strictly adhere to the "do one thing but do it well" philosophy they do that to a satisfying degree of approximation. Coreutils, textutils, find and xargs can go a really really long way.
>Certainly, the learning curve is much steep for Unix > utilities than, say, Python.
Now what makes you think that? UNIX shell tools have a very consistent interface which is really pretty simple (pipes in, pipes out,) they are well documented, and they're interactive and easy to play around with no zero setup.
Also, the UNIX shell has been around for 30 years now and isn't going anywhere. So what's the best investment? Python will change more in the next five years.
I would say they are among the most important skills a programmer could have - the amount of time I have saved by being able to do fast manipulation with chains of things like sort, tr, sed, awk, head, tail, paste, strings, nm, hexdump, xargs, etc. and get diagnostic information for deeper issues
it's like anything else; each tool is just a building block. the number of creative things you can do with very little effort by piping the output of one to the input of another is incredibly powerful. it's not always computationally efficient, but many times that doesn't matter.
Python is another great tool, but sometimes a machete is quicker and easier. Different tools for different situations.
Honestly, I think it's a little of column A and a little of column B. While sometimes it can be useful and faster to bust out a one line bash shortcut, I think the HN community is also a bit guilty of over-romanticizing Unix/Bash/<Insert non-visual & somewhat geeky tool/language>.
About non-visual stuff I can simply answer that I am more productive with a shell, R, ggplot2 and python than with excel, just because I know this better. This is my stack along with matlab and mathematica in the bank where I work now, both basically at cli level. The majority of people in my division uses excel.
I also think of things in an every unix way in my using of computers largely because of 15 years of using unix like systems, it's the same when I need to code more than a couple of lines and do not have vim, for me the brain became too much used to it, I believe this will be the case for many people in HN.
While I can't speak for other texts, UTP is my and probably the go-to book for learning Troff. I owe the typesetting of my ebook to it. Since it occupies that niche area, it is still quite relevant and useful today.
It might be that at any given point of time, there are a lot of people who want or need to learn Unix. I already know Python, but I think there is value in learning the fundamentals of the Unix shell programs. I've spent the first five years of my career on Windows, am now on Linux, and this book looks like a useful resource.
Exactly how useful are they? Would you say they're among the most important skills a programmer could have? Or do we just have a disproportionately large amount of sysadmins for HN readers? Isn't some general-purpose language like Python or Ruby almost always much better, much faster than these solutions? Isn't it worth it to rather invest your time learning Python well -- instead of getting familiar to the segregated and messy environment of modern Unix-land? Certainly, the learning curve is much steep for Unix utilities than, say, Python.