As a follow-up on my post about keeping things simple with AWK, I want to mention another piece of 1970s technology that keeps my scripts simple: sed. More specifically, sed's s command for text substitution.
For example, let's say I wanted to use AWK to process data where every line is a list of key-value pairs formatted like so: id=1, a=42, b=3
.
AWK uses spaces as the default field separator so the AWK fields would look like this out of the box: id=1,
, a=42,
, b=3
.
Not very helpful if we want to access keys and values separately.
Changing the field separator to '=' or ',' doesn't help here.
A simple sed 's/[=,]/ /g'
converts the input data into id 1 a 42 b 3
.
When we feed this into AWK we get keys and values as separate fields: id
, 1
, a
, 42
, b
, 3
. Now it's easier to process the data.
On a side note, AWK supports regex field separators and it has string manipulation functions. However, in most cases I find it easier to pre-process the data with sed and pipe the output into AWK or whatever program I use to process the data. I know the s command by heart so I guess that's one factor that makes it feel so easy. In comparison, I would need to read the manual to be able to use regex in AWK's field separator or use AWK's string manipulation functions.
My sed knowledge is pretty basic, but the 's/a/b/' syntax is more or less burned into my brain after 20+ years of vim use. I think the s command is a very simple, concise and beautiful way to substitute text.