Shell Scripting: Bash Arrays

I’m actually not a huge fan of shell scripting, in spite of the fact that I’ve been doing it for years, and am fairly adept at it. I guess because the shell wasn’t really intended to be used for programming per se, it has evolved into something that sorta kinda looks like a programming language from a distance, but gets to be really ugly and full of inconsistencies and spooky weirdness when viewed up close. This is why I now recode in Python where appropriate and practical, and just about all new code I write is in Python as well.

One of my least favorite things about Bash scripting is arrays, so here are a few notes for those who are forced to deal with them in bash. 

First, to declare an array variable, you can assign directly to a variable name, like this: 

myarr=('foo' 'bar' 'baz')

Or, you can use the ‘declare’ bash built-in: 

declare -a myarr=('foo' 'bar' 'baz')

The ‘-a’ flag says you want to declare an array. Notice that when you assign elements to an array like this, you separate the elements with spaces, not commas. 

Arrays in bash are zero-indexed, so to echo the value of the first element of myarr, we do this: 

echo ${myarr[0]}

Now that you have an array, and it has values, at some point you’ll want to loop over it and do something with each value in the array. Almost anyone who utilizes an array will at some point want to do this. There’s a little bit of confusion for the uninitiated in this area. For whatever reason, there is more than one way to list out all of the elements in an array. What’s more, the two different ways act different if they are used inside of double quotes (wtf?). To illustrate, cut-n-paste this to a script, and then run the script: 

#!/bin/bash
myarr=('foo' 'bar' 'baz')
echo ${myarr[*]}
echo ${myarr[@]}
echo "${myarr[*]}"
echo "${myarr[@]}" # looks just like the previous line's output
for i in "${myarr[*]}"; do # echoes one line containing all three elements
   echo $i
done
for i in "${myarr[@]}"; do  # echoes one line for each element of the array.
   echo $i
done

Odd but true. The “@” expands each element of the array to its own “word”, while the “*” expands the entire set of elements to a single word. 

Another oddity — to get just a count of the elements in the array, you do this: 

echo ${#myarr[*]} 

Of course, this also works: 

echo ${#myarr[@]}

And the funny thing here is, these two methods do not appear to produce different results when inside of double quotes. I’d be hard pressed, of course, to figure out a use for counting the entire set of array elements as “1″, but it still seems a little inconsistent. 

Also note that you don’t have to count the elements in the array – you can count the length of any element in the array, too: 

echo ${#myarr[0]} 

That’ll return 3 for the array we defined above. 

Have fun!

  • Keegan

    Hey, maybe you could write a post on tips for sysadmin scripts in python. I much prefer python, but I find myself falling back to bash for shell scripts just cause a lot of things are quicker to do.

    Anyways, nice article :)

  • http://gedmin.as Marius Gedminas

    The thing that was hardest for me was figuring out the syntax for appending an element to the end of an array:

    myarr[${#myarr[*]}]=”$newitem”

  • http://holdenweb.com/ Steve Holden

    I believe you’ll find that the myarr[*] vs. myarr[@] notation is merely reflecting the same difference that exists between $* and $@. Traditionally (?) in shell scripting the correct way to pass all your arguments, quoted, to another scripts has been

    otherscript “$@”

    Or so I seem to remember from about twenty years ago. So it might seem like a wtf, but it’s not unreasonable to differentiate between the two cases.

  • m0j0

    @Steve,

    The rule I learned, just a little over a decade ago now, is “always double quote your variables, period.” I have to say that the rule has served me pretty well in general, and I have been bitten in at least a few cases over the years when I’ve forgotten the rule. I also remember running into a snag when I interpreted the rule to mean “double quote anything that will later be replaced by something else”: double quoting a command substitution can be harmful in certain circumstances. The rule is specific to variables.

    Thanks for making the link in my brain between $* and myarr[*]. I hadn’t made that link for whatever reason. Bash is somewhat unique in that your use of it as a utility unto itself is kind of a prerequisite to using it as a scripting language. My understanding is also that it was never intended to be used for programming, so some of the inconsistencies I see in the shell are, I’m sure, a result of the incremental growth of the shell over the years to support constructs that support more advanced use as a scripting language. I’ll post more of these things in the future and maybe you can help me make sense of them. :-D

  • http://www.hollenback.net Phil Hollenback

    If you want to get even crazier then try using bash arrays to emulate perl split:

    http://advogato.org/person/philiph/diary/0.html

    another argument against using bash as a ‘real’ programming language I guess.

  • exceed

    I know this is a little late, but..

    @Steve, to add an element you can just use “$myarr = (${myarr[@]} newelement) ..there is a space between myarr and newelement.