wc -w - sed, a stream editor

Next: wc -l, Previous: wc -c, Up: Examples

4.10 Counting Words

This script is almost the same as the previous one, once each of the words on the line is converted to a single ‘a’ (in the previous script each letter was changed to an ‘a’).

It is interesting that real wc programs have optimized loops for ‘wc -c’, so they are much slower at counting words rather than characters. This script's bottleneck, instead, is arithmetic, and hence the word-counting one is faster (it has to manage smaller numbers).

Again, the common parts are not commented to show the importance of commenting sed scripts.

     #!/usr/bin/sed -nf
     
     # Convert words to a's
     s/[ tab][ tab]*/ /g
     s/^/ /
     s/ [^ ][^ ]*/a /g
     s/ //g
     
     # Append them to hold space
     H
     x
     s/\n//
     
     # From here on it is the same as in wc -c.
     /aaaaaaaaaa/! bx;   s/aaaaaaaaaa/b/g
     /bbbbbbbbbb/! bx;   s/bbbbbbbbbb/c/g
     /cccccccccc/! bx;   s/cccccccccc/d/g
     /dddddddddd/! bx;   s/dddddddddd/e/g
     /eeeeeeeeee/! bx;   s/eeeeeeeeee/f/g
     /ffffffffff/! bx;   s/ffffffffff/g/g
     /gggggggggg/! bx;   s/gggggggggg/h/g
     s/hhhhhhhhhh//g
     :x
     $! { h; b; }
     :y
     /a/! s/[b-h]*/&0/
     s/aaaaaaaaa/9/
     s/aaaaaaaa/8/
     s/aaaaaaa/7/
     s/aaaaaa/6/
     s/aaaaa/5/
     s/aaaa/4/
     s/aaa/3/
     s/aa/2/
     s/a/1/
     y/bcdefgh/abcdefg/
     /[a-h]/ by
     p