More hints for #6 on assignment 1

Start by pseudo-coding it:

Write out in words what you need to do to solve the problem. What is the goal state? (an ordered list of the words listing how many times each occurs, or the top of that list) What is the starting point or problem state? (the original file). How could you get there?

You may need to work backwards:

So:

  1. change all non-alphabetic characters to newline characters (using tr)
    (this: '\n' is how you can specify a newline character)
    (Here is a way to change non-alphabetic characters to newlines: tr -sc '[:alpha:]' '\n'
    In speaking to some of you I incorrectly said this would do it: tr -s '[^a-zA-Z]' '\n'
    I forgot that the tr command uses -c to mean "everything except these" rather than the ^ as some other commands do. Oops.)
  2. sort the resulting list of words
  3. count how many times each word occurs (using uniq)
  4. sort the resulting list to find the one that occurred the most

There are probably several ways to accomplish each of these steps, and step 1 could be accomplished in one command or by stringing 2 or more commands together.