it.gen.nz

Writings on technology and society from Wellington, New Zealand

Sunday, October 26, 2008

Hitting the target

You’ve seen the ‘Target’ word puzzle that runs in most daily newspapers. It looks like a 3×3 square of letters with the central letter highlighted. Your mission (should you choose to accept it) is to make as many dictionary words as possible out of the letters in the puzzle, and including the central highlighted letter. There’s always one nine-letter word.

I quite enjoy looking at the puzzle and trying to get the long word, but I lack the patience to list out all the others. A couple of years ago I decided to try to automate doing the puzzle – yes, I know it’s cheating – and here are the results. Read on for some geeky Python stuff.

There are essentially two ways to do this problem: start with the letters given in the puzzle and try to construct words from them (which is what us human types do); or start with a long list of words and see how many of them you can make from the target letters. That involves a lot more processing but less actual thought, so it’s ideal for computers.

There a couple of disadvantages of processing the whole dictionary. First of all, you need a word list. Happily, most computers have one of these. Secondly, it could take quite a lot of computing power to go right through the dictionary. I would have been thrown out of programming school in the old mainframe days for even thinking such a thing. And the program I ended up writing did prove a bit of a hog – it ran for nearly a minute on my old Powerbook G4, but its takes just a couple of seconds on my brand new MacBook Pro. There’s a philosophical point in there, but I won’t labour it.

So, to the program. I used Python for this, because it’s a nice clean way of burning CPU cycles, I mean expressing yourself. And because I don’t know C :-).

And the program is surprisingly concise. I’ve put the whole listing of it online, but here I’ll step through it to show how it works.

The core concept I used in this program is a signature. A signature of a word or a string of letters is a list of 26 numbers, to count the number of times each letter a-z occurs in the word. Using signatures you can tell whether you can make word “A” out of word “B”, by checking whether each of the 26 numbers in A’s signature is less than or equal to the corresponding number in B’s signature. The strategy for the program is to check whether each candidate word from the dictionary could be made from the target letters.

Here’s the part of the program that calculates a signature:

def sig(w):
	l = list(w)
	lettercount = range(26)
	for i in range(26):
		lettercount[i]=l.count(alpha[i])
	return lettercount

This works by converting the string w to a list, and exploiting the count property of a Python lists to count occurences of each letter in turn. This is the only explicit loop in the entire program :-).

And here’s the bit that compares two signatures:

def sigin(small,large):
	return min(map(lambda x,y:x<=y,small,large))

This uses a couple of other cool Python features. The lambda construct is a way of inventing a function that you only use in that line, so it never gets a name. The lambda here just compares two values and gives you a True if the second is bigger than or equal to the first.

The other useful feature here is the map construct, which applies the function its given (the lambda, in this case) separately to every member of the lists its given as arguments. The effect here is to return a list whose members is True if the corresponding member of the first argument is smaller than or equal to the corresponding member of the second argument. The min at the end means that the result is False if any of the individual comparisons are. The effect is to compare each of the 26 numbers in the signatures of two words and tell you whether you could make one word from the other.

One more subsidiary piece of the program:

def get_wordlist(letter):
	wl = open('/usr/share/dict/web2','r').readlines()
	return [w.strip('\n') for w in wl if (len(w)>4) and (len(w)<11) and 0<w.count(letter) and w.lower()==w]

This just gets the dictionary – well, list of words that we know (that’s what the readlines() bit does) – the rest of it strips newline characters, remove short words, long words, words with capital letters in, and words which don’t contain the central letter of the target string.

There’s another neat Python feature in use here – it’s called a list comprehension. It’s the bit in square brackets, which has the effect of turning one list (called wl) into another by selecting members of wl (the bit after the if) and transforming the members selected (before the for).

Still with me? Well, here’s the main bit of the program:

def check_target(letters):
	if len(letters)<>9:
		return 'Argument should be exactly nine letters'
	wordlist = get_wordlist(letters[4])
	tarsig = sig(letters)
	result = [w for w in wordlist if sigin(sig(w),tarsig)]
	result = [(len(w)==9) and w.upper() or w for w in result]
	return  reduce(lambda x,y:x+" "+y,result)

After sanity checking the argument, this gets a list of candidate words from the dictionary, calculates the signature of the target letters, then generates a list of candidate words which could be made from the letters by comparing signatures. The final two lines just format the result by capitalising any nine-letter words, and using another lambda function, glues the resulting list together as a single string.

The result? A concise program which does some seriously heavy lifting. And it’s arguably even useful!

To test this program, you need a Python interpreter on your computer. Macs, Linuxes and BSDs usually have this already. Just pull up a terminal window and type python – if you get something like this:

wairua:~ colinjackson$ python
Python 2.5.1 (r251:54863, Jul 23 2008, 11:00:16) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 

…you definitely have Python. (Hitting crtl-D will stop the interpreter and get you back to the usual prompt.) If you are using a Windows machine, there are free Python interpreters available. Google is your friend.

Once you have Python working, copy and paste the program into a file called, say, target.py.

The other thing to check is that you have a system dictionary – a long list of words in the English language – available. On my Mac it’s at

/usr/share/dict/web2

If yours is somewhere else you will need to adjust the line in the program which reads the dictionary.

Then use the following command line:

python target.py "carthorse"

And you should see:

ache acher achor ahorse arch arche archer arches ashet ashore asthore cahot cash cathro chao chaos char chare charer charet charr chart charter chase chaser chaste chat cheat chert chest choate chore chorea chort chose corah cosh cosharer cosher coth cothe crash crasher each earshot earth echo erth eschar etch ethos haec haet hare harr hart haste haster hate hater hear hearst heart hearts heat hector hero hers hest hoar hoarse hoast hoer hora horse horsecar horser horst hose host hoster oath ocher ochrea ocht orach orchat ORCHESTRA oshac other rach rache rash rasher ratch ratcher rath rathe rather reach recash rechaos rechar resh retch rhea rheocrat rhetor roach rocher rochet rotch rother sachet scarth scathe scho scrath seah search sech seth share sharer shat shea shear sheat sher shoat shoe shoer shor shore shorer short shorter shot shote socht stacher starch starcher stech stoach stocah tach tache tahr tash tche teach tech thar theca thoraces thore those thro throe tocher toher torah torch torcher tosh tosher trah trash troche

Cool!

posted by colin at 9:19 pm  

4 Comments

  1. Whenever I see something like a target I find myself thinking about how to write a program to solve it for me…

    Somehow I find that more interesting than simply solving the problem directly.

    Comment by Brian — 27 October 2008 @ 8:39 pm

  2. Brian

    I know exactly what you mean. I bore easily; but writing a program to do something tricky definitely isn’t boring.

    Colin

    Comment by colin — 27 October 2008 @ 9:22 pm

  3. Pretty cool – noticed you omitted something though: the “Target”-type puzzles usually require each of the words to include a specified letter (usually the centre of the “target”), whereas your program identifies all possible words without requiring them to contain this letter. I assume you would have to add another argument to the command and then filter the output words appropriately?

    Interesting working case anyway – got me to have a play with Python!

    Comment by Simon — 30 January 2009 @ 8:24 pm

  4. Simon

    The program is definitely intended to do that. It does so by screening out words that do not contain the centre letter when it is selecting words from the dictionary.

    Cheers

    Colin

    Comment by colin — 5 February 2009 @ 7:06 am

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress