Inspired by Peter Norvig’s essay about writing a spelling corrector, I was recently working on a puzzle and thought that it would make a good essay: I am attempting to write a program that can win Hangman.
The idea behind Hangman is simple: guess letters that are in the hidden word, and if you get all the letters then you win. Every time you get another letter wrong, draw another body part on the hanged man. Once he is fully drawn, you lose. (This man is represented in my program by a counter that goes up to some number. I call this the number of guesses you’re allowed to make, although strictly speaking it’s the number of incorrect guesses.)
Without trying to predict what word your opponent might choose, which would require some serious artificial intelligence programming, the optimal strategy is to select whichever letter is the most likely to be in a word like the one you’re looking at. The simplest way for a computer to do this is to have a list of letters by letter frequency, and then simply guess each letter in turn. The code looks like this:
frequency = "esiarntolcdugpmhbyfkvwxqjz" # This is letter frequency in dictionary words, and doesn't take into account that some words are more common than others.
letter = frequency[1:]
frequency = frequency[:1]
if hangman.guess(letter) == win:
The problem with this method, as you may have guessed, is that it is extremely unreliable. When you’re allowed six guesses, it only gets the word 11% of the time. When you’re allowed four guesses, that figure drops to 3%.
A better method is to consider that letter frequency is relative. Depending on what words you’ve eliminated already, the most probable letter is going to be different from just the next on the list of letter frequency. For example, if you have a word such as “___in_”, it is very probable that the last letter is a ‘g’ — even though the letter g is not very common in general. This simple change bumps up the 11% figure to a surprising 92%.
Next I tried something that is rather easy for humans. It is clear enough to us that all words have at least one vowel, so the program should recognize this too. I modified the program so that it would keep guessing vowels until it found at least one. Perhaps surprisingly, this only improved the program by 1% — barely a statistically significant margin, given how many trials I did. Perhaps the improvement would be more noticeable if the win ratio wasn’t already so high.
The ‘executioner’, as we may call the program that decides upon the word, was as yet not doing anything clever. It was time to change that. I wrote a simple method designed to evaluate how difficult a given word is: take the average of the rarity of each letter in the word. Then I had the executioner look through some number of words and pick whichever one is decided to be the hardest. When it looked through 50 words, the win ratio dropped to 72%; when it looked through 500 words, it went all the way down to 49%.
Given the current program, it seems like the optimal strategy for the executioner is simply to find the hardest-to-guess word and pick it every time. This might fool a computer, but it wouldn’t fool a person. The next step is to add some pattern recognition to the program.
This is the basic idea of how I was able to write a pretty powerful Hangman program for both sides. I was able to collect some data regarding the success of the program. It’s not a whole lot, but it gives you a rough idea of how successful the program is. Players were allowed six guesses: head, body, arm, arm, leg, leg.
human guesser: 2 wins against computer, 2 wins against human; 6 losses against computer, 10 losses against human
computer guesser: 7 wins against human; 2 losses against human
It is worth noting that humans seem to be much better at coming up with words than at guessing them, whereas the computer is about equally good either way when using the more sophisticated algorithms. But on both counts, the computer performed much better than the people did.
Unfortunately, one thing that humans are bad at and computers are not is pattern recognition. If many more trials had been done, the human players would have started guessing rarer letters. The computer assumes that rarer letters such as q and z are harder to guess, and so finds words that contain them (words like “buzz” and “minx”). The humans were already beginning to notice the abundance of rare letters.
Currently, the biggest flaw in the computer is that if you are the executioner and you give it the same word over and over, as long as the word is hard enough, the computer’s guesser will lose every single time. The necessary improvement there is clear enough, but actually implementing it could be tricky. I’ll leave that as an exercise to the reader . . . for now.