How I spent way too much effort to win a game show on national television
December 13, 2019
In my time, I've done a number of side projects — some serious, some decidedly less so. My latest side project is quite definitely in the latter category: trying to win a game show on national television.
The goal is to guess a five- or six-letter word in at most five guesses. The first letter is given, and after every guess players are told which letters their guess shares with the actual word, and which of those are also in the correct position. An example should clear things up.
Let's say it's my turn. The word to find is aspect — but of course, I don't know that yet. All I know is that the word is six characters long, and starts with the letter a. My first guess is 'actors' which, although not the word we're looking for, does share the letters 'c', 't' and 's' with the one we are looking for, albeit in different positions.
Still clueless about the actual word, my next guess is 'assume'. Now, not only is the letter 'e' part of the word we're looking for, but the first 's' is actually in the same place! Given that I now know that the first two letters are an 'a' and an 's', and that the rest of the word contains a 'c', a 't' and an 'e', I can make a reasonably informed guess that the word we're looking for is, in fact, aspect.
Two teams take turns guessing a word. Roughly, the team that guesses most words correctly wins.
One way to win at Lingo is to have a large vocabulary and a keen eye. I teamed up with my brother, and although we both speak Dutch just fine, we are certainly not good enough to win this game — we barely even made it through the qualifiers.
So we started working on a strategy to increase our odds of winning. Of course, we're not the first ones to try to do so: a commonly adopted strategy is to memorise a list of words that cover most of the vowels. Using those words as the first guesses increases the likelihood of letters overlapping with the word you're looking for. However, we felt there was still room for improvement to this strategy.
By focusing on vowels, you are pretty much guaranteed to find at least one of the target word's letters, and maybe one or two more if you're lucky. But if we were to focus not just on the vowels, but on the most common letters for every possible first letter, we might be able to push that number up quite a bit!
So I set to work. The first step was to find a list of all the possible words in the Dutch language. Luckily, Wikipedia's fantastic sister project Wiktionary has carefully documented their efforts to determine which words they should focus on. That led me to the fantastic work by Hermit Dave, who parsed a large number of subtitles to compile Frequency Word Lists for many languages, including Dutch.
Word list in hand, I wrote a script to count the most frequently occurring letters for every starting letter, for both five- and six-letter words. For example, it enumerated all six-letter words that start with an
s, then tallied how many of them contained an 'a', how many of them contained a 'b', etc.
Then, for each of those words, it considered how often the letters of those words were used in the entire list, and gave me one of the words that cover most occurrences.
It then went on to select the word that contained the most frequently occurring of the remaining letters not covered by the selected word.
That was repeated once more to end up with a list of three words that show significant overlap with most six-letter words starting with an 's'.
The entire process was repeated for every letter in the alphabet, over both five- and six-letter words. By memorising all of them, my brother and I could spend our first three guesses covering as many relevant letters as possible, and then have two more guesses remaining to find the correct word.
(Fun fact: my brother is a comedian, and in one part of his latest show he asks members of the audience to prompt him with letters, to which he'll respond by rattling off the words we memorised. He's still got it!)
Practice makes perfect
While it wasn't feasible for us to significantly expand our vocabulary in preparation for the show, we could certainly improve our ability to find the right words given a set of letters. Thus, I wrote a small practice application (source code) that would serve us words from the word lists, take our guesses, and then rate them.
This was a great help not only in memorising words, but also to help us detect and internalise patterns. For example, we learned that often, if only few letters were covered by our initial three words, the gaps would be repeat occurrences of the letters we did already find.
Additionally, since the words were also selected from the list, it gave a more realistic picture of the words we'd encounter in practice: we'd be more likely to encounter words starting with an 's' that those that start with a 'z'.
So, how did it go?
Pretty well! Although the game still involves quite a bit of luck, we were certainly aided by our far too extensive preparations in achieving a winning streak lasting a week. But of course, given the incredible geekiness of it all, the most important thing is that we had a lot of fun doing it.
(And for the Dutch readers: yes, the word that went viral was actually one of the best three.)
This work by Vincent Tunru is licensed under a Creative Commons Attribution 4.0 International License.