Skip to content

Letterpress.app Word List

Here is a little utility to assist solving Letterpress games.  It comes largely (almost entirely) from an article on Stack Overflow.  It uses /usr/share/dict/words for the word list, and requires a initial stage to convert this into an anagram dictionary for efficiency.  Even so, it still takes a minute to run on my recent iMac.

The anagrammer.py:

#!/usr/bin/env python

f = open('/usr/share/dict/words')
d = {}
lets = set('abcdefghijklmnopqrstuvwxyz\n')
for word in f:
  if len(set(word) - lets) == 0 and len(word) > 2 and len(word) < 9:
    word = word.strip()
    key = ''.join(sorted(word))
    if key in d:
      d[key].append(word)
    else:
      d[key] = [word]
f.close()
anadict = [' '.join([key]+value) for key, value in d.iteritems()]
anadict.sort()
f = open('anadict.txt','w')
f.write('\n'.join(anadict))
f.close()

And then the solver.py:

#!/usr/bin/env python
# http://stackoverflow.com/questions/5485654/how-can-this-python-scrabble-word-finder-be-made-faster

from bisect import bisect_left
from itertools import combinations
from time import time

def loadvars():
  f = open('anadict.txt','r')
  anadict = f.read().split('\n')
  f.close()
  return anadict

# letterpress scores
scores = {"a": 1, "c": 1, "b": 1, "e": 1, "d": 1, "g": 1, 
         "f": 1, "i": 1, "h": 1, "k": 1, "j": 1, "m": 1, 
         "l": 1, "o": 1, "n": 1, "q": 1, "p": 1, "s": 1, 
         "r": 1, "u": 1, "t": 1, "w": 1, "v": 1, "y": 1, 
         "x": 1, "z": 1}

def score_word(word):
  return sum([scores[c] for c in word])

def findwords(rack, anadict):
  rack = ''.join(sorted(rack))
  foundwords = []
  for i in xrange(2,len(rack)+1):
    for comb in combinations(rack,i):
      ana = ''.join(comb)
      j = bisect_left(anadict, ana)
      if j == len(anadict):
        continue
      words = anadict[j].split()
      if words[0] == ana:
        foundwords.extend(words[1:])
  return foundwords

if __name__ == "__main__":
  import sys
  if len(sys.argv) == 2:
    rack = sys.argv[1].strip()
  else:
    print """Usage: python solver.py """
    exit()
  t = time()
  anadict = loadvars()
  print "Dictionary loading time:",(time()-t)
  t = time()
  foundwords = set(findwords(rack, anadict))
  scored = [(score_word(word), word) for word in foundwords]
  scored.sort()
  for score, word in scored:
    print "%d\t%s" % (score,word)
  print "Time elapsed:", (time()-t)

 

Converting this to use a screen capture from iOS is an exercise left for the reader; QED (which I always translate as: it all goes to show).

{ 1 } Comments

  1. Mark Cogan | November 8, 2012 at 8:26 pm | Permalink

    You can use the “time” unix command to get timings instead of wiring them into the script yourself (“time python solver.py .. “).

    A minute seems really slow, and the algorithm seems to be overkill. Here’s what I do:

    – Generate a regular expression from the board letters by sorting them and then joining the sorted list with the optional marker. For example, if the board had NVRIPYKTA, the regular expression would be /^A?I?K?N?P?R?T?V?Y?$/.
    – Read each word from the dictionary file and convert it to “anagram dictionary” form (sort its letters). So KINTAR would be AIKNTR.
    – Check if the anagram form of the word matches the regular expression. If it does, you can make the word from the letters. Add it to your list.

    The quick Perl script I threw together does this in ~3 seconds using /usr/share/dict/words, twice that time using my comprehensive 450K word list. Compiled regular expressions tend to be really efficient.

    (for Letterpress, the next step for a useful list is to go through and remove any word which is a prefix of any longer word).

Post a Comment

Your email is never published nor shared. Required fields are marked *