Exercise: Find the anagrams for all words in a list¶

You are given an English dictionary containing M words (“the dictionary”), and a separate list of N words (“the input”, saved in the file words_to_search.txt)
For each word in the input, find all the anagrams in the dictionary (e.g., for input 'acme' the anagrams are ['acme', 'came', 'mace'])

How to proceed?

1. Load the system dictionary and the input words¶

In [7]:

# Load the system dictionary
with open('/usr/share/dict/words', 'r') as f:
    dict_words = [w.strip() for w in f.readlines()]

In [8]:

# Print the start and end of the dictionary
dict_words[:5] + ['...'] + dict_words[-5:]

Out[8]:

['A',
 'a',
 'aa',
 'aal',
 'aalii',
 '...',
 'zythem',
 'Zythia',
 'zythum',
 'Zyzomys',
 'Zyzzogeton']

In [13]:

# Load the input words
with open('words_to_search.txt', 'r') as f:
    words = [w.strip() for w in f.readlines()]

In [14]:

# Print the start and end of the input list
words[:5] + ['...'] + words[-5:]

Out[14]:

['acer',
 'acers',
 'aces',
 'aches',
 'acme',
 '...',
 'yap',
 'yaw',
 'yea',
 'zendo',
 'zoned']

There are several anagrams, including "groan" and "argon".
What is the Big-O performance oh your algorithm? In terms of M, the number of words in the dictionary, and K, the number of letters in a word

In [15]:

word = 'organ'

In [ ]:

The performance of this implementation is ... .

How does the Big-O performance of your one-word implementation scale to an input list of M words?
Is there a way to pre-process the dictionary words in a data structure that is better suited for this task?

In [ ]: