4.7 KiB
4.7 KiB
Exercise: Find the anagrams for all words in a list¶
- You are given an English dictionary containing M words (“the dictionary”), and a separate list of N words (“the input”, saved in the file
words_to_search.txt
) - For each word in the input, find all the anagrams in the dictionary (e.g., for input 'acme' the anagrams are
['acme', 'came', 'mace']
)
How to proceed?
- Write an algorithm to find all anagrams for one input word first
- What is the Big-O class of this algorithm when executed the full N-words input?
- Is there a way to pre-process the dictionary to improve the Big-O performance?
1. Load the system dictionary and the input words¶
In [7]:
# Load the system dictionary
with open('/usr/share/dict/words', 'r') as f:
dict_words = [w.strip() for w in f.readlines()]
In [8]:
# Print the start and end of the dictionary
dict_words[:5] + ['...'] + dict_words[-5:]
Out[8]:
In [13]:
# Load the input words
with open('words_to_search.txt', 'r') as f:
words = [w.strip() for w in f.readlines()]
In [14]:
# Print the start and end of the input list
words[:5] + ['...'] + words[-5:]
Out[14]:
2. Look for the anagrams of one input word, e.g. "organ"¶
There are several anagrams, including "groan" and "argon".
What is the Big-O performance oh your algorithm? In terms of M, the number of words in the dictionary, and K, the number of letters in a word
In [15]:
word = 'organ'
In [ ]:
The performance of this implementation is ... .
3. Look for the anagrams of the words in the input list¶
- How does the Big-O performance of your one-word implementation scale to an input list of M words?
- Is there a way to pre-process the dictionary words in a data structure that is better suited for this task?
In [ ]:
In [ ]: