211 lines
4.7 KiB
Plaintext
211 lines
4.7 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "245d99ab",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Exercise: Find the anagrams for all words in a list"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d1e46c4a",
|
|
"metadata": {},
|
|
"source": [
|
|
"* You are given an English dictionary containing M words (“the dictionary”), and a separate list of N words (“the input”, saved in the file `words_to_search.txt`)\n",
|
|
"* For each word in the input, find all the anagrams in the dictionary (e.g., for input 'acme' the anagrams are `['acme', 'came', 'mace']`)\n",
|
|
"\n",
|
|
"How to proceed?\n",
|
|
"1. Write an algorithm to find all anagrams for one input word first\n",
|
|
"2. What is the Big-O class of this algorithm when executed the full N-words input?\n",
|
|
"3. Is there a way to pre-process the dictionary to improve the Big-O performance?"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b9b9cecd",
|
|
"metadata": {},
|
|
"source": [
|
|
"# 1. Load the system dictionary and the input words"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"id": "24070a26",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Load the system dictionary\n",
|
|
"with open('/usr/share/dict/words', 'r') as f:\n",
|
|
" dict_words = [w.strip() for w in f.readlines()]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"id": "4002fcdd",
|
|
"metadata": {
|
|
"scrolled": true
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"['A',\n",
|
|
" 'a',\n",
|
|
" 'aa',\n",
|
|
" 'aal',\n",
|
|
" 'aalii',\n",
|
|
" '...',\n",
|
|
" 'zythem',\n",
|
|
" 'Zythia',\n",
|
|
" 'zythum',\n",
|
|
" 'Zyzomys',\n",
|
|
" 'Zyzzogeton']"
|
|
]
|
|
},
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Print the start and end of the dictionary\n",
|
|
"dict_words[:5] + ['...'] + dict_words[-5:]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"id": "823537ef",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Load the input words\n",
|
|
"with open('words_to_search.txt', 'r') as f:\n",
|
|
" words = [w.strip() for w in f.readlines()]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"id": "4ccec6a3",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"['acer',\n",
|
|
" 'acers',\n",
|
|
" 'aces',\n",
|
|
" 'aches',\n",
|
|
" 'acme',\n",
|
|
" '...',\n",
|
|
" 'yap',\n",
|
|
" 'yaw',\n",
|
|
" 'yea',\n",
|
|
" 'zendo',\n",
|
|
" 'zoned']"
|
|
]
|
|
},
|
|
"execution_count": 14,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Print the start and end of the input list\n",
|
|
"words[:5] + ['...'] + words[-5:]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "14d91685",
|
|
"metadata": {},
|
|
"source": [
|
|
"# 2. Look for the anagrams of one input word, e.g. \"organ\"\n",
|
|
"\n",
|
|
"* There are several anagrams, including \"groan\" and \"argon\".\n",
|
|
"\n",
|
|
"* What is the Big-O performance oh your algorithm? In terms of M, the number of words in the dictionary, and K, the number of letters in a word"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"id": "badf44c1",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"word = 'organ'"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "ef938d95",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7cd1196c",
|
|
"metadata": {},
|
|
"source": [
|
|
"The performance of this implementation is ... ."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "115c3219",
|
|
"metadata": {},
|
|
"source": [
|
|
"# 3. Look for the anagrams of the words in the input list\n",
|
|
"\n",
|
|
"* How does the Big-O performance of your one-word implementation scale to an input list of M words?\n",
|
|
"* Is there a way to pre-process the dictionary words in a data structure that is better suited for this task?"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "03ce3e28",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "a2fc5ec4",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.11.3"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|