2024-heraklion-data/exercises/anagrams/.ipynb_checkpoints/anagrams-checkpoint.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "245d99ab",
   "metadata": {},
   "source": [
    "# Exercise: Find the anagrams for all words in a list"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d1e46c4a",
   "metadata": {},
   "source": [
    "* You are given an English dictionary containing M words (“the dictionary”), and a separate list of N words (“the input”, saved in the file `words_to_search.txt`)\n",
    "* For each word in the input, find all the anagrams in the dictionary (e.g., for input 'acme' the anagrams are `['acme', 'came', 'mace']`)\n",
    "\n",
    "How to proceed?\n",
    "1. Write an algorithm to find all anagrams for one input word first\n",
    "2. What is the Big-O class of this algorithm when executed the full N-words input?\n",
    "3. Is there a way to pre-process the dictionary to improve the Big-O performance?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b9b9cecd",
   "metadata": {},
   "source": [
    "# 1. Load the system dictionary and the input words"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "24070a26",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the system dictionary\n",
    "with open('/usr/share/dict/words', 'r') as f:\n",
    "    dict_words = [w.strip() for w in f.readlines()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "4002fcdd",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['A',\n",
       " 'a',\n",
       " 'aa',\n",
       " 'aal',\n",
       " 'aalii',\n",
       " '...',\n",
       " 'zythem',\n",
       " 'Zythia',\n",
       " 'zythum',\n",
       " 'Zyzomys',\n",
       " 'Zyzzogeton']"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Print the start and end of the dictionary\n",
    "dict_words[:5] + ['...'] + dict_words[-5:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "823537ef",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the input words\n",
    "with open('words_to_search.txt', 'r') as f:\n",
    "    words = [w.strip() for w in f.readlines()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "4ccec6a3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['acer',\n",
       " 'acers',\n",
       " 'aces',\n",
       " 'aches',\n",
       " 'acme',\n",
       " '...',\n",
       " 'yap',\n",
       " 'yaw',\n",
       " 'yea',\n",
       " 'zendo',\n",
       " 'zoned']"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Print the start and end of the input list\n",
    "words[:5] + ['...'] + words[-5:]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14d91685",
   "metadata": {},
   "source": [
    "# 2. Look for the anagrams of one input word, e.g. \"organ\"\n",
    "\n",
    "* There are several anagrams, including \"groan\" and \"argon\".\n",
    "\n",
    "* What is the Big-O performance oh your algorithm? In terms of M, the number of words in the dictionary, and K, the number of letters in a word"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "badf44c1",
   "metadata": {},
   "outputs": [],
   "source": [
    "word = 'organ'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ef938d95",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "7cd1196c",
   "metadata": {},
   "source": [
    "The performance of this implementation is ... ."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "115c3219",
   "metadata": {},
   "source": [
    "# 3. Look for the anagrams of the words in the input list\n",
    "\n",
    "* How does the Big-O performance of your one-word implementation scale to an input list of M words?\n",
    "* Is there a way to pre-process the dictionary words in a data structure that is better suited for this task?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "03ce3e28",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a2fc5ec4",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}