Description
Assignment no. 5
Objective To find anagrams using hashing.
Total marks 10
Penalty for violating naming convention(s) 5%
An anagram is a word formed from another by rearranging its letters. For example ‘brainy’ is an anagram of ‘binary’ and vice versa.
Command-line argument:
Your program should receive three command-line arguments: a file containing words, the size of the hash table, and a query file. For example, a typical run of your program can be ./a.out words.txt 25000 query.txt .
Inputs:
An input file containing words: The first command-line argument is the path of a text file containing
English words. The file contains one word per line.
The size of the hash table: The second command-line argument is a positive integer which denotes how many slots should be there in the hash table you create.
Task
Let m be the size of the hash table (given by the second command-line argument). Use the hash function h described below. Given a string, the hash function sums the ASCII values of the characters in the string and takes the remainder obtained by dividing the sum with m. For example h(‘brainy’) =
98+114+97+105+110+121 % m. If m=25000, this value is 645.
Your hash table should resolve collisions by chaining. Every slot in the hash table should contain a pointer to a linked list. Take the words one by one from the input file containing words, apply the hash function, and insert the word in the linked list associated with the slot given by the hash function. The insertion should always be done at the beginning of the linked list.
Take every ‘word’ in the file containing queries: (i) hash it; (ii) list all anagrams of the ‘word’ present in the linked list associated with the slot given by the hash function. Note that the hash function we defined will hash all anagrams of a word to the same slot. So, to obtain the anagrams of a word, it is enough to search in the linked list associated with the slot. Also note that there can be other words, which are not anagrams in the same linked list. Your program should be able to ignore them while collecting all anagrams of the word.
Output:
Submission
● The program you submit should output anagrams.txt when run.
● The main file of your program should be named as <roll no>.<extension>, where roll no. specifies your roll no. and the extension depends on the language you choose (Usage of C is mandatory for this assignment). Ex: 200010001.c.
● Follow some coding style uniformly. Provide proper comments in your code.
● Submit only through moodle. Submit well in advance. Any hiccups in the moodle/internet at the last minute is never acceptable as an excuse for late submission. Submissions through email or any other means will be ignored.
●
Evaluation
● We will do the second evaluation later. This is for those who want to improve their marks obtained in the first evaluation or who do not submit for the first evaluation. There will be a penalty of 20% for those who are submitting for the second evaluation. The details of the second evaluation will be shared later.




Reviews
There are no reviews yet.