For this assignment, we will implement a program that cracks messages encrypted using any one of our substitution ciphers. Our goals include:
In order to crack messages encrypted with a substitution cipher, our strategy will be to find the frequencies of letters in the cipher text, and use those frequencies to help us guess the key that gave rise to the text. To do that, we will also need an estimate on the frequencies of letters in standard English text. With the frequencies in hand, we can establish an initial correspondence between letters in the cipher text and the plain text based on frequencies of those letters in the respective texts. This correspondence represents a candidate key. We can evaluate the plausibility of candidate keys by checking the words that appear in the plain text resulting from the key against a dictionary of English words. You'll want to use the dictionary of words that I provide on our Eureka page.
FrequencyMap, that records mappings between characters and their frequencies in a given document. Your class should have a single constructor that consumes a
Stringargument that is the filename of a document for which we want its frequency map. You should design your class with an eye toward extracting the characters in order from most to least frequent. As a
Map, your implementation can be simplified by not reinventing the wheel; the
TreeMapclass may be helpful; that is, by either implementing or extending a
Map. Consequently, your map will naturally consist of
TreeMap(for example) orders its map entries by a natural ordering that makes comparison of Keys efficient. Once we have determined the frequencies, we want to access the letters in an order that is decreasing with their frequency. Augment your
FrequencyMapclass with a method,
public List<Map.Entry<Character,Integer>> getSortedEntries()that returns a list of character-integer map entries ordered by decreasing integer counts (which represent our frequencies). You should implement a
Comparator<Map.Entry<Character,Integer>>class to support this.
SubstitutionCracker, that will use your
FrequencyMapclass and will eventually handle the cracking of an encrypted message. For now, we will generate and evaluate only a single candidate decrypted message. Your
SubstitutionCrackerclass should take a single command-line argument that represents the filename of an encrypted text that we want to crack.
FrequencyMapclass, find the frequencies of Crime and Punishment, by Fyodor Dostoevsky; we will be using this text as our measure of the frequency of letters in plain English. With another frequency map, determine the frequencies of letters in the encrypted message that is provided as the command-line argument; this encrypted text should be generated using your
SubstitutionCipherclass from part 1 of our project.
On your machine where you are doing your homework, create a folder called <your email name> followed by “P1B”. For example, someone with email address “cjones” would create a folder called “cjonesP1B”. Inside that folder, place plain text file(s) containing your answers to any written questions required of you. Also, place whatever Java and documentation files are necessary for your programming project in the same folder. Finally, either tar or zip the folder so that when I extract it, the folder ”<emailname>P1B“ will be created. Finally, submit via Eureka.