Project Required Files:

  • WordFinder.java
    • Contains the main method and any helper methods that you design. You need to create this file.
  • AnagramDictionary.java
    • Contains all anagram sets from a dictionary. The interface is already provided.
  • Rack.java
    • Used to store the current rack. It is up to you to choose the representation and public methods. A private static allSubsets method is already provided.
  • ScoreTable.java
    • Contains information on how much each letter in Scrabble is worth. You need to create this file.
  • sowpods.txt
    • The Scrabble default dictionary for your program. Note that it is all lower case letters.
  • testFiles folder
    • This folder contains sample data files and the corresponding correct output, so that you can make sure that your program works correctly. Important: Try the same examples in your program and make sure that your output matches the reference outputs CHARACTER BY CHARACTER.

Project Description:

Create a program that when given letters for a Scrabble rack, creates and provides a list of all the legal words that can be formed with those letters along with their score (in decreasing order, if two letters have the same score, they are provided in alphabetical order). A default scrabble dictionary is provided. Note that not all the letters from the rack need to be used to create a word. See information below for more details.

Example of required output for a rack containing "cmal":

Rack? cmal
We can make 11 words from "cmal"
All of the words with their scores (sorted by score):
8: calm
8: clam
7: cam
7: mac
5: lac
5: lam
5: mal
4: am
4: ma
2: al
2: la

Program Required Operation:

The program takes an optional command-line argument for the dictionary file name. If the argument is left off, the program uses the default sowpods.txt dictionary file. If the custom file name or default dictionary does not exist, the program must print an informative error message (which includes the file name) and exit.

For example, to run your program you would enter:

java WordFinder dictionaryFileName

Note that dictionaryFileName is the optional command-line argument.

Then, the initial output is:

Type . to quit.

Then, the program runs in a loop on the console, prompting for "Rack?" and reading and processing the rack that you enter. To exit the program, you simply typing . as stated in the initial output.

Important Points:

1. The program must accept any sequence of non-whitespace characters as a valid rack. Example: A rack given as "abc@" is a valid rack, but only words such as cab will be provided as possible words, no words should contain the @ as that character does not appear in the dictionary.

2. Program must be case-sensitive, so if the dictionary contains only lower-case letters, it will find words such as "cmal", but it won't be able to find any words from CMAL

3. Program is not required to handle blanks like the real Scrabble game.

4. Rack is not required to contain exactly 7 letters like the real Scrabble. The rack can contain any number of characters. For example, the rack can have more than 7 characters, or less.

Program Required Approach (MUST USE APPROACH #2):

Approach #1: One approach would be to read in the dictionary, and then for each rack given, compare each word in the dictionary to that rack to figure out whether that word can be formed from some or all of the letters in that rack, creating a list of the legal words as you go. This is faster to process the dictionary, but slower to process each rack.

Approach #2: You need to preprocess the dictionary, so that you organize the words by the set of letters each one contains (this set is actually a multiset, because letters can appear more than once in a word; the rack itself is also a multiset). Then for each rack you'll generate all the subsets of that multiset of letters, and for each subset add all the words from the dictionary that have exactly the same elements as that subset. This is slower to process the dictionary, but once we do this processing, it's faster to process each rack than the first approach. This approach is explained in more detail in the following two sections. In big-O terms, some of the time spent for processing a rack in the second approach is to get the list of anagrams for each subset; we'll discuss that further in the next section. The rest of the time spent processing a rack is to sort the resulting word list (which we would have to do for either approach). You are required to use this approach.

The AnagramDictionary class:

For the approach we're using, we said that you would organize the dictionary words by the (multi)set of letters a word contains. If two words contain the same exact letters in a different order, they are called anagrams of each other. If a rack (or subset of that rack) has all the same letters (and multiplicity of those letters) as a particular word in the dictionary, that word, plus all of its anagrams from the dictionary should all be added to the list of words reported by our WordFinder.

You are required to create an AnagramDictionary class to handle this. It will have a getAnagramsOf method that finds all anagrams of a particular string efficiently. For, example, suppose we have a variable, dictionary, of type AnagramDictionary, that contains data from the sowpods dictionary. If we did the call

dictionary.getAnagramsOf("rlee")

it would return an ArrayList of words with the contents: ["leer", "lere", "reel"] (not necessarily in that order). Note, "rlee" is not a real word, but the method does not require you to pass it a real word. But the anagrams returned are real English words.

How to do this efficiently? One insight is that if we put two words into some kind of canonical form, then we could figure out if they are anagrams of each other by just comparing the canonical versions of them for equality. This canonical form will be a sorted version of the characters in the word. In the earlier example given the rack contained "cmal". The sorted version of this rack is "aclm". The first two words listed in the output are "calm" and "clam", anagrams of "aclm", or put another way, these first two words are the only dictionary words we can make using all the letters on the rack, and all the other words listed are anagrams of subsets of "cmal".

For full credit your AnagramDictionary is required to find all the anagrams of one String in time linear in the size of the output set (not including the time to sort the letters in the String given).

Finding all the subsets of the rack

The code for finding all subsets of a multiset is already provided (static method allSubsets in Rack.java). The method is static because, like some other recursive methods we have written, it takes all of its data as explicit parameters; also, this means allSubsets works regardless of what representation you choose for your Rack objects (it will not be accessing any Rack instance variables). You will likely have to write a wrapper method that calls allSubsets with the correct starting parameters.

The allSubsets method uses a particular representation for the rack which we'll explain with an example here. Earlier we mentioned that a rack is a multiset of letters (set because we don't care about the order of the letters, and multiset because letters can appear more than once). Suppose our rack is:

a b a d b b

Gathering together the like letters, we could rewrite this as "aabbbd". We could also say that 'a' appears with multiplicity 2, 'b' appears with multiplicity 3, and that 'd' appears with multiplicity 1. allSubsets expects the rack information to be in two parallel arrays: one has the unique letters, and the other has the multiplicity of that letter at the same array index. The array of unique letters is actually a String, so we can do String operations on it. For the example given, we could create this rack representation as follows:

// create variables for the rack "aabbbd"
String unique = "abd";
int[] mult = {2, 3, 1};
// example to show relation between values in unique and mult:
for (int i = 0; i < unique.length(); i++) {
System.out.prinln(unique.charAt(i) + " appears " + multi[i] + " times in the rack");
}

Like other examples of recursion over an array that we've seen, allSubsets will take a third argument, k, which is the starting position of the part of the array that this recursive call will process. So for this code, it's the starting postion from which to find the subsets. So, for example, if we called

allSubsets(unique, mult, 1); // starts at position 1 in unique and mult

it would find all the subsets of the rack "bbbd" (i.e., it wouldn't consider the subsets that included any 'a's in it).

Class design

You are required to have at least the following four classes in your program, with the responsibilities described. You are allowed to add more classes to your design as you see fit. The four, with their overall responsibilities described, are:

1. WordFinder

This contains the main method. This class will have a main that's responsible for processing the command-line argument, and handling any error processing. It will probably also have the main command loop. Most of the other functionality will be delegated to other object(s) created in main and their methods.

2. Rack

This corresponds to the idea of the rack in the problem description. Thus, wherever your program is using a rack, it should be using an object of type Rack. As previously discussed, we have already provided the code for a private static Rack method allSubsets.

3. AnagramDictionary

This will contain the dictionary data organized by anagrams. It is required to have at least the two public methods whose headers are given in the starter file. You are allowed to add other methods to this interface. This class was discussed in more detail in the section about it.

4. ScoreTable

This class has information about Scrabble scores for scrabble letters and words. In scrabble not every letter has the same value. Letters that occur more often in the English language are worth less (e.g., 'e' and 's' are each worth 1 point), and letters that occur less often are worth more (e.g., 'q' and 'z' are worth 10 points each). You may use hard-coded values in its data. Here are all the letter values:

(1 point)-A, E, I, O, U, L, N, S, T, R
(2 points)-D, G
(3 points)-B, C, M, P
(4 points)-F, H, V, W, Y
(5 points)-K
(8 points)- J, X
(10 points)-Q, Z

This class should work for both upper and lower case versions of the letters, e.g., 'a' and 'A' will have the same score. Hint: You can index an array with a char that is a lower case letter by treating it as an int and subtracting 'a' from it (because the internal numeric codes for letters are all sequential). E.g., If your letter is 'd', ('d' - 'a') = 3 and if it's 'e', ('e' - 'a') = 4.

One thing to keep in mind is you want the code that operates on some data to be in the same class that contains that data. One sign that your design doesn't have that feature is if your classes tend to have a lot of get and set methods and not much else. That would indicate that all the code operating on this data is outside of the class itself

Hopefully we've made clear the importance of making all instance variables private. But even if you make your data private there are other ways to expose the implementation of your objects. For example, if you have a class that contains an ArrayList, and also provide an accessor method for this ArrayList, it gives clients the ability to change the contents of that arraylist from outside of the object methods, possibly invalidating the object.

You are welcome to add additional classes as part of your design. These ones would be designed and implemented by you, of course. If a class is just used by one other class, you could put it in the same file as that class, or a separate file. If it is used by multiple classes, it should be in its own file.

Development hints

We recommend creating test drivers for any non-trivial class you implement to make it easier to debug your code. That should be pretty easy here, because the classes are somewhat independent from each other. (WordFinder is an exception since it already is a main program.)

You'll want to test your complete program (and your AnagramDictionary) on a small dictionary file before subjecting it to sowpods.txt. We provided a sample small dictionary and input and corresponding output for some racks in the testFiles directory (more about that in the next paragraph). If you find AnagramDictionary-related bugs, you may want to use an even tinier dictionary for when you are single-stepping, etc.

Once you have all your modules working, you can also check if your program produces the right answers for sowpods.txt with the other test input files and corresponding output in the testFiles directory. Note: The testFiles/README.txt file describes what's what that directory.

Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.