Problem

Loosely speaking, a concordance 1 is a list of all the words in a piece of text. A concordance can be used for a variety of things from helping determine the author of a book or play to creating a word cloud (see: https://en.wikipedia.org/wiki/Tag_cloud).

Write a program that will read a file of text and produce a list of all the unique words in the text and their frequency of occurrence; in order from most frequent to least frequent. The output will be to an ASCIIDisplayer. For example, the concordance of the file AlicesAdventuresInWonderland.txt (the Project Gutenberg version) looks like: see image.

Solution

Your program should read the text file and create a sequentially-linked structure of the unique words in the text with their frequency of occurrence. It should then move the words from this list to a new sequentially-linked structure in which the words are in descending order by frequency. It should then list the words in the sorted list to an ASCIIDisplayer with the frequency followed by the word (as avbove).

Hints

1. To make the reading of the words easier, a class called WordReaderas part of the package Assign_2is included as WordReader.java. Include this in your package for the assignment (change the package declaration if you haven't called your package Assign_2). When a new WordReader is created, it opens an ASCIIDataFile. Each time readWord is called, it returns the next word (sequence of alphabetic characters) as a String. When EOF is reached, the method isEOF returns true. The reader is closed via a call to the method close. This class should not be modified (except, perhaps, changing the package declaration.)

2. The words should be represented by a class Word that includes the word itself (i.e. a String) and the frequency of occurrence of the word (an int). The class should have appropriate accessor, updater and mutator methods for the problem.

3. The Node class should be the usual with the "items" being objects of type Word.

4. The unique words list should be a sequentially-linked structure in any order containing only unique words (i.e. no duplicates). Since words at the beginning of a sentence are capitalized but elsewhere are not capitalized, "uniqueness" should not be case sensitive (i.e. The and the should be considered as the same word.) The String method equalsIgnoreCase can be used for this check.

5. The sorted word list should be created by removing the Word objects from the unique list and adding them in descending sorted (by frequency of occurrence) order into the sorted list.

6. The sorted list should then be traversed to write the frequencies and words to the display.

Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.