Introduction

This lab has you finish a utility that counts words in a text file. Word frequency analysis is used by linguists and cryptographers in various applications. We use the following algorithm. We maintain a Map of words and their frequencies in the analyzed text. A Map is a table that connects a key to a value. We call a key,value pair in a Map an entry. There can be at most one entry with a given key. In this application, we use a TreeMap object that implements the Map. The keys are String objects representing words, and the values are Counter objects representing the frequency of the word in the text. We read the text file a line at a time, and then read the line a word at a time. For each word, we look the word up in the Map. If there is an entry for the word, we increment the associated Counter. Otherwise, this is the first time we have encountered the word so we add an entry with a Counter with the value 1.

Purpose

This lab gives you some experience working with the File I/O, exceptions, and Java Collections.

Activities

  • Copy the project. A nearly complete set of Java files for the word count application can be found in Lab9.zip. Extract these files to folder of your choosing on the student drive. As usual, open Dr Java, create a new project, and open the files you just unzipped.
  • Go to the DrJava Edit-> Preferences-> Compiler Options Menu and uncheck the box labeled “Show Unchecked Warnings”.
  • Look at the code. You can make the code work by completing one (or maybe 2) line of code in the WordCount class. This line of code must create a Scanner object from a String containing the file name. NOTE: You invoke this program on the Dr Java Interactions Pane with two command line arguments: one for the file name and one for the minimum frequency count.
  • Write the line of code and run the application. We have provided three text files that you can use as sample data. These sample files contain the Declaration of Independence (doi.txt), Homer’s Odyssey (odyssey.txt), and Dicken’s Great Expectations (ge.txt). Use the application to find the words that occur more than 100 times in the Odyssey. Compare with the sample output below.
  • Test exceptional conditions. In particular, what happens when you give an invalid file name (e.g. you use “odssey.txt” instead of “odyssey.txt”)? What happens when you give an invalid number as the minimum frequency count (e.g. you use “aardvark” instead of “1000”)?
  • Modify the application. In particular, modify the code so the case where an invalid file name is given behaves the same way as the case where an invalid frequency count is given.
  • Test your changes.

Sample Output

You should see the following when you run the WordCount class with application parameters “odyssey.txt” and “1000”: See image.

Lab Report

Write a document describing your experiences.

Answer each of the following questions about the application:

  • In the original application, what were the different ways that NumberFormatException and IOException are handled?
  • Find all Java Collection objects used in this application. What are the roles of each object?

Describe what you learned doing this lab. Explain what was difficult and what was easy.

Attach a listing of your completed WordCount classes.

Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.