In this lab, you're going to open a file and read and process data from it in a loop.

The file contains tuition data from some number of states, where every three lines are data for a single state of the form:

State name
2002 public institution tuition
2012 public institution tuition

You will process each record and produce output in the form of a table like the one below for the six New England states and a few statistics, including 2012 largest tuition, 2012 smallest tuition, average tuition, and largest increase. You may not assume that the file contains data, but you may assume it exists and you may assume each record of the file is complete. (In other words, if there is a state name, there will be the two lines after it of 2002 and 2012 tuition values.)

State Name 2002 Tuition 2012 Tuition Change
Connecticut 11066 19842 8776
Massachusetts 9361 20328 10967
Maine 10292 18631 8339
New Hampsphire 12343 23314 10971
Rhode Island 11575 20649 9074
Vermont 13426 22504 9078

Think about how you would solve this problem before reading through the step-by-step instructions and hints below, and solve as much as you can on your own. You have learned everything you need to know to solve this problem in lessons 1-4.

1. OPEN YOUR PYTHON TEMPLATE AND RENAME IT.

To begin, open your Python program template and give it a new name, such as lab4.py or state_tuition.py. Write a docstring at the top that describes what the program does and that includes the date. The basic outline of the program will be to open the file, process the file data and print table rows, and print the summary statistics, so write comments in the program for each of these logical blocks.

2. CREATE SEVERAL SAMPLE TEXT FILES IN A TEXT EDITOR.

We will need to have a text file of sample data to work with. Because our program must work with a text file of any length, we should create several text files and test each version of our program on all of them. Below is the data for the six New England states shown in the table above. We can create a text file called six_states.txt, we can create an empty text file called empty.txt, and we can create a text file with just one state's data, called one_state.txt. When we get to the testing step, we will create a few other variations. To create a text file, in IDLE, choose File -> New File, and when you save the file, choose "Text files (*.txt)" from the Save as type box.

Be sure to save the text files in the same directory as your program.

Copy/paste or type the data for the six states exactly as it is below for the six_states file, do not put anything in the empty file, and put only the three lines of data for Connecticut in the one state file.

Connecticut
11066
19842
Massachusetts
9361
20328
Maine
10292
18631
New Hampshire
12343
23314
Rhode Island
11575
20649
Vermont
13426
22504

3. EDIT YOUR PYTHON PROGRAM TO OPEN THE TEXT FILE AND DETERMINE IF IT HAS ANY DATA.

Now that you have some text files in the same directory as your program, you can open a text file with your program and read it.

There are different ways to break this program up into smaller pieces to develop incrementally. You do not have to develop the program in the order it is developed here, but if you take a different development path, be sure that your program meets all of the requirements and gives correct output for all of the final test cases.

We will begin by obtaining a file name from the user. This will make it easier for us to test our program on many test cases, as we can place the different test cases in different text files and simply type the name in at the prompt to test a specific test case. Once we have obtained the file name from the user, we will open the file. (We will assume the user has typed the name correctly and the file exists.)

We will then read the first line from the file and check to see if it's the end of file. If it isnt, well print the header for our table. If it is, well print the message "< file name> contains no data," for example, empty.txt contains no data.

Follow these steps:

  • Annotate variables for the file name (a string), the first line of the file (a string which will be either the state name or the end of file marker), and the file reference variable (of type typing.TextIO). You will have to import the typing library. Put the code for the import and the annotations under the correct comments in your program. Be sure to give the variables meaningful names and follow Python naming conventions, such as beginning with a lowercase letter and separating words with underscores. For example, you might call the file reference variable tuition_file.
  • Use an input statement to obtain the file name from the user. Use an appropriate prompt, such as "Please enter the file name with the .txt extension:
  • Open the file and read the first line.
  • Use an if statement to determine if the first line is the end of file marker ( ""). If it is, print the message that the file is empty.
  • If the file is not empty (use an else), print the table header. The table header is composed of four strings formatted in columns of 20 characters each. This is what the table header should look like:
State Name 2002 Tuition 2012 Tuition Change
  • If you can't remember how to use the format statement to do this, a hint is below.
  • Be sure to put comments over each block of code that relate the code below to the part of the problem being solved.
  • At the end of your program, close the tuition file.

Formatting hint: This code prints the first two columns of the table header:

print("{:20s}{:20s}".format("State Name","2002 Tuition"))

If you run your program on the empty file and then on the file with six states, your executions should look like this:

>>>
RESTART: D:\ lesson4_lab_sample_solution.py
Please enter the file name with the .txt extension: empty.txt
empty.txt contains no data.
>>>
RESTART: D:\ lesson4_lab_sample_solution.py
Please enter the file name with the .txt extension: six_states.txt
State Name 2002 Tuition 2012 Tuition Change
>>>

4. IF THE FILE ISN'T EMPTY, READ RECORDS IN A LOOP UNTIL THE END OF FILE

To produce the table with the state name, the two tuition values, and the change between the tuition values, we will have to read the records of state data one at a time. Think about how you read a multi-line record in a loop. A loop has an initializer, a loop entry condition, and an updater. The loop entry condition is whether we have reached the end of file. The initializer and updater read the first line of the record, and within the loop, we must read the rest of the record. See if you can write the loop before reading the hints below.

Hints:

  • The loop control variable will hold the first line of the record - the state name variable. Either it's a state name or its the end of file, which will control whether you enter the loop. Youve already read it before we begin the loop because you read it to check if the file is empty. That is your initializer code.
  • If you know the file isn't empty and have printed the header, thats where you write a loop to read the rest of the record and keep going to the end of the file.
  • The loop entry condition is the same as the if condition.
  • Inside the loop, you can now process the state name by stripping off the white space, for example, state_name = state_name.strip()
  • After you've processed the state name, read the 2002 and 2012 tuition. Convert them to integers. Annotate variables for these values in your variable annotation area.
  • Compute the difference by subtracting the 2002 tuition from the 2012 tuition. (Annotate a variable for the difference value.)
  • Print a line of the table after you have all of the values (name, the two tuitions, and the difference). You'll use the same sort of formatting as for the header, but use d for integer instead of s for string.
  • The last thing your loop needs is the updater. You need to read the next line of the file. It will be either the name of the next state or it will be the end of file marker, which will get checked at the top of the loop.

This might be the trickiest part of the program, but there are examples in the book and the videos of reading multi- line records and it's very good practice for learning loops. If youre having trouble, step through your program in the debugger to see what youre reading into each variable.

Once you have this loop written, test on the multi-record file and the one-record file. Your output should look like this for each test case: see image.

5. ADD CODE TO THE LOOP TO STORE DATA FOR THE SUMMARY STATISTICS

After the table, the program must display the largest 2012 tuition, the smallest 2012 tuition, the average 2012 tuition, and largest increase. Refresh your memory on loop patterns / variable roles to see if you can solve this part of the problem without reading the hints.

Hints:

  • Finding the "largest" or smallest of something means looking for the best-fit value for a property (in this case, largest and smallest). To do that, you need a variable in a most-wanted holder role for each property you're looking for. Annotate and initialize variables for the largest and smallest 2012 tuition and the largest increase.
  • When initializing a value where you're looking for the largest, initialize to a small value. When initializing a value where youre looking for the smallest, initialize to a large value. This way the first value you encounter in the loop will replace the initial value.
  • You can import the sys library and use sys.maxsize as a large value, and -sys.maxsize - 1 as a small value.
  • Inside the loop, use if statements to compare the most-wanted holders to the variables holding the 2012 tuition and the difference, as appropriate, and replace the values in the most-wanted holders if appropriate.
  • To find the average, you'll need to sum (total, add together) the values youre averaging and youll need to count the values youre averaging. To sum values being processed in a loop, use a variable in the accumulator role. Annotate a variable for the total and a variable for the count, and initialize both to zero.
  • Inside the loop, accumulate the 2012 tuition into your sum variable and add one to the count.
  • After the loop, calculate the average by dividing the sum by the count. Annotate a variable for the average.
  • After the loop, display the four summary statistics in a readable way. (You can use the output below as a guide.)

Before we get to more extensive testing, test your program on the three files that you have already created. Your output should look like this: see image.

6. TESTING

Test your program thoroughly! We have tested the empty file and files with just one record and several records. How else might we test the program?

We might consider that the maximum tuition, minimum tuition, and maximum change are found in different locations at the file, since off-by-one errors are common. (A programmer might accidentally miss checking the first or last record in the file if they didn't write their loop correctly.) We can move the record for the largest tuition and largest change (New Hampshires record) to the beginning and end of the text file, and we can move the record for the smallest tuition (Maines record) to the beginning and end of the text file. Create two more text files accordingly: largest_first_smallest_last.txt, largest_last_smallest_first.txt and test your program on all five test cases:

  • File with data for six states
  • Empty file
  • File with data for one state
  • File with largest tuition and change first and smallest last
  • File with smallest tuition first and largest tuition and change last

Output from the final two test cases will look like: see image.

Paste your program running at the bottom of the program file as a comment with each test case described, for example: see image.

Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.