You have probably observed randomly occurring events such as people arriving at the end of a waiting line. To study this behavior more scientifically, we need a mathematical model for the arrival events. The Poisson distribution is one such model that calculates the probability of occurrence, P(X=k), of a specific number of events (k) in each observed time interval. In this model, there is an average occurrence rate called lambda which is the average number of events occurring during each time interval. During any specific time interval, the number of events (k) actually occurring may be more or less than the average. The Poisson distribution provides an estimate of the probability as a function of lambda and k. We are not going to get into the details of the calculation in this project, but the function can be visualized in the following figure. See image.
For example, if lambda = 1: For 37% of the time intervals, the value of k will be 0. For 37% of the time intervals, the value of k will be 1. For 18% of the time intervals, the value of k will be 2. For 7% of the time intervals, the value of k will be 3. For 1% of the time intervals, the value of k will be 4. For any values of k 5 or larger, the probability is virtually zero – improbable but not impossible.
We are going to generate a histogram showing this frequency based on a user specified average rate and a user specified number of time intervals. We will obtain a randomly generated value of k for each observed time interval, count the number of occurrences, and display the data in the form of two bar graphs – one horizontal and one vertical. This project will require you to write several loops that process the integer variables in an array.
See the UML diagram for this project: See image.
You will use the code provided for the Poisson class. You will not be modifying this code. You should read it and understand how it works. It contains the calculation formula for the probability, P(X=k). You do not need to understand why that formula is correct.
In the EventFrequency.java file, you need to add code to the main method to do the following steps in the following order (where indicated by the comments in the code itself):
- Declare an integer array named “counts” with a size equal to 3 * lambda + 1. Each integer element in this array will contain the count of occurrences of that number of events in the observed time intervals. For example, if 4 was the number of events in 26 of the observed time intervals, the array element “counts” will contain the integer value 26. The lowest element in the array that you will use is “counts” and the highest element in the array that you will use is “counts[counts.length - 1]”.
- Execute a loop to initialize the value of all elements in the array from 0 through counts.length-1 to zero. If we are going to start counting the number of times a particular event occurs, we want to start all of our counts at zero.
- Using the user entered number of time intervals to observe (times), execute a loop. Each time through this loop, your code gets a value of k using the myDist getValue() method and increments the value of the kth element in the counts array.
- Now calculate the frequency of occurrence of zero events and lambda events. The counts array element contains a count of the number of times there was a zero number of events. Divide that number by the number of time intervals (times) to get the frequency. Do the same for the array element that contains the count of the number of times that lambda events occurred. Be sure to cast your data from integer to float or double to perform this calculation since the frequency values will all be between 0 and 1. (If you do the division in integer arithmetic, all answers will come out to be 0 due to the truncation of the fractional parts.)
- It’s nice to see values for these two numbers of events, but it would also be nice to display the data visibly to allow a user to observe the pattern of the frequencies for the number of events.
- Finally, your EventFrequency code instantiates an object of Histogram class and passes the array of counts, the limits of the indices to draw, and the maximum length of the bars desired as parameters to the Histogram constructor method. For the maximum length of the bars calculate the value as 100 times counts[lambda] divided by times . Be careful to avoid integer truncation while doing the division but you will need to cast the value to an int afterwards. Then, it calls the Histogram object’s two draw methods.
- You must write the rest of the code in the Histogram constructor and the two Histogram class draw methods. Note that this class has nothing to do with the Arrivals specifically. The draw methods can draw histograms for any kind of data values that are passed to it in its parameter list. Hence, we use “neutral” names for all variables in the Histogram methods – not names like “events” that imply any such activity related to the process we have created for arrival times. The histogram draw method needs to do the following steps in the following order (where indicated by the comments in the code itself): In the Histogram constructor method, we need to initialize the instance variables from the supplied parameters. This code has been provided for you. You should study it and what it is doing to understand it. You need to write the rest of code for the constructor in steps 6 and 7 (below).
- 6. and 7. We may have very large values in the values array supplied as a parameter. Since we want to limit the size of the histogram bar graph to maxLength, we need to scale the data in the values array when we copy it into the instance copy of the data in the array. This consists of two loops. The first loop finds the largest value in the values array. (Declare, initialize, and use a variable named something like “maxValue”). The second loop multiplies each value in the values array by the max length we want for bars (maxLength) and divides by the largest value found (“maxValue”). Remember the limitations of multiplication and division for integer variables. You may need to cast the integer values to double for the calculation and cast the result back to integer to update the value in each counts array element with good “resolution”. You can also get good resolution by performing integer multiplications and divisions in the correct order. See if you can do that as it will have better run time performance.
- In the drawHor method, your code must draw a horizontal bar graph of the data in the values array. (See Sample Output) This will require two nested loops. The outer loop will be a scan through each element of the values array and the inner loop will print an asterisk from 1 through the value in the outer loop element of the values array. Print the values array integer at the end of each bar of asterisks.
- In the drawVer method, your code must draw a vertical bar graph of the data in the values array. (See Sample Output) This is a little trickier than the horizontal bar graph. You still need two nested loops. The outer loop will count down through all the values from maxLength to 1 to print each line. The inner loop will print a piece of each line for each value (index into the values array). If the count for the value is greater than or equal to the decreasing count of the outer loop, print an asterisk with a space on either side of it. If the count for the value is not greater than or equal to the value of the outer loop count, print an equivalent number of spaces to maintain the column alignment but not show the “bar” in this column.
See the sample output of the project: See image. See image.
For this project and all projects in this course, the memo.txt file that you upload to the turn-in system MUST BE A PLAIN TEXT FILE - NOT A WORD (.DOC or .RTF) FILE. On a Windows PC, I recommend that you use Notepad to create this file. On a MAC, you must use a suitable editor to create a plain text file. Write a report that answers the following questions:
- Lambda is the mean of the Poisson distribution, but it is also the variance of the distribution. (The variance is a measure of the “spread” for the distribution.) Run your EventFrequency program with three different values for lambda: 1, 4, and 10. What do you observe in the shape (the mean and the variance) of the three different distributions? Discuss and try to explain your observations.
- Each time you run the program with a large number of observed time intervals (e.g. >=1000), you get values close to the ideal Poisson distribution and the distribution shows very similar values from one run to the next. What happens when you run the program several different times with a small number of observed time intervals (e.g. <= 100 or <= 10)? Explain what is happening.