In the workspace is a file called "text.txt" which contains a piece of text (it's the opening passage of Charles Darwins On the Origin of Species).

Write a program that reads the text in the file and lists each unique word, along with its frequency (i.e., how many times it occurs).

Example: (These numbers might not be correct - they're just a guide to sort of output you should generate).

when: 3
we: 3
look: 1
to: 9
the: 11
… etc.

Your program should work not only for this file but also for variations. You may assume the following about the text in a file:

First. Words are separated by whitespace. In the sample text given all of the whitespace is just a single space. But sometimes it might be multiple spaces, or tabs, or line breaks, etc. Make sure your program can handle all of these different kinds of whitespace. You will find the string method split very helpful.

Second. The text might contain punctuation marks (as the sample text does). Make sure that you don't include punctuation marks in words. You could remove them from the text altogether, using the programming task from Week 1 as a guide. It will be good enough if your program can handle punctuation marks:

. ? ! , ; : ( ) [ ] { } "

Don't worry about dashes and single quote marks:

- '

These are tricky, because sometimes they are used as parts of words (e.g. hyphenated words, such as "sub-variety", or contractions, such as "aren't") and sometimes they are used not as part of words (e.g., as dashes or quote marks). If you feel like a challenge then you could get your program to deal with them correctly, but you're not expected to, and you wont lose marks if you dont.

Third. Words might occur both with a capital first letter and without a capital first letter (e.g., the sample text contains both "When" and when). You should consider these to be the same word. You could make all words lower case, turning When into when. Or you could make them all upper case, turning both When and when into WHEN. It's up to you.

Checking your Work

Note that you can add your own files to the workspace. So you could create your own text file, called, for example, "my_text.txt", put whatever text you like in that file, and then check your program by getting it to read that file instead of the sample file. You could add something like the following text, which contains things your program should be able to handle:

Hello, world, hello!
World: hello?

GOODBYE.

A nice thing about this is that you know what answers you should get: "hello" occurs three times. world occurs twice, and goodbye occurs once. So, if you're converting words to lower case then your output should be something like this:

hello: 3
world: 2
goodbye: 1

Optional Extra

If you're feeling up to it, get the words appear in alphabetical order. Even better, get them to appear in order of frequency, from the most frequent down to the least frequent. Again, youre not expected to, and you wont lose marks if you dont do either of these things.

Questions

1. What is the difference between a list and a tuple?

Choices:

  • A list is ordered but a tuple is not.
  • A list can contain elements of different types but a tuple cannot.
  • A list is mutable but a tuple is not.
  • A list can contain duplicate elements but a tuple cannot.

2. An ordinary playing card has two attributes: rank ('A', 2, 3, 4, 5, 6, 7, 9, 10, J, Q, K), and a suit (Hearts, Spades, Diamonds, and Clubs). When considering all 52 playing cards, which of the following collections would be the most appropriate for representing a single playing card?

Choices:

  • A list, e.g. ['J', Spades].
  • A tuple, e.g. ('J', Spades).
  • A set, e.g. {'J', Spades}.
  • A dictionary e.g. {'J': Spades}.

3. Suppose you're writing a program to work with the grades of students in a class. Each student has a unique id, but different students might get the same grade. Which one of the following collections would best allow you to store and modify this data?

Choices:

  • A list of tuples of the form [(2345, 'CR'), (4567, HD), ].
  • A tuple of sets of the form ({2345: 'CR'}, {4567: HD}, ).
  • A tuple of sets of the form ({2345, 4567, }, {'CR', HD, }).
  • A dictionary of the form {2345: 'CR', 4567: HD, }.

4. Which one of the following expressions is a list literal?

Choices:

  • list(1, 3, 9)
  • [x**0, x**1, x**2]
  • [3**0, 3**1, 3**2]
  • None of them are

5. The following code was written to show the intersection of the empty set with another set, but it generates an error. Why?

s = {‘a’, ‘b’, ‘c’}
print({}.intersection(s))

Choices:

  • Python does not support empty sets.
  • Sets do not have an intersection method.
  • The intersection method of sets cannot be used with the empty set.
  • {} is not the empty set - it's the empty dictionary.

6. Which one of the following statements is true?

Choices:

  • You can use the len function to find the number of elements in a list, tuple, set, or dictionary.
  • You can use the len function to find the number of elements in a list or tuple, but not in a set or dictionary because they are not ordered.
  • You can use the len function to find the number of elements in a list, tuple or set, but not in a dictionary because their elements are key-value pairs.
  • You can use the len function to find the number of elements in a list, tuple, or dictionary, but not in a set because their elements must be unique.

7. Which one of the following statements could you use to extend a list a by appending the elements of a list b?

Choices:

  • a.append(b)
  • a.extend(b)
  • a + b
  • You could use any of the above.

8. Which one of the following statements could you use to add the letter 'a' as an item to the end of a tuple t?

Choices:

  • t.add('a')
  • t.append('a')
  • t = t + 'a'
  • You can't use any of the above.

9. Suppose that letters is a list of ten letters. Which one of the following pieces of code could you use to change the first two elements 'a' and b?

Choices:

  • letters[0:1] = ['a', b]
  • letters[0:1] = ('a', b)
  • letters[0] = 'a'
    letters[1] = b
  • You could use any of the above.

10. Why does the following piece of code print b?

x = [‘b’, ‘b’]
print(x.pop())

Choices:

  • The pop method removes the first element and returns the remaining second element.
  • The pop method removes the second element and returns the remaining first element.
  • The pop method removes and returns the first element.
  • The pop method removes and returns the second element.

11. Which one of the following is true?

Choices:

  • The sort method, when called on a list, and the sorted function, when applied to a list, both return the same thing - the sorted list.
  • The sorted function cannot be applied to tuples because tuples are immutable.
  • Sets have a sort method but there is little point in using it because sets are not ordered.
  • None of the above are true.

12. Suppose you have a list called "names" which contains a number of names. Suppose you want to join them into a string, separated by commas. Which one of the following expressions could you use?

Choices:

  • names.join(",")
  • join(names, ",")
  • join(",", names)
  • ",".join(names)

13. Suppose you want to loop through a dictionary d and print each of its values. Which one of the following pieces of code could you use?

Choices:

for x in d:
print(d[x])
for x, y in d.items():
print(y)
for x in d.values():
print(x)
You could use any of the above.

14. Which one of the following is true?

Choices:

  • Lists can contain lists but tuples cannot contain lists.
  • Sets can contain lists, tuple, or dictionaries but they cannot contain sets.
  • Dictionaries can have sets as their values but not as their keys.
  • None of the above are true.

15. In the code below, line 2 will produce an error but line 3 will not. Why?

x = ([1], [2], [3])
x[0] = 4 # Error
x[0][0] = # No error

Choices:

  • x[0] does not exist, but x[0][0] does.
  • You can't assign a number to x[0] because the other items in the tuple are lists. But you can assign a number to x[0][0].
  • You can't change which items are in the tuple, so you cant replace the first item by 4. But you can change an item internally, if it mutable. So you can replace the first element of the first item by 4.
  • None of the above is correct.

16. Suppose that word is a variable that refers to a word. Suppose you want an expression that returns the set of consonants in word. Which one of the following expressions could you use?

Choices:

  • {x for x in word if x not in {'a', e, i, o, u}}
  • {x for x in word if not x in ('a', e, i, o, u)}
  • {x for x in word}.difference({'a', e, i, o, u})
  • You could use any of the above.

17. Consider the following code template:

for I in < expression >:
< statement >

Suppose you want i to loop through the numbers 0, 2, 4, 6, 8 and 10. Which one of the following could you use to replace < expression >?

Choices:

  • (0, 2, 4, 6, 8, 10)
  • range(0, 11, 2)
  • [2*x for x in range(0, 6)]
  • You could use any of the above.

18. Why does the following piece of code generate an error?

word = ‘Expediensy’
word[-2] = ‘c’

Choices:

  • word is a string and the indexing operator [] cannot be used with strings.
  • When using the indexing operator [] with strings, only positive indices can be used.
  • The indexing operator [] can be used to select string elements but not to modify them.
  • None of the above are true.

19. Suppose the following piece of code executes without error:

with open(‘myfile’, ‘r’) as file:
lines = file.readlines()

What will be the value of lines?

Choices:

  • A string containing the text in the file.
  • A list of characters in the file.
  • A list of lines in the file, not including their newline characters.
  • A list of lines in the file, including their newline characters.

20. Suppose you're creating a datetime object from the string "June 24, 1968, at 05:30" using datetime.strptime. Which one of the following format strings should you used?

Choices:

  • '%b %d, %Y, at %H:%M'
  • '%B %d, %Y, at %H:%M'
  • '%B %d, %y, at %H:%M'
  • '%B %d, %y, at %h:%m'
Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.