A text concordance is an alphabetical listing of all the distinct words in a piece of text, and in this project we consider the problem of constructing one concordance for a document stored in a file. Because the words in a concordance are arranged in alphabetical order, a concordance is an ordered list(sorted list). To construct such a concordance, we begin with an empty list. As each word is read, we will check the list whether it is in one of list or not. If the word is a new word, it is inserted into the list in the appropriate place. If the word you read appears in the list already, the frequency of the word will be increased by one. Obviously, it may be necessary to insert words at any point in this list, and thus a sorted list using double-linked implementation is appropriate. Each node consists of four fields, one for word, one for its frequency, and two for prev/next node link(reference) fields.
Each time a word is read from a document, this Sorted list as a doubly linked list must be searched sequentially, always beginning with the first node. The doubly-linked list would be in the alphabetically ascending order. (The data type of the word field would be a string. If the word starts with non-alphabet(number or special symbol) do not insert it in the list. ALGORITHM:
(* Input : A Text file (concordinput.txt)
Function : Construct a text concordance from a document stored in a file. The concordance is stored in a data structure composed of a double linked list.
Output : A list of distinct words with their frequencies on the screen and into the output file in an ascending alphabetic order. *)
Output (to an output file and on the screen simultaneously)
Write a program to evaluate empirically the following strategies for removing nodes with two children:
Input: use a file, BSTinput.txt
+ 3 means inserting 3 into the current BST.
- 5 means removing 5 from the current BST if 5 is in the BST.