You are asked to write a program to check if the HTML tags in a given file are nested correctly or not. This problem is similar to the example of proper matching of { ( [ ] ) }, but now you will deal with HTML tags and read them from a file. So you need to use stack ADT as we discussed in class.

[ First get booklib.zip and 08-Abstract-Data-Types.zip from the class web page. For this, click the link "programs from the textbook" under online materials. Then click 00-zip-files. Get README.txt and read it. Accordingly, get other files as described in it ]

The implementation in 08-Abstract... has the stack library implementation and a driver program (rpncal.c) for RPN calculator. During the recitation you will modify it. But for this homework, you need to write a new application program (say htmlchecker.c), which will use the stack lib as rpncal.c does. So you will use the existing implementation of stack library, but in your new application since you will push/pop html tags (strings), you need to make sure stack.h hastypedef void *stackElementT; // or typedef char *stackElementT;

Background:

HTML files consist of regular text and tags enclosed in angle brackets, the < and > symbols. Tags are used to identify the structure of a document.

Most tags come in pairs: a beginning tag and a closing tag. For example, the tags < title> and< /title> are the beginning and closing tags. There are several such tags including < b> < /b> < i> < /i> < h1>< /h1> etc. Tags may have several attributes and such attributes will be in the beginning tag for example < a href="ff.html"> link < /a>.

HTML allows two-sided tags to be nested without overlapping, as in the example of proper matching of { ( [ ] ) }.

Some tags are single-sided (e.g., < img src="a.jpg" />, < br />, < hr />) and they appear alone. They will not affect the nesting but you still need to process them since they might be in the file. You can simply ignore the tags that start with < and end with /> Also the file may contain HTML comments such as

< !-- comments contain < p> < /p> -->

Your program should ignore everything between< !--and-->

For example, if an HTML file contains

< title>< b> THIS FILE < /b> USES CORRECTLY NESTED TAGS
< /title>
< h1>< i> First < b class="c1"> header < /b> text < img src="pic.jpg" /> < /i>< /h1>
< !-- comments contain < p> < b> < /p> -->
< p id=par1"> Some other text < /p>

Then YES, all the tags are nested correctly. But if an HTML file contains

< title> < b> THIS FILE < /title> IS < /b> NOT NESTED CORRECTLY.
< p> < b> some text is not nested correctly < /b>

Then NO, the tag < B> < /title> violates the proper nesting!

Your program must accept HTML input file name as a command line argument and process it for proper nesting of HTML tags. So we will run your program as follows

main212> driver sample1.html

You can stop the program when you detect the first tag that violates the proper nesting structure and print NO and the tag that violates nesting on the screen. Otherwise, your program will print YES, all the tags are nested correctly.

As always, make sure you release (free) the dynamically allocated memories if you allocate any memory in your programs. So, before submitting your program, run it with valgrind to see if there is any memory leakage Also if you need to debug your program, compile your programs with -g option and then run it with gdb and/or ddd.

Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.