The C programming language, as well as C++ and Objective C, use the “C Preprocessor” as the first pass of compilation of programs. The preprocessor does many things, but the main functions are processing include files, macro processing, and removing comments. Most of what the preprocessor does is handle “preprocessor directives”; that is, lines that begin with the # character.

For this assignment you are to write a C++ program to implement a small subset of the C preprocessor.

Your program should take one argument. The argument is the name of a file, which must have the suffix “.c”. Your program should create a file for output with the same name, replacing the suffix “.c” with the suffix “.i”. The program should process the input file and put its output in the output file.

The program should copy everything from the input to the output, with the following changes:

  • The program should remove all comments, both comments that are enclosed in /* and */, and comments that begin with //. You should assume that comments do NOT nest. A comment that does not end is an error: it should be detected and the error message “Unterminated comment that began at line N” should be generated.
  • The program must recognize and process the #define statement. You need only recognize the simple case of a statement where a symbol is defined to have a value. In other words, you only need to recognize statements of the following form:
#define SYMBOL value
  • When your program recognizes a line in this form, it should remember the SYMBOL and its value and not generate any output for that line. Redefining a SYMBOL that has already been defined is an error: your program should generate the error message “Duplicate definition of SYMBOL at line N ignored”, and should ignore the duplicate definition.
  • The program must recognize a simplified form of #ifdef and #endif: If your program finds the line #ifdef SYMBOL, and if SYMBOL is NOT defined, it should replace each line of input with a blank line until it reads an #endif. Note that #ifdef can be nested, that an #endif belongs with the most recently read #ifdef, and that a missing #endif should generate the error message “Missing #endif for #ifdef that began at line N”.
  • If a SYMBOL has been defined, and if that SYMBOL is found as a token somewhere in the input, then SYMBOL should be replaced with the defined value in the output. Symbols inside of quoted strings should NOT be replaced!
  • ANY preprocessor directives formatted in ANY other way should simply be copied to the output, unchanged.
Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.