All programs should be able to compile with no warnings when compiled with the –Wall option, e.g. gcc –Wall taxes.c. Beginning this week we will be taking points off for warnings. You should put your name(s) in a comment on the first line of each file. When you are to write your own functions, since main() should be the first function in each file, you will need to provide a prototype above main() for each function you write. You will find my executables as well testing files in ~ssdavis/30/p7. The prompts and output format of each program must match the examples exactly. User inputs are in bold.

This is the first of a series of assignments that interact with genealogy files. You are to write a program that reads a GEDCOM (an acronym standing for GEnealogical Data COMmunication) file, and then provides the names of the children of a person named by the user. “A GEDCOM file consists of a header section, records, and a trailer section. Within these sections, records represent people (INDI record), families (FAM records), sources of information (SOUR records), and other miscellaneous records, including notes. Every line of a GEDCOM file begins with a level number where all top-level records (HEAD, TRLR, SUBN, and each INDI, FAM, OBJE, NOTE, REPO, SOUR, and SUBM) begin with a line with level 0, while other level numbers are positive integers.” From http://en.wikipedia.org/wiki/GEDCOM.

For this assignment you will only be dealing with the INDI and FAM top-level records. The information for a given record ends when a tag line of another top-level record occurs in the file. Thus, everything between lines that begin with zeroes deal with a single record. An INDI record provides information about individual, including their names (NAME tag). The FAM records provide information about a family, including its spouses (HUSB and WIFE tags) and children (CHIL tags). Each individual and family has a unique ID that have a ‘@’ at each end. These IDs are established in each INDI and FAM tag line, and used in the HUSB, WIFE, and CHIL tags lines. Note that of all the tag lines with which your program interacts, only the NAME tag line does have an ID as its data. Other than INDI and FAM tag lines, all of the tag lines your program will interact with will begin with a ‘1’, because they need no further elaboration, unlike a BIRTH that would have a date and a place.

For this first genealogy assignment, we will be streamlining the information we store. This will make our searching inefficient, but will minimize the number of data arrays you will need to maintain. We will be using “parallel” arrays for this assignment. All of the elements at a given index of parallel arrays contain information about the same object. In this case, the indiIDs and names arrays will be parallel arrays that deal with individuals, and the spousesIDs and childIDs arrays will be parallel arrays that deal with families. For example, both indiIDs[2] and names[2] will both refer to the same individual, and both spousesIDs[5] and childIDs[5] will both refer to the same family. You will be storing multiple IDs separated by spaces in each element of the spousesIDs and childIDs.

Since GEDCOM files have different lengths, your arrays will have to be dynamically allocated. Since we are using parallel arrays, the number of elements in indiIDs and names will be the same, and the number of elements in the spousesIDs and childIDs will be the same. A simple way to determine the size needed is to quickly read the file, and only count how many INDI tags and FAM tags there are, and then allocate accordingly. After reading the file, you will rewind() it so you can read through it again and this time process its data into the arrays.

The search process relies on linear search and the parallel arrays. When given a name by the user, search the names array for that name. If it is found, the individuals ID can be found in the indiIDs array at the same index as the name. Next search the spousesIDs array for that ID. Once you find the family that has the individual as a spouse, then loop through that family’s corresponding childIDs element and look for the corresponding ID in the indiIDs array. Once you find a child’s ID in the indiIDs, then print their name from the names array.

Specifications

  • main.c will only contain main()
    • main() should act as the manager of a program. It declares the variables that are shared among the top level functions, ensures that the program command is proper, and calls user functions.
    • The variables will be four char** for your arrays, two ints to hold the sizes of the two sets of parallel arrays, and a FILE*.
    • The name of the GEDCOM file will be passed as the command line parameter to main().
  • vector.c
    • initialize() will allocate and initialize the values of the elements of the four arrays.
    • deallocate() will free all dynamically allocated memory.
  • vector.h will contain the prototypes of vector.c as well as macros for constants used in vector.c.
  • file.c
    • count_records() will try to open the file, count the families and individuals in the file, rewind the file, and dynamically allocate the four arrays. If there is an error, then initialize should notify the user of the problem, and call exit().
    • read_file() will loop through the file calling read_indi(), read_family(), or read_other() based on the record tag, until the end of the file is reached.
    • read_indi() will process an INDI record to copy an individual’s ID into indiIDs, dynamically allocate enough room for their name in names, copy their name into names, and increment indi_count.
    • read_family() will process a FAM record to append the ID of each spouse to the family’s spousesIDs, and append the ID of each child to the family’s childIDs. There should be a space character between each ID.
    • read_other() will simply loop through any record that is not a INDI or FAM record until each comes upon the another top-level record.
    • get_ID() parses an ID out of a tagged line, and returns a pointer to it.
  • file.h will contain the prototypes of file.c.
  • search.c
    • find_children() conducts the searching process that is explained above.
    • find_ID() finds an ID within an array of char*s, and returns its index.
    • find_name() will parse a name entered by the user, and then search the names array for it, and returns its index.
  • search.h will contain the prototype for find_children().
  • Makefile
    • It will create an executable named family.out.
    • It must create an object file for each source code file.
    • You must use the –Wall and –g options on all lines invoking gcc.
    • It must have a clean: option that uses rm –f to explicitly remove the files created by the Makefile, i.e., family.out, family.o, file.o, vector.o, and search.o.
Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.