Assignment #4: Indexing a Textbook

Generate an index for a book.

Compiler Notes

You will use the STL for this assignment.

  1. On Unix, this can be done with the g++ 2.8.1 compiler. Older versions do not support the STL. g++ does not cleanly support separate compilation of templates, but that should not be a problem, because you'll only need one C++ source file for this assignment.
  2. Visual C++ 5.0 has two implementation errors that you should be aware of. First, if you declare a vector<Object>, the Visual C++ implementation requires that Object provide an operator< (and perhaps some others). It will be clear from the linker errors what you need. Second, the string class has a bug that surfaces intermittently when you copy a shorter string into an already existing string. To be safe, before you do copies over an existing string, use erase:
        string str = "junk";
        str.erase( );
        str = "the";
    

The Input

The input file is specified as a command-line argument (use argc and argv). Be certain to check the validity of the file.

The input file consists of a set of index entries. Each line consists of the string IX:, followed by an index entry name enclosed in braces, followed by a page number that is enclosed in braces. Each ! in an index entry name represets a sub-level. A |( represents the start of a range and a |) represents the end of the range. Occasionally, this will be the same page, and in that case, only output a single page number. Otherwise, do not collapse or expand ranges on your own. As an example, here's a sample input:

IX: {Series|(}        	{2}
IX: {Series!geometric|(}        	{4}
IX: {Euler's constant}       	{4}
IX: {Series!geometric|)}        	{4}
IX: {Series!arithmetic|(}        	{4}
IX: {Series!arithmetic|)}        	{5}
IX: {Series!harmonic|(}        	{5}
IX: {Euler's constant}       	{5}
IX: {Series!harmonic|)}        	{5}
IX: {Series|)}        	{5}
The corresponding output is
Euler's constant: 4, 5
Series: 2-5
   arithmetic: 4-5
   geometric: 4
   harmonic: 5

Strategy

Use a map to map the index entry to a list (or vector) that stores information about the page numbers.

This problem is a slightly more involved variation of the concordance problem that I discussed in class. Here, each entry in the list is not an int, but rather is an entity that stores a page number and an indication of whether the page number starts a range, finishes a range, or is on its own.

For extra credit, ignore case distinctions so that the sort is case-independent. To do this, define a function object that performs a case-insensitive comparison. If that function object is Comp, then your map is declared as

map<string,vector<Entry>,Comp> m;

You may assume that every line of the input is properly formatted. You must verify that |( and |) are properly matched, and print a warning message if they are not. Any strange-looking characters on the line can be accepted as part of the index entry (they are likely text-formatting commands).

What To Submit

Submit complete source code and generate an index for my first textbook. (This file can be copied from ~weiss/www/cop3530_f98/assignments/dsaa.idx).

Due Date

This program is due on Monday, November 2.