Assignment #5: Sorting Program
[Modified Exercise 8.28]: Write a simple sorting utility, sort.
The sort command takes a file name as a parameter.
The file contains a sequence of lines; each line contains
fields. By default, the fields are separated by white space
(space or tab, in any amount),
but the -t option allows you to use any one character
as a field separator.
By default fields are to be interpreted as strings, but the -n
option allows you to interpret a field as a (double)
number.
Finally, by default lines are
to be sorted as if the entire line was a single string
(breaking ties arbitrarily).
However, the -k option can be used to specify that
the sort is performed using a specified field.
Options may appear in any order.
The file name may appear anywhere; if it is not present,
use cin.
You may assume a limit of nine fields.
The sorted output is written to cout.
Examples
If the command is
sort -n -t: -k2 fileName
and the data file fileName is
10:20:30:40
20:100:40:30
30:40:10:20
Then the output is sorted on the basis of the second field, which is interpreted as a number, and is thus
10:20:30:40
30:40:10:20
20:100:40:30
Without the -n option, the output would be
20:100:40:30
10:20:30:40
30:40:10:20
Without any options, each line would be viewed as a giant string.
The resulting output would be
10:20:30:40
20:100:40:30
30:40:10:20
Unusual Circumstances
If the file doesn't open,
a line does not have enough fields, or a field
that is to be a number can't be a number, then
you have to do something reasonable.
I will accept anything reasonable, but you have
to document the behavior.
You also must perform other checks, including verification
that the command-line arguments make sense.
For instance, a character must follow -t and
-k.
Algorithms
You already have the quicksort algorithm in the online
code and I expect you to use it UNMODIFIED.
The idea is that quicksort is a template
and can work with anything.
So define a class that stores the complete line (as a
String) and the key, on which the sort is
based. Since the key could be either a double
or a String, the logical declaration is:
template <class Comparable>
struct Line
{
String theLine;
Comparable theKey;
int operator< ( const Line & rhs ) const
{ return theKey < rhs.theKey; }
};
At this point, you need to hook everything together,
and much of the work is parsing the command-line arguments
and then the lines.
I will sketch some of the ideas in class on Monday, but that will be
all that I have to say.
C++ Stuff
Submission Details
I will provide
four good test cases of various sizes, using various options,
and two additional test cases that are degenerate.
Look for them on November 9.
-
Submit all your source code, and the results of running the program
on the test cases.
-
You must also provide a "man" page that explains what
your program does. In particular, tell me how you
handle degenerate cases.
Try man sort to see what the Unix page looks like;
yours should be more complete.
I expect a professional job -- type it!
- You must tell me if the bad operator[] for strings
was a problem, and if so, how you worked around it.
The man page, and operator[] decision are important
components of your grade.
Extra Credit
-
The data movement of all the strings can be excessive.
Use pointers in the Line class to avoid the extra cost.
-
Efficiently
add a -r option that reverses the order of the comparisons.
Thus the sort proceeds from largest to smallest.
-
Efficiently add a
-f that makes the sort case-insensitive.