CS 1724, Data Structures Assignment 5, Concordance Program Due March 28, 1997 On pages 139-143 (Section 6.5), your text gives the C code for a program which will print each word in an input text along with the number of times the word occurred. Using the program source itself as input, the book's program will produce output that starts as follows: 2 EOF 3 MAXWORD 5 NULL 3 a 1 add 6 addtree . . . For this assignment you are to modify the text's program so that it produces a concordance: a list of each word in the input text, and for each word a list of the line numbers where the word occurred in the text. (This is essentially just Exercise 6-3 on page 143.) Using the same program as source, your modified program should produce output like the following. (Much of it is deleted below.) EOF: 27 77 MAXWORD: 6 25 27 NULL: 26 39 43 57 97 a: 35 64 92 add: 35 addtree: 15 29 35 36 48 50 at: 35 below: 35 break: 86 builtin: 92 c: 73 75 77 78 79 81 char: 9 15 18 19 25 36 71 74 93 95 96 ... ... (a number of lines deleted) order: 54 p: 35 36 39 40 41 42 43 45 46 48 50 51 54 55 57 58 59 60 95 96 97 98 99 print: 54 ... ... (a number of lines deleted) Details: 1. Relatively simple changes are needed to the supplied source. (It was all taken from the text.) 2. You should return the current line number from the getword() function as another parameter. You can keep track of the current line number by starting a static variable at 1 and incrementing it each time you see a '\n' on input. 3. Inside the main struct for the tree node (tnode) you must use a pointer to a linked list of integers, which will hold the line numbers. This will require creating another struct, and even writing another allocation function. 4. The addtree() function will have another parameter, the line number where the word occurred. addtree() will insert this line number into the linked list associated with the node for the word. 5. You should fix up the output so that for a given word it only prints a given line number once, as shown with the entry for p above. 6. You should have your program skip to a new line after printing 14 numbers (or so), again as shown with the entry for p above. 7. Print the line numbers in increasing order. There are several ways to arrange this. 8. Inside the directory "~wagner/pub/CS1723" is the text of the assignment as "assign5.text", the source file as "tree.c", and a non-recursive version of the source as "treealt.c". The Book's Program: Here is the book's program, using "cat -n tree.c" to show the line numbers: 1 #include 2 #include 3 #include 4 #include 5 6 #define MAXWORD 100 7 8 struct tnode { 9 char *word; 10 int count; 11 struct tnode *left; 12 struct tnode *right; 13 }; 14 15 struct tnode *addtree(struct tnode *, char *); 16 void treeprint(struct tnode *); 17 struct tnode *talloc(void); 18 int getword(char *, int); 19 char *strdupl(char *); 20 21 /* word frequency count */ 22 void main(void) 23 { 24 struct tnode *root; 25 char word[MAXWORD]; 26 root = NULL; 27 while (getword(word, MAXWORD) != EOF) 28 if (isalpha(word[0])) 29 root = addtree(root, word); 30 treeprint(root); 31 exit(0); 32 } 33 34 35 /* addtree: add a node with w, at or below p */ 36 struct tnode *addtree(struct tnode *p, char *w) 37 { 38 int cond; 39 if (p == NULL) { 40 p = talloc(); 41 p -> word = strdupl(w); 42 p -> count = 1; 43 p -> left = p -> right = NULL; 44 } 45 else if ((cond = strcmp(w, p -> word)) == 0) 46 (p -> count)++; 47 else if (cond < 0) 48 p -> left = addtree(p -> left, w); 49 else 50 p -> right = addtree(p -> right, w); 51 return p; 52 } 53 54 /* treeprint: in-order print of tree p */ 55 void treeprint(struct tnode *p) 56 { 57 if (p != NULL) { 58 treeprint(p -> left); 59 printf("%4d %s\n", p -> count, p -> word); 60 treeprint(p -> right); 61 } 62 } 63 64 /* talloc: make a tnode */ 65 struct tnode *talloc(void) 66 { 67 return (struct tnode *) malloc(sizeof(struct tnode)); 68 } 69 70 /* getword: get next word or charcter from input */ 71 int getword(char *word, int lim) 72 { 73 int c; 74 char *w = word; 75 while (isspace(c = getchar())) 76 ; 77 if (c != EOF) 78 *w++ = c; 79 if (!isalpha(c)) { 80 *w = '\0'; 81 return c; 82 } 83 for ( ; --lim > 0; w++) 84 if (!isalpha(*w = getchar())) { 85 ungetc(*w, stdin); 86 break; 87 } 88 *w = '\0'; 89 return word[0]; 90 } 91 92 /* strdupl: make a duplicate of s. strdup is builtin */ 93 char *strdupl(char *s) 94 { 95 char *p; 96 p = (char *) malloc(strlen(s)+1); 97 if (p != NULL) 98 strcpy(p, s); 99 return p; 100 }