This is my first try at a mass email. If you don't receive this message please let me know. (Supposed to be a joke.) I am also going to post these answers on the course web site.
Recently a student asked:
> I've got a couple questions about the lexical analyzer project in your > programming languages class. The first has to do with integers. Say, for > example, that we have the integer 50 somewhere in the source code. Since > we're reading the source one character at a time, wouldn't we read the 5 and > the 0 separately? If that's the case, then how do we put them back together > again to make the single integer 50? >Yes, you are correct: you get the integer one character at a time. You could just save the integer as a string and convert it to an int later. However, it would be better to change the string to an int right away. There are several ways to do this: Low-tech method: convert to an int as you go ("on the fly").
/* read_int.c: read digits and convert to an int */ #includeHigh-tech method, using a string buffer and sscanf:#include int main() { char ch; /* storage for each char */ int decoded_int; /* final decoded int */ while ( !isdigit(ch = getchar())) /* skip to first digit */ ; decoded_int = ch - '0'; /* decode first digit as an int */ while ( isdigit(ch = getchar())) /* decode remaining digits */ decoded_int = decoded_int*10 + ch - '0'; printf("Decoded int: %i\n", decoded_int); } ten42% cc -o read_int2 read_int2.c ten42% read_int2 47532 Decoded int: 47532 ten42% read_int2 xyz47532xyz Decoded int: 47532
/* read_int.c: read digits and convert to an int #includeLet's not call a short table of reserved words a "symbol table". It just lets you recognize the various reserved words and return a unique integer code for each one.#include int main() { char* buf = (char* ) malloc(20); /* storage for string of digits */ char ch; /* storage for each char */ int buf_ind = 0; /* index into the array buf */ int decoded_int; /* final decoded int */ while ( !isdigit(ch = getchar())) /* skip to first digit */ ; buf[buf_ind++] = ch; /* insert first digit into buf */ while ( isdigit(ch = getchar())) /* insert other digits into buf */ buf[buf_ind++] = ch; buf[buf_ind] = '\0'; /* must have null char at the end */ /* sscanf reads from a string rather than a file. */ /* Otherwise it's just like scanf */ sscanf(buf, "%i", &decoded_int); printf("Original string: \"%s\"\n", buf); printf("Decoded int: %i\n", decoded_int); } ten42% cc -o read_int read_int.c ten42% read_int 47532 Original string: "47532" Decoded int: 47532 ten42% read_int xyz47532xyz Original string: "47532" Decoded int: 47532
> > My other question was in regards to the symbol tables. I understand > the need to have a symbol table with the reserved words in it. >
> > What I don't > understand is the need for a symbol table to hold all the identifiers we come > across. If we have a string of characters that is not a reserved word, don't > we just put the string into the nameval of the Token? Does it matter if > we've already read that particular identifier before? The token will still > be a nametok with the string as it's nameval. I feel like I'm missing > something here. >You're right. For this part of the project you don't need a symbol table, and you can just return the characters of the identifier, regardless of whether you've seen the identifier before or not. For later parts of the project it will be essential to know which identifier is being used, and then a symbol table will be needed.
-- Neal Wagner
*------------------------------------------------------------------* * Neal R. Wagner, Assoc. Prof., Division of Computer Science * * University of Texas at San Antonio, San Antonio, Texas 78249 * * Tel:(210)458-5550, Fax:(210)458-4437, E-mail:wagner@cs.utsa.edu * * Web page: http://www.cs.utsa.edu/~wagner/ * *------------------------------------------------------------------* * Dialog from the famous "Soup Nazi" episode of Seinfeld: * * (Note: this episode is about a temperamental owner of a soup * * kitchen. It is almost exactly based on a real restaurant: * * Soup Kitchen International, 8th Ave. and 55th St., NYC, and * * the real owner: Ali "Al" Yeganeh. I have eaten at the real * * soup kitchen many times; all the soups are wonderful --the * * lobster bisque is the best soup ever created.) * * * * Soup Nazi: You are the only one who understands me. * * Kraemer: You suffer for your soup. * * Soup Nazi: Yes, that is right. * * Kraemer: You demand perfection from yourself, from your soup. * * Soup Nazi: How can I tolerate any less from my customers? * * * * (See: http://members.aol.com/rynocub/soupnazi.htm * * http://home.earthlink.net/~asena/soup.html * * http://www.toptown.com/dorms/rick/SOUPPAGE.HTM) * *------------------------------------------------------------------*