CS 3723 Programming Languages
Review for Exam 1, Monday, 23 October 2000
Assignments:
Handouts:
- A scanner in C for c-style comments,
based on a FSM.
Also a similar scanner in Java.
- Parser for arithmetic expressions, along with debug output:
Text version,
Postscript version, for printing,
C source text
- Parser and Reverse Polish Notation (RPN) generator:
Text version,
Postscript version, for printing,
C source text
- Evaluator for arithmetic expressions, with output:
Text version,
Postscript version,
for printing,
C source text
- Java program to convert an arithmetic expression to
fully parenthesized prefix notation (as in Lisp):
Text version,
Postscript version,
for printing,
C source text
Topics:
- Contrasts:
- Syntax versus semantics.
- Run time versus compile time.
- Translation versus interpretation.
- Overview of a compiler (text, pages 10 and 13):
- Lexical analysis.
- Syntax analysis.
- Semantic analysis.
- Intermediate code generation.
- Code optimization.
- Code generation.
- Quadruples:
- Basic form: operator, argument 1, argument 2, result.
- Coded as four integers.
- The types of instructions:
- Arithmetic: ADD, SUB, MUL, DIV, MOD, ABS, CHS.
- Jumps: JMP, JEQ, JNE, JGE, JGT, JLE, JLT.
- Assignment: ASG.
- I/O: WRT, RDM.
- Halt execution: HLT.
- Literal (for constants): LIT.
- The form of a quadruple interpreter and of a quadruple translator.
- Finite-state machines (FTMs):
- Diagrams with states (circles), transitions (arrows),
and labels on arrows (character absorbed in
following the transition).
There is also exactly one start state and one or more
final (or terminal states).
- Non-deterministic Finite Automata (NFAs). This is a
FSM with at least one case of several arrows labeled with the
same symbol.
- Deterministic Finite Automata (DFAs). Here there
is a unique arrow for each symbol.
- Algorithm for converting an NFA to a DFA. The new states in
the constructed DFA are sets of states of the old NFA.
- The language described by a DFA (= all possible sequences
of characters absorbed by the machine when it ends up in
a final state). We also say the language recognized
by a DFA.
- Use of FSMs in constructing a scanner:
- Simulate the FSM for a scanner. Usually based on a DFA,
though it is possible to simulate an NFA. (Just keep track
of a set of states at each step.)
- Construct a simulator for a DFA by letting each state
be a label's location, and each transition
a goto, conditional on the character (or type of symbol)
above the transition.
- Simulate the same DFA using a state variable and a
while and switch to pick off the correct
code at each step,
corresponding to the state number.
- Examples, especially for c-style comments and for
the language of our term project.
- Formal grammars and formal languages:
- Terminology: Terminal (symbol), non-terminal (symbol),
meta-symbol, start symbol,
grammar rule, left side
(= single non-terminal) and right
side (= sequence of terminals and non-terminals),
a formal grammar (= finite set of rules).
- Formal grammars are also called BNF, Backus-Naur,
and Context-free grammars. (``Context-free'' because
no part depends on the context in which it occurs.)
- Derivation sequences, starting with the start symbol
and ending with a sequence of terminals (called a sentence).
At each step of the derivation, a single non-terminal is
replaced using a grammar rule with that non-terminal on its
left side. The non-terminal is replaced by the right side.
- Language associated with a grammar (= all strings of
terminals that are generated by some derivation sequence).
We also say language recognized by the grammar.
- Leftmost derivations (leftmost non-terminal replaced
at each stage). Rightmost derivations.
- Parse tree for a sentence,
one tree for each derivation. There can
be several derivations for a given tree, but only one
leftmost and one rightmost derivation.
- Ambiguity of a grammar. (There is a sentence with two
distinct parse trees, or equivalently, with two distinct
leftmost derivations.)
- Unambiguous grammars. (Each sentence has a unique parse tree,
or equivalently a unique leftmost derivation.)
- Disambiguating a grammar. Two ways: use special rules,
or change the grammar.
- Example: simple sentences in English, with noun, verb, etc.
- Example: various grammars for simple forms of arithmetic
expressions.
- Example: simple grammar for if-then with optional else.
In its simple form it is ambiguous, but a special rule
disambiguates it.
- Parsers:
- A program that constructs a derivation sequence
for an input sentence, or that
constructs a parse tree (or that goes through the motions of doing
these things).
- Top-down parser: constructs leftmost derivation from the
start symbol, aiming toward the given sentence.
- Recursive descent parser: can be coded by hand and is
useful for compilers. One writes a recursive function for each
grammar rule. The code of the function mirrors the right side
of the rule. Not all grammars can be handled by recursive
descent, so one might have to rewrite the grammar.
- Bottom-up parser: constructs a rightmost derivation
in reverse, from the input sentence, aiming toward the start symbol.
These parsers are usually driven by a table that is constructed
separately.
- Use of parsers to construct translators:
- Syntax-directed translation:
Add code to a parser so as to carry out a translation
from one language to another.
- Example that translates from arithmetic expressions
to rpn.
- Example that evaluates arithmetic expressions.
- Example (in Java) that translates to Lisp (prefix) notation.
- The symbol table in compilers:
- Insert when a variable is declared, look up afterward.
- Assumes declaration before use.
- The implied one-pass compiler for C, C++, Java.
Sample Questions:
- About FSMs:
- Given a diagram for a specific DFA or NFA, describe the
language recognized by the machine.
- Given a diagram for a specific NFA, construct the
corresponding DFA.
- Quadruples:
- Explain how execution of a specific sequence of
quadruples would work.
- Scanners:
- Examples of the kind of DFA that would be used to implement
a scanner.
- Given a simple DFA, write a simulator for it.
- Some specific question about your particular scanner code
written for the course assignment.
- Formal grammars:
- Given a specific grammar, say whether or not it is ambiguous.
- Given a specific grammar, describe the language recognized by it.
- Given a specific grammar and a specific sentence, construct:
- A leftmost derivation for the sentence.
- A parse tree for the sentence.
- Given a description of a language, construct a grammar for it.
- Parsers:
- Given a specific grammar, write a recursive descent parser
for the language described by the grammar.
- Given a specific recursive descent parser,
- answer questions about it.
- supply extra code to do some kind of translation.
- comment on extra code provided to do some kind
of translation.
Revision date: 2000-10-20.
(Also known as 20 October 2000, or even
10/20/00.
Please use ISO 8601,
the International Standard Date and Time Notation.)