Spring Semester 2026
image
line decor
  
line decor
image
image

image

Course Project

Homework Description Deadline
1 Implementation of a LL(1) parser for a string evaluator and a translator to Java for a simple language 03/04/2026
2

For both course assignments you’ll be required to show your incremental development effort in the form of version control history on github. You will create a private github repository and invite our compilersdi user as a collaborator.

You are expected to show multiple commits for every day you worked on the project and the commits are expected to be consistent with normal development effort, i.e., intermediate stages are likely to be executable, new commits are likely to remove/change existing code and not just add new code, etc. Don’t stress too much about forgetting to commit often on some day—the assignments will take you enough time that incremental effort will be easy to document, overall.

For submitting the assignment, you need to email the assistants with the commit hash of your assignment’s submission commit. NO SUBMISSION IS COMPLETE WITHOUT EMAILING A COMMIT HASH

HW 1

Part 1

For the first part of this homework you should implement a simple string expression evaluator. The evaluator should accept expressions with the ** and / operators, as well as parentheses.

The operator ** is left-associative and is described as follows: a**b=abb a \textrm{**} b = a \circ b \circ b

That is, it concatenates its second operand twice at the end of its first operand.

The operator / is right-associative and is described by the following formula: a/b={aif b is not a suffix of aqif a=qb a / b = \begin{cases} a & \text{if } b \text{ is not a suffix of } a \\ q & \text{if } a = q \circ b \end{cases}

Operator ** has higher priority than operator /.

The grammar is summarized in:

exp -> str | exp op exp | (exp)

op -> / | **

str -> char | char str

char -> a-z | A-Z

You need to change this grammar to support priority between the operators, to remove the left recursion for LL parsing, etc.

This part of the homework is divided in two tasks:

  1. For practice, you can write the FIRST+ & FOLLOW sets for the LL(1) version of the above grammar. In the end you will summarize them in a single lookahead table (include a row for every derivation in your final grammar). This part will not be graded.

  2. You have to write a recursive descent parser in Java that reads expressions and computes the values or prints “parse error” if there is a syntax error. You don’t need to identify blank spaces. You can read the symbols one-by-one (as in the C getchar() function). The expression must end with a newline or EOF.

Your parser should read its input from the standard input (e.g., via an InputStream on System.in) and write the computed values of expressions to the standard output (System.out). Parse errors should be reported on standard error (System.err).

Examples:

a**b**c     --> abbcc
aaaaa/aa/a  --> aaaa
(aaaa/aa)/a --> a
abb**cc/c   --> abbccc

Part 2

In the second part of this homework you will implement a parser and translator for a language supporting string operations. The language supports the concatenation (+) operator over strings, function definitions and calls, conditionals (if-else i.e, every “if” must be followed by an “else”), and the following logical expressions:

  • is-prefix-of (string1 prefix string2): Whether string1 is a prefix of string2.
  • is-suffix-of (string1 suffix string2): Whether string1 is a suffix of string2.

All values in the language are strings.

The precedence of the operator expressions is defined as: precedence(if) < precedence(concat).

Your parser, based on a context-free grammar, will translate the input language into Java. You will use JavaCUP for the generation of the parser combined either with a hand-written lexer or a generated-one (e.g., using JFlex, which is encouraged).

You will infer the desired syntax of the input and output languages from the examples below. The output language is a subset of Java so it can be compiled using javac and executed using Java or online Java compilers like this, if you want to test your output.

There is no need to perform type checking for the argument types or a check for the number of function arguments. You can assume that the program input will always be semantically correct.

Note that each file of Java source code you produce must have the same name as the public Java class in it. For your own convenience you can name the public class “Main” and the generated files “Main.java”. In order to compile a file named Main.java you need to execute the command: javac Main.java. In order to execute the produced Main.class file you need to execute: java Main.

To execute the program successfully, the “Main” class of your Java program must have a method with the following signature: public static void main(String[] args), which will be the main method of your program, containing all the translated statements of the input program. Moreover, for each function declaration of the input program, the translated Java program must contain an equivalent static method of the same name. Finally, keep in mind that in the input language the function declarations must precede all statements.

As with the first part of this assignment, you should accept input programs from stdin and print output Java programs to stdout.

Example #1

Input:

name()  {
    "John"
}

surname() {
    "Doe"
}

fullname(first_name, sep, last_name) {
    first_name + sep + last_name
}

name()
surname()
fullname(name(), " ", surname())

Output (Java):

public class Main {
    public static void main(String[] args) {
        System.out.println(name());
        System.out.println(surname());
        System.out.println(fullname(name(), " ", surname()));
    }
    
    public static String name() {
        return "John";
    }
    
    public static String surname() {
        return "Doe";
    }
    
    public static String fullname(String first_name, String sep, String last_name) {
        return first_name + sep + last_name;
    }
}

Example #2

Input:

name() {
    "John"
}

repeat(x) {
    x + x
}

cond_repeat(c, x) {
    if (c prefix "yes")
        if("yes" prefix c)
            repeat(x)
        else
            x
    else
        x
}

cond_repeat("yes", name())
cond_repeat("no", "Jane")

Example #3

Input:

findLangType(langName) {
    if ("Java" prefix langName)
        if(langName prefix "Java")
            "Static"
        else
            if("script" suffix langName)
                "Dynamic"
            else
                "Unknown"
    else
        if ("script" suffix langName)
            "Probably Dynamic"
        else
            "Unknown"
}

findLangType("Java")
findLangType("Javascript")
findLangType("Typescript")