ParseIT: A Question-Answer based Tool to Learn Parsing Techniques

ParseIT: A Question-Answer based Tool to Learn Parsing Techniques


Parsing (also called syntax analysis) techniques cover a substantial portion of any undergraduate Compiler Design course. We present ParseIT, a tool to help students understand the parsing techniques through question-answering. ParseIT automates the generation of tutorial questions based on the Context Free Grammar provided by the student and generates feedback for the student solutions. The tool generates multiple-choice questions (MCQs) and fill in the blank type questions, and evaluates students’ attempts. It provides hints for incorrect attempts, again in terms of MCQs. The hints questions are generated for any correct choice that is missed or any incorrect choice that is selected. Another interesting form of hint generated is an input string that helps the students identify incorrectly filled cells of a parsing table. We also present results of a user study conducted to measure the effectiveness of ParseIT.

ParseIT: A Question-Answer based Tool to Learn Parsing Techniques

Amey Karkare
Indian Institute of Technology Kanpur
Kanpur, UP, India
Nimisha Agarwal
Indian Institute of Technology Kanpur
Kanpur, UP, India



Intelligent Tutoring ; Education; Programming; Compilers; E-Learning

Compiler design is an important subject in the computer science curriculum for undergraduates [?]. Compilers are one of the success stories of Computer Science, where sound theoretical concepts (e.g. Automata, Grammars, Graph Theory, Lattice Theory etc.) are backed by practical implementations (Lexical analyzers, Parsers, Code Optimizers etc.) to solve the real world problem of fast and resource-efficient compilation. Most existing compiler courses [?, ?, ?, ?] divide the curriculum into modules corresponding to the phases of compilation. Instructors discuss the theory in lectures while students typically work on a semester-long project implementing a compiler for some small language.

In a typical course, about 15%-22% of the total time is spent on syntax analysis phase (also called parsing techniques, see Table ParseIT: A Question-Answer based Tool to Learn Parsing Techniques). A number of concepts are introduced to explain the internals of parsers, for example first sets, follow sets, item set, goto and closure sets, parse tables and the parsing algorithms [?], making the understanding difficult. While parser generators (YACC and its variants) allow the students to experiment with grammars, the working of the parser generated by the tools is still opaque111The generated parsers do produce debugging information when used with appropriate options, but this is of little didactical value as one needs to know the parsing algorithms to understand it..

 Institute Course Name #/Duration of Lectures %
Parsing Total
 Stanford Intro. to Compilers [?] 4 18 22%
 IIT Kanpur Compiler Design [?] 6 35 17%
 Coursera Compilers [?] 4 modules 18 modules 22%
(4 hours) (19 hours) (21%)
 Saylor Compilers [?] 28 hours 146 hours 19%
Table \thetable: Time Spent on Teaching Parsing.

Recent development in technologies has enabled institutions to offer courses to large number of students. These massive-open-online courses (MOOCs) [?, ?, ?] digitize the contents of the topics (lecture videos, notes etc), and allow students to access the contents beyond physical boundaries of classrooms. The increase in number of students has added challenges for the instructor for the tutoring aspects, such as the creation of new problems for assignments, solving these problems, grading, and helping the students master a concept through hands-on exercises. These challenges have prompted researchers to develop automated tutoring systems to help the student to explore a course based on his skills and learning speed [?, ?, ?, ?, ?].

In this paper, we present ParseIT, a tool for teaching parsing techniques. ParseIT helps students to understand the parsing concepts through automatically generated problems and hints. Problems are generated based on a Context Free Grammar (CFG) given as input. The tool evaluates the solutions attempted by the user for these problems. Upon evaluation, if the solutions provided by the users are incorrect, it generates hint questions. The problems generated by the tool follow a general Multiple Choice Question (MCQ) pattern, where a user is given a problem with a set of possible choices, 1 or more of which are correct. The incorrect solutions are the ones where a correct option is not chosen, or an incorrect option is chosen, or both. The hints are generated in the forms of (simplified) questions to direct student toward the correct solution. Hint generation procedures involve different types of algorithms, of which the input string generation algorithm is notable. For an incorrect parse table provided by the user, this algorithm enables the creation of an input string that distinguishes a successful parse from an unsuccessful one.

We describe some of the systems developed by other for teaching compiler concepts in Sec. ParseIT: A Question-Answer based Tool to Learn Parsing Techniques. The tool itself is described in Sec. ParseIT: A Question-Answer based Tool to Learn Parsing Techniques. Input string generation algorithms for LL and LR parsers are given in Sec. LABEL:sec:input. We present a summary of the user study in Sec. ParseIT: A Question-Answer based Tool to Learn Parsing Techniques, and conclude in Sec. ParseIT: A Question-Answer based Tool to Learn Parsing Techniques.

Several efforts exist to automate teaching phases of compilers and to help developing a compiler as a course project. LISA [?] helps students learn compiler technology through animations and visualizations. The tool uses animations to explain the working of 3 phases of compilers, namely, lexical analysis, syntax analysis, and semantic analysis. Lexical analysis is taught using animations in DFAs. For syntax analysis, animations are shown for the construction of syntax trees and for semantic analysis, animations are shown for the node visits of the semantic tree and evaluation of attributes. Students understand the working of phases by modifying the specification and observing the corresponding changes in the animation.

Lorenzo et. al. [?] present a system for test-case based automated evaluation of compiler projects. Test cases (inputs and corresponding desired outputs) designed by the instructor are given as input to students’ compilers. The tool then assesses the compiler in three distinct steps–compilation, execution, and correction. The system automatically generates different reports (for instructors and students) by analyzing the logs generated at each of these steps.

Demaille et. al. [?, ?] introduce several tools to improve the teaching of compiler construction projects and make it relevant to the core curriculum. They made changes to Bison [?] to provide detailed textual and graphical descriptions of the LALR automata, allow the use of named symbols in actions (instead of $1, $2, etc.), and use Generalized LR (GLR) as backend. Waite [?] proposed 3 strategies for teaching compilers–software project, application of theory and support for communicating with a computer. Various other tools are also available to teach different phases of compiler like understanding code generation [?], and understanding symbol tables through animations [?].

Our work is different in that we use question-answering as means to explain the working of parsing technology and to guide the students towards the construction of correct parse table.

ParseIT takes as input a context free grammar and uses it as a basis for generating questions. These questions are in the form of MCQ 222MCQs have their advantages as well as disadvantages [?]. We chose MCQ as it is easier for the system to evaluate student choices as compared to the free form text answers. and deal with various concepts related to parsing. The normal workflow involves the following steps:

  1. The user provides an input grammar and the choice of topic. The topics refer to the concepts related to parsing such as FIRST set, FOLLOW set, LL Parsing Table, LL Parsing Moves, LR(0) Item-sets, SLR Parsing Table, SLR Parsing Moves, etc.

  2. A primary multiple choice question is generated based on the above two pieces of information.

  3. If the user answers the problem incorrectly, then hints are generated for the same question in the form of questions.

  4. When a correct solution to the problem is received, another question for the same topic is generated and presented to the user.

In the preprocessing step, the system takes a grammar as input and generates the information required for correct solutions. In particular, the tool generates the FIRST set and the FOLLOW set for all non-terminals, LL Parsing Table, LR(0) items, canonical set of items for SLR parser, and SLR parsing table.

For primary problem for the selected topic, ParseIT uses the data-structures to form MCQs having multiple correct answers. Users have to select all valid options, and no invalid option, for the answer to be deemed correct. The options are also generated using the preprocessed data.

In the answer evaluation step, the solution given by the user is compared with the solution computed by the tool in the preprocessing step. If the solutions match, then the control transfers back to the primary problem generation step to generate the next question. However, if the solution is wrong, the tool collects: a) the incorrect options which are selected, and b) the correct options which are not selected by the user and passes them to the hint generation step.333In the rest of the paper, unless specified otherwise, we use the term incorrect choice for both the types of mistakes, i.e., the missing valid choice and the selected invalid choice.

For hints, ParseIT generates multiple hint questions for each of the incorrect choices. These questions are MCQs having a single correct choice. These questions help the user to revise the concept required to get correct solution to the primary question.

Parsing techniques require solving three main types of problems: a) computation of sets of elements, for example, FIRST, FOLLOW, LR Items, GOTO, CLOSURE, b) computation of entries in a parse table, and c) steps of a parser on a given input string.

Since all the sets and tables are computed by ParseIT in the preprocessing step, generation of questions is easy. The details are given in a technical report  ´=´

To verify the effectiveness of ParseIT, we implemented the tool in Java. The prototype implementation is available as a JAR file from anonymous Dropbox link [?]. A web interface was created for the user study.

The user study was conducted with with 16 students who have already done an introductory course on Compiler Design. We used 2 grammars and created 22 questions of 1 mark each related to various sub-topics in parsing. The question papers are code named P1 and P2. The students were randomly divided into 4 groups of 4 students each, G1–G4.

Each group solved one question paper using ParseIT, and the other using pen and paper (offline mode). To maintain equality between the two approaches, we provided a cheat sheet containing the required rules to each student for offline mode. Further, the sequence of ParseIT mode and offline mode was alternated. In particular, the groups solved the grammar in the following order:

G1: P2 using offline followed by P1 using ParseIT
G2: P1 using ParseIT followed by P2 using offline
G3: P2 using ParseIT followed by P1 using offline
G4: P1 using offline followed by P2 using ParseIT

The students were asked to fill a survey about the effectiveness of ParseIT after solving both the papers.

Fig. LABEL:fig:avg shows the average of marks for groups while Fig. LABEL:fig:indiv shows average marks for individuals with and without ParseIT. Comparing the average marks across sessions, we found that average marks for G4 remain unchanged while for G1 it reduced by 0.5. For both G2 and G3, the average marks increased by 1. If we include the correct answer after a hint is taken during ParseIT mode, we found that most student could get nearly full marks across the groups (average over all students improved to 21.75 from 18.50 for ParseIT without hints and 19.18 for offline). The biggest improvement was of 7 marks, for 3 students.

Even though the data set is small, it shows that online platform itself does not make a big difference in the understanding of parser concepts, but the hints’ mechanism that results in improvements in marks. The hints allow students to correct their mistakes early. It is also easy to figure out the source of confusion for students, which can be of help to the instructor. The post study survey also corroborated our inference: 15 students accepted that the hints provided a better understanding of parsing, and helped reach the correct solution. One student had a negative feedback as he commented that “Hints are produced as questions, which increases confusion as the user has already answered it wrong.”. However, generating hints in other forms (say natural language sentence) is an area of future long-term research.

In this paper, we described ParseIT for teaching parsing techniques. Our approach is question-answering based: problems are generated automatically and given to students to explain the working of a parser. Further, the hints provided by the tool are also in forms of targeted questions that help a student discover her mistake and revise the concept at the same time. ParseIT allows students to learn the techniques at their own pace, according to their convenience. The user study shows that the interactive nature of ParseIT helps users to learn from their own mistakes through experiments, and reduce the burden on teachers and teaching assistants.

Similar tools exist to teach few other phases of a compiler. In future, we plan to integrate these tools with ParseIT, and develop new tools to automate tutoring of all the phases of the compiler. We also plan to build animations around these concepts to improve student experience and understanding. An interesting question that will require user study over a longer period is whether the hints just helps the students select the correct answer during the exam, or do they have a lasting learning effect. Our plan is to deploy ParseIT in a large class teaching Compilers, to understand its impact on learning.

  • 1 N. Agrawal. A tool for teaching parsing techniques. Master’s thesis, IIT Kanpur, 2015. karkare/MTP/2014-15/nimisha2015parsing.pdf.
  • 2 A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Princiles, Techniques, and Tools. Pearson Education, Inc, 2006.
  • 3 R. Alur, L. D’Antoni, S. Gulwani, D. Kini, and M. Viswanathan. Automated grading of DFA constructions. In International Joint Conference on Artificial Intelligence, IJCAI, pages 1976–1982, 2013.
  • 4 GNU Bison.
  • 5 Coursera.
  • 6 Compilers.
  • 7 Introduction to Compilers.
  • 8 Computer Science Curricula 2013., December 2013.
  • 9 Compilers.
  • 10 Principles of Compiler Design. karkare/courses/2011/cs335.
  • 11 L. D’Antoni, D. Kini, R. Alur, S. Gulwani, M. Viswanathan, and B. Hartmann. How can automatic feedback help students construct automata? ACM Trans. Comput.-Hum. Interact., 22(2):9:1–9:24, 2015.
  • 12 A. Demaille. Making compiler construction projects relevant to core curriculums. In Innovation and Technology in Computer Science Education, ITiCSE, 2005.
  • 13 A. Demaille, R. Levillain, and B. Perrot. A set of tools to teach compiler construction. In Innovation and Technology in Computer Science Education, ITiCSE, 2008.
  • 14 edX.
  • 15 S. Gulwani. Example-based learning in computer- aided STEM education. Commun. ACM, 57(8):70–80, 2014.
  • 16 E. J. Lorenzo, J. Velez, and A. Peñas. A proposal for automatic evaluation in a compiler construction course. In Innovation and Technology in Computer Science Education, ITiCSE, pages 308–312, 2011.
  • 17 Computer Science Curricula 2013., Last Accessed October 2016.
  • 18 M. Mernik and V. Zumer. An educational tool for teaching compiler construction. IEEE Trans. Education, 46(1):61–68, 2003.
  • 19 NPTEL: National Programme on Technology Enhanced Learning.
  • 20 ParseIT Implementation.
  • 21 R. Singh, S. Gulwani, and A. Solar-Lezama. Automated feedback generation for introductory programming assignments. In Programming Language Design and Implementation, PLDI, pages 15–26, 2013.
  • 22 T. Sondag, K. L. Pokorny, and H. Rajan. Frances: A tool for understanding code generation. In ACM Technical Symposium on Computing Science Education, SIGCSE, 2010.
  • 23 J. Urquiza-Fuentes, F. Manso, J. A. Velázquez-Iturbide, and M. Rubio-Sánchez. Improving compilers education through symbol tables animations. In Innovation and Technology in Computer Science Education, ITiCSE, 2011.
  • 24 W. M. Waite. The compiler course in today’s curriculum: Three strategies. SIGCSE Bull., 38(1):87–91, Mar. 2006.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description