Compiler-compiler
In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a language and machine. The earliest and still most common form of compiler-compiler is a parser generator, whose input is a grammar (usually in BNF) of a programming language, and whose generated output is the source code of a parser often used as a component of a compiler.
The ideal compiler-compiler takes a description of a programming language and a target instruction set architecture, and automatically generates a usable compiler from them. In practice, the state of the art has yet to reach this degree of sophistication and most compiler generators are not capable of handling semantic or target architecture information.
Variants
A typical parser generator associates executable code with each of the rules of the grammar that should be executed when these rules are applied by the parser. These pieces of code are sometimes referred to as semantic action routines since they define the semantics of the syntactic structure that is analyzed by the parser. Depending upon the type of parser that should be generated, these routines may construct a parse tree (or abstract syntax tree), or generate executable code directly.
One of the earliest (1964), surprisingly powerful, versions of compiler-compilers is META II, which accepted grammars and code generation rules, and is able to compile itself and other languages.
Some experimental compiler-compilers take as input a formal description of programming language semantics, typically using denotational semantics. This approach is often called 'semantics-based compiling', and was pioneered by Peter Mosses' Semantic Implementation System (SIS) in 1978.[1] However, both the generated compiler and the code it produced were inefficient in time and space. No production compilers are currently built in this way, but research continues.
The Production Quality Compiler-Compiler project at Carnegie-Mellon University does not formalize semantics, but does have a semi-formal framework for machine description.
Compiler-compilers exist in many flavors, including bottom-up rewrite machine generators (see JBurg) used to tile syntax trees according to a rewrite grammar for code generation, and attribute grammar parser generators (e.g. ANTLR can be used for simultaneous type checking, constant propagation, and more during the parsing stage).
History
The first compiler-compiler to use that name was written by Tony Brooker in 1960 and was used to create compilers for the Atlas computer at the University of Manchester, including the Atlas Autocode compiler. However it was rather different from modern compiler-compilers, and today would probably be described as being somewhere between a highly customisable generic compiler and an extensible-syntax language. The name 'compiler-compiler' was far more appropriate for Brooker's system than it is for most modern compiler-compilers, which are more accurately described as parser generators. It is almost certain that the "Compiler Compiler" name has entered common use due to Yacc rather than Brooker's work being remembered.
Other examples of parser generators in the yacc vein are ANTLR, Coco/R, CUP, GNU bison, Eli, FSL, SableCC, SID (Syntax Improving Device) and JavaCC. While useful, pure parser generators only address the parsing part of the problem of building a compiler. Tools with broader scope, such as PQCC, Coco/R and DMS Software Reengineering Toolkit provide considerable support for more difficult post-parsing activities such as semantic analysis, code optimization and generation.
Several compiler-compilers
- ANTLR
- Bison
- Coco/R
- DMS Software Reengineering Toolkit, a program transformation system with parser generators.
- ELI, an integrated toolset for compiler construction.
- Grako, a Python EBNF-to-PEG parser generator.
- Lemon
- META II
- parboiled, a Java library for building parsers.
- Packrat parser
- PackCC, a packrat parser with left recursion support.
- PQCC, a compiler-compiler that is more than a parser generator.
- SID, Syntax Improving Device by J.M.Foster.
- SYNTAX, an integrated toolset for compiler construction.
- TREEMETA
- Yacc
- XPL
- JavaCC
See also
Notes
- ↑ Peter Mosses, "SIS: A Compiler-Generator System Using Denotational Semantics," Report 78-4-3, Dept. of Computer Science, University of Aarhus, Denmark, June 1978
References
This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.
Further reading
- Brooker, R .A.; MacCallum, I. R.; Morris, D.; Rohl, J. S. (1963), "The compiler-compiler", Annual Review in Automatic Programming, 3: 229–275
- Brooker, R. A., Morris, D. and Rohl, J. S., Experience with the Compiler Compiler, Computer Journal, Vol. 9, p. 350. (February 1967).
- Johnson, Stephen C., Yacc—yet another compiler-compiler, Computer Science Technical Report 32, Bell Laboratories, Murray Hill, NJ, July 1975
- McKeeman, William M.; Horning, James J.; Wortman, David B. (1970). A Compiler Generator. Englewood Cliffs, N.J.: Prentice-Hall. ISBN 0-13-155077-2. Retrieved 13 December 2012.
External links
- Computer50.org, Brooker Autocodes
- Catalog.compilertools.net, The Catalog of Compiler Construction Tools
- Labraj.uni-mb.si, Lisa
- Skenz.it, Jflex and Cup resources (Italian)
- Gentle.compilertools.net, The Gentle Compiler Construction System
- Accent.compilertools.net, Accent: a Compiler for the Entire Class of Context-Free Languages
- Grammatica.percederberg.net, an open-source parser-generator for .NET and Java