• Keine Ergebnisse gefunden

Data-dependent grammars [JMW10] support the declaration of constraints to restrict the applicability of a production. However, constraints in data-dependent grammars must be context-insensitive [JMW10, Lemma 4], and therefore cannot be used to describe languages with context-sensitive layout such as Haskell.

4.8 Chapter summary

We have presented a declarative mechanism for specifying layout-sensitive lan-guages based on layout constraints in context-free grammars. We have developed a parser for these grammars based on SGLR. Our parser enforces constraints at parse time when possible but fully validates parse trees at disambiguation time. We have empirically shown that our parser is correct and the performance penalty is acceptable compared to layout-insensitive generalized parsing. While our parser implementation is based on a scannerless parser, the ideas presented in this chapter are applicable to parsers with separate lexers as well. We believe that this work will enable language implementors to specify the grammar of their layout-sensitive languages in a high-level, declarative way.

Our original motivation for this work was to develop a syntactically extensible variant of Haskell in the style of SugarJ, where regular programmers write syntactic language extensions. This requires a declarative and composable syntax formalism as provided by SDF, but supplemented with support for layout-sensitive language. Based on the work presented in this chapter, we have been able to implement SugarHaskell, a syntactically extensible programming language based on Haskell, which we present in the following chapter.

5 A Framework for Library-based Language Extensibility

This chapter shares material with the HASKELL’12 paper “Layout-sensitive Language Extensibility with SugarHaskell” [ERRO12].

The core idea explored in this thesis is to use library-based language extensibil-ity for flexible and principled domain abstraction. In Chapter 2 and Chapter 3, we investigated library-based language extensibility in SugarJ, an extensible programming language that supports domain-specific syntax, domain-specific static analyses, and domain-specific editor services. In this chapter, we generalize SugarJ to a framework for library-based language extensibility.

SugarJ is based on Java, in which application code is written by the user or generated by desugarings. However, the ideas behind SugarJ do not depend on Java. Instead, we hypothesize that library-based language extensibility can be made available for any programming language that has a notion oflibraries.

To validate this claim, we have developed a framework for library-based language extensibility that can be instantiated for different base languages.

The framework is based on the SugarJ compiler, but abstracts over the Java-specific fragments of the compiler using an abstract class [Rie12]. The resulting compiler framework can be instantiated for different base languages. So far, we have instantiated the framework to build support for library-based language extensibility based on Java, Prolog, Fω, and Haskell.

In this chapter, we present the extensible programming language SugarHaskell that uses Haskell as a base language. SugarHaskell satisfies the same design goals as SugarJ: specific syntax, specific static analysis, domain-specific editor services, modular reasoning, implementation reuse, declarativity, composability, and uniform self-application. However, in contrast to Java and as discussed in the previous chapter, Haskell is a layout-sensitive programming language. SugarHaskell embraces the layout-sensitivity of Haskell and also supports layout-sensitive language extensions using layout constraints introduced in the previous chapter. Building on our previous work on syntactic extensibility for Java, SugarHaskell integrates syntactic extensions as sugar libraries into

Haskell’s module system. Syntax extensions in SugarHaskell can declare arbitrary context-free and layout-sensitive syntax. SugarHaskell modules are compiled into Haskell modules and further processed by a Haskell compiler. We provide an Eclipse-based IDE for SugarHaskell that is extensible through editor libraries, and automatically provides syntax coloring for all syntax extensions imported into a module.

We have validated SugarHaskell with several case studies, including arrow nota-tion (as implemented in GHC) and EBNF as a concise syntax for the declaranota-tion of algebraic data types with associated concrete syntax. EBNF declarations also show how to extend the extension mechanism itself: They introduce syntactic sugar for using the declared concrete syntax in other SugarHaskell modules.

5.1 Introduction

Many papers on Haskell programming propose some form of domain-specific syntax for Haskell. For instance, consider the following code excerpt from a paper about applicative functors [MP08]:

instanceTraversable Treewhere traverse f Leaf =[|Leaf|]

traverse f (Node l x r)=[|Node (traverse f l) (f x) (traverse f r)|]

Theidiom brackets[|...|]used in this listing are not supported by the actual Haskell compiler; rather, the paper explains that they are a shorthand notation for writing this more elaborate code:

instanceTraversable Treewhere traverse f Leaf =pure Leaf

traverse f (Node l x r)=pure Node<*>(traverse f l)<*>(f x)<*>(traverse f r) Such domain-specific syntax is quite common. Sometimes it is eventually supported by the compiler (such as donotation for monads); sometimes pre-processors are written to desugar the code to standard Haskell (such as the Strathclyde Haskell Enhancement preprocessor1 which supports, among other notations, the idiom brackets mentioned above), and sometimes such notations are only used in papers but not in actual program texts. Extending a compiler or writing a preprocessor is not declarative, not modular, and independently developed compiler extensions or preprocessors are hard to compose.

1http://personal.cis.strath.ac.uk/conor.mcbride/pub/she

5.1 Introduction

Another practical problem of syntactic language extension is that integrated development environments (IDEs) should know how to deal with the new syntax and provide domain-specific editor services, for example, for syntax coloring, auto completion, or reference resolving. IDEs can be extended, of course, but this again is not declarative, not modular, and does not support composition.

We propose a generic extension to Haskell, SugarHaskell, with which arbitrary syntax extensions can be defined, used, and composed as needed. In SugarHaskell, a syntactic extension is activated by importing a library which exports the syntax extension and defines a desugaring of the extension to SugarHaskell. Using SugarHaskell, we can realize the code example from above as follows:2

importControl.Applicative

importControl.Applicative.IdiomBrackets instanceTraversable Treewhere

traverse f Leaf =(|Leaf|)

traverse f (Node l x r)=(|Node (traverse f l) (f x) (traverse f r)|)

The syntactic extension and its desugaring is defined in the libraryIdiomBrackets. By importing this library, the notation and its desugaring are activated within the remainder of the current module. When the SugarHaskell compiler is invoked, it desugars the brackets to the code usingpure and<*>from above. Modules that do not import IdiomBrackets are not affected by the syntactic extension.

If more than one syntax extension is required in the same file, the extensions are composed by importing all of them. Conflicts can arise if the extensions overlap syntactically, but this is rare for real-world examples and can usually be disambiguated easily.

SugarHaskell also comes with an Eclipse-based development environment specif-ically tailored to support syntactic extensions. By importing theIdiomBrackets library, syntax coloring for the extended syntax is automatically provided. More advanced IDE services can be defined in and imported fromeditor libraries (see Chapter 3).

It makes a significant difference that the target of the desugaring is Sugar-Haskell and not Sugar-Haskell, because this means that the syntax extension mechanism is itself syntactically extensible. We illustrate this issue with a case study that allows the definition of EBNF grammars in Haskell. Besides desugaring an EBNF grammar into an algebraic data type (the abstract syntax) and a Parsec

2To avoid syntactic overlap with Template Haskell, we follow Strathclyde Haskell Enhancement and implement rounded idiom brackets.

parser (the concrete syntax), we generate yet another syntactic extension that enables using the concrete syntax in Haskell expressions and patterns directly.

SugarHaskell builds on our earlier work on SugarJ, a syntactically extensible version of Java presented in Chapter 2. The research contributions of this chapter are as follows:

• SugarHaskell demonstrates that flexible and principled domain abstraction is not confined to Java-based languages, but similar extensibility is feasible for other base languages, too.

• To create SugarHaskell, we developed a framework for library-based lan-guage extensibility that decouples the syntax-extension mechanism of SugarJ from the underlying programming language. To this end, we gen-eralized the SugarJ compiler by creating an interface that abstracts over the base language. We describe the design of this interface and how we used it to implement SugarHaskell.

• Haskell presents a new technical challenge not present in Java: layout-sensitive parsing [Mar10, Sec. 2.7]. SugarHaskell allows the definition of layout-sensitive syntactic extensions and is, to the best of our knowledge, the first declaratively extensible parser for Haskell with layout-sensitive syntax. We validate the extensibility of our parser by developing a layout-sensitive language extension of Haskell, namely arrow notation [Pat01].

In addition to these research contributions, we believe that this work can also contribute very practically to the Haskell community. Haskell programmers often strive to express programs elegantly and concisely, using built-in features such as user-defined infix notation and layout-sensitivedonotation. But since these built-in features are not always enough to express the desired syntax, Haskell compiler writers add language extensions to their compilers to support additional syntactic sugar. The Haskell community can benefit from SugarHaskell in two ways:

• SugarHaskell empowers ordinary library authors to provide appropriate notation for the use of their libraries without having to change a Haskell compiler.

• SugarHaskell assists language designers by providing a framework for prototyping and thoroughly experimenting with language extensions that affect Haskell’s syntax.