Writing a Parser in Haskell
It is common to use a family of libraries known as parser combinators for parsing in Haskell. These parser combinators permit us to compose higher-order functions to generate parsers. They are a particularly expressive pattern that allows programmers to prototype language grammars in a small embedded domain language in Haskell. Most significantly, it is possible to embed custom Haskell logic inside of the parser. A parser is structurally a function that takes a stream of characters as input and yields a parse tree. Parsing takes a list of tokens and transforms it into an abstract syntax tree. The syntax tree has parameters as children and some form of meaning as its node.
In this article, our experts will explain how parsing is done in Haskell using monads. They have assumed that a list of tokens can be generated from a lexical analysis.
Parsec is a library that defines a set of common combinators. The table below shows the combinators and their description:
|String||It matches the given string|
|Char||It matches the given character|
|Many||This combinator matches an arbitrary number of patterns, matches the patterns and returns them as a list|
|Many1||It is just like the many combinator but it requires at least one match|
|<\>||It can be chained sequentially to generate a sequence of options. It is known as the choice operator. It tries to parse the first argument before proceeding to the second|
|SepBy||It matches an arbitrary length sequence of patterns. It is delimited by a given pattern|
|parens||It is used to parse the given pattern surrounded by parenthesis|
|Optional||It parses a given pattern optionally, returning its value as maybe|
|Try||The backtracking operator will allow us to parse ambiguous matching expressions and restart with a different pattern|
A tree is usually represented by its nodes and operands. The nodes represent the operation while the operands as its children. For example, in the expression a+b+c, ‘+’ is the root and identifier, a is the left child, and the other ‘+’ subtree is the right child. Now let us discuss the simple definition of tree nodes for the operations ‘/’, ‘*’, ‘+’, and ‘-‘. In arithmetic, division and multiplication have priority over subtraction and addition. Also, brackets have priority over multiplication and division. As a result, the expressions in brackets are always a subtree of division and multiplication. As a result, since we have three levels of priority, we will have three types of nodes. In addition and subtraction, expressions should have the operator + or – and two operands.
Suppose we want to parse somehow a list of tokens. The question will be what should the operation return? The operation should return a given tree node and the rest of tokens which are not parsed. This can be modeled as a state monad where:
- State is the list of tokens
- Output is the resulting tree node or something else
The result of parsing is something that returns results and a list of remaining tokens to be passed or some kind of error. Do you need a code example of monadic parsing? Avail our Haskell programming assignment help.
How is Parsing done?
Implementing parsing is easy once you are familiar with the definition of nodes. When we define two functions:
- GetToken -One which represents computation. It returns the current token as result and doesn’t change the state of tokens (the token remains in the list of tokens)
- Expect – This function returns parsing error if the given token doesn’t exist in the list. It will remove that token from the list if the token exists. This function can also be defined by other names. For example, in other articles or tutorials, the experts may call it “eat function” because it eats token if it exists.
Additionally, we will have some function which always returns an error with a given message for producing an output during parsing.
We can easily do parsing from definitions of tree nodes. Take an example of factor parsing where:
- we start by getting the current token and checking various possibilities
- Then, we handle these possibilities
- We output and error if nothing is satisfied
Also, we can use the expression method to easily parse an expression inside the brackets.
Do you want to learn more about parsing? Hire our programming experts today.