Understanding Terminals
Terminals are leaf nodes in the syntax tree. For example, in the "textbook" grammar:
E => T E'
E' => + T E' | ε
T => F T'
T' => * F T' | ε
F => ( E ) | id
The syntax ... => ...
is called a production. The left-hand side is the symbol to produce,
and the right-hand side is what can produce it. The left-hand symbol is also called a non-terminal,
and every symbol on the right-hand side that does not appear on the left-hand side is a terminal.
In this example, +
, *
, (
, )
, and id
are terminals. The other symbols are non-terminals.
The terminal structs derived by #[derive_lexicon]
are the building blocks
to define the syntax tree. For example:
use teleparse::prelude::*; use teleparse::GrammarError; #[derive_lexicon] #[teleparse(ignore(r"\s+"), terminal_parse)] pub enum TokenType { #[teleparse(regex(r"\w+"), terminal(Id))] Ident, } #[derive_syntax] #[teleparse(root)] #[derive(Debug, PartialEq)] // for assert_eq! struct ThreeIdents(Id, Id, Id); #[test] fn main() -> Result<(), GrammarError> { let t = ThreeIdents::parse("a b c")?; assert_eq!( t, Some(ThreeIdents( Id::from_span(0..1), Id::from_span(2..3), Id::from_span(4..5), )) ); let pizza = Id::parse("pizza")?; assert_eq!(pizza, Some(Id::from_span(0..5))); Ok(()) }
Here:
- We are generating a
Id
terminal to parse a token of typeIdent
, matching the regexr"\w+"
. - We are creating a non-terminal
ThreeIdents
with the productionThreeIdents => Id Id Id
.- More about
#[derive_syntax]
in later sections
- More about
- We are also using the
terminal_parse
attribute to derive aparse
method for the terminals for testing purposes.