Handling Comments (Extracted Tokens)

Comments are special tokens that don't have meaning in the syntax and are there just for human readers.

You may recall that you can skip unwanted patterns between tokens by using #[teleparse(ignore(...))]. This can be used for comments, but sometimes you do want to parse the comments. For example for a transpiler that keeps the comments in the output.

In this senario, you can define a token type that doesn't have any terminals. The lexer will still produce those tokens, but instead of passing them to the parser, they will be kept aside. You can query them using a Parser object later.

use teleparse::prelude::*;

#[derive_lexicon]
pub enum MyToken {
    #[teleparse(regex(r"/\*([^\*]|(\*[^/]))*\*/"))]
    Comment,
}

fn main() {
    let input = "/* This is a comment */";
    // you can call `lexer` to use a standalone lexer without a Parser
    let mut lexer = MyToken::lexer(input).unwrap();
    // the lexer will not ignore comments
    assert_eq!(
        lexer.next(),
        (None, Some(Token::new(0..23, MyToken::Comment)))
    );
    // `should_extract` will tell the lexer to not return the token to the Parser
    assert!(MyToken::Comment.should_extract());
}