parsley.errors.ErrorBuilder.scala Maven / Gradle / Ivy
/*
* Copyright 2020 Parsley Contributors
*
* SPDX-License-Identifier: BSD-3-Clause
*/
package parsley.errors
/** This typeclass specifies how to construct an error from a parser
* as a specified type.
*
* An instance of this trait is required when calling `parse`
* (or similar). By default, Parsley defines its own instance for
* `ErrorBuilder[String]` found in the `ErrorBuilder` companion object.
*
* To implement this trait, a number of methods must be defined,
* as well the representation types for a variety of different components;
* the relation between the various methods is closely linked
* to the types that they both produce and consume. To only change
* the basics of formatting without having to define the entire instance,
* inherit from `DefaultErrorBuilder`: this will mean, however, that the
* representation types cannot be overriden.
*
* =How an Error is Structured=
* There are two kinds of error messages that are generated by Parsley:
* ''Specialized'' and ''Vanilla''. These are produced by different combinators
* and can be merged with other errors of the same type if both errors appear
* at the same offset. However, ''Specialized'' errors will take precedence
* over ''Vanilla'' errors if they appear at the same offset. The most
* common form of error is the ''Vanilla'' variant, which is generated by
* most combinators, except for some in [[combinator `errors.combinator`]].
*
* Both types of error share some common structure, namely:
*
* - The error preamble, which has the file and the position.
* - The content lines, the specifics of which differ between the two types of error.
* - The context lines, which has the surrounding lines of input for contextualisation.
*
* ==''Vanilla'' Errors==
* There are three kinds of content line found in a ''Vanilla'' error:
*
* 1. Unexpected info: this contains information about the kind of token that caused the error.
* 1. Expected info: this contains the information about what kinds of token could have avoided the error.
* 1. Reasons: these are the bespoke reasons that an error has occurred (as generated by [[combinator.ErrorMethods.explain `explain`]]).
*
* There can be at most one unexpected line, at most one expected line, and zero or more reasons.
* Both of the unexpected and expected info are built up of ''error items'', which are either:
* the end of input, a named token, raw input taken from the parser definition. These can all be
* constructed separately.
*
* The overall structure of a ''Vanilla'' error is given in the following diagram:
* {{{
* ┌───────────────────────────────────────────────────────────────────────┐
* │ Vanilla Error │
* │ ┌────────────────┐◄──────── position │
* │ source │ │ │
* │ │ │ line col│ │
* │ ▼ │ │ ││ │
* │ ┌─────┐ │ ▼ ▼│ end of input │
* │ In foo.txt (line 1, column 5): │ │
* │ ┌─────────────────────┐ │ │
* │unexpected ─────►│ │ │ ┌───── expected │
* │ │ ┌──────────┐ ◄──────────┘ │ │
* │ unexpected end of input ▼ │
* │ ┌──────────────────────────────────────┐ │
* │ expected "(", "negate", digit, or letter │
* │ │ └──────┘ └───┘ └────┘ ◄────── named│
* │ │ ▲ └──────────┘ │ │
* │ │ │ │ │
* │ │ raw │ │
* │ └─────────────────┬───────────┘ │
* │ '-' is a binary operator │ │
* │ └──────────────────────┘ │ │
* │ ┌──────┐ ▲ │ │
* │ │>3+4- │ │ expected items │
* │ │ ^│ │ │
* │ └──────┘ └───────────────── reason │
* │ ▲ │
* │ │ │
* │ line info │
* └───────────────────────────────────────────────────────────────────────┘
* }}}
*
* ==''Specialized'' Errors==
* There is only one kind of content found in a ''Specialized'' error:
* a message. These are completely free-form, and are generated by the
* [[combinator.fail(caretWidth:Int,msg0:String,msgs:String*)* `fail`]] combinator, as well as its derived combinators.
* There can be one or more messages in a ''Specialized'' error.
*
* The overall structure of a ''Specialized'' error is given in the following diagram:
*
* {{{
* ┌───────────────────────────────────────────────────────────────────────┐
* │ Specialized Error │
* │ ┌────────────────┐◄──────── position │
* │ source │ │ │
* │ │ │ line col │
* │ ▼ │ │ │ │
* │ ┌─────┐ │ ▼ ▼ │
* │ In foo.txt (line 1, column 5): │
* │ │
* │ ┌───► something went wrong │
* │ │ │
* │ message ──┼───► it looks like a binary operator has no argument │
* │ │ │
* │ └───► '-' is a binary operator │
* │ ┌──────┐ │
* │ │>3+4- │ │
* │ │ ^│ │
* │ └──────┘ │
* │ ▲ │
* │ │ │
* │ line info │
* └───────────────────────────────────────────────────────────────────────┘
* }}}
*
* @tparam Err The final result type of the error message
* @since 3.0.0
* @group formatting
*
* @groupprio outer 0
* @groupname outer Top-Level Construction
* @groupdesc outer
* These methods help assembly the final products of the error messages. The `build` method will return the desired `Err` types,
* whereas `specializedError` and `vanillaError` both assemble an `ErrorInfoLines` that the `build` method can consume.
*
* @groupprio contextlines 10
* @groupname contextlines Contextual Input Lines
* @groupdesc contextlines
* These methods control how many lines of input surrounding the error are requested, and direct how these should be put
* together to form a `LineInfo`.
*
* @groupprio preamble 5
* @groupname preamble Error Preamble
* @groupdesc preamble
* These methods control the construction of the preamble of an error message, which is the position and source info.
* These are then consumed by `build` itself.
*
* @groupprio item 18
* @groupname item Error Items
* @groupdesc item
* These methods control how error items within ''Vanilla'' errors are constructed. These are either
* the end of input, a named label generated by the [[combinator.ErrorMethods.label `label`]] combinator,
* or a raw piece of input intrinsically associated with a combinator.
*
* @groupprio vanilla 16
* @groupname vanilla Vanilla-Specific Components
* @groupdesc vanilla
* These methods control the ''Vanilla''-specific error components, namely how expected error items
* should be combined, how to represent the unexpected line, and how to represent reasons generated from
* [[combinator.ErrorMethods.explain `explain`]].
*
* @groupprio spec 15
* @groupname spec Specialized-Specific Components
* @groupdesc spec
* These methods control the ''Specialized''-specific components, namely the construction of a bespoke
* error message.
*
* @groupprio shared 12
* @groupname shared Shared Components
* @groupdesc shared
* These methods control any components or structure shared by both types of messages. In particular,
* the representation of reasons and messages is shared, as well as how they are combined together
* to form a unified block of content lines.
*/
trait ErrorBuilder[+Err] {
/* This is the top level function, which finally compiles all the built
* sub-parts into a finished value of type `Err`.
*
* @param pos this is the representation of the position of the error in the input (see the [[pos `pos`]] method).
* @param source this is the representation of the filename (if it exists) (see the [[source `source`]] method).
* @param ctxs this is the representation of any addition contextual information in the error (see the [[nestedContexts `nestedContexts`]] method).
* @param lines this is the main body of the error message (see [[vanillaError `vanillaError`]] or [[specializedError `specializedError`]] methods).
* @return The final assembled error message
* @since 5.0.0
* @inheritdoc
*/
//def build(pos: Position, source: Source, ctxs: NestedContexts, lines: ErrorInfoLines): Err
/** This is the top level function, which finally compiles all the built
* sub-parts into a finished value of type `Err`.
*
* @param pos this is the representation of the position of the error in the input (see the [[pos `pos`]] method).
* @param source this is the representation of the filename, if it exists (see the [[source `source`]] method).
* @param lines this is the main body of the error message (see [[vanillaError `vanillaError`]] or [[specializedError `specializedError`]] methods).
* @return the final assembled error message.
* @since 3.0.0
* @group outer
*/
def build(pos: Position, source: Source, lines: ErrorInfoLines): Err
/** The representation type of position information within the generated message.
*
* @since 3.0.0
* @group preamble
*/
type Position
/** The representation of the file information.
*
* @since 3.0.0
* @group preamble
*/
type Source
/* The representation of contextual information, including the file and additional
* context
* @since 5.0.0
* @group preamble
*/
//type Context
/** Formats a position into the representation type given by `Position`.
*
* @param line the line the error occurred at.
* @param col the column the error occurred at.
* @return a representation of the position.
* @since 3.0.0
* @group preamble
*/
def pos(line: Int, col: Int): Position
/** Formats the name of the file if it exists into the type give by `Source`
*
* @param sourceName the source name of the file, if any.
* @since 3.0.0
* @group preamble
*/
def source(sourceName: Option[String]): Source
/* Formats any additional contextual information from the parser. This might,
* for instance, include function or class names.
*
* @param context the context information produced by the parser.
* @return a representation of the context.
* @since 5.0.0
* @group preamble
*/
//def contexualScope(context: String): Context
/* The representation of collapsed nested contextual information.
* @since 5.0.0
* @group preamble
*/
//type NestedContexts
/* Contextual information produced by [[contextualScope `contextualScope`]] is combined by this
* method into a single piece of information. This does not include information
* about the source file.
*
* @param contexts the nested contexts to be collapsed, most general first (produced by [[contextualScope `contextualScope`]]).
* @since 5.0.0
* @group preamble
*/
//def nestContexts(contexts: List[Context]): NestedContexts
/** The representation type of the main body within the error message.
* @since 3.0.0
* @group outer
*/
type ErrorInfoLines
/** Vanilla errors are those produced such that they have information about
* both `expected` and `unexpected` tokens. These are usually the default,
* and are not produced by `fail` (or any derivative) combinators.
*
* @param unexpected information about which token(s) caused the error (see the [[unexpected `unexpected`]] method).
* @param expected information about which token(s) would have avoided the error (see the [[expected `expected`]] method).
* @param reasons additional information about why the error occured (see the [[combineMessages `combineMessages`]] method).
* @param line representation of the line of input that this error occured on (see the [[lineInfo `lineInfo`]] method).
* @since 3.0.0
* @group outer
*/
def vanillaError(unexpected: UnexpectedLine, expected: ExpectedLine, reasons: Messages, line: LineInfo): ErrorInfoLines
/** Specialized errors are triggered by `fail` and any combinators that are
* implemented in terms of `fail`. These errors take precedence over
* the vanilla errors, and contain less, more specialized, information
*
* @param msgs information detailing the error (see the [[combineMessages `combineMessages`]] method).
* @param line representation of the line of input that this error occured on (see the [[lineInfo `lineInfo`]] method).
* @since 3.0.0
* @group outer
*/
def specializedError(msgs: Messages, line: LineInfo): ErrorInfoLines
/** The representation of all the different possible tokens that could
* have prevented an error.
* @since 3.0.0
* @group vanilla
*/
type ExpectedItems
/** The representation of the combined reasons or failure messages from
* the parser.
* @since 3.0.0
* @group shared
*/
type Messages
/** Details how to combine the various expected items into a single
* representation.
*
* @param alts The possible items that fix the error
* @since 3.0.0
* @group vanilla
*/
def combineExpectedItems(alts: Set[Item]): ExpectedItems
/** Details how to combine any reasons or messages generated within a
* single error. Reasons are used by `vanilla` messages and messages
* are used by `specialized` messages.
*
* @param alts the messages to combine (see the [[message `message`]] or [[reason `reason`]] methods).
* @since 3.0.0
* @group shared
*/
def combineMessages(alts: Seq[Message]): Messages
/** The representation of the information regarding the problematic token.
* @since 3.0.0
* @group vanilla
*/
type UnexpectedLine
/** The representation of the information regarding the solving tokens.
* @since 3.0.0
* @group vanilla
*/
type ExpectedLine
/** The representation of a reason or a message generated by the parser.
* @since 3.0.0
* @group shared
*/
type Message
/** The representation of the line of input where the error occurred.
* @since 3.0.0
* @group contextlines
*/
type LineInfo
/** Describes how to handle the (potentially missing) information
* about what token(s) caused the error.
*
* @param item The `Item` that caused this error
* @since 3.0.0
* @group vanilla
*/
def unexpected(item: Option[Item]): UnexpectedLine
/** Describes how to handle the information about the tokens that
* could have avoided the error.
*
* @param alts the tokens that could have prevented the error (see the [[combineExpectedItems `combineExpectedItems`]] method).
* @since 3.0.0
* @group vanilla
*/
def expected(alts: ExpectedItems): ExpectedLine
/** Describes how to represent the reasons behind a parser fail.
* These reasons originate from the `explain` combinator.
*
* @param reason the reason produced by the parser.
* @since 3.0.0
* @group vanilla
*/
def reason(reason: String): Message
/** Describes how to represent the messages produced by the
* `fail` combinator (or any that are implemented using it).
*
* @param msg the message produced by the parser.
* @since 3.0.0
* @group spec
*/
def message(msg: String): Message
/** Describes how to represent the information about the line that
* the error occured on.
*
* @param line the full line of input that produced this error message.
* @param linesBefore the lines of input just before the one that produced this message (up to [[numLinesBefore `numLinesBefore`]]).
* @param linesAfter the lines of input just after the one that produced this message (up to [[numLinesAfter `numLinesAfter`]]).
* @param lineNum the line number of the error message.
* @param errorPointsAt the offset into the line that the error points at.
* @param errorWidth how wide the caret in the message should be.
* @since 5.0.0
* @group contextlines
*/
def lineInfo(line: String, linesBefore: Seq[String], linesAfter: Seq[String], lineNum: Int, errorPointsAt: Int, errorWidth: Int): LineInfo
/** The number of lines of input to request before an error occured.
* @since 3.1.0
* @group contextlines
*/
val numLinesBefore: Int
/** The number of lines of input to request after an error occured.
* @since 3.1.0
* @group contextlines
*/
val numLinesAfter: Int
/** The base type of `Raw`, `Named` and `EndOfInput` that represents the individual items within the error.
* @since 3.0.0
* @group item
*/
type Item
/** This represents "raw" tokens, where are those without labels: they come
* direct from the input, or the characters that the parser is trying to read.
* @since 3.0.0
* @group item
*/
type Raw <: Item
/** This represents "named" tokens, which have been provided with a label.
* @since 3.0.0
* @group item
*/
type Named <: Item
/** Represents the end of the input.
* @since 3.0.0
* @group item
*/
type EndOfInput <: Item
/** Formats a raw item generated by either the input string or a input reading combinator without a label.
*
* @param item the raw, unprocessed input.
* @since 3.0.0
* @group item
*/
def raw(item: String): Raw
/** Formats a named item generated by a label.
*
* @param item the name given to the label.
* @since 3.0.0
* @group item
*/
def named(item: String): Named
/** Value that represents the end of the input in the error message.
* @since 3.0.0
* @group item
*/
val endOfInput: EndOfInput
/** Extracts an unexpected token from the remaining input.
*
* When a parser fails, by default an error reports an unexpected token of a specific width.
* This works well for some parsers, but often it is nice to have the illusion of a dedicated
* lexing pass: instead of reporting the next few characters as unexpected, an unexpected token
* can be reported instead. This can take many forms, for instance trimming the token to the
* next whitespace, only taking one character, or even trying to lex a token out of the stream.
*
* This method can be easily implemented by mixing in an appropriate ''token extractor'' from
* `parsley.errors.tokenextractors` into this builder.
*
* @param cs the remaining input at point of failure (this is '''guaranteed to be non-empty''')
* @param amountOfInputParserWanted the input the parser tried to read when it failed
* (this is '''not''' guaranteed to be smaller than the length of `cs`, but is '''guaranteed to be greater than 0''')
* @param lexicalError was this error generated as part of "lexing", or in a wider parser (see [[parsley.errors.combinator$.markAsToken `markAsToken`]])
* @return a token extracted from `cs` that will be used as part of the unexpected message.
* @since 4.0.0
* @group item
*/
def unexpectedToken(cs: Iterable[Char], amountOfInputParserWanted: Int, lexicalError: Boolean): Token
}
/** Contains the default instance for the `ErrorBuilder` typeclass, which will be automatically available without import.
*
* @group formatting
*/
object ErrorBuilder {
// $COVERAGE-OFF$
/** The default error builder used by Parsley, which produces
* an error as a String. An instance of `DefaultErrorBuilder`.
*
* It ''currently'' uses `TillNextWhitespace` as the token extractor, which
* is trimmed to parser demand. This can be changed without notice.
*/
implicit val stringError: ErrorBuilder[String] = new DefaultErrorBuilder with tokenextractors.TillNextWhitespace {
override val trimToParserDemand = true
}
// $COVERAGE-ON$
}
© 2015 - 2025 Weber Informatics LLC | Privacy Policy