All Downloads are FREE. Search and download functionalities are using the official Maven repository.

parsley.errors.ErrorBuilder.scala Maven / Gradle / Ivy

There is a newer version: 5.0.0-M6
Show newest version
/* SPDX-FileCopyrightText: © 2021 Parsley Contributors 
 * SPDX-License-Identifier: BSD-3-Clause
 */
package parsley.errors

/** This typeclass specifies how to format an error from a parser
  * as a specified type.
  *
  * An instance of this trait is required when calling `parse`
  * (or similar). By default, Parsley defines its own instance for
  * `ErrorBuilder[String]` found in the `ErrorBuilder` companion object.
  *
  * To implement this trait, a number of methods must be defined,
  * as well the representation types for a variety of different components;
  * the relation between the various methods is closely linked
  * to the types that they both produce and consume. To only change
  * the basics of formatting without having to define the entire instance,
  * inherit from `DefaultErrorBuilder`: this will mean, however, that the
  * representation types cannot be overriden.
  *
  * =How an Error is Structured=
  * There are two kinds of error messages that are generated by Parsley:
  * ''Specialised'' and ''Vanilla''. These are produced by different combinators
  * and can be merged with other errors of the same type if both errors appear
  * at the same offset. However, ''Specialised'' errors will take precedence
  * over ''Vanilla'' errors if they appear at the same offset. The most
  * common form of error is the ''Vanilla'' variant, which is generated by
  * most combinators, except for some in [[combinator `errors.combinator`]].
  *
  * Both types of error share some common structure, namely:
  *
  *   - The error preamble, which has the file and the position.
  *   - The content lines, the specifics of which differ between the two types of error.
  *   - The context lines, which has the surrounding lines of input for contextualisation.
  *
  * ==''Vanilla'' Errors==
  * There are three kinds of content line found in a ''Vanilla'' error:
  *
  *   1. Unexpected info: this contains information about the kind of token that caused the error.
  *   1. Expected info: this contains the information about what kinds of token could have avoided the error.
  *   1. Reasons: these are the bespoke reasons that an error has occurred (as generated by [[combinator.ErrorMethods.explain `explain`]]).
  *
  * There can be at most one unexpected line, at most one expected line, and zero or more reasons.
  * Both of the unexpected and expected info are built up of ''error items'', which are either:
  * the end of input, a named token, raw input taken from the parser definition. These can all be
  * formatted separately.
  *
  * The overall structure of a ''Vanilla'' error is given in the following diagram:
  * {{{
  * ┌───────────────────────────────────────────────────────────────────────┐
  * │   Vanilla Error                                                       │
  * │                          ┌────────────────┐◄──────── position         │
  * │                  source  │                │                           │
  * │                     │    │   line      col│                           │
  * │                     ▼    │     │         ││                           │
  * │                  ┌─────┐ │     ▼         ▼│   end of input            │
  * │               In foo.txt (line 1, column 5):       │                  │
  * │                 ┌─────────────────────┐            │                  │
  * │unexpected ─────►│                     │            │  ┌───── expected │
  * │                 │          ┌──────────┐ ◄──────────┘  │               │
  * │                 unexpected end of input               ▼               │
  * │                 ┌──────────────────────────────────────┐              │
  * │                 expected "(", "negate", digit, or letter              │
  * │                          │    └──────┘  └───┘     └────┘ ◄────── named│
  * │                          │       ▲        └──────────┘ │              │
  * │                          │       │                     │              │
  * │                          │      raw                    │              │
  * │                          └─────────────────┬───────────┘              │
  * │                 '-' is a binary operator   │                          │
  * │                 └──────────────────────┘   │                          │
  * │                ┌──────┐        ▲           │                          │
  * │                │>3+4- │        │           expected items             │
  * │                │     ^│        │                                      │
  * │                └──────┘        └───────────────── reason              │
  * │                   ▲                                                   │
  * │                   │                                                   │
  * │                   line info                                           │
  * └───────────────────────────────────────────────────────────────────────┘
  * }}}
  *
  * ==''Specialised'' Errors==
  * There is only one kind of content found in a ''Specialised'' error:
  * a message. These are completely free-form, and are generated by the
  * [[combinator.fail(caretWidth:Int,msg0:String,msgs:String*)* `fail`]] combinator, as well as its derived combinators.
  * There can be one or more messages in a ''Specialised'' error.
  *
  * The overall structure of a ''Specialised'' error is given in the following diagram:
  *
  * {{{
  * ┌───────────────────────────────────────────────────────────────────────┐
  * │   Specialised Error                                                   │
  * │                          ┌────────────────┐◄──────── position         │
  * │                  source  │                │                           │
  * │                     │    │   line       col                           │
  * │                     ▼    │     │         │                            │
  * │                  ┌─────┐ │     ▼         ▼                            │
  * │               In foo.txt (line 1, column 5):                          │
  * │                                                                       │
  * │           ┌───► something went wrong                                  │
  * │           │                                                           │
  * │ message ──┼───► it looks like a binary operator has no argument       │
  * │           │                                                           │
  * │           └───► '-' is a binary operator                              │
  * │                ┌──────┐                                               │
  * │                │>3+4- │                                               │
  * │                │     ^│                                               │
  * │                └──────┘                                               │
  * │                   ▲                                                   │
  * │                   │                                                   │
  * │                   line info                                           │
  * └───────────────────────────────────────────────────────────────────────┘
  * }}}
  *
  * @tparam Err The final result type of the error message
  * @since 3.0.0
  * @group formatting
  *
  * @groupprio outer 0
  * @groupname outer Top-Level Formatting
  * @groupdesc outer
  *     These methods help assembly the final products of the error messages. The `format` method will return the desired `Err` types,
  *     whereas `specialisedError` and `vanillaError` both assemble an `ErrorInfoLines` that the `format` method can consume.
  *
  * @groupprio contextlines 10
  * @groupname contextlines Contextual Input Lines
  * @groupdesc contextlines
  *     These methods control how many lines of input surrounding the error are requested, and direct how these should be put
  *     together to form a `LineInfo`.
  *
  * @groupprio preamble 5
  * @groupname preamble Error Preamble
  * @groupdesc preamble
  *     These methods control the formatting of the preamble of an error message, which is the position and source info.
  *     These are then consumed by `format` itself.
  *
  * @groupprio item 18
  * @groupname item Error Items
  * @groupdesc item
  *     These methods control how error items within ''Vanilla'' errors are formatted. These are either
  *     the end of input, a named label generated by the [[combinator.ErrorMethods.label `label`]] combinator,
  *     or a raw piece of input intrinsically associated with a combinator.
  *
  * @groupprio vanilla 16
  * @groupname vanilla Vanilla-Specific Components
  * @groupdesc vanilla
  *     These methods control the ''Vanilla''-specific error components, namely how expected error items
  *     should be combined, how to format the unexpected line, and how to format reasons generated from
  *     [[combinator.ErrorMethods.explain `explain`]].
  *
  * @groupprio spec 15
  * @groupname spec Specialised-Specific Components
  * @groupdesc spec
  *     These methods control the ''Specialised''-specific components, namely the formatting of a bespoke
  *     error message.
  *
  * @groupprio shared 12
  * @groupname shared Shared Components
  * @groupdesc shared
  *     These methods control any components or structure shared by both types of messages. In particular,
  *     the representation of reasons and messages is shared, as well as how they are combined together
  *     to form a unified block of content lines.
  */
trait ErrorBuilder[+Err] {
    /* This is the top level function, which finally compiles all the formatted
      * sub-parts into a finished value of type `Err`.
      *
      * @param pos this is the representation of the position of the error in the input (see the [[pos `pos`]] method).
      * @param source this is the representation of the filename (if it exists) (see the [[source `source`]] method).
      * @param ctxs this is the representation of any addition contextual information in the error (see the [[nestedContexts `nestedContexts`]] method).
      * @param lines this is the main body of the error message (see [[vanillaError `vanillaError`]] or [[specialisedError `specialisedError`]] methods).
      * @return The final assembled error message
      * @since 5.0.0
      * @inheritdoc
      */
    //def format(pos: Position, source: Source, ctxs: NestedContexts, lines: ErrorInfoLines): Err

    /** This is the top level function, which finally compiles all the formatted
      * sub-parts into a finished value of type `Err`.
      *
      * @param pos this is the representation of the position of the error in the input (see the [[pos `pos`]] method).
      * @param source this is the representation of the filename, if it exists (see the [[source `source`]] method).
      * @param lines this is the main body of the error message (see [[vanillaError `vanillaError`]] or [[specialisedError `specialisedError`]] methods).
      * @return the final assembled error message.
      * @since 3.0.0
      * @group outer
      */
    def format(pos: Position, source: Source, lines: ErrorInfoLines): Err

    /** The representation type of position information within the generated message.
      *
      * @since 3.0.0
      * @group preamble
      */
    type Position
    /** The representation of the file information.
      *
      * @since 3.0.0
      * @group preamble
      */
    type Source
    /* The representation of contextual information, including the file and additional
      * context
      * @since 5.0.0
      * @group preamble
      */
    //type Context
    /** Formats a position into the representation type given by `Position`.
      *
      * @param line the line the error occurred at.
      * @param col the column the error occurred at.
      * @return a representation of the position.
      * @since 3.0.0
      * @group preamble
      */
    def pos(line: Int, col: Int): Position
    /** Formats the name of the file if it exists into the type give by `Source`
      *
      * @param sourceName the source name of the file, if any.
      * @since 3.0.0
      * @group preamble
      */
    def source(sourceName: Option[String]): Source
    /* Formats any additional contextual information from the parser. This might,
      * for instance, include function or class names.
      *
      * @param context the context information produced by the parser.
      * @return a representation of the context.
      * @since 5.0.0
      * @group preamble
      */
    //def contexualScope(context: String): Context

    /* The representation of collapsed nested contextual information.
      * @since 5.0.0
      * @group preamble
      */
    //type NestedContexts
    /* Contextual information produced by [[contextualScope `contextualScope`]] is combined by this
      * method into a single piece of information. This does not include information
      * about the source file.
      *
      * @param contexts the nested contexts to be collapsed, most general first (produced by [[contextualScope `contextualScope`]]).
      * @since 5.0.0
      * @group preamble
      */
    //def nestContexts(contexts: List[Context]): NestedContexts

    /** The representation type of the main body within the error message.
      * @since 3.0.0
      * @group outer
      */
    type ErrorInfoLines
    /** Vanilla errors are those produced such that they have information about
      * both `expected` and `unexpected` tokens. These are usually the default,
      * and are not produced by `fail` (or any derivative) combinators.
      *
      * @param unexpected information about which token(s) caused the error (see the [[unexpected `unexpected`]] method).
      * @param expected information about which token(s) would have avoided the error (see the [[expected `expected`]] method).
      * @param reasons additional information about why the error occured (see the [[combineMessages `combineMessages`]] method).
      * @param line representation of the line of input that this error occured on (see the [[lineInfo `lineInfo`]] method).
      * @since 3.0.0
      * @group outer
      */
    def vanillaError(unexpected: UnexpectedLine, expected: ExpectedLine, reasons: Messages, line: LineInfo): ErrorInfoLines
    /** Specialised errors are triggered by `fail` and any combinators that are
      * implemented in terms of `fail`. These errors take precedence over
      * the vanilla errors, and contain less, more specialised, information
      *
      * @param msgs information detailing the error (see the [[combineMessages `combineMessages`]] method).
      * @param line representation of the line of input that this error occured on (see the [[lineInfo `lineInfo`]] method).
      * @since 3.0.0
      * @group outer
      */
    def specialisedError(msgs: Messages, line: LineInfo): ErrorInfoLines

    /** The representation of all the different possible tokens that could
      * have prevented an error.
      * @since 3.0.0
      * @group vanilla
      */
    type ExpectedItems
    /** The representation of the combined reasons or failure messages from
      * the parser.
      * @since 3.0.0
      * @group shared
      */
    type Messages
    /** Details how to combine the various expected items into a single
      * representation.
      *
      * @param alts The possible items that fix the error
      * @since 3.0.0
      * @group vanilla
      */
    def combineExpectedItems(alts: Set[Item]): ExpectedItems
    /** Details how to combine any reasons or messages generated within a
      * single error. Reasons are used by `vanilla` messages and messages
      * are used by `specialised` messages.
      *
      * @param alts the messages to combine (see the [[message `message`]] or [[reason `reason`]] methods).
      * @since 3.0.0
      * @group shared
      */
    def combineMessages(alts: Seq[Message]): Messages

    /** The representation of the information regarding the problematic token.
      * @since 3.0.0
      * @group vanilla
      */
    type UnexpectedLine
    /** The representation of the information regarding the solving tokens.
      * @since 3.0.0
      * @group vanilla
      */
    type ExpectedLine
    /** The representation of a reason or a message generated by the parser.
      * @since 3.0.0
      * @group shared
      */
    type Message
    /** The representation of the line of input where the error occurred.
      * @since 3.0.0
      * @group contextlines
      */
    type LineInfo
    /** Describes how to handle the (potentially missing) information
      * about what token(s) caused the error.
      *
      * @param item The `Item` that caused this error
      * @since 3.0.0
      * @group vanilla
      */
    def unexpected(item: Option[Item]): UnexpectedLine
    /** Describes how to handle the information about the tokens that
      * could have avoided the error.
      *
      * @param alts the tokens that could have prevented the error (see the [[combineExpectedItems `combineExpectedItems`]] method).
      * @since 3.0.0
      * @group vanilla
      */
    def expected(alts: ExpectedItems): ExpectedLine
    /** Describes how to represent the reasons behind a parser fail.
      * These reasons originate from the `explain` combinator.
      *
      * @param reason the reason produced by the parser.
      * @since 3.0.0
      * @group vanilla
      */
    def reason(reason: String): Message
    /** Describes how to represent the messages produced by the
      * `fail` combinator (or any that are implemented using it).
      *
      * @param msg the message produced by the parser.
      * @since 3.0.0
      * @group spec
      */
    def message(msg: String): Message
    /** Describes how to format the information about the line that
      * the error occured on.
      *
      * @param line the full line of input that produced this error message.
      * @param linesBefore the lines of input just before the one that produced this message (up to [[numLinesBefore `numLinesBefore`]]).
      * @param linesAfter the lines of input just after the one that produced this message (up to [[numLinesAfter `numLinesAfter`]]).
      * @param errorPointsAt the offset into the line that the error points at.
      * @since 3.1.0
      * @group contextlines
      */
    def lineInfo(line: String, linesBefore: Seq[String], linesAfter: Seq[String], errorPointsAt: Int, errorWidth: Int): LineInfo

    /** The number of lines of input to request before an error occured.
      * @since 3.1.0
      * @group contextlines
      */
    val numLinesBefore: Int
    /** The number of lines of input to request after an error occured.
      * @since 3.1.0
      * @group contextlines
      */
    val numLinesAfter: Int

    /** The base type of `Raw`, `Named` and `EndOfInput` that represents the individual items within the error.
      * @since 3.0.0
      * @group item
      */
    type Item
    /** This represents "raw" tokens, where are those without labels: they come
      * direct from the input, or the characters that the parser is trying to read.
      * @since 3.0.0
      * @group item
      */
    type Raw <: Item
    /** This represents "named" tokens, which have been provided with a label.
      * @since 3.0.0
      * @group item
      */
    type Named <: Item
    /** Represents the end of the input.
      * @since 3.0.0
      * @group item
      */
    type EndOfInput <: Item
    /** Formats a raw item generated by either the input string or a input reading combinator without a label.
      *
      * @param item the raw, unprocessed input.
      * @since 3.0.0
      * @group item
      */
    def raw(item: String): Raw
    /** Formats a named item generated by a label.
      *
      * @param item the name given to the label.
      * @since 3.0.0
      * @group item
      */
    def named(item: String): Named
    /** Value that represents the end of the input in the error message.
      * @since 3.0.0
      * @group item
      */
    val endOfInput: EndOfInput

    /** Extracts an unexpected token from the remaining input.
      *
      * When a parser fails, by default an error reports an unexpected token of a specific width.
      * This works well for some parsers, but often it is nice to have the illusion of a dedicated
      * lexing pass: instead of reporting the next few characters as unexpected, an unexpected token
      * can be reported instead. This can take many forms, for instance trimming the token to the
      * next whitespace, only taking one character, or even trying to lex a token out of the stream.
      *
      * This method can be easily implemented by mixing in an appropriate ''token extractor'' from
      * `parsley.errors.tokenextractors` into this builder.
      *
      * @param cs the remaining input at point of failure (this is '''guaranteed to be non-empty''')
      * @param amountOfInputParserWanted the input the parser tried to read when it failed
      *                                  (this is '''not''' guaranteed to be smaller than the length of `cs`)
      * @param lexicalError was this error generated as part of "lexing", or in a wider parser (see [[parsley.errors.combinator$.markAsToken `markAsToken`]])
      * @return a token extracted from `cs` that will be used as part of the unexpected message.
      * @since 4.0.0
      * @group item
      */
    def unexpectedToken(cs: Iterable[Char], amountOfInputParserWanted: Int, lexicalError: Boolean): Token
}

/** Contains the default instance for the `ErrorBuilder` typeclass, which will be automatically available without import.
  *
  * @group formatting
  */
object ErrorBuilder {
    // $COVERAGE-OFF$
    /** The default error builder used by Parsley, which produces
      * an error as a String. An instance of `DefaultErrorBuilder`.
      *
      * It ''currently'' uses `TillNextWhitespace` as the token extractor, which
      * is trimmed to parser demand. This can be changed without notice.
      */
    implicit val stringError: ErrorBuilder[String] = new DefaultErrorBuilder with tokenextractors.TillNextWhitespace {
        override val trimToParserDemand = true
    }
    // $COVERAGE-ON$
}




© 2015 - 2025 Weber Informatics LLC | Privacy Policy