All Downloads are FREE. Search and download functionalities are using the official Maven repository.

commonMain.org.antlr.v4.kotlinruntime.atn.ParserATNSimulator.kt Maven / Gradle / Ivy

There is a newer version: 1.0.1
Show newest version
// Copyright 2017-present Strumenta and contributors, licensed under Apache 2.0.
// Copyright 2024-present Strumenta and contributors, licensed under BSD 3-Clause.

package org.antlr.v4.kotlinruntime.atn

import com.strumenta.antlrkotlin.runtime.BitSet
import com.strumenta.antlrkotlin.runtime.System
import com.strumenta.antlrkotlin.runtime.assert
import com.strumenta.antlrkotlin.runtime.synchronized
import org.antlr.v4.kotlinruntime.*
import org.antlr.v4.kotlinruntime.dfa.DFA
import org.antlr.v4.kotlinruntime.dfa.DFAState
import org.antlr.v4.kotlinruntime.misc.DoubleKeyMap
import org.antlr.v4.kotlinruntime.misc.Interval
import org.antlr.v4.kotlinruntime.misc.IntervalSet

/**
 * The embodiment of the adaptive LL(*), ALL(*), parsing strategy.
 *
 * The basic complexity of the adaptive strategy makes it harder to understand.
 * We begin with ATN simulation to build paths in a DFA. Subsequent prediction
 * requests go through the DFA first. If they reach a state without an edge for
 * the current symbol, the algorithm fails over to the ATN simulation to
 * complete the DFA path for the current input (until it finds a conflict state
 * or uniquely predicting state).
 *
 * All of that is done without using the outer context because we want to create
 * a DFA that is not dependent upon the rule invocation stack when we do a
 * prediction. One DFA works in all contexts. We avoid using context not
 * necessarily because it's slower, although it can be, but because of the DFA
 * caching problem. The closure routine only considers the rule invocation stack
 * created during prediction beginning in the decision rule. For example, if
 * prediction occurs without invoking another rule's ATN, there are no context
 * stacks in the configurations. When lack of context leads to a conflict, we
 * don't know if it's an ambiguity or a weakness in the strong LL(*) parsing
 * strategy (versus full LL(*)).
 *
 * When SLL yields a configuration set with conflict, we rewind the input and
 * retry the ATN simulation, this time using full outer context without adding
 * to the DFA. Configuration context stacks will be the full invocation stacks
 * from the start rule. If we get a conflict using full context, then we can
 * definitively say we have a true ambiguity for that input sequence. If we
 * don't get a conflict, it implies that the decision is sensitive to the outer
 * context. (It is not context-sensitive in the sense of context-sensitive
 * grammars.)
 *
 * The next time we reach this DFA state with an SLL conflict, through DFA
 * simulation, we will again retry the ATN simulation using full context mode.
 * This is slow because we can't save the results and have to "interpret" the
 * ATN each time we get that input.
 *
 * ### CACHING FULL CONTEXT PREDICTIONS
 *
 * We could cache results from full context to predicted alternative easily and
 * that saves a lot of time but doesn't work in presence of predicates. The set
 * of visible predicates from the ATN start state changes depending on the
 * context, because closure can fall off the end of a rule. I tried to cache
 * tuples (stack context, semantic context, predicted alt) but it was slower
 * than interpreting and much more complicated. Also required a huge amount of
 * memory. The goal is not to create the world's fastest parser anyway. I'd like
 * to keep this algorithm simple. By launching multiple threads, we can improve
 * the speed of parsing across a large number of files.
 *
 * There is no strict ordering between the amount of input used by SLL vs LL,
 * which makes it really hard to build a cache for full context. Let's say that
 * we have input A B C that leads to an SLL conflict with full context X. That
 * implies that using X we might only use A B, but we could also use A B C D to
 * resolve conflict. Input A B C D could predict alternative 1 in one position
 * in the input and A B C E could predict alternative 2 in another position in
 * input. The conflicting SLL configurations could still be non-unique in the
 * full context prediction, which would lead us to requiring more input than the
 * original A B C.	To make a	prediction cache work, we have to track	the exact
 * input	used during the previous prediction. That amounts to a cache that maps
 * X to a specific DFA for that context.
 *
 * Something should be done for left-recursive expression predictions. They are
 * likely LL(1) + pred eval. Easier to do the whole SLL unless error and retry
 * with full LL thing Sam does.
 *
 * ### AVOIDING FULL CONTEXT PREDICTION
 *
 * We avoid doing full context retry when the outer context is empty, we did not
 * dip into the outer context by falling off the end of the decision state rule,
 * or when we force SLL mode.
 *
 * As an example of the not dip into outer context case, consider as super
 * constructor calls versus function calls. One grammar might look like
 * this:
 *
 * ```
 * ctorBody
 * : '{' superCall? stat* '}'
 * ;
 * ```
 *
 * Or, you might see something like
 *
 * ```
 * stat
 * : superCall ';'
 * | expression ';'
 * | ...
 * ;
 * ```
 *
 * In both cases I believe that no closure operations will dip into the outer
 * context. In the first case ctorBody in the worst case will stop at the '}'.
 * In the 2nd case it should stop at the ';'. Both cases should stay within the
 * entry rule and not dip into the outer context.
 *
 * ### PREDICATES
 *
 * Predicates are always evaluated if present in either SLL or LL both. SLL and
 * LL simulation deals with predicates differently. SLL collects predicates as
 * it performs closure operations like ANTLR v3 did. It delays predicate
 * evaluation until it reaches and accept state. This allows us to cache the SLL
 * ATN simulation whereas, if we had evaluated predicates on-the-fly during
 * closure, the DFA state configuration sets would be different, and we couldn't
 * build up a suitable DFA.
 *
 * When building a DFA accept state during ATN simulation, we evaluate any
 * predicates and return the sole semantically valid alternative. If there is
 * more than 1 alternative, we report an ambiguity. If there are 0 alternatives,
 * we throw an exception. Alternatives without predicates act like they have
 * true predicates. The simple way to think about it is to strip away all
 * alternatives with false predicates and choose the minimum alternative that
 * remains.
 *
 * When we start in the DFA and reach an accept state that's predicated, we test
 * those and return the minimum semantically viable alternative. If no
 * alternatives are viable, we throw an exception.
 *
 * During full LL ATN simulation, closure always evaluates predicates and
 * on-the-fly. This is crucial to reducing the configuration set size during
 * closure. It hits a landmine when parsing with the Java grammar, for example,
 * without this on-the-fly evaluation.
 *
 * ### SHARING DFA
 *
 * All instances of the same parser share the same decision DFAs through a
 * static field. Each instance gets its own ATN simulator, but they share the
 * same [.decisionToDFA] field. They also share a
 * [PredictionContextCache] object that makes sure that all
 * [PredictionContext] objects are shared among the DFA states. This makes
 * a big size difference.
 *
 * ### THREAD SAFETY
 *
 * The [ParserATNSimulator] locks on the [decisionToDFA] field when
 * it adds a new DFA object to that array. [addDFAEdge]
 * locks on the DFA for the current decision when setting the
 * [DFAState.edges] field. [addDFAState] locks on
 * the DFA for the current decision when looking up a DFA state to see if it
 * already exists. We must make sure that all requests to add DFA states that
 * are equivalent result in the same shared DFA object. This is because lots of
 * threads will be trying to update the DFA at once. The
 * [addDFAState] method also locks inside the DFA lock
 * but this time on the shared context cache when it rebuilds the
 * configurations' [PredictionContext] objects using cached
 * sub-graphs/nodes. No other locking occurs, even during DFA simulation. This is
 * safe as long as we can guarantee that all threads referencing
 * `s.edge[t]` get the same physical target [DFAState], or
 * `null`. Once into the DFA, the DFA simulation does not reference the
 * [DFA.states] map. It follows the [DFAState.edges] field to new
 * targets. The DFA simulator will either find [DFAState.edges] to be
 * `null`, to be non-`null` and `dfa.edges[t]` null, or
 * `dfa.edges[t]` to be non-null.
 * The [addDFAEdge] method could be racing to set the field
 * but in either case the DFA simulator works; if `null`, and requests ATN
 * simulation. It could also race trying to get `dfa.edges[t]`, but either
 * way it will work because it's not doing a test and set operation.
 *
 * ### Starting with SLL then failing to combined SLL/LL (Two-Stage Parsing)
 *
 * Sam pointed out that if SLL does not give a syntax error, then there is no
 * point in doing full LL, which is slower. We only have to try LL if we get a
 * syntax error. For maximum speed, Sam starts the parser set to pure SLL
 * mode with the [BailErrorStrategy]:
 *
 * ```
 * parser.interpreter.predictionMode = PredictionMode.SLL
 * parser.errorHandler = BailErrorStrategy()
 * ```
 *
 * If it does not get a syntax error, then we're done. If it does get a syntax
 * error, we need to retry with the combined SLL/LL strategy.
 *
 * The reason this works is as follows. If there are no SLL conflicts, then the
 * grammar is SLL (at least for that input set). If there is an SLL conflict,
 * the full LL analysis must yield a set of viable alternatives which is a
 * subset of the alternatives reported by SLL. If the LL set is a singleton,
 * then the grammar is LL but not SLL. If the LL set is the same size as the SLL
 * set, the decision is SLL. If the LL set has size < 1, then that decision
 * is truly ambiguous on the current input. If the LL set is smaller, then the
 * SLL conflict resolution might choose an alternative that the full LL would
 * rule out as a possibility based upon better context information. If that's
 * the case, then the SLL parse will definitely get an error because the full LL
 * analysis says it's not viable. If SLL conflict resolution chooses an
 * alternative within the LL set, them both SLL and LL would choose the same
 * alternative because they both choose the minimum of multiple conflicting
 * alternatives.
 *
 * Let's say we have a set of SLL conflicting alternatives `{1, 2, 3}` and
 * a smaller LL set called *s*. If *s* is `{2, 3}`, then SLL
 * parsing will get an error because SLL will pursue alternative 1. If
 * *s* is `{1, 2}` or `{1, 3}` then both SLL and LL will
 * choose the same alternative because alternative one is the minimum of either
 * set. If *s* is `{2}` or `{3}` then SLL will get a syntax
 * error. If *s* is `{1}` then SLL will succeed.
 *
 * Of course, if the input is invalid, then we will get an error for sure in
 * both SLL and LL parsing. Erroneous input will therefore require 2 passes over
 * the input.
 */
@Suppress("MemberVisibilityCanBePrivate", "PropertyName")
public open class ParserATNSimulator(
  public val parser: Parser?,
  atn: ATN,
  public val decisionToDFA: Array,
  sharedContextCache: PredictionContextCache,
) : ATNSimulator(atn, sharedContextCache) {
  public companion object {
    public var debug: Boolean = false
    public var trace_atn_sim: Boolean = false
    public var dfa_debug: Boolean = false
    public var retry_debug: Boolean = false

    /**
     * Just in case this optimization is bad, add an ENV variable to turn it off.
     */
    public val TURN_OFF_LR_LOOP_ENTRY_BRANCH_OPT: Boolean =
      getSafeEnv("TURN_OFF_LR_LOOP_ENTRY_BRANCH_OPT", "false").toBoolean()

    protected fun getUniqueAlt(configs: ATNConfigSet): Int {
      var alt = ATN.INVALID_ALT_NUMBER

      for (c in configs) {
        if (alt == ATN.INVALID_ALT_NUMBER) {
          // Found first alt
          alt = c.alt
        } else if (c.alt != alt) {
          return ATN.INVALID_ALT_NUMBER
        }
      }

      return alt
    }

    private fun getSafeEnv(envName: String, defaultValue: String? = null): String? =
      try {
        System.getenv(envName, defaultValue)
      } catch (e: Exception) {
        System.err.println(e.toString())
        null
      }
  }

  /**
   * SLL, LL, or LL + exact ambig detection?
   */
  public var predictionMode: PredictionMode = PredictionMode.LL

  /**
   * Each prediction operation uses a cache for merge of prediction contexts.
   *
   * Don't keep around as it wastes huge amounts of memory. [DoubleKeyMap]
   * isn't synchronized, but we're ok since two threads shouldn't reuse same
   * parser/atnsim object because it can only handle one input at a time.
   *
   * This maps graphs a and b to merged result c. (a,b)c. We can avoid
   * the merge if we ever see a and b again. Note that (b,a)c should
   * also be examined during cache lookup.
   */
  protected var mergeCache: DoubleKeyMap? = null

  // LAME globals to avoid parameters!!!!! I need these down deep in predTransition
  protected var _input: TokenStream? = null
  protected var _startIndex: Int = 0
  protected var _outerContext: ParserRuleContext? = null
  protected var _dfa: DFA? = null

  /**
   * Testing only!
   */
  public constructor(
    atn: ATN,
    decisionToDFA: Array,
    sharedContextCache: PredictionContextCache,
  ) : this(null, atn, decisionToDFA, sharedContextCache)

  override fun reset() {
    // Noop
  }

  override fun clearDFA() {
    for (d in decisionToDFA.indices) {
      decisionToDFA[d] = DFA(atn.getDecisionState(d)!!, d)
    }
  }

  public open fun adaptivePredict(
    input: TokenStream,
    decision: Int,
    outerContext: ParserRuleContext?,
  ): Int {
    var tempOuterContext = outerContext

    if (debug || trace_atn_sim) {
      System.out.println(
        "adaptivePredict decision $decision" +
          " exec LA(1)==${getLookaheadName(input)}" +
          " line ${input.LT(1)!!.line}:${input.LT(1)!!.charPositionInLine}"
      )
    }

    _input = input
    _startIndex = input.index()
    _outerContext = tempOuterContext

    val dfa = decisionToDFA[decision]
    _dfa = dfa

    val m = input.mark()
    val index = _startIndex

    // Now we are certain to have a specific decision's DFA
    // But, do we still need an initial state?
    try {
      var s0 = if (dfa.isPrecedenceDfa) {
        // the start state for a precedence DFA depends on the current
        // parser precedence, and is provided by a DFA method.
        dfa.getPrecedenceStartState(parser!!.precedence)
      } else {
        // the start state for a "regular" DFA is just s0
        dfa.s0
      }

      if (s0 == null) {
        if (tempOuterContext == null) {
          tempOuterContext = ParserRuleContext.EMPTY
        }

        val fullCtx = false
        var s0Closure = computeStartState(dfa.atnStartState, ParserRuleContext.EMPTY, fullCtx)

        if (dfa.isPrecedenceDfa) {
          // If this is a precedence DFA, we use applyPrecedenceFilter
          // to convert the computed start state to a precedence start
          // state. We then use DFA.setPrecedenceStartState to set the
          // appropriate start state for the precedence level rather
          // than simply setting DFA.s0.

          // Not used for prediction but useful to know start configs anyway
          dfa.s0!!.configs = s0Closure
          s0Closure = applyPrecedenceFilter(s0Closure)
          s0 = addDFAState(dfa, DFAState(s0Closure))
          dfa.setPrecedenceStartState(parser!!.precedence, s0)
        } else {
          s0 = addDFAState(dfa, DFAState(s0Closure))
          dfa.s0 = s0
        }
      }

      val alt = execATN(dfa, s0, input, index, tempOuterContext!!)

      if (debug) {
        System.out.println("DFA after predictATN: ${dfa.toString(parser!!.vocabulary)}")
      }

      return alt
    } finally {
      // Wack cache after each prediction
      mergeCache = null
      _dfa = null
      input.seek(index)
      input.release(m)
    }
  }

  /**
   * Performs ATN simulation to compute a predicted alternative based
   * upon the remaining input, but also updates the DFA cache to avoid
   * having to traverse the ATN again for the same input sequence.
   *
   * There are some key conditions we're looking for after computing a new
   * set of ATN configs (proposed DFA state):
   *
   * - if the set is empty, there is no viable alternative for current symbol
   * - does the state uniquely predict an alternative?
   * - does the state have a conflict that would prevent us from
   *   putting it on the work list?
   *
   * We also have some key operations to do:
   *
   * - add an edge from previous DFA state to potentially new DFA state, D,
   *   upon current symbol but only if adding to work list, which means in all
   *   cases except no viable alternative (and possibly non-greedy decisions?)
   * - collecting predicates and adding semantic context to DFA accept states
   * - adding rule context to context-sensitive DFA accept states
   * - consuming an input symbol
   * - reporting a conflict
   * - reporting an ambiguity
   * - reporting a context sensitivity
   * - reporting insufficient predicates
   *
   * Cover these cases:
   *
   * - dead end
   * - single alt
   * - single alt + preds
   * - conflict
   * - conflict + preds
   */
  protected open fun execATN(
    dfa: DFA,
    s0: DFAState,
    input: TokenStream,
    startIndex: Int,
    outerContext: ParserRuleContext,
  ): Int {
    if (debug || trace_atn_sim) {
      System.out.println(
        "execATN decision ${dfa.decision}" +
          ", DFA state $s0" +
          ", LA(1)==${getLookaheadName(input)}" +
          ", line ${input.LT(1)!!.line}:${input.LT(1)!!.charPositionInLine}"
      )
    }

    var previousD = s0
    var t = input.LA(1)

    while (true) {
      @Suppress("LocalVariableName")
      val D = getExistingTargetState(previousD, t) ?: computeTargetState(dfa, previousD, t)

      if (D === ERROR) {
        // If any configs in previous dipped into outer context, that
        // means that input up to t actually finished entry rule
        // at least for SLL decision. Full LL doesn't dip into outer
        // so don't need special case.
        // We will get an error no matter what so delay until after
        // decision; better error message. Also, no reachable target
        // ATN states in SLL implies LL will also get nowhere.
        // If conflict in states that dip out, choose min since we
        // will get error no matter what.
        val e = noViableAlt(input, outerContext, previousD.configs, startIndex)
        input.seek(startIndex)

        val alt = getSynValidOrSemInvalidAltThatFinishedDecisionEntryRule(previousD.configs, outerContext)

        if (alt != ATN.INVALID_ALT_NUMBER) {
          return alt
        }

        throw e
      }

      if (D!!.requiresFullContext && predictionMode != PredictionMode.SLL) {
        // IF PREDS, MIGHT RESOLVE TO SINGLE ALT => SLL (or syntax error)
        var conflictingAlts = D.configs.conflictingAlts

        if (D.predicates != null) {
          if (debug) {
            System.out.println("DFA state has preds in DFA sim LL failover")
          }

          val conflictIndex = input.index()

          if (conflictIndex != startIndex) {
            input.seek(startIndex)
          }

          conflictingAlts = evalSemanticContext(D.predicates!!, outerContext, true)

          if (conflictingAlts.cardinality() == 1) {
            if (debug) {
              System.out.println("Full LL avoided")
            }

            return conflictingAlts.nextSetBit(0)
          }

          if (conflictIndex != startIndex) {
            // restore the index so reporting the fallback to full
            // context occurs with the index at the correct spot
            input.seek(conflictIndex)
          }
        }

        if (dfa_debug) {
          System.out.println("ctx sensitive state $outerContext in $D")
        }

        val fullCtx = true
        val s0Closure = computeStartState(dfa.atnStartState, outerContext, fullCtx)
        reportAttemptingFullContext(dfa, conflictingAlts!!, D.configs, startIndex, input.index())

        return execATNWithFullContext(
          dfa = dfa,
          D = D,
          s0 = s0Closure,
          input = input,
          startIndex = startIndex,
          outerContext = outerContext,
        )
      }

      if (D.isAcceptState) {
        if (D.predicates == null) {
          return D.prediction
        }

        val stopIndex = input.index()
        input.seek(startIndex)

        val alts = evalSemanticContext(D.predicates!!, outerContext, true)

        when (alts.cardinality()) {
          0 -> throw noViableAlt(input, outerContext, D.configs, startIndex)
          1 -> return alts.nextSetBit(0)
          else -> {
            // Report ambiguity after predicate evaluation to make sure the correct
            // set of ambig alts is reported.
            reportAmbiguity(dfa, D, startIndex, stopIndex, false, alts, D.configs)
            return alts.nextSetBit(0)
          }
        }
      }

      previousD = D

      if (t != IntStream.EOF) {
        input.consume()
        t = input.LA(1)
      }
    }
  }

  /**
   * Get an existing target state for an edge in the DFA.
   *
   * If the target state for the edge has not yet been computed
   * or is otherwise not available, this method returns `null`.
   *
   * @param previousD The current DFA state
   * @param t The next input symbol
   * @return The existing target DFA state for the given input symbol
   *   [t], or `null` if the target state for this edge is not already cached
   */
  protected open fun getExistingTargetState(previousD: DFAState, t: Int): DFAState? {
    val edges = previousD.edges

    if (edges == null || t + 1 < 0 || t + 1 >= edges.size) {
      return null
    }

    return edges[t + 1]
  }

  /**
   * Compute a target state for an edge in the DFA, and attempt to add the
   * computed state and corresponding edge to the DFA.
   *
   * @param dfa The DFA
   * @param previousD The current DFA state
   * @param t The next input symbol
   * @return The computed target DFA state for the given input symbol [t].
   *   If [t] does not lead to a valid DFA state, this method returns [ATNSimulator.ERROR]
   */
  protected open fun computeTargetState(dfa: DFA, previousD: DFAState, t: Int): DFAState? {
    val reach = computeReachSet(previousD.configs, t, false)

    if (reach == null) {
      addDFAEdge(dfa, previousD, t, ERROR)
      return ERROR
    }

    // Create new target state, we'll add to DFA after it's complete
    @Suppress("LocalVariableName")
    val D = DFAState(reach)
    val predictedAlt = getUniqueAlt(reach)

    if (debug) {
      val altSubSets = PredictionMode.getConflictingAltSubsets(reach)
      System.out.println(
        "SLL altSubSets=$altSubSets" +
          ", configs=$reach" +
          ", predict=$predictedAlt" +
          ", allSubsetsConflict=${PredictionMode.allSubsetsConflict(altSubSets)}" +
          ", conflictingAlts=${getConflictingAlts(reach)}"
      )
    }

    if (predictedAlt != ATN.INVALID_ALT_NUMBER) {
      // NO CONFLICT, UNIQUELY PREDICTED ALT
      D.isAcceptState = true
      D.configs.uniqueAlt = predictedAlt
      D.prediction = predictedAlt
    } else if (PredictionMode.hasSLLConflictTerminatingPrediction(predictionMode, reach)) {
      // MORE THAN ONE VIABLE ALTERNATIVE
      D.configs.conflictingAlts = getConflictingAlts(reach)
      D.requiresFullContext = true

      // In SLL-only mode, we will stop at this state and return the minimum alt
      D.isAcceptState = true
      D.prediction = D.configs.conflictingAlts!!.nextSetBit(0)
    }

    if (D.isAcceptState && D.configs.hasSemanticContext) {
      predicateDFAState(D, atn.getDecisionState(dfa.decision)!!)

      if (D.predicates != null) {
        D.prediction = ATN.INVALID_ALT_NUMBER
      }
    }

    // All adds to dfa are done after we've created full D state
    return addDFAEdge(dfa, previousD, t, D)
  }

  protected open fun predicateDFAState(dfaState: DFAState, decisionState: DecisionState) {
    // We need to test all predicates, even in DFA states that
    // uniquely predict alternative.
    val nAlts = decisionState.numberOfTransitions

    // Update DFA so reach becomes accept state with (predicate,alt)
    // pairs if preds found for conflicting alts
    val altsToCollectPredsFrom = getConflictingAltsOrUniqueAlt(dfaState.configs)
    val altToPred = getPredsForAmbigAlts(altsToCollectPredsFrom, dfaState.configs, nAlts)

    if (altToPred != null) {
      dfaState.predicates = getPredicatePredictions(altsToCollectPredsFrom, altToPred)
      dfaState.prediction = ATN.INVALID_ALT_NUMBER // Make sure we use preds
    } else {
      // There are preds in configs, but they might go away
      // when OR'd together like {p}? || NONE == NONE.
      // If neither alt has preds, resolve to min alt
      dfaState.prediction = altsToCollectPredsFrom.nextSetBit(0)
    }
  }

  // Comes back with reach.uniqueAlt set to a valid alt
  @Suppress("LocalVariableName")
  protected open fun execATNWithFullContext(
    dfa: DFA,
    D: DFAState, // How far we got in SLL DFA before failing over
    s0: ATNConfigSet,
    input: TokenStream,
    startIndex: Int,
    outerContext: ParserRuleContext,
  ): Int {
    if (debug || trace_atn_sim) {
      System.out.println("execATNWithFullContext $s0")
    }

    val fullCtx = true
    var foundExactAmbig = false
    var reach: ATNConfigSet?
    var previous = s0
    input.seek(startIndex)

    var t = input.LA(1)
    var predictedAlt: Int

    while (true) {
      reach = computeReachSet(previous, t, fullCtx)

      if (reach == null) {
        // If any configs in previous dipped into outer context, that
        // means that input up to t actually finished entry rule
        // at least for LL decision. Full LL doesn't dip into outer
        // so don't need special case.
        // We will get an error no matter what so delay until after
        // decision; better error message. Also, no reachable target
        // ATN states in SLL implies LL will also get nowhere.
        // If conflict in states that dip out, choose min since we
        // will get error no matter what.
        val e = noViableAlt(input, outerContext, previous, startIndex)
        input.seek(startIndex)

        val alt = getSynValidOrSemInvalidAltThatFinishedDecisionEntryRule(previous, outerContext)

        if (alt != ATN.INVALID_ALT_NUMBER) {
          return alt
        }

        throw e
      }

      val altSubSets = PredictionMode.getConflictingAltSubsets(reach)

      if (debug) {
        System.out.println(
          "LL altSubSets=$altSubSets" +
            ", predict=${PredictionMode.getUniqueAlt(altSubSets)}" +
            ", resolvesToJustOneViableAlt=${PredictionMode.resolvesToJustOneViableAlt(altSubSets)}"
        )
      }

      reach.uniqueAlt = getUniqueAlt(reach)

      // Unique prediction?
      if (reach.uniqueAlt != ATN.INVALID_ALT_NUMBER) {
        predictedAlt = reach.uniqueAlt
        break
      }

      if (predictionMode != PredictionMode.LL_EXACT_AMBIG_DETECTION) {
        predictedAlt = PredictionMode.resolvesToJustOneViableAlt(altSubSets)

        if (predictedAlt != ATN.INVALID_ALT_NUMBER) {
          break
        }
      } else {
        // In exact ambiguity mode, we never try to terminate early.
        // Just keeps scarfing until we know what the conflict is
        if (PredictionMode.allSubsetsConflict(altSubSets) && PredictionMode.allSubsetsEqual(altSubSets)) {
          foundExactAmbig = true
          predictedAlt = PredictionMode.getSingleViableAlt(altSubSets)
          break
        }

        // Else there are multiple non-conflicting subsets, or
        // we're not sure what the ambiguity is yet.
        // So, keep going.
      }

      previous = reach

      if (t != IntStream.EOF) {
        input.consume()
        t = input.LA(1)
      }
    }

    // If the configuration set uniquely predicts an alternative,
    // without conflict, then we know that it's a full LL decision
    // not SLL.
    if (reach!!.uniqueAlt != ATN.INVALID_ALT_NUMBER) {
      reportContextSensitivity(dfa, predictedAlt, reach, startIndex, input.index())
      return predictedAlt
    }

    // We do not check predicates here because we have checked them
    // on-the-fly when doing full context prediction.

    // In non-exact ambiguity detection mode, we might	actually be able to
    // detect an exact ambiguity, but I'm not going to spend the cycles
    // needed to check. We only emit ambiguity warnings in exact ambiguity
    // mode.
    //
    // For example, we might know that we have conflicting configurations.
    // But, that does not mean that there is no way forward without a
    // conflict. It's possible to have non-conflicting alt subsets as in:
    //
    //    LL altSubSets=[{1, 2}, {1, 2}, {1}, {1, 2}]
    //
    // from
    //
    //    [(17,1,[5 $]), (13,1,[5 10 $]), (21,1,[5 10 $]), (11,1,[$]),
    //     (13,2,[5 10 $]), (21,2,[5 10 $]), (11,2,[$])]
    //
    // In this case, (17,1,[5 $]) indicates there is some next sequence that
    // would resolve this without conflict to alternative 1. Any other viable
    // next sequence, however, is associated with a conflict. We stop
    // looking for input because no amount of further lookahead will alter
    // the fact that we should predict alternative 1. We just can't say for
    // sure that there is an ambiguity without looking further.
    reportAmbiguity(
      dfa = dfa,
      D = D,
      startIndex = startIndex,
      stopIndex = input.index(),
      exact = foundExactAmbig,
      ambigAlts = reach.alts,
      configs = reach,
    )

    return predictedAlt
  }

  protected open fun computeReachSet(closure: ATNConfigSet, t: Int, fullCtx: Boolean): ATNConfigSet? {
    if (debug) {
      System.out.println("in computeReachSet, starting closure: $closure")
    }

    if (mergeCache == null) {
      mergeCache = DoubleKeyMap()
    }

    val intermediate = ATNConfigSet(fullCtx)

    // Configurations already in a rule stop state indicate reaching the end
    // of the decision rule (local context) or end of the start rule (full
    // context). Once reached, these configurations are never updated by a
    // closure operation, so they are handled separately for the performance
    // advantage of having a smaller intermediate set when calling closure.
    //
    // For full-context reach operations, separate handling is required to
    // ensure that the alternative matching the longest overall sequence is
    // chosen when multiple such configurations can match the input.
    var skippedStopStates: MutableList? = null

    // First figure out where we can reach on input t
    for (c in closure) {
      if (debug) {
        System.out.println("testing ${getTokenName(t)} at $c")
      }

      if (c.state is RuleStopState) {
        assert(c.context!!.isEmpty)

        if (fullCtx || t == IntStream.EOF) {
          if (skippedStopStates == null) {
            skippedStopStates = ArrayList()
          }

          skippedStopStates.add(c)
        }

        continue
      }

      val n = c.state.numberOfTransitions

      // For each transition
      for (ti in 0..()
      val treatEofAsEpsilon = t == Token.EOF

      for (c in intermediate) {
        closure(c, reach, closureBusy, false, fullCtx, treatEofAsEpsilon)
      }
    }

    if (t == IntStream.EOF) {
      // After consuming EOF no additional input is possible, so we are
      // only interested in configurations which reached the end of the
      // decision rule (local context) or end of the start rule (full
      // context). Update reach to contain only these configurations. This
      // handles both explicit EOF transitions in the grammar and implicit
      // EOF transitions following the end of the decision or start rule.
      //
      // When reach==intermediate, no closure operation was performed. In
      // this case, removeAllConfigsNotInRuleStopState needs to check for
      // reachable rule stop states as well as configurations already in
      // a rule stop state.
      //
      // This is handled before the configurations in skippedStopStates,
      // because any configurations potentially added from that list are
      // already guaranteed to meet this condition whether it's
      // required.
      reach = removeAllConfigsNotInRuleStopState(reach, reach === intermediate)
    }

    // If skippedStopStates is not null, then it contains at least one
    // configuration. For full-context reach operations, these
    // configurations reached the end of the start rule, in which case we
    // only add them back to reach if no configuration during the current
    // closure operation reached such a state. This ensures adaptivePredict
    // chooses an alternative matching the longest overall sequence when
    // multiple alternatives are viable.
    if (skippedStopStates != null && (!fullCtx || !PredictionMode.hasConfigInRuleStopState(reach))) {
      assert(skippedStopStates.isNotEmpty())

      for (c in skippedStopStates) {
        reach.add(c, mergeCache)
      }
    }

    if (trace_atn_sim) {
      System.out.println("computeReachSet $closure -> $reach")
    }

    return if (reach.isEmpty()) {
      null
    } else {
      reach
    }
  }

  /**
   * Return a configuration set containing only the configurations from
   * [configs] which are in a [RuleStopState].
   *
   * If all configurations in [configs] are already in a rule stop state, this
   * method simply returns [configs].
   *
   * When [lookToEndOfRule] is `true`, this method uses
   * [ATN.nextTokens] for each configuration in [configs] which is
   * not already in a rule stop state to see if a rule stop state is reachable
   * from the configuration via epsilon-only transitions.
   *
   * @param configs The configuration set to update
   * @param lookToEndOfRule When `true`, this method checks for rule stop states
   *   reachable by epsilon-only transitions from each configuration in [configs]
   * @return [configs] if all configurations in [configs] are in a
   *   rule stop state, otherwise return a new configuration set containing only
   *   the configurations from [configs] which are in a rule stop state
   */
  protected open fun removeAllConfigsNotInRuleStopState(configs: ATNConfigSet, lookToEndOfRule: Boolean): ATNConfigSet {
    if (PredictionMode.allConfigsInRuleStopStates(configs)) {
      return configs
    }

    val result = ATNConfigSet(configs.fullCtx)

    for (config in configs) {
      if (config.state is RuleStopState) {
        result.add(config, mergeCache)
        continue
      }

      if (lookToEndOfRule && config.state.onlyHasEpsilonTransitions()) {
        val nextTokens = atn.nextTokens(config.state)

        if (nextTokens.contains(Token.EPSILON)) {
          val endOfRuleState = atn.ruleToStopState!![config.state.ruleIndex]
          result.add(ATNConfig(config, endOfRuleState!!), mergeCache)
        }
      }
    }

    return result
  }

  protected open fun computeStartState(p: ATNState, ctx: RuleContext, fullCtx: Boolean): ATNConfigSet {
    // Always at least the implicit call to start rule
    val initialContext = PredictionContext.fromRuleContext(atn, ctx)
    val configs = ATNConfigSet(fullCtx)

    if (trace_atn_sim) {
      System.out.println("computeStartState from ATN state $p initialContext=${initialContext.toString(parser!!)}")
    }

    for (i in 0..()
      closure(c, configs, closureBusy, true, fullCtx, false)
    }

    return configs
  }

  // parrt internal source brain dump that doesn't mess up
  // external API spec.
  // context-sensitive in that they can only be properly evaluated
  // in the context of the proper prec argument. Without pruning,
  // these predicates are normal predicates evaluated when we reach
  // conflict state (or unique prediction). As we cannot evaluate
  // these predicates out of context, the resulting conflict leads
  // to full LL evaluation and nonlinear prediction which shows up
  // very clearly with fairly large expressions.
  //
  // Example grammar:
  //
  // e : e '*' e
  //   | e '+' e
  //   | INT
  //   ;
  //
  // We convert that to the following:
  //
  // e[int prec]
  //   :   INT
  //     ( {3>=prec}? '*' e[4]
  //     | {2>=prec}? '+' e[3]
  //     )*
  //   ;
  //
  // The (..)* loop has a decision for the inner block as well as
  // an enter or exit decision, which is what concerns us here. At
  // the 1st + of input 1+2+3, the loop entry sees both predicates
  // and the loop exit also sees both predicates by falling off the
  // edge of e. This is because we have no stack information with
  // SLL and find the follow of e, which will hit the return states
  // inside the loop after e[4] and e[3], which brings it back to
  // the enter or exit decision. In this case, we know that we
  // cannot evaluate those predicates because we have fallen off
  // the edge of the stack and will in general not know which prec
  // parameter is the right one to use in the predicate.
  //
  // Because we have special information, that these are precedence
  // predicates, we can resolve them without failing over to full
  // LL despite their context-sensitive nature. We make an
  // assumption that prec[-1] <= prec[0], meaning that the current
  // precedence level is greater than or equal to the precedence
  // level of recursive invocations above us in the stack. For
  // example, if predicate {3>=prec}? is true of the current prec,
  // then one option is to enter the loop to match it now. The
  // other option is to exit the loop and the left recursive rule
  // to match the current operator in rule invocation further up
  // the stack. But, we know that all of those prec are lower or
  // the same value, and so we can decide to enter the loop instead
  // of matching it later. That means we can strip out the other
  // configuration for the exit branch.
  //
  // So imagine we have (14,1,$,{2>=prec}?) and then
  // (14,2,$-dipsIntoOuterContext,{2>=prec}?). The optimization
  // allows us to collapse these two configurations. We know that
  // if {2>=prec}? is true for the current prec parameter, it will
  // also be true for any prec from an invoking e call, indicated
  // by dipsIntoOuterContext. As the predicates are both true, we
  // have the option to evaluate them early in the decision start
  // state. We do this by stripping both predicates and choosing to
  // enter the loop as it is consistent with the notion of operator
  // precedence. It's also how the full LL conflict resolution
  // would work.
  //
  // The solution requires a different DFA start state for each
  // precedence level.
  //
  // The basic filter mechanism is to remove configurations of the
  // form (p, 2, pi) if (p, 1, pi) exists for the same p and pi. In
  // other words, for the same ATN state and predicate context,
  // remove any configuration associated with an exit branch if
  // there is a configuration associated with the enter branch.
  //
  // It's also the case that the filter evaluates precedence
  // predicates and resolves conflicts according to precedence
  // levels. For example, for input 1+2+3 at the first +, we see
  // prediction filtering
  //
  // [(11,1,[$],{3>=prec}?), (14,1,[$],{2>=prec}?), (5,2,[$],up=1),
  //  (11,2,[$],up=1), (14,2,[$],up=1)],hasSemanticContext=true,dipsIntoOuterContext
  //
  // to
  //
  // [(11,1,[$]), (14,1,[$]), (5,2,[$],up=1)],dipsIntoOuterContext
  //
  // This filters because {3>=prec}? evals to true and collapses
  // (11,1,[$],{3>=prec}?) and (11,2,[$],up=1) since early conflict
  // resolution based upon rules of operator precedence fits with
  // our usual match first alt upon conflict.
  //
  // We noticed a problem where a recursive call resets precedence
  // to 0. Sam's fix: each config has flag indicating if it has
  // returned from an expr[0] call. then just don't filter any
  // config with that flag set. flag is carried along in
  // closure(). So to avoid adding field, set bit just under sign
  // bit of dipsIntoOuterContext (SUPPRESS_PRECEDENCE_FILTER).
  // With the change you filter "unless (p, 2, pi) was reached
  // after leaving the rule stop state of the LR rule containing
  // state p, corresponding to a rule invocation with precedence
  // level 0"

  /**
   * This method transforms the start state computed by
   * [computeStartState] to the special start state used by a
   * precedence DFA for a particular precedence value.
   *
   * The transformation process applies the following changes to
   * the start state's configuration set.
   *
   * 1. Evaluate the precedence predicates for each configuration using
   *    [SemanticContext.evalPrecedence].
   * 2. When [ATNConfig.isPrecedenceFilterSuppressed] is `false`,
   *    remove all configurations which predict an alternative greater than `1`,
   *    for which another configuration that predicts alternative `1` is in the
   *    same ATN state with the same prediction context. This transformation is
   *    valid for the following reasons:
   *
   *    * The closure block cannot contain any epsilon transitions which bypass
   *      the body of the closure, so all states reachable via alternative `1` are
   *      part of the precedence alternatives of the transformed left-recursive
   *      rule
   *    * The "primary" portion of a left recursive rule cannot contain an
   *      epsilon transition, so the only way an alternative other than `1` can exist
   *      in a state that is also reachable via alternative `1` is by nesting calls
   *      to the left-recursive rule, with the outer calls not being at the
   *      preferred precedence level.
   *      The [ATNConfig.isPrecedenceFilterSuppressed] property marks ATN
   *      configurations which do not meet this condition, and therefore are not
   *      eligible for elimination during the filtering process.
   *
   * The prediction context must be considered by this filter to address
   * situations like the following.
   *
   * ```
   * grammar TA;
   * prog: statement* EOF;
   * statement: letterA | statement letterA 'b' ;
   * letterA: 'a';
   * ```
   *
   * If the above grammar, the ATN state immediately before the token
   * reference `'a'` in `letterA` is reachable from the left edge
   * of both the primary and closure blocks of the left-recursive rule
   * `statement`. The prediction context associated with each of these
   * configurations distinguishes between them, and prevents the alternative
   * which stepped out to `prog` (and then back in to `statement`
   * from being eliminated by the filter).
   *
   * @param configs The configuration set computed by
   *   [computeStartState] as the start state for the DFA
   * @return The transformed configuration set representing the start state
   *   for a precedence DFA at a particular precedence level (determined by
   *   calling [Parser.precedence])
   */
  public open fun applyPrecedenceFilter(configs: ATNConfigSet): ATNConfigSet {
    val statesFromAlt1 = HashMap()
    val configSet = ATNConfigSet(configs.fullCtx)

    for (config in configs) {
      // Handle alt 1 first
      if (config.alt != 1) {
        continue
      }

      val updatedContext = config.semanticContext.evalPrecedence(parser!!, _outerContext!!)
        ?: continue // The configuration was eliminated

      statesFromAlt1[config.state.stateNumber] = config.context!!

      if (updatedContext !== config.semanticContext) {
        configSet.add(ATNConfig(config, updatedContext), mergeCache)
      } else {
        configSet.add(config, mergeCache)
      }
    }

    for (config in configs) {
      if (config.alt == 1) {
        // Already handled
        continue
      }

      if (!config.isPrecedenceFilterSuppressed) {
        // In the future, this elimination step could be updated to also
        // filter the prediction context for alternatives predicting alt>1
        // (basically a graph subtraction algorithm).
        val context = statesFromAlt1[config.state.stateNumber]

        if (context != null && context == config.context) {
          // Eliminated
          continue
        }
      }

      configSet.add(config, mergeCache)
    }

    return configSet
  }

  protected open fun getReachableTarget(trans: Transition, ttype: Int): ATNState? =
    if (trans.matches(ttype, 0, atn.maxTokenType)) {
      trans.target
    } else {
      null
    }

  public open fun getPredsForAmbigAlts(
    ambigAlts: BitSet,
    configs: ATNConfigSet,
    nAlts: Int,
  ): Array? {
    // REACH=[1|1|[]|0:0, 1|2|[]|0:1]
    //
    // altToPred starts as an array of all null contexts. The entry at index i
    // corresponds to alternative i. altToPred[i] may have one of three values:
    //   1. null: no ATNConfig c is found such that c.alt==i
    //   2. SemanticContext.NONE: At least one ATNConfig c exists such that
    //      c.alt==i and c.semanticContext==SemanticContext.NONE. In other words,
    //      alt i has at least one unpredicated config.
    //   3. Non-NONE Semantic Context: There exists at least one, and for all
    //      ATNConfig c such that c.alt==i, c.semanticContext!=SemanticContext.NONE.
    //
    // From this, it is clear that NONE||anything==NONE.
    val altToPred = arrayOfNulls(nAlts + 1)

    for (c in configs) {
      if (ambigAlts.get(c.alt)) {
        altToPred[c.alt] = SemanticContext.or(altToPred[c.alt], c.semanticContext)
      }
    }

    var nPredAlts = 0

    for (i in 1..nAlts) {
      if (altToPred[i] == null) {
        altToPred[i] = SemanticContext.Empty.Instance
      } else if (altToPred[i] !== SemanticContext.Empty.Instance) {
        nPredAlts++
      }
    }

    // non-ambig alts are null in altToPred
    if (nPredAlts == 0) {
      if (debug) {
        System.out.println("getPredsForAmbigAlts result null")
      }

      return null
    }

    if (debug) {
      System.out.println("getPredsForAmbigAlts result ${altToPred.joinToString()}")
    }

    return altToPred
  }

  protected open fun getPredicatePredictions(
    ambigAlts: BitSet?,
    altToPred: Array,
  ): Array? {
    val pairs = ArrayList()
    var containsPredicate = false

    for (i in 1.. 0) {
      alt = getAltThatFinishedDecisionEntryRule(semInvalidConfigs)

      if (alt != ATN.INVALID_ALT_NUMBER) {
        // Syntactically viable path exists
        return alt
      }
    }

    return ATN.INVALID_ALT_NUMBER
  }

  protected open fun getAltThatFinishedDecisionEntryRule(configs: ATNConfigSet): Int {
    val alts = IntervalSet()

    for (c in configs) {
      if (c.outerContextDepth > 0 || c.state is RuleStopState && c.context!!.hasEmptyPath()) {
        alts.add(c.alt)
      }
    }

    if (alts.size() == 0) {
      return ATN.INVALID_ALT_NUMBER
    }

    return alts.minElement
  }

  /**
   * Walk the list of configurations and split them according to
   * those that have preds evaluating to true/false.
   *
   * If no pred, assume true pred and include in succeeded set.
   * Returns a pair of sets.
   *
   * Create a new set so as not to alter the incoming parameter.
   *
   * Assumption: the input stream has been restored to the starting point
   * prediction, which is where predicates need to evaluate.
   */
  protected fun splitAccordingToSemanticValidity(
    configs: ATNConfigSet,
    outerContext: ParserRuleContext,
  ): Pair {
    val succeeded = ATNConfigSet(configs.fullCtx)
    val failed = ATNConfigSet(configs.fullCtx)

    for (c in configs) {
      if (c.semanticContext !== SemanticContext.Empty.Instance) {
        val predicateEvaluationResult = evalSemanticContext(c.semanticContext, outerContext, c.alt, configs.fullCtx)

        if (predicateEvaluationResult) {
          succeeded.add(c)
        } else {
          failed.add(c)
        }
      } else {
        succeeded.add(c)
      }
    }

    return Pair(succeeded, failed)
  }

  /**
   * Look through a list of predicate/alt pairs, returning alts for the
   * pairs that win.
   *
   * A `NONE` predicate indicates an alt containing an
   * unpredicated config which behaves as "always true".
   *
   * If [complete] is `false`, we stop at the first predicate that evaluates to `true`.
   * This includes pairs with `null` predicates.
   */
  public open fun evalSemanticContext(
    predPredictions: Array,
    outerContext: ParserRuleContext,
    complete: Boolean,
  ): BitSet {
    val predictions = BitSet()

    for (pair in predPredictions) {
      if (pair.pred === SemanticContext.Empty.Instance) {
        predictions.set(pair.alt)

        if (!complete) {
          break
        }

        continue
      }

      val fullCtx = false // In dfa
      val predicateEvaluationResult = evalSemanticContext(pair.pred, outerContext, pair.alt, fullCtx)

      if (debug || dfa_debug) {
        System.out.println("eval pred $pair=$predicateEvaluationResult")
      }

      if (predicateEvaluationResult) {
        if (debug || dfa_debug) {
          System.out.println("PREDICT ${pair.alt}")
        }

        predictions.set(pair.alt)

        if (!complete) {
          break
        }
      }
    }

    return predictions
  }

  /**
   * Evaluate a semantic context within a specific parser context.
   *
   * This method might not be called for every semantic context evaluated
   * during the prediction process. In particular, we currently do not
   * evaluate the following, but it may change in the future:
   *
   * * Precedence predicates (represented by [SemanticContext.PrecedencePredicate])
   *   are not currently evaluated through this method
   * * Operator predicates (represented by [SemanticContext.AND] and
   *   [SemanticContext.OR]) are evaluated as a single semantic
   *   context, rather than evaluating the operands individually.
   *   Implementations which require evaluation results from individual
   *   predicates should override this method to explicitly handle evaluation of
   *   the operands within operator predicates.
   *
   * @param pred The semantic context to evaluate
   * @param parserCallStack The parser context in which to evaluate the
   *   semantic context
   * @param alt The alternative which is guarded by [pred]
   * @param fullCtx `true` if the evaluation is occurring during LL
   *   prediction, otherwise `false` if the evaluation is occurring
   *   during SLL prediction
   *
   * @since 4.3
   */
  protected open fun evalSemanticContext(
    pred: SemanticContext,
    parserCallStack: ParserRuleContext?,
    alt: Int,
    fullCtx: Boolean,
  ): Boolean =
    pred.eval(parser!!, parserCallStack!!)

  // TODO: If we are doing predicates, there is no point in pursuing
  //  closure operations if we reach a DFA state that uniquely predicts
  //  alternative. We will not be caching that DFA state and it is a
  //  waste to pursue the closure. Might have to advance when we do
  //  ambig detection thought :(

  protected open fun closure(
    config: ATNConfig,
    configs: ATNConfigSet,
    closureBusy: MutableSet,
    collectPredicates: Boolean,
    fullCtx: Boolean,
    treatEofAsEpsilon: Boolean,
  ) {
    val initialDepth = 0
    closureCheckingStopState(
      config = config,
      configs = configs,
      closureBusy = closureBusy,
      collectPredicates = collectPredicates,
      fullCtx = fullCtx,
      depth = initialDepth,
      treatEofAsEpsilon = treatEofAsEpsilon,
    )

    assert(!fullCtx || !configs.dipsIntoOuterContext)
  }

  protected open fun closureCheckingStopState(
    config: ATNConfig,
    configs: ATNConfigSet,
    closureBusy: MutableSet,
    collectPredicates: Boolean,
    fullCtx: Boolean,
    depth: Int,
    treatEofAsEpsilon: Boolean,
  ) {
    if (trace_atn_sim) {
      System.out.println("closure(${config.toString(parser, true)})")
    }

    if (config.state is RuleStopState) {
      // We hit rule end. If we have context info, use it
      // run through all possible stack tops in ctx
      if (!config.context!!.isEmpty) {
        for (i in 0.. Int.MIN_VALUE)
          closureCheckingStopState(
            config = c,
            configs = configs,
            closureBusy = closureBusy,
            collectPredicates = collectPredicates,
            fullCtx = fullCtx,
            depth = depth - 1,
            treatEofAsEpsilon = treatEofAsEpsilon,
          )
        }

        return
      } else if (fullCtx) {
        // Reached end of start rule
        configs.add(config, mergeCache)
        return
      } else {
        // Else if we have no context info, just chase follow links (if greedy)
        if (debug) {
          System.out.println("FALLING off rule ${getRuleName(config.state.ruleIndex)}")
        }
      }
    }

    closure_(
      config = config,
      configs = configs,
      closureBusy = closureBusy,
      collectPredicates = collectPredicates,
      fullCtx = fullCtx,
      depth = depth,
      treatEofAsEpsilon = treatEofAsEpsilon,
    )
  }

  /**
   * Do the actual work of walking epsilon edges.
   */
  @Suppress("FunctionName")
  protected open fun closure_(
    config: ATNConfig,
    configs: ATNConfigSet,
    closureBusy: MutableSet,
    collectPredicates: Boolean,
    fullCtx: Boolean,
    depth: Int,
    treatEofAsEpsilon: Boolean,
  ) {
    val p = config.state

    // Optimization
    if (!p.onlyHasEpsilonTransitions()) {
      configs.add(config, mergeCache)
      // Make sure to not return here, because EOF transitions can act as
      // both epsilon transitions and non-epsilon transitions.
    }

    for (i in 0.. 0.

          if (_dfa != null && _dfa!!.isPrecedenceDfa) {
            val outermostPrecedenceReturn = (t as EpsilonTransition).outermostPrecedenceReturn

            if (outermostPrecedenceReturn == _dfa!!.atnStartState.ruleIndex) {
              c.isPrecedenceFilterSuppressed = true
            }
          }

          c.reachesIntoOuterContext++

          if (!closureBusy.add(c)) {
            // Avoid infinite recursion for right-recursive rules
            continue
          }

          // TODO: can remove? only care when we add to set per middle of this method
          configs.dipsIntoOuterContext = true
          assert(newDepth > Int.MIN_VALUE)
          newDepth--

          if (debug) {
            System.out.println("dips into outer ctx: $c")
          }
        } else {
          if (!t.isEpsilon && !closureBusy.add(c)) {
            // Avoid infinite recursion for EOF* and EOF+
            continue
          }

          if (t is RuleTransition) {
            // Latch when newDepth goes negative - once we step out of the entry context we can't return
            if (newDepth >= 0) {
              newDepth++
            }
          }
        }

        closureCheckingStopState(
          config = c,
          configs = configs,
          closureBusy = closureBusy,
          collectPredicates = continueCollecting,
          fullCtx = fullCtx,
          depth = newDepth,
          treatEofAsEpsilon = treatEofAsEpsilon,
        )
      }
    }
  }

  /**
   * Implements first-edge (loop entry) elimination as an optimization
   * during closure operations.
   *
   * See [antlr4#1398](https://github.com/antlr/antlr4/issues/1398).
   *
   * The optimization is to avoid adding the loop entry config when
   * the exit path can only lead back to the same
   * [StarLoopEntryState] after popping context at the rule end state
   * (traversing only epsilon edges, so we're still in closure, in
   * this same rule).
   *
   * We need to detect any state that can reach loop entry on
   * epsilon w/o exiting rule. We don't have to look at `FOLLOW`
   * links, just ensure that all stack tops for config refer to key
   * states in `LR` rule.
   *
   * To verify we are in the right situation we must first check
   * closure is at a [StarLoopEntryState] generated during `LR` removal.
   * Then we check that each stack top of context is a return state
   * from one of these cases:
   *
   * 1. 'not' expr, '(' type ')' expr. The return state points at loop entry state
   * 2. expr op expr. The return state is the block end of internal block of `(...)*`
   * 3. 'between' expr 'and' expr. The return state of 2nd expr reference.
   *    That state points at block end of internal block of (...)*
   * 4. expr '?' expr ':' expr. The return state points at block end,
   *    which points at loop entry state
   *
   * If any is true for each stack top, then [closure] does not add a
   * config to the current config set for `edge[0]`, the loop entry branch.
   *
   * Conditions fail if any context for the current config is:
   *
   * - empty (we'd fall out of expr to do a global `FOLLOW` which could
   *   even be to some weird spot in expr) or,
   * - lies outside expr or,
   * - lies within expr but at a state not the [BlockEndState]
   *   generated during LR removal
   *
   * Do we need to evaluate predicates ever in closure for this case?
   *
   * No. Predicates, including precedence predicates, are only
   * evaluated when computing a DFA start state. I.e., only before
   * the lookahead (but not parser) consumes a token.
   *
   * There are no epsilon edges allowed in `LR` rule alt blocks or in
   * the "primary" part (`ID` here). If closure is in
   * [StarLoopEntryState] any lookahead operation will have consumed a
   * token as there are no epsilon-paths that lead to
   * [StarLoopEntryState]. We do not have to evaluate predicates
   * therefore if we are in the generated [StarLoopEntryState] of an `LR`
   * rule. Note that when making a prediction starting at that
   * decision point, decision `d=2`, compute-start-state performs
   * closure starting at `edges[0]`, `edges[1]` emanating from
   * [StarLoopEntryState]. That means it is not performing closure on
   * [StarLoopEntryState] during compute-start-state.
   *
   * How do we know this always gives same prediction answer?
   *
   * Without predicates, loop entry and exit paths are ambiguous
   * upon remaining input `+b` (in, say, `a+b`). Either paths lead to
   * valid parses. Closure can lead to consuming `+` immediately or by
   * falling out of this call to expr back into expr and loop back
   * again to [StarLoopEntryState] to match `+b`. In this special case,
   * we choose the more efficient path, which is to take the bypass
   * path.
   *
   * The lookahead language has not changed because closure chooses
   * one path over the other. Both paths lead to consuming the same
   * remaining input during a lookahead operation. If the next token
   * is an operator, lookahead will enter the choice block with
   * operators. If it is not, lookahead will exit expr. Same as if
   * closure had chosen to enter the choice block immediately.
   *
   * Closure is examining one config (some loopentrystate, some alt,
   * context) which means it is considering exactly one alt. Closure
   * always copies the same alt to any derived configs.
   *
   * How do we know this optimization doesn't mess up precedence in
   * our parse trees?
   *
   * Looking through expr from left edge of stat only has to confirm
   * that an input, say, `a+b+c`, begins with any valid interpretation
   * of an expression. The precedence actually doesn't matter when
   * making a decision in stat seeing through expr. It is only when
   * parsing rule expr that we must use the precedence to get the
   * right interpretation and, hence, parse tree.
   *
   * @since 4.6
   */
  protected open fun canDropLoopEntryEdgeInLeftRecursiveRule(config: ATNConfig): Boolean {
    if (TURN_OFF_LR_LOOP_ENTRY_BRANCH_OPT) {
      return false
    }

    val p = config.state

    // First check to see if we are in StarLoopEntryState generated during
    // left-recursion elimination. For efficiency, also check if
    // the context has an empty stack case. If so, it would mean
    // global FOLLOW, so we can't perform optimization
    if (
      p.stateType != ATNState.STAR_LOOP_ENTRY ||
      !(p as StarLoopEntryState).isPrecedenceDecision ||  // Are we the special loop entry/exit state?
      config.context!!.isEmpty ||                         // If SLL wildcard
      config.context!!.hasEmptyPath()
    ) {
      return false
    }

    // Require all return states to return back to the same rule
    // that p is in.
    val numCtxs = config.context!!.size()

    // For each stack context
    for (i in 0..= 0) {
      return parser.ruleNames[index]
    }

    return ""
  }

  protected open fun getEpsilonTarget(
    config: ATNConfig,
    t: Transition,
    collectPredicates: Boolean,
    inContext: Boolean,
    fullCtx: Boolean,
    treatEofAsEpsilon: Boolean,
  ): ATNConfig? {
    when (t.serializationType) {
      Transition.RULE -> {
        return ruleTransition(config, t as RuleTransition)
      }
      Transition.PRECEDENCE -> {
        return precedenceTransition(
          config,
          t as PrecedencePredicateTransition,
          collectPredicates,
          inContext,
          fullCtx,
        )
      }
      Transition.PREDICATE -> {
        return predTransition(
          config, t as PredicateTransition,
          collectPredicates,
          inContext,
          fullCtx,
        )
      }
      Transition.ACTION -> {
        return actionTransition(config, t as ActionTransition)
      }
      Transition.EPSILON -> {
        return ATNConfig(config, t.target)
      }
      Transition.ATOM,
      Transition.RANGE,
      Transition.SET -> {
        // EOF transitions act like epsilon transitions after the first EOF
        // transition is traversed
        if (treatEofAsEpsilon) {
          if (t.matches(Token.EOF, 0, 1)) {
            return ATNConfig(config, t.target)
          }
        }

        return null
      }
      else -> return null
    }
  }

  protected open fun actionTransition(config: ATNConfig, t: ActionTransition): ATNConfig {
    if (debug) {
      System.out.println("ACTION edge ${t.ruleIndex}:${t.actionIndex}")
    }

    return ATNConfig(config, t.target)
  }

  public open fun precedenceTransition(
    config: ATNConfig,
    pt: PrecedencePredicateTransition,
    collectPredicates: Boolean,
    inContext: Boolean,
    fullCtx: Boolean,
  ): ATNConfig? {
    if (debug) {
      System.out.println("PRED (collectPredicates=$collectPredicates) ${pt.precedence}>=_p, ctx dependent=true")

      if (parser != null) {
        System.out.println("context surrounding pred is ${parser.getRuleInvocationStack()}")
      }
    }

    var c: ATNConfig? = null

    if (collectPredicates && inContext) {
      if (fullCtx) {
        // In full context mode, we can evaluate predicates on-the-fly
        // during closure, which dramatically reduces the size of
        // the config sets. It also obviates the need to test predicates
        // later during conflict resolution.
        val currentPosition = _input!!.index()
        _input!!.seek(_startIndex)

        val predSucceeds = evalSemanticContext(pt.predicate, _outerContext, config.alt, fullCtx)
        _input!!.seek(currentPosition)

        if (predSucceeds) {
          // No pred context
          c = ATNConfig(config, pt.target)
        }
      } else {
        val newSemCtx = SemanticContext.and(config.semanticContext, pt.predicate)
        c = ATNConfig(config, pt.target, newSemCtx!!)
      }
    } else {
      c = ATNConfig(config, pt.target)
    }

    if (debug) {
      System.out.println("config from pred transition=${c}")
    }

    return c
  }

  protected open fun predTransition(
    config: ATNConfig,
    pt: PredicateTransition,
    collectPredicates: Boolean,
    inContext: Boolean,
    fullCtx: Boolean,
  ): ATNConfig? {
    if (debug) {
      System.out.println(
        "PRED (collectPredicates=$collectPredicates)" +
          " ${pt.ruleIndex}:${pt.predIndex}" +
          ", ctx dependent=${pt.isCtxDependent}"
      )

      if (parser != null) {
        System.out.println("context surrounding pred is ${parser.getRuleInvocationStack()}")
      }
    }

    var c: ATNConfig? = null

    if (collectPredicates && (!pt.isCtxDependent || pt.isCtxDependent && inContext)) {
      if (fullCtx) {
        // In full context mode, we can evaluate predicates on-the-fly
        // during closure, which dramatically reduces the size of
        // the config sets. It also obviates the need to test predicates
        // later during conflict resolution.
        val currentPosition = _input!!.index()
        _input!!.seek(_startIndex)

        val predSucceeds = evalSemanticContext(pt.predicate, _outerContext, config.alt, fullCtx)
        _input!!.seek(currentPosition)

        if (predSucceeds) {
          // No pred context
          c = ATNConfig(config, pt.target)
        }
      } else {
        val newSemCtx = SemanticContext.and(config.semanticContext, pt.predicate)
        c = ATNConfig(config, pt.target, newSemCtx!!)
      }
    } else {
      c = ATNConfig(config, pt.target)
    }

    if (debug) {
      System.out.println("config from pred transition=${c}")
    }

    return c
  }

  protected open fun ruleTransition(config: ATNConfig, t: RuleTransition): ATNConfig {
    if (debug) {
      System.out.println("CALL rule ${getRuleName(t.target.ruleIndex)}, ctx=${config.context}")
    }

    val returnState = t.followState
    val newContext = SingletonPredictionContext.create(config.context, returnState.stateNumber)
    return ATNConfig(config, t.target, newContext)
  }

  /**
   * Gets a [BitSet] containing the alternatives in [configs]
   * which are part of one or more conflicting alternative subsets.
   *
   * @param configs The [ATNConfigSet] to analyze
   * @return The alternatives in [configs] which are part of one or more
   *   conflicting alternative subsets. If [configs] does not contain any
   *   conflicting subsets, this method returns an empty [BitSet]
   */
  protected open fun getConflictingAlts(configs: ATNConfigSet): BitSet {
    val altSets = PredictionMode.getConflictingAltSubsets(configs)
    return PredictionMode.getAlts(altSets)
  }

  /**
   * Sam pointed out a problem with the previous definition, v3, of
   * ambiguous states. If we have another state associated with conflicting
   * alternatives, we should keep going. For example, the following grammar:
   *
   * ```
   * s : (ID | ID ID?) ';' ;
   * ```
   *
   * When the ATN simulation reaches the state before `';'`, it has a DFA
   * state that looks like: `[12|1|[], 6|2|[], 12|2|[]]`.
   *
   * Naturally `12|1|[]` and `12|2|[]` conflict, but we cannot stop processing
   * this node because alternative to has another way to continue, via `[6|2|[]]`.
   *
   * The key is that we have a single state that has config's only associated
   * with a single alternative, `2`, and crucially the state transitions
   * among the configurations are all non-epsilon transitions. That means
   * we don't consider any conflicts that include alternative `2`. So, we
   * ignore the conflict between alts `1` and `2`. We ignore a set of
   * conflicting alts when there is an intersection with an alternative
   * associated with a single alt state in the state-config-list map.
   *
   * It's also the case that we might have two conflicting configurations but
   * also a 3rd non-conflicting configuration for a different alternative:
   * `[1|1|[], 1|2|[], 8|3|[]]`. This can come about from grammar:
   *
   * ```
   * a : A | A | A B ;
   * ```
   *
   * After matching input `A`, we reach the stop state for rule `A`, state `1`.
   * State `8` is the state right before `B`. Clearly alternatives `1` and `2`
   * conflict and no amount of further lookahead will separate the two.
   * However, alternative `3` will be able to continue, and so we do not
   * stop working on this state. In the previous example, we're concerned
   * with states associated with the conflicting alternatives. Here alt
   * `3` is not associated with the conflicting configs, but since we can continue
   * looking for input reasonably, I don't declare the state done. We
   * ignore a set of conflicting alts when we have an alternative
   * that we still need to pursue.
   */
  protected open fun getConflictingAltsOrUniqueAlt(configs: ATNConfigSet): BitSet {
    val conflictingAlts: BitSet

    if (configs.uniqueAlt != ATN.INVALID_ALT_NUMBER) {
      conflictingAlts = BitSet()
      conflictingAlts.set(configs.uniqueAlt)
    } else {
      conflictingAlts = configs.conflictingAlts!!
    }

    return conflictingAlts
  }

  public open fun getTokenName(t: Int): String {
    if (t == Token.EOF) {
      return "EOF"
    }

    val vocabulary = parser?.vocabulary ?: VocabularyImpl.EMPTY_VOCABULARY
    val displayName = vocabulary.getDisplayName(t)

    if (displayName == t.toString()) {
      return displayName
    }

    return "$displayName<$t>"
  }

  public open fun getLookaheadName(input: TokenStream): String =
    getTokenName(input.LA(1))

  /**
   * Used for debugging in [adaptivePredict] around [execATN] but I cut
   * it out for clarity now that alg. works well. We can leave this
   * "dead" code for a bit.
   */
  public open fun dumpDeadEndConfigs(ex: NoViableAltException) {
    System.err.println("dead end configs: ")

    for (c in ex.deadEndConfigs!!) {
      var trans = "no edges"

      if (c.state.numberOfTransitions > 0) {
        val t = c.state.transition(0)

        if (t is AtomTransition) {
          trans = "Atom ${getTokenName(t.label)}"
        } else if (t is SetTransition) {
          val not = t is NotSetTransition
          trans = "${if (not) "~" else ""}Set ${t.set}"
        }
      }

      System.err.println("${c.toString(parser, true)}:$trans")
    }
  }

  protected open fun noViableAlt(
    input: TokenStream,
    outerContext: ParserRuleContext,
    configs: ATNConfigSet,
    startIndex: Int,
  ): NoViableAltException =
    NoViableAltException(
      recognizer = parser!!,
      input = input,
      startToken = input[startIndex],
      offendingToken = input.LT(1),
      deadEndConfigs = configs,
      ctx = outerContext,
    )

  /**
   * Add an edge to the DFA, if possible.
   *
   * This method calls [addDFAState] to ensure the [to] state is present
   * in the DFA. If [from] is `null`, or if [t] is outside the
   * range of edges that can be represented in the DFA tables, this method
   * returns without adding the edge to the DFA.
   *
   * If [to] is `null`, this method returns `null`.
   * Otherwise, this method returns the [DFAState] returned by
   * calling [addDFAState] for the [to] state.
   *
   * @param dfa The DFA
   * @param from The source state for the edge
   * @param t The input symbol
   * @param to The target state for the edge
   * @return If [to] is `null` this method returns `null`,
   *   otherwise this method returns the result of calling
   *   [addDFAState] on [to]
   */
  protected fun addDFAEdge(dfa: DFA, from: DFAState?, t: Int, to: DFAState?): DFAState? {
    if (debug) {
      System.out.println("EDGE $from -> $to upon ${getTokenName(t)}")
    }

    if (to == null) {
      return null
    }

    // Used existing if possible not incoming
    val tto = addDFAState(dfa, to)

    if (from == null || t < -1 || t > atn.maxTokenType) {
      return tto
    }

    synchronized(from) {
      if (from.edges == null) {
        from.edges = arrayOfNulls(atn.maxTokenType + 1 + 1)
      }

      // Connect
      from.edges!![t + 1] = tto
    }

    if (debug) {
      System.out.println("DFA=\n${dfa.toString(parser?.vocabulary ?: VocabularyImpl.EMPTY_VOCABULARY)}")
    }

    return tto
  }

  /**
   * Add state [D] to the DFA if it is not already present, and return
   * the actual instance stored in the DFA.
   *
   * If a state equivalent to [D] is already in the DFA,
   * the existing state is returned. Otherwise, this
   * method returns [D] after adding it to the DFA.
   *
   * If [D] is [ATNSimulator.ERROR], this method returns [ATNSimulator.ERROR]
   * and does not change the DFA.
   *
   * @param dfa The DFA
   * @param D The DFA state to add
   * @return The state stored in the DFA. This will be either the existing
   *   state if [D] is already in the DFA, or [D] itself if the
   *   state was not already present
   */
  @Suppress("LocalVariableName")
  protected open fun addDFAState(dfa: DFA, D: DFAState): DFAState {
    if (D === ERROR) {
      return D
    }

    synchronized(dfa.states) {
      val existing = dfa.states[D]

      if (existing != null) {
        if (trace_atn_sim) {
          System.out.println("addDFAState $D exists")
        }

        return existing
      }

      D.stateNumber = dfa.states.size

      if (!D.configs.isReadonly) {
        D.configs.optimizeConfigs(this)
        D.configs.isReadonly = true
      }

      if (trace_atn_sim) {
        System.out.println("addDFAState new $D")
      }

      dfa.states[D] = D
      return D
    }
  }

  protected open fun reportAttemptingFullContext(
    dfa: DFA,
    conflictingAlts: BitSet,
    configs: ATNConfigSet,
    startIndex: Int,
    stopIndex: Int,
  ) {
    if (debug || retry_debug) {
      val interval = Interval.of(startIndex, stopIndex)
      System.out.println(
        "reportAttemptingFullContext" +
          " decision=${dfa.decision}:$configs" +
          ", input=${parser?.tokenStream?.getText(interval)}"
      )
    }

    parser?.errorListenerDispatch?.reportAttemptingFullContext(
      recognizer = parser,
      dfa = dfa,
      startIndex = startIndex,
      stopIndex = stopIndex,
      conflictingAlts = conflictingAlts,
      configs = configs,
    )
  }

  protected open fun reportContextSensitivity(
    dfa: DFA,
    prediction: Int,
    configs: ATNConfigSet,
    startIndex: Int,
    stopIndex: Int,
  ) {
    if (debug || retry_debug) {
      val interval = Interval.of(startIndex, stopIndex)
      System.out.println(
        "reportContextSensitivity" +
          " decision=${dfa.decision}:$configs" +
          ", input=${parser?.tokenStream?.getText(interval)}"
      )
    }

    parser?.errorListenerDispatch?.reportContextSensitivity(
      recognizer = parser,
      dfa = dfa,
      startIndex = startIndex,
      stopIndex = stopIndex,
      prediction = prediction,
      configs = configs,
    )
  }

  /**
   * If context-sensitive parsing, we know it's ambiguity not conflict.
   */
  @Suppress("LocalVariableName")
  protected open fun reportAmbiguity(
    dfa: DFA,
    D: DFAState, // The DFA state from execATN() that had SLL conflicts
    startIndex: Int, stopIndex: Int,
    exact: Boolean,
    ambigAlts: BitSet,
    configs: ATNConfigSet, // Configs that LL not SLL considered conflicting
  ) {
    if (debug || retry_debug) {
      val interval = Interval.of(startIndex, stopIndex)
      System.out.println("reportAmbiguity $ambigAlts:$configs, input=${parser?.tokenStream?.getText(interval)}")
    }

    parser?.errorListenerDispatch?.reportAmbiguity(
      recognizer = parser,
      dfa = dfa,
      startIndex = startIndex,
      stopIndex = stopIndex,
      exact = exact,
      ambigAlts = ambigAlts,
      configs = configs,
    )
  }
}




© 2015 - 2024 Weber Informatics LLC | Privacy Policy