All Downloads are FREE. Search and download functionalities are using the official Maven repository.

com.google.gwt.dev.util.editdistance.GeneralEditDistance Maven / Gradle / Ivy

/*
 * Copyright 2010 Google Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License"); you may not
 * use this file except in compliance with the License. You may obtain a copy of
 * the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
 * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
 * License for the specific language governing permissions and limitations under
 * the License.
 */
package com.google.gwt.dev.util.editdistance;

/**
 * An engine definition for computing string edit distances, and
 * implementation generators.
 *
 * Generally, edit distance is the minimum number of edit operations required
 * to transform one string into another.  Being a minimum transform, it
 * forms a metric space (one that is non-negative, symmetric, and obeys
 * triangle inequality).
 *
 * The most common form is Levenshtein distance, where the set of operations
 * consists of:
 *   deleting one character;
 *   inserting one character; or,
 *   substituting one character for another,
 * each having a "cost" of 1.
 *
 * Another common form is Damerau-Levenshtein distance, which adds
 *   transpose two adjacent characters
 * to the set of operations.  Note that many implementations of
 * Damerau distance do not account for insertion after transposition;
 * this "restricted" Damerau distance does not obey triangle inequality,
 * but is substantially easier to compute than the unrestricted metric.
 *
 * Another variation is to apply differential costs for the various
 * operations.  For example, substitutions could be weighted by
 * character type (e.g., vowel<=>vowel being a cheaper edit than
 * vowel<=>consonant).
 *
 * This class defines an engine for computing edit distance from
 * a fixed ("pattern") string to other ("target") strings.  The pattern
 * is fixed at instantiation time; this allows an implementation to
 * perform precomputations to make individual target computations faster.
 *
 * To allow further performance optimization, the GeneralEditDistance
 * contract does not require thread-safety.  A duplicate method is
 * defined to allow additional equivalent instances to be created
 * for separate threads; it is essentially a type-knowledgeable clone.
 */
public interface GeneralEditDistance {
  /**
   * Creates a duplicate (clone) of this distance engine, typically
   * in order to use it in another thread (as the GeneralEditDistance
   * contract does not specify thread safety).
   *
   * This method differs from Object.clone() in two ways:
   * it is typed (to GeneralEditDistance); and, it must not throw
   * an exception.
   *
   * @return     a clone of this instance.
   */
  GeneralEditDistance duplicate();

  /**
   * Computes edit distance from the pattern string (used to
   * construct this GeneralEditDistance instance) to a given target
   * string, bounded by a limit of interest.
   *
   * The specified limit must be non-negative.  If the actual distance
   * exceeds the limit, this method may return *any* value above the limit;
   * it need not be the actual distance.  The behavior of
   * illegal limit values is undefined; an implementation may
   * return an incorrect result or raise a runtime exception.
   *
   * @param target string for which distance is to be compared
   *        against a predefined pattern
   * @param limit maximum distance for which exact result is
   *        required (and beyond which any over-limit result
   *        can be returned)
   * @return edit distance from the predefined pattern to the
   *        target (or any value above limit, if the actual distance
   *        exceeds the prescribed limit)
   */
  int getDistance(CharSequence target, int limit);
}




© 2015 - 2024 Weber Informatics LLC | Privacy Policy