All Downloads are FREE. Search and download functionalities are using the official Maven repository.

com.optimaize.langdetect.i18n.LdLocale Maven / Gradle / Ivy

package com.optimaize.langdetect.i18n;

import com.google.common.base.Optional;
import com.google.common.base.Splitter;
import org.jetbrains.annotations.NotNull;

import java.util.List;

/**
 * A language-detector implementation of a Locale, similar to the java.util.Locale.
 *
 * 

It represents a IETF BCP 47 tag, but does not implement all the features. Features can be added as needed.

* *

It is constructed through the {@link #fromString} factory method. The {@link #toString()} method * produces a parseable and persistable string.

* *

The class is immutable.

* *

The java.util.Locale cannot be used because it has issues for historical reasons, notably the * script code conversion for Hebrew, Yiddish and Indonesian, and more. If one needs a Locale, * it is simple to create one based on this object.
* The ICU ULocale cannot be used because a) it has issues too (for our use case) and b) we're not * using ICU in here [yet].

* *

This class does not perform any modifications on the input. The input is used as is, and the getters * return it in exactly the same way. No standardization, canonicalization, cleaning.

* *

The input is validated syntactically, but not for code existence. For example the script code must * be a valid ISO 15924 like "Latn" or "Cyrl", in correct case. But whether the code exists or not is not checked. * These code standards are not fixed, simply because regional entities like Countries can change for political * reasons, and languages are living entities. Therefore certain codes may exist at some point in time only * (be introduced late, or be deprecated or removed, or even be re-assigned another meaning). * It is not up to us to decide whether Kosovo is a country in 2015 or not. * If one needs to only work with a certain range of acceptable codes, he can validate the codes through other * classes that have knowledge about the codes. *

* *

Language: as for BCP 47, the iso 639-1 code must be used if there is one. For example "fr" for French. * If not, the ISO 639-3 should be used. It is highly discouraged to use 639-2. * Right now this class enforces a 2 or 3 char code, but this may be relaxed in the future.

* *

Script: Only ISO 15924, no discussion.

* *

Region: same as for BCP 47. That means ISO 3166-1 alpha-2 and "UN M.49". * I can imagine relaxing it in the future to also allow 3166-2 codes. * In most cases the "region" is a "country".

* * @author fabian kessler */ public final class LdLocale { @NotNull private final String language; @NotNull private final Optional script; @NotNull private final Optional region; private LdLocale(@NotNull String language, @NotNull Optional script, @NotNull Optional region) { this.language = language; this.script = script; this.region = region; } /** * @param string The output of the toString() method. * @return either a new or possibly a cached (immutable) instance. */ @NotNull public static LdLocale fromString(@NotNull String string) { if (string==null || string.isEmpty()) throw new IllegalArgumentException("At least a language is required!"); String language = null; Optional script = null; Optional region = null; List strings = Splitter.on('-').splitToList(string); for (int i=0; i>>"+chunk+"<<>>"+s+"<< getScript() { return script; } /** * @return ISO 3166-1 or UN M.49 code, eg "DE" or 150, see class header. */ @NotNull public Optional getRegion() { return region; } @Override //generated-code public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; LdLocale ldLocale = (LdLocale) o; if (!language.equals(ldLocale.language)) return false; if (!region.equals(ldLocale.region)) return false; if (!script.equals(ldLocale.script)) return false; return true; } @Override //generated-code public int hashCode() { int result = language.hashCode(); result = 31 * result + script.hashCode(); result = 31 * result + region.hashCode(); return result; } }




© 2015 - 2024 Weber Informatics LLC | Privacy Policy