Search in sources :

Example 16 with Token

use of org.omegat.util.Token in project omegat by omegat-org.

the class GlossarySearcher method tokenize.

private Token[] tokenize(String str, List<Tag> tags) {
    Token[] tokens = tokenize(str);
    if (tags.isEmpty()) {
        return tokens;
    List<Token> result = new ArrayList<>(tokens.length);
    for (Token tok : tokens) {
        if (!tokenInTag(tok, tags)) {
    return result.toArray(new Token[result.size()]);
Also used : ArrayList(java.util.ArrayList) Token(org.omegat.util.Token)

Example 17 with Token

use of org.omegat.util.Token in project omegat by omegat-org.

the class TransTipsPopup method addItems.

public void addItems(final JPopupMenu menu, JTextComponent comp, final int mousepos, boolean isInActiveEntry, boolean isInActiveTranslation, SegmentBuilder sb) {
    if (!Core.getEditor().getSettings().isMarkGlossaryMatches()) {
    if (!isInActiveEntry || isInActiveTranslation) {
    // is mouse in active entry's source ?
    final int startSource = sb.getStartSourcePosition();
    int len = sb.getSourceText().length();
    if (mousepos < startSource || mousepos > startSource + len) {
    Set<String> added = new HashSet<>();
    for (GlossaryEntry ge : GlossaryTextArea.nowEntries) {
        for (Token[] toks : Core.getGlossaryManager().searchSourceMatchTokens(sb.getSourceTextEntry(), ge)) {
            for (Token tok : toks) {
                // is inside found word ?
                if (startSource + tok.getOffset() <= mousepos && mousepos <= startSource + tok.getOffset() + tok.getLength()) {
                    // Create the MenuItems
                    for (String s : ge.getLocTerms(true)) {
                        if (!added.contains(s)) {
                            JMenuItem it = menu.add(s);
                            it.addActionListener(e -> Core.getEditor().insertText(s));
Also used : Token(org.omegat.util.Token) JMenuItem(javax.swing.JMenuItem) HashSet(java.util.HashSet)

Example 18 with Token

use of org.omegat.util.Token in project omegat by omegat-org.

the class LevenshteinDistance method compute.

 * Compute Levenshtein distance between two lists.
 * <p> The difference between this impl. and the canonical one is that,
 * rather than creating and retaining a matrix of size s.length()+1 by
 * t.length()+1, we maintain two single-dimensional arrays of length
 * s.length()+1.
 * <p> The first, d, is the 'current working' distance array that maintains
 * the newest distance cost counts as we iterate through the characters of
 * String s. Each time we increment the index of String t we are comparing,
 * d is copied to p, the second int[]. Doing so allows us to retain the
 * previous cost counts as required by the algorithm (taking the minimum of
 * the cost count to the left, up one, and diagonally up and to the left of
 * the current cost count being calculated). <p> (Note that the arrays
 * aren't really copied anymore, just switched... this is clearly much
 * better than cloning an array or doing a System.arraycopy() each time
 * through the outer loop.)
 * <p> Effectively, the difference between the two implementations is this
 * one does not cause an out of memory condition when calculating the LD
 * over two very large strings.
 * <p> For perfomance reasons the maximal number of compared items is {@link
 * #MAX_N}.
public int compute(Token[] s, Token[] t) {
    if (s == null || t == null) {
        throw new IllegalArgumentException(OStrings.getString("LD_NULL_ARRAYS_ERROR"));
    // length of s
    int n = s.length;
    // length of t
    int m = t.length;
    if (n == 0) {
        return m;
    } else if (m == 0) {
        return n;
    if (n > MAX_N) {
        n = MAX_N;
    if (m > MAX_N) {
        m = MAX_N;
    // placeholder to assist in swapping p and d
    short[] swap;
    // indexes into strings s and t
    // iterates through s
    short i;
    // iterates through t
    short j;
    // jth object of t
    Token t_j = null;
    // cost
    short cost;
    for (i = 0; i <= n; i++) {
        p[i] = i;
    for (j = 1; j <= m; j++) {
        t_j = t[j - 1];
        d[0] = j;
        // ith object of s
        Token s_i = null;
        for (i = 1; i <= n; i++) {
            s_i = s[i - 1];
            cost = s_i.equals(t_j) ? (short) 0 : (short) 1;
            // minimum of cell to the left+1, to the top+1, diagonally left
            // and up +cost
            d[i] = minimum(d[i - 1] + 1, p[i] + 1, p[i - 1] + cost);
        // copy current distance counts to 'previous row' distance counts
        swap = p;
        p = d;
        d = swap;
    // actually has the most recent cost counts
    return p[n];
Also used : Token(org.omegat.util.Token)

Example 19 with Token

use of org.omegat.util.Token in project omegat by omegat-org.

the class AutoCompleterListView method getLastToken.

protected String getLastToken(String text) {
    String token = "";
    ITokenizer tokenizer = getTokenizer();
    Token[] tokens = tokenizer.tokenizeVerbatim(text);
    if (tokens.length != 0) {
        Token lastToken = tokens[tokens.length - 1];
        String lastString = text.substring(lastToken.getOffset()).trim();
        if (!lastString.isEmpty()) {
            token = lastString;
    return token;
Also used : ITokenizer(org.omegat.tokenizer.ITokenizer) Token(org.omegat.util.Token)


Token (org.omegat.util.Token)19 ArrayList (java.util.ArrayList)8 NearString (org.omegat.core.matching.NearString)3 List (java.util.List)2 ITokenizer (org.omegat.tokenizer.ITokenizer)2 Point (java.awt.Point)1 IOException ( BreakIterator (java.text.BreakIterator)1 Arrays (java.util.Arrays)1 Collections (java.util.Collections)1 Comparator (java.util.Comparator)1 HashSet (java.util.HashSet)1 Matcher (java.util.regex.Matcher)1 JMenuItem (javax.swing.JMenuItem)1 HighlightPainter (javax.swing.text.Highlighter.HighlightPainter)1 StyledDocument (javax.swing.text.StyledDocument)1 TokenStream (org.apache.lucene.analysis.TokenStream)1 CharTermAttribute (org.apache.lucene.analysis.tokenattributes.CharTermAttribute)1 OffsetAttribute (org.apache.lucene.analysis.tokenattributes.OffsetAttribute)1 Test (org.junit.Test)1