Search in sources :

Example 46 with LinkedVector

use of edu.illinois.cs.cogcomp.lbjava.parse.LinkedVector in project cogcomp-nlp by CogComp.

the class WordSplitter method main.

/**
     * Run this program on a file containing plain text, and it will produce
     * the same text on <code>STDOUT</code> rearranged so that each line
     * contains exactly one sentence, and so that character sequences deemed to
     * be "words" are delimited by whitespace.
     * <p/>
     * <p> Usage:
     * <code> java edu.illinois.cs.cogcomp.lbjava.edu.illinois.cs.cogcomp.lbjava.nlp.WordSplitter &lt;file name&gt; </code>
     *
     * @param args The command line arguments.
     **/
public static void main(String[] args) {
    String filename = null;
    try {
        filename = args[0];
        if (args.length > 1)
            throw new Exception();
    } catch (Exception e) {
        System.err.println("usage: java edu.illinois.cs.cogcomp.lbjava.edu.illinois.cs.cogcomp.lbjava.nlp.WordSplitter <file name>");
        System.exit(1);
    }
    WordSplitter splitter = new WordSplitter(new SentenceSplitter(filename));
    for (LinkedVector s = (LinkedVector) splitter.next(); s != null; s = (LinkedVector) splitter.next()) {
        if (s.size() > 0) {
            Word w = (Word) s.get(0);
            System.out.print(w.form);
            for (w = (Word) w.next; w != null; w = (Word) w.next) System.out.print(" " + w.form);
        }
        System.out.println();
    }
}
Also used : LinkedVector(edu.illinois.cs.cogcomp.lbjava.parse.LinkedVector)

Aggregations

LinkedVector (edu.illinois.cs.cogcomp.lbjava.parse.LinkedVector)46 Word (edu.illinois.cs.cogcomp.lbjava.nlp.Word)9 NEWord (edu.illinois.cs.cogcomp.ner.LbjTagger.NEWord)9 ArrayList (java.util.ArrayList)8 Vector (java.util.Vector)8 IntPair (edu.illinois.cs.cogcomp.core.datastructures.IntPair)3 Sentence (edu.illinois.cs.cogcomp.lbjava.nlp.Sentence)3 SentenceSplitter (edu.illinois.cs.cogcomp.lbjava.nlp.SentenceSplitter)3 NERDocument (edu.illinois.cs.cogcomp.ner.LbjTagger.NERDocument)3 File (java.io.File)3 HashMap (java.util.HashMap)3 Sentence (edu.illinois.cs.cogcomp.core.datastructures.textannotation.Sentence)2 Token (edu.illinois.cs.cogcomp.lbjava.nlp.seg.Token)2 OutFile (edu.illinois.cs.cogcomp.ner.IO.OutFile)2 Matcher (java.util.regex.Matcher)2 ChunkLabel (edu.illinois.cs.cogcomp.chunker.main.lbjava.ChunkLabel)1 Chunker (edu.illinois.cs.cogcomp.chunker.main.lbjava.Chunker)1 Pair (edu.illinois.cs.cogcomp.core.datastructures.Pair)1 SpanLabelView (edu.illinois.cs.cogcomp.core.datastructures.textannotation.SpanLabelView)1 TextAnnotation (edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation)1