Search in sources :

Example 11 with StreamSource

use of org.opensolaris.opengrok.analysis.StreamSource in project OpenGrok by OpenGrok.

the class SourceSplitterTest method shouldHandleStreamedDocsOfLongerLength.

@Test
public void shouldHandleStreamedDocsOfLongerLength() throws IOException {
    // 0             0
    // 0-- -  5-- - -1--- - 5--- - 2-
    final String INPUT = "ab\r\ncde\r\nefgh\r\nijk\r\nlm";
    StreamSource src = StreamSource.fromString(INPUT);
    SourceSplitter splitter = new SourceSplitter();
    splitter.reset(src);
    assertEquals("split count", 5, splitter.count());
    assertEquals("split position", 0, splitter.getPosition(0));
    assertEquals("split position", 4, splitter.getPosition(1));
    assertEquals("split position", 9, splitter.getPosition(2));
    assertEquals("split position", 15, splitter.getPosition(3));
    assertEquals("split position", 20, splitter.getPosition(4));
    assertEquals("split position", 22, splitter.getPosition(5));
    /*
         * Test findLineOffset() for every character with an alternate
         * computation that counts every LFs.
         */
    for (int i = 0; i < splitter.originalLength(); ++i) {
        char c = INPUT.charAt(i);
        int off = splitter.findLineOffset(i);
        long numLF = INPUT.substring(0, i + 1).chars().filter(ch -> ch == '\n').count();
        long exp = numLF - (c == '\n' ? 1 : 0);
        assertEquals("split find-offset of " + i, exp, off);
    }
}
Also used : Assert.assertArrayEquals(org.junit.Assert.assertArrayEquals) StreamSource(org.opensolaris.opengrok.analysis.StreamSource) IOException(java.io.IOException) Test(org.junit.Test) Assert.assertEquals(org.junit.Assert.assertEquals) StreamSource(org.opensolaris.opengrok.analysis.StreamSource) Test(org.junit.Test)

Example 12 with StreamSource

use of org.opensolaris.opengrok.analysis.StreamSource in project OpenGrok by OpenGrok.

the class StreamUtils method readTagsFromResource.

public static Definitions readTagsFromResource(String tagsResourceName, String rawResourceName, int tabSize) throws IOException {
    InputStream res = StreamUtils.class.getClassLoader().getResourceAsStream(tagsResourceName);
    assertNotNull(tagsResourceName + " as resource", res);
    BufferedReader in = new BufferedReader(new InputStreamReader(res, "UTF-8"));
    CtagsReader rdr = new CtagsReader();
    rdr.setTabSize(tabSize);
    if (rawResourceName != null) {
        rdr.setSplitterSupplier(() -> {
            /**
             * This should return truly raw content, as the CtagsReader will
             * expand tabs according to its setting.
             */
            SourceSplitter splitter = new SourceSplitter();
            StreamSource src = sourceFromEmbedded(rawResourceName);
            try {
                splitter.reset(src);
            } catch (IOException ex) {
                System.err.println(ex.toString());
                return null;
            }
            return splitter;
        });
    }
    String line;
    while ((line = in.readLine()) != null) {
        rdr.readLine(line);
    }
    return rdr.getDefinitions();
}
Also used : InputStreamReader(java.io.InputStreamReader) BufferedInputStream(java.io.BufferedInputStream) InputStream(java.io.InputStream) StreamSource(org.opensolaris.opengrok.analysis.StreamSource) BufferedReader(java.io.BufferedReader) IOException(java.io.IOException) CtagsReader(org.opensolaris.opengrok.analysis.CtagsReader)

Aggregations

StreamSource (org.opensolaris.opengrok.analysis.StreamSource)12 InputStream (java.io.InputStream)6 Test (org.junit.Test)6 BufferedInputStream (java.io.BufferedInputStream)4 IOException (java.io.IOException)3 Field (org.apache.lucene.document.Field)2 BufferedReader (java.io.BufferedReader)1 ByteArrayInputStream (java.io.ByteArrayInputStream)1 InputStreamReader (java.io.InputStreamReader)1 Reader (java.io.Reader)1 StringWriter (java.io.StringWriter)1 GZIPInputStream (java.util.zip.GZIPInputStream)1 CharTermAttribute (org.apache.lucene.analysis.tokenattributes.CharTermAttribute)1 OffsetAttribute (org.apache.lucene.analysis.tokenattributes.OffsetAttribute)1 Document (org.apache.lucene.document.Document)1 CBZip2InputStream (org.apache.tools.bzip2.CBZip2InputStream)1 Assert.assertArrayEquals (org.junit.Assert.assertArrayEquals)1 Assert.assertEquals (org.junit.Assert.assertEquals)1 CtagsReader (org.opensolaris.opengrok.analysis.CtagsReader)1 Definitions (org.opensolaris.opengrok.analysis.Definitions)1