Search in sources :

Example 1 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class SerializationFactoryFinder method quadSerializationFactory.

public static SerializationFactory<Quad> quadSerializationFactory() {
    return new SerializationFactory<Quad>() {

        @Override
        public Sink<Quad> createSerializer(OutputStream out) {
            return new SinkQuadOutput(out, null, NodeToLabel.createBNodeByLabelEncoded());
        }

        @Override
        public Iterator<Quad> createDeserializer(InputStream in) {
            Tokenizer tokenizer = TokenizerFactory.makeTokenizerASCII(in);
            ParserProfile profile = RiotLib.createParserProfile(RiotLib.factoryRDF(LabelToNode.createUseLabelEncoded()), ErrorHandlerFactory.errorHandlerNoWarnings, IRIResolver.createNoResolve(), false);
            LangNQuads parser = new LangNQuads(tokenizer, profile, null);
            return parser;
        }

        @Override
        public long getEstimatedMemorySize(Quad item) {
            // TODO
            return 0;
        }
    };
}
Also used : Quad(org.apache.jena.sparql.core.Quad) BindingInputStream(org.apache.jena.sparql.engine.binding.BindingInputStream) InputStream(java.io.InputStream) OutputStream(java.io.OutputStream) BindingOutputStream(org.apache.jena.sparql.engine.binding.BindingOutputStream) SinkQuadOutput(org.apache.jena.riot.out.SinkQuadOutput) LangNQuads(org.apache.jena.riot.lang.LangNQuads) SerializationFactory(org.apache.jena.atlas.data.SerializationFactory) Tokenizer(org.apache.jena.riot.tokens.Tokenizer)

Example 2 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class RiotLib method parse.

/** Parse a string to get one Node (the first token in the string) */
public static Node parse(String string) {
    Tokenizer tokenizer = TokenizerFactory.makeTokenizerString(string);
    if (!tokenizer.hasNext())
        return null;
    Token t = tokenizer.next();
    Node n = profile.create(null, t);
    if (tokenizer.hasNext())
        Log.warn(RiotLib.class, "String has more than one token in it: " + string);
    return n;
}
Also used : LabelToNode(org.apache.jena.riot.lang.LabelToNode) Node(org.apache.jena.graph.Node) Token(org.apache.jena.riot.tokens.Token) Tokenizer(org.apache.jena.riot.tokens.Tokenizer)

Example 3 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class TestParserFactory method nquads_01.

@Test
public void nquads_01() {
    {
        String s = "<x> <p> <q> <g> .";
        CatchParserOutput sink = parseCapture(s, Lang.NQ);
        assertEquals(1, sink.startCalled);
        assertEquals(1, sink.finishCalled);
        assertEquals(0, sink.triples.size());
        assertEquals(1, sink.quads.size());
        Quad q = SSE.parseQuad("(<g> <x> <p> <q>)");
        assertEquals(q, last(sink.quads));
    }
    // Old style, deprecated.
    Tokenizer tokenizer = TokenizerFactory.makeTokenizerString("<x> <p> <q> <g>.");
    CatchParserOutput sink = new CatchParserOutput();
    ParserProfile x = makeParserProfile(IRIResolver.createNoResolve(), null, false);
    LangRIOT parser = RiotParsers.createParserNQuads(tokenizer, sink, x);
    parser.parse();
    assertEquals(1, sink.startCalled);
    assertEquals(1, sink.finishCalled);
    assertEquals(0, sink.triples.size());
    assertEquals(1, sink.quads.size());
    Quad q = SSE.parseQuad("(<g> <x> <p> <q>)");
    assertEquals(q, last(sink.quads));
}
Also used : Quad(org.apache.jena.sparql.core.Quad) Tokenizer(org.apache.jena.riot.tokens.Tokenizer) Test(org.junit.Test) BaseTest(org.apache.jena.atlas.junit.BaseTest)

Example 4 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class TestParserFactory method turtle_01.

@Test
public void turtle_01() {
    // Verify the excected outoput works.
    {
        String s = "<x> <p> <q> .";
        CatchParserOutput sink = parseCapture(s, Lang.TTL);
        assertEquals(1, sink.startCalled);
        assertEquals(1, sink.finishCalled);
        assertEquals(1, sink.triples.size());
        assertEquals(0, sink.quads.size());
        Triple t = SSE.parseTriple("(<http://base/x> <http://base/p> <http://base/q>)");
        assertEquals(t, last(sink.triples));
    }
    // Old style, deprecated.
    Tokenizer tokenizer = TokenizerFactory.makeTokenizerString("<x> <p> <q> .");
    CatchParserOutput sink = new CatchParserOutput();
    ParserProfile maker = makeParserProfile(IRIResolver.create("http://base/"), null, true);
    LangRIOT parser = RiotParsers.createParserTurtle(tokenizer, sink, maker);
    parser.parse();
    assertEquals(1, sink.startCalled);
    assertEquals(1, sink.finishCalled);
    assertEquals(1, sink.triples.size());
    assertEquals(0, sink.quads.size());
    assertEquals(SSE.parseTriple("(<http://base/x> <http://base/p> <http://base/q>)"), last(sink.triples));
}
Also used : Triple(org.apache.jena.graph.Triple) Tokenizer(org.apache.jena.riot.tokens.Tokenizer) Test(org.junit.Test) BaseTest(org.apache.jena.atlas.junit.BaseTest)

Example 5 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class TestTurtleTerms method parse.

public static void parse(String testString) {
    // Need to access the prefix mapping.
    Tokenizer tokenizer = TokenizerFactory.makeTokenizerString(testString);
    StreamRDF sink = StreamRDFLib.sinkNull();
    LangTurtle parser = RiotParsers.createParserTurtle(tokenizer, sink, RiotLib.dftProfile());
    PrefixMap prefixMap = parser.getProfile().getPrefixMap();
    prefixMap.add("a", "http://host/a#");
    prefixMap.add("x", "http://host/a#");
    // Unicode 00E9 is e-acute
    // Unicode 03B1 is alpha
    prefixMap.add("é", "http://host/e-acute/");
    prefixMap.add("α", "http://host/alpha/");
    prefixMap.add("", "http://host/");
    prefixMap.add("rdf", "http://www.w3.org/1999/02/22-rdf-syntax-ns#");
    prefixMap.add("xsd", "http://www.w3.org/2001/XMLSchema#");
    parser.parse();
    tokenizer.close();
}
Also used : Tokenizer(org.apache.jena.riot.tokens.Tokenizer)

Aggregations

Tokenizer (org.apache.jena.riot.tokens.Tokenizer)16 BaseTest (org.apache.jena.atlas.junit.BaseTest)4 Token (org.apache.jena.riot.tokens.Token)4 Test (org.junit.Test)4 ByteArrayInputStream (java.io.ByteArrayInputStream)3 InputStream (java.io.InputStream)3 Node (org.apache.jena.graph.Node)3 Triple (org.apache.jena.graph.Triple)3 OutputStream (java.io.OutputStream)2 SerializationFactory (org.apache.jena.atlas.data.SerializationFactory)2 ErrorHandlerEx (org.apache.jena.riot.ErrorHandlerTestLib.ErrorHandlerEx)2 RiotException (org.apache.jena.riot.RiotException)2 Quad (org.apache.jena.sparql.core.Quad)2 BindingInputStream (org.apache.jena.sparql.engine.binding.BindingInputStream)2 BindingOutputStream (org.apache.jena.sparql.engine.binding.BindingOutputStream)2 Timer (org.apache.jena.atlas.lib.Timer)1 LabelToNode (org.apache.jena.riot.lang.LabelToNode)1 LangNQuads (org.apache.jena.riot.lang.LangNQuads)1 LangNTriples (org.apache.jena.riot.lang.LangNTriples)1 SinkQuadOutput (org.apache.jena.riot.out.SinkQuadOutput)1