Search in sources :

Example 11 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class SerializationFactoryFinder method tripleSerializationFactory.

public static SerializationFactory<Triple> tripleSerializationFactory() {
    return new SerializationFactory<Triple>() {

        @Override
        public Sink<Triple> createSerializer(OutputStream out) {
            return new SinkTripleOutput(out, null, NodeToLabel.createBNodeByLabelEncoded());
        }

        @Override
        public Iterator<Triple> createDeserializer(InputStream in) {
            Tokenizer tokenizer = TokenizerFactory.makeTokenizerASCII(in);
            ParserProfile profile = RiotLib.createParserProfile(RiotLib.factoryRDF(LabelToNode.createUseLabelEncoded()), ErrorHandlerFactory.errorHandlerNoWarnings, IRIResolver.createNoResolve(), false);
            LangNTriples parser = new LangNTriples(tokenizer, profile, null);
            return parser;
        }

        @Override
        public long getEstimatedMemorySize(Triple item) {
            // TODO
            return 0;
        }
    };
}
Also used : Triple(org.apache.jena.graph.Triple) LangNTriples(org.apache.jena.riot.lang.LangNTriples) SinkTripleOutput(org.apache.jena.riot.out.SinkTripleOutput) BindingInputStream(org.apache.jena.sparql.engine.binding.BindingInputStream) InputStream(java.io.InputStream) OutputStream(java.io.OutputStream) BindingOutputStream(org.apache.jena.sparql.engine.binding.BindingOutputStream) SerializationFactory(org.apache.jena.atlas.data.SerializationFactory) Tokenizer(org.apache.jena.riot.tokens.Tokenizer)

Example 12 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class TestLangRdfJson method rdfjson_invalid_tokenizer.

@Test(expected = IllegalArgumentException.class)
public void rdfjson_invalid_tokenizer() {
    byte[] b = StrUtils.asUTF8bytes("");
    ByteArrayInputStream in = new ByteArrayInputStream(b);
    Tokenizer tokenizer = TokenizerFactory.makeTokenizerUTF8(in);
    StreamRDFCounting sink = StreamRDFLib.count();
    LangRDFJSON parser = RiotParsers.createParserRdfJson(tokenizer, sink, RiotLib.dftProfile());
}
Also used : ByteArrayInputStream(java.io.ByteArrayInputStream) Tokenizer(org.apache.jena.riot.tokens.Tokenizer) Test(org.junit.Test) BaseTest(org.apache.jena.atlas.junit.BaseTest)

Example 13 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class TestParserFactory method ntriples_01.

@Test
public void ntriples_01() {
    {
        String s = "<x> <p> <q> .";
        CatchParserOutput sink = parseCapture(s, Lang.NT);
        assertEquals(1, sink.startCalled);
        assertEquals(1, sink.finishCalled);
        assertEquals(1, sink.triples.size());
        assertEquals(0, sink.quads.size());
        Triple t = SSE.parseTriple("(<x> <p> <q>)");
        assertEquals(t, last(sink.triples));
    }
    // Old style, direct to LangRIOT -- very deprecated.
    // NQ version tests that relative URIs remain relative. 
    Tokenizer tokenizer = TokenizerFactory.makeTokenizerString("<x> <p> <q> .");
    CatchParserOutput sink = new CatchParserOutput();
    ParserProfile profile = makeParserProfile(IRIResolver.createNoResolve(), null, false);
    LangRIOT parser = RiotParsers.createParserNTriples(tokenizer, sink, profile);
    parser.parse();
    assertEquals(1, sink.startCalled);
    assertEquals(1, sink.finishCalled);
    assertEquals(1, sink.triples.size());
    assertEquals(0, sink.quads.size());
    assertEquals(SSE.parseTriple("(<x> <p> <q>)"), last(sink.triples));
}
Also used : Triple(org.apache.jena.graph.Triple) Tokenizer(org.apache.jena.riot.tokens.Tokenizer) Test(org.junit.Test) BaseTest(org.apache.jena.atlas.junit.BaseTest)

Example 14 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class NodeFactoryExtra method parseNode.

/**
     * Parse a string into a node.
     * <p>
     * Allows surrounding white space.
     * </p>
     * 
     * @param nodeString Node string to parse
     * @param pmap Prefix Map, null to use no prefix mappings
     * @return Parsed Node
     * @throws RiotException Thrown if a valid node cannot be parsed
     */
public static Node parseNode(String nodeString, PrefixMap pmap) {
    Tokenizer tokenizer = TokenizerFactory.makeTokenizerString(nodeString);
    if (!tokenizer.hasNext())
        throw new RiotException("Empty RDF term");
    Token token = tokenizer.next();
    Node node = token.asNode(pmap);
    if (node == null)
        throw new RiotException("Bad RDF Term: " + nodeString);
    if (tokenizer.hasNext())
        throw new RiotException("Trailing characters in string: " + nodeString);
    if (node.isURI()) {
        // Lightly test for bad URIs.
        String x = node.getURI();
        if (x.indexOf(' ') >= 0)
            throw new RiotException("Space(s) in  IRI: " + nodeString);
    }
    return node;
}
Also used : RiotException(org.apache.jena.riot.RiotException) Node(org.apache.jena.graph.Node) Token(org.apache.jena.riot.tokens.Token) Tokenizer(org.apache.jena.riot.tokens.Tokenizer)

Example 15 with Tokenizer

use of org.apache.jena.riot.tokens.Tokenizer in project jena by apache.

the class TestLangNTuples method tokenizer.

protected static Tokenizer tokenizer(String string) {
    // UTF-8
    byte[] b = StrUtils.asUTF8bytes(string);
    ByteArrayInputStream in = new ByteArrayInputStream(b);
    Tokenizer tokenizer = TokenizerFactory.makeTokenizerUTF8(in);
    return tokenizer;
}
Also used : ByteArrayInputStream(java.io.ByteArrayInputStream) Tokenizer(org.apache.jena.riot.tokens.Tokenizer)

Aggregations

Tokenizer (org.apache.jena.riot.tokens.Tokenizer)16 BaseTest (org.apache.jena.atlas.junit.BaseTest)4 Token (org.apache.jena.riot.tokens.Token)4 Test (org.junit.Test)4 ByteArrayInputStream (java.io.ByteArrayInputStream)3 InputStream (java.io.InputStream)3 Node (org.apache.jena.graph.Node)3 Triple (org.apache.jena.graph.Triple)3 OutputStream (java.io.OutputStream)2 SerializationFactory (org.apache.jena.atlas.data.SerializationFactory)2 ErrorHandlerEx (org.apache.jena.riot.ErrorHandlerTestLib.ErrorHandlerEx)2 RiotException (org.apache.jena.riot.RiotException)2 Quad (org.apache.jena.sparql.core.Quad)2 BindingInputStream (org.apache.jena.sparql.engine.binding.BindingInputStream)2 BindingOutputStream (org.apache.jena.sparql.engine.binding.BindingOutputStream)2 Timer (org.apache.jena.atlas.lib.Timer)1 LabelToNode (org.apache.jena.riot.lang.LabelToNode)1 LangNQuads (org.apache.jena.riot.lang.LangNQuads)1 LangNTriples (org.apache.jena.riot.lang.LangNTriples)1 SinkQuadOutput (org.apache.jena.riot.out.SinkQuadOutput)1