use of org.apache.tika.parser.ParsingReader in project tika by apache.
the class Tika method parse.
/**
* Parses the given document and returns the extracted text content.
* Input metadata like a file name or a content type hint can be passed
* in the given metadata instance. Metadata information extracted from
* the document is returned in that same metadata instance.
* <p>
* The returned reader will be responsible for closing the given stream.
* The stream and any associated resources will be closed at or before
* the time when the {@link Reader#close()} method is called.
*
* @param stream the document to be parsed
* @param metadata where document's metadata will be populated
* @return extracted text content
* @throws IOException if the document can not be read or parsed
*/
public Reader parse(InputStream stream, Metadata metadata) throws IOException {
ParseContext context = new ParseContext();
context.set(Parser.class, parser);
return new ParsingReader(parser, stream, metadata, context);
}
Aggregations