Search in sources :

Example 31 with CSVParser

use of org.apache.commons.csv.CSVParser in project tika by apache.

the class ISATabUtils method parseAssay.

public static void parseAssay(InputStream stream, XHTMLContentHandler xhtml, Metadata metadata, ParseContext context) throws IOException, TikaException, SAXException {
    TikaInputStream tis = TikaInputStream.get(stream);
    // Automatically detect the character encoding
    TikaConfig tikaConfig = context.get(TikaConfig.class);
    if (tikaConfig == null) {
        tikaConfig = TikaConfig.getDefaultConfig();
    }
    try (AutoDetectReader reader = new AutoDetectReader(new CloseShieldInputStream(tis), metadata, tikaConfig.getEncodingDetector());
        CSVParser csvParser = new CSVParser(reader, CSVFormat.TDF)) {
        xhtml.startElement("table");
        Iterator<CSVRecord> iterator = csvParser.iterator();
        xhtml.startElement("thead");
        if (iterator.hasNext()) {
            CSVRecord record = iterator.next();
            for (int i = 0; i < record.size(); i++) {
                xhtml.startElement("th");
                xhtml.characters(record.get(i));
                xhtml.endElement("th");
            }
        }
        xhtml.endElement("thead");
        xhtml.startElement("tbody");
        while (iterator.hasNext()) {
            CSVRecord record = iterator.next();
            xhtml.startElement("tr");
            for (int j = 0; j < record.size(); j++) {
                xhtml.startElement("td");
                xhtml.characters(record.get(j));
                xhtml.endElement("td");
            }
            xhtml.endElement("tr");
        }
        xhtml.endElement("tbody");
        xhtml.endElement("table");
    }
}
Also used : TikaConfig(org.apache.tika.config.TikaConfig) AutoDetectReader(org.apache.tika.detect.AutoDetectReader) CSVParser(org.apache.commons.csv.CSVParser) TikaInputStream(org.apache.tika.io.TikaInputStream) CSVRecord(org.apache.commons.csv.CSVRecord) CloseShieldInputStream(org.apache.commons.io.input.CloseShieldInputStream)

Aggregations

CSVParser (org.apache.commons.csv.CSVParser)31 CSVRecord (org.apache.commons.csv.CSVRecord)19 StringReader (java.io.StringReader)17 Test (org.junit.Test)16 PhoenixConnection (org.apache.phoenix.jdbc.PhoenixConnection)15 CSVCommonsLoader (org.apache.phoenix.util.CSVCommonsLoader)15 PreparedStatement (java.sql.PreparedStatement)11 ResultSet (java.sql.ResultSet)11 IOException (java.io.IOException)5 File (java.io.File)3 FileReader (java.io.FileReader)3 ArrayList (java.util.ArrayList)3 SQLException (java.sql.SQLException)2 HashMap (java.util.HashMap)2 CloseShieldInputStream (org.apache.commons.io.input.CloseShieldInputStream)2 TikaConfig (org.apache.tika.config.TikaConfig)2 AutoDetectReader (org.apache.tika.detect.AutoDetectReader)2 TikaInputStream (org.apache.tika.io.TikaInputStream)2 ImmutableTable (com.google.common.collect.ImmutableTable)1 JsonArray (com.google.gson.JsonArray)1