Search in sources :

Example 6 with TsvParser

use of com.univocity.parsers.tsv.TsvParser in project eol-globi-data by jhpoelen.

the class StudyImporterForHurlbert method importStudy.

@Override
public void importStudy() throws StudyImporterException {
    InputStream resource;
    try {
        resource = getDataset().getResource(RESOURCE);
    } catch (IOException e) {
        throw new StudyImporterException("failed to access [" + RESOURCE + "]", e);
    }
    Set<String> regions = new HashSet<String>();
    Set<String> locales = new HashSet<String>();
    Set<String> habitats = new HashSet<String>();
    TsvParserSettings settings = new TsvParserSettings();
    settings.getFormat().setLineSeparator("\n");
    settings.setHeaderExtractionEnabled(true);
    TsvParser parser = new TsvParser(settings);
    parser.beginParsing(resource, CharsetConstant.UTF8);
    Record record;
    while ((record = parser.parseNextRecord()) != null) {
        String columnNameSource = "Source";
        String sourceCitation = columnValueOrNull(record, columnNameSource);
        if (StringUtils.isBlank(sourceCitation)) {
            LOG.warn("failed to extract source from column [" + columnNameSource + "] in [" + RESOURCE + "] on line [" + (parser.getContext().currentLine() + 1) + "]");
        } else {
            importRecords(regions, locales, habitats, record, sourceCitation);
        }
    }
}
Also used : InputStream(java.io.InputStream) TsvParserSettings(com.univocity.parsers.tsv.TsvParserSettings) Record(com.univocity.parsers.common.record.Record) IOException(java.io.IOException) TsvParser(com.univocity.parsers.tsv.TsvParser) HashSet(java.util.HashSet)

Example 7 with TsvParser

use of com.univocity.parsers.tsv.TsvParser in project eol-globi-data by jhpoelen.

the class StudyImporterForFishbase3 method handleTsvInputStream.

public static void handleTsvInputStream(RecordListener listener, InputStream is) throws StudyImporterException {
    TsvParserSettings settings = new TsvParserSettings();
    settings.getFormat().setLineSeparator("\n");
    settings.setMaxCharsPerColumn(4096 * 8);
    settings.setHeaderExtractionEnabled(true);
    TsvParser parser = new TsvParser(settings);
    parser.beginParsing(is, CharsetConstant.UTF8);
    Record record;
    while ((record = parser.parseNextRecord()) != null) {
        listener.onRecord(record);
    }
}
Also used : TsvParserSettings(com.univocity.parsers.tsv.TsvParserSettings) Record(com.univocity.parsers.common.record.Record) TsvParser(com.univocity.parsers.tsv.TsvParser)

Example 8 with TsvParser

use of com.univocity.parsers.tsv.TsvParser in project QueryAnalysis by Wikidata.

the class Main method loadPropertyGroupMapping.

/**
 * Loads the mapping of property to groups.
 */
private static void loadPropertyGroupMapping() {
    TsvParserSettings parserSettings = new TsvParserSettings();
    parserSettings.setLineSeparatorDetectionEnabled(true);
    parserSettings.setHeaderExtractionEnabled(true);
    parserSettings.setSkipEmptyLines(true);
    parserSettings.setReadInputOnSeparateThread(true);
    ObjectRowProcessor rowProcessor = new ObjectRowProcessor() {

        @Override
        public void rowProcessed(Object[] row, ParsingContext parsingContext) {
            if (row.length <= 1) {
                logger.warn("Ignoring line without tab while parsing.");
                return;
            }
            if (row.length == 2) {
                if (row[1] == null) {
                    return;
                }
                propertyGroupMapping.put(row[0].toString(), new HashSet<String>(Arrays.asList(row[1].toString().split(","))));
                return;
            }
            logger.warn("Line with row length " + row.length + " found. Is the formatting of propertyGroupMapping.tsv correct?");
            return;
        }
    };
    parserSettings.setProcessor(rowProcessor);
    TsvParser parser = new TsvParser(parserSettings);
    File file = new File("propertyClassification/propertyGroupMapping.tsv");
    parser.parse(file);
}
Also used : ParsingContext(com.univocity.parsers.common.ParsingContext) TsvParserSettings(com.univocity.parsers.tsv.TsvParserSettings) ObjectRowProcessor(com.univocity.parsers.common.processor.ObjectRowProcessor) TsvParser(com.univocity.parsers.tsv.TsvParser)

Aggregations

TsvParser (com.univocity.parsers.tsv.TsvParser)8 TsvParserSettings (com.univocity.parsers.tsv.TsvParserSettings)8 ParsingContext (com.univocity.parsers.common.ParsingContext)5 ObjectRowProcessor (com.univocity.parsers.common.processor.ObjectRowProcessor)5 Record (com.univocity.parsers.common.record.Record)2 JSONElement (com.eden.common.json.JSONElement)1 CsvParser (com.univocity.parsers.csv.CsvParser)1 CsvParserSettings (com.univocity.parsers.csv.CsvParserSettings)1 IOException (java.io.IOException)1 InputStream (java.io.InputStream)1 Path (java.nio.file.Path)1 HashSet (java.util.HashSet)1 Entry (java.util.Map.Entry)1 JSONArray (org.json.JSONArray)1 JSONObject (org.json.JSONObject)1 ParsedQuery (org.openrdf.query.parser.ParsedQuery)1 OpenRDFQueryHandler (query.OpenRDFQueryHandler)1 QueryHandler (query.QueryHandler)1