Search in sources :

Example 6 with CsvParserSettings

use of com.univocity.parsers.csv.CsvParserSettings in project dble by actiontech.

the class ServerLoadDataInfileHandler method parseFileByLine.

private void parseFileByLine(String file, String encode, String split) {
    List<SQLExpr> columns = statement.getColumns();
    CsvParserSettings settings = new CsvParserSettings();
    settings.setMaxColumns(65535);
    settings.setMaxCharsPerColumn(65535);
    settings.getFormat().setLineSeparator(loadData.getLineTerminatedBy());
    settings.getFormat().setDelimiter(loadData.getFieldTerminatedBy().charAt(0));
    if (loadData.getEnclose() != null) {
        settings.getFormat().setQuote(loadData.getEnclose().charAt(0));
    }
    settings.getFormat().setNormalizedNewline(loadData.getLineTerminatedBy().charAt(0));
    /*
         *  fix #1074 : LOAD DATA local INFILE导入的所有Boolean类型全部变成了false
         *  不可见字符将在CsvParser被当成whitespace过滤掉, 使用settings.trimValues(false)来避免被过滤掉
         *  FIXME : 设置trimValues(false)之后, 会引起字段值前后的空白字符无法被过滤!
         */
    settings.trimValues(false);
    CsvParser parser = new CsvParser(settings);
    InputStreamReader reader = null;
    FileInputStream fileInputStream = null;
    try {
        fileInputStream = new FileInputStream(file);
        reader = new InputStreamReader(fileInputStream, encode);
        parser.beginParsing(reader);
        String[] row = null;
        int ignoreNumber = 0;
        if (statement.getIgnoreLinesNumber() != null && !"".equals(statement.getIgnoreLinesNumber().toString())) {
            ignoreNumber = Integer.parseInt(statement.getIgnoreLinesNumber().toString());
        }
        while ((row = parser.parseNext()) != null) {
            if (ignoreNumber == 0) {
                parseOneLine(columns, tableName, row, true, loadData.getLineTerminatedBy());
            } else {
                ignoreNumber--;
            }
        }
    } catch (FileNotFoundException | UnsupportedEncodingException e) {
        throw new RuntimeException(e);
    } finally {
        parser.stopParsing();
        if (fileInputStream != null) {
            try {
                fileInputStream.close();
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
        }
        if (reader != null) {
            try {
                reader.close();
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
        }
    }
}
Also used : SQLExpr(com.alibaba.druid.sql.ast.SQLExpr) CsvParserSettings(com.univocity.parsers.csv.CsvParserSettings) CsvParser(com.univocity.parsers.csv.CsvParser)

Example 7 with CsvParserSettings

use of com.univocity.parsers.csv.CsvParserSettings in project engine by Lumeer.

the class ImportFacade method parseCSVFile.

private void parseCSVFile(Collection collection, String data) {
    if (data == null || data.trim().isEmpty()) {
        return;
    }
    CsvParserSettings settings = new CsvParserSettings();
    settings.detectFormatAutomatically();
    settings.setHeaderExtractionEnabled(true);
    CsvParser parser = new CsvParser(settings);
    parser.beginParsing(new StringReader(data));
    String[] headers = Arrays.stream(parser.getRecordMetadata().headers()).filter(Objects::nonNull).toArray(String[]::new);
    if (headers.length == 0) {
        return;
    }
    int[] counts = new int[headers.length];
    int documentsCount = 0;
    List<Document> documents = new ArrayList<>();
    String[] row;
    while ((row = parser.parseNext()) != null) {
        Document d = createDocumentFromRow(headers, row, counts);
        addDocumentMetadata(collection, d);
        documents.add(d);
        if (documents.size() >= MAX_PARSED_DOCUMENTS) {
            addDocumentsToDb(collection.getId(), documents);
            documents.clear();
        }
        documentsCount++;
    }
    if (!documents.isEmpty()) {
        addDocumentsToDb(collection.getId(), documents);
    }
    parser.stopParsing();
    addCollectionMetadata(collection, headers, counts, documentsCount);
}
Also used : CsvParserSettings(com.univocity.parsers.csv.CsvParserSettings) StringReader(java.io.StringReader) ArrayList(java.util.ArrayList) CsvParser(com.univocity.parsers.csv.CsvParser) DataDocument(io.lumeer.engine.api.data.DataDocument) JsonDocument(io.lumeer.api.dto.JsonDocument) Document(io.lumeer.api.model.Document)

Example 8 with CsvParserSettings

use of com.univocity.parsers.csv.CsvParserSettings in project drill by apache.

the class PhoenixBaseTest method parseTblFile.

private static List<String[]> parseTblFile(String path) throws Exception {
    CsvParserSettings settings = new CsvParserSettings();
    settings.getFormat().setDelimiter("|");
    settings.getFormat().setLineSeparator("\n");
    CsvParser parser = new CsvParser(settings);
    return parser.parseAll(getReader(path));
}
Also used : CsvParserSettings(com.univocity.parsers.csv.CsvParserSettings) CsvParser(com.univocity.parsers.csv.CsvParser)

Example 9 with CsvParserSettings

use of com.univocity.parsers.csv.CsvParserSettings in project symja_android_library by axkr.

the class CsvReader method csvParser.

private CsvParser csvParser(CsvReadOptions options) {
    CsvParserSettings settings = new CsvParserSettings();
    settings.setLineSeparatorDetectionEnabled(options.lineSeparatorDetectionEnabled());
    settings.setFormat(csvFormat(options));
    settings.setMaxCharsPerColumn(options.maxCharsPerColumn());
    if (options.maxNumberOfColumns() != null) {
        settings.setMaxColumns(options.maxNumberOfColumns());
    }
    return new CsvParser(settings);
}
Also used : CsvParserSettings(com.univocity.parsers.csv.CsvParserSettings) CsvParser(com.univocity.parsers.csv.CsvParser)

Example 10 with CsvParserSettings

use of com.univocity.parsers.csv.CsvParserSettings in project Mycat-Server by MyCATApache.

the class ServerLoadDataInfileHandler method parseFileByLine.

private void parseFileByLine(String file, String encode, String split) {
    List<SQLExpr> columns = statement.getColumns();
    CsvParserSettings settings = new CsvParserSettings();
    settings.setMaxColumns(65535);
    settings.setMaxCharsPerColumn(65535);
    settings.getFormat().setLineSeparator(loadData.getLineTerminatedBy());
    settings.getFormat().setDelimiter(loadData.getFieldTerminatedBy().charAt(0));
    if (loadData.getEnclose() != null) {
        settings.getFormat().setQuote(loadData.getEnclose().charAt(0));
    }
    if (loadData.getEscape() != null) {
        settings.getFormat().setQuoteEscape(loadData.getEscape().charAt(0));
    }
    settings.getFormat().setNormalizedNewline(loadData.getLineTerminatedBy().charAt(0));
    /*
         *  fix #1074 : LOAD DATA local INFILE导入的所有Boolean类型全部变成了false
         *  不可见字符将在CsvParser被当成whitespace过滤掉, 使用settings.trimValues(false)来避免被过滤掉
         *  TODO : 设置trimValues(false)之后, 会引起字段值前后的空白字符无法被过滤!
         */
    settings.trimValues(false);
    CsvParser parser = new CsvParser(settings);
    InputStreamReader reader = null;
    FileInputStream fileInputStream = null;
    try {
        fileInputStream = new FileInputStream(file);
        reader = new InputStreamReader(fileInputStream, encode);
        parser.beginParsing(reader);
        String[] row = null;
        while ((row = parser.parseNext()) != null) {
            parseOneLine(columns, tableName, row, true, loadData.getLineTerminatedBy());
        }
    } catch (FileNotFoundException | UnsupportedEncodingException e) {
        throw new RuntimeException(e);
    } finally {
        parser.stopParsing();
        if (fileInputStream != null) {
            try {
                fileInputStream.close();
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
        }
        if (reader != null) {
            try {
                reader.close();
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
        }
    }
}
Also used : SQLExpr(com.alibaba.druid.sql.ast.SQLExpr) CsvParserSettings(com.univocity.parsers.csv.CsvParserSettings) CsvParser(com.univocity.parsers.csv.CsvParser)

Aggregations

CsvParserSettings (com.univocity.parsers.csv.CsvParserSettings)22 CsvParser (com.univocity.parsers.csv.CsvParser)15 SQLExpr (com.alibaba.druid.sql.ast.SQLExpr)6 ParsingContext (com.univocity.parsers.common.ParsingContext)5 Reader (java.io.Reader)5 Benchmark (org.openjdk.jmh.annotations.Benchmark)4 AbstractRowProcessor (com.univocity.parsers.common.processor.AbstractRowProcessor)2 ConcurrentRowProcessor (com.univocity.parsers.common.processor.ConcurrentRowProcessor)2 RowListProcessor (com.univocity.parsers.common.processor.RowListProcessor)2 RouteResultset (io.mycat.route.RouteResultset)2 ArrayList (java.util.ArrayList)2 RouteResultset (com.actiontech.dble.route.RouteResultset)1 JSONElement (com.eden.common.json.JSONElement)1 TsvParser (com.univocity.parsers.tsv.TsvParser)1 TsvParserSettings (com.univocity.parsers.tsv.TsvParserSettings)1 JsonDocument (io.lumeer.api.dto.JsonDocument)1 Document (io.lumeer.api.model.Document)1 DataDocument (io.lumeer.engine.api.data.DataDocument)1 InputStreamReader (java.io.InputStreamReader)1 StringReader (java.io.StringReader)1