Search in sources :

Example 1 with CsvFileFormat

use of com.hazelcast.jet.pipeline.file.CsvFileFormat in project hazelcast by hazelcast.

the class CsvReadFileFnProvider method createReadFileFn.

@SuppressWarnings("unchecked")
@Nonnull
@Override
public <T> FunctionEx<Path, Stream<T>> createReadFileFn(@Nonnull FileFormat<T> format) {
    CsvFileFormat<T> csvFileFormat = (CsvFileFormat<T>) format;
    // Format is not Serializable
    Class<?> formatClazz = csvFileFormat.clazz();
    return path -> {
        FileInputStream fis = new FileInputStream(path.toFile());
        MappingIterator<T> iterator;
        Function<T, T> projection = identity();
        if (formatClazz == String[].class) {
            ObjectReader reader = new CsvMapper().enable(Feature.WRAP_AS_ARRAY).readerFor(String[].class).with(CsvSchema.emptySchema().withSkipFirstDataRow(false));
            iterator = reader.readValues(fis);
            if (!iterator.hasNext()) {
                throw new JetException("Header row missing in " + path);
            }
            String[] header = (String[]) iterator.next();
            List<String> fieldNames = csvFileFormat.fieldNames();
            if (fieldNames != null) {
                projection = (Function<T, T>) createFieldProjection(header, fieldNames);
            }
        } else {
            iterator = new CsvMapper().readerFor(formatClazz).withoutFeatures(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES).with(CsvSchema.emptySchema().withHeader()).readValues(fis);
        }
        return StreamSupport.stream(Spliterators.spliteratorUnknownSize(iterator, ORDERED), false).map(projection).onClose(() -> uncheckRun(fis::close));
    };
}
Also used : FunctionEx(com.hazelcast.function.FunctionEx) Util.uncheckRun(com.hazelcast.jet.impl.util.Util.uncheckRun) Spliterators(java.util.Spliterators) MappingIterator(com.fasterxml.jackson.databind.MappingIterator) CsvMapper(com.fasterxml.jackson.dataformat.csv.CsvMapper) CsvSchema(com.fasterxml.jackson.dataformat.csv.CsvSchema) ORDERED(java.util.Spliterator.ORDERED) FileInputStream(java.io.FileInputStream) Function(java.util.function.Function) DeserializationFeature(com.fasterxml.jackson.databind.DeserializationFeature) ObjectReader(com.fasterxml.jackson.databind.ObjectReader) JetException(com.hazelcast.jet.JetException) CsvFileFormat(com.hazelcast.jet.pipeline.file.CsvFileFormat) Feature(com.fasterxml.jackson.dataformat.csv.CsvParser.Feature) FileFormat(com.hazelcast.jet.pipeline.file.FileFormat) List(java.util.List) Stream(java.util.stream.Stream) Util.createFieldProjection(com.hazelcast.jet.impl.util.Util.createFieldProjection) ReadFileFnProvider(com.hazelcast.jet.pipeline.file.impl.ReadFileFnProvider) Function.identity(java.util.function.Function.identity) StreamSupport(java.util.stream.StreamSupport) Nonnull(javax.annotation.Nonnull) Path(java.nio.file.Path) SuppressFBWarnings(edu.umd.cs.findbugs.annotations.SuppressFBWarnings) CsvMapper(com.fasterxml.jackson.dataformat.csv.CsvMapper) JetException(com.hazelcast.jet.JetException) FileInputStream(java.io.FileInputStream) Function(java.util.function.Function) MappingIterator(com.fasterxml.jackson.databind.MappingIterator) ObjectReader(com.fasterxml.jackson.databind.ObjectReader) List(java.util.List) CsvFileFormat(com.hazelcast.jet.pipeline.file.CsvFileFormat) Nonnull(javax.annotation.Nonnull)

Aggregations

DeserializationFeature (com.fasterxml.jackson.databind.DeserializationFeature)1 MappingIterator (com.fasterxml.jackson.databind.MappingIterator)1 ObjectReader (com.fasterxml.jackson.databind.ObjectReader)1 CsvMapper (com.fasterxml.jackson.dataformat.csv.CsvMapper)1 Feature (com.fasterxml.jackson.dataformat.csv.CsvParser.Feature)1 CsvSchema (com.fasterxml.jackson.dataformat.csv.CsvSchema)1 FunctionEx (com.hazelcast.function.FunctionEx)1 JetException (com.hazelcast.jet.JetException)1 Util.createFieldProjection (com.hazelcast.jet.impl.util.Util.createFieldProjection)1 Util.uncheckRun (com.hazelcast.jet.impl.util.Util.uncheckRun)1 CsvFileFormat (com.hazelcast.jet.pipeline.file.CsvFileFormat)1 FileFormat (com.hazelcast.jet.pipeline.file.FileFormat)1 ReadFileFnProvider (com.hazelcast.jet.pipeline.file.impl.ReadFileFnProvider)1 SuppressFBWarnings (edu.umd.cs.findbugs.annotations.SuppressFBWarnings)1 FileInputStream (java.io.FileInputStream)1 Path (java.nio.file.Path)1 List (java.util.List)1 ORDERED (java.util.Spliterator.ORDERED)1 Spliterators (java.util.Spliterators)1 Function (java.util.function.Function)1