Search in sources :

Example 1 with FileInputInputStream

use of org.embulk.spi.util.FileInputInputStream in project embulk by embulk.

the class JsonParserPlugin method run.

@Override
public void run(TaskSource taskSource, Schema schema, FileInput input, PageOutput output) {
    PluginTask task = taskSource.loadTask(PluginTask.class);
    final boolean stopOnInvalidRecord = task.getStopOnInvalidRecord();
    // record column
    final Column column = schema.getColumn(0);
    try (PageBuilder pageBuilder = newPageBuilder(schema, output);
        FileInputInputStream in = new FileInputInputStream(input)) {
        while (in.nextFile()) {
            boolean evenOneJsonParsed = false;
            try (JsonParser.Stream stream = newJsonStream(in, task)) {
                Value value;
                while ((value = stream.next()) != null) {
                    try {
                        if (!value.isMapValue()) {
                            throw new JsonRecordValidateException(String.format("A Json record must not represent map value but it's %s", value.getValueType().name()));
                        }
                        pageBuilder.setJson(column, value);
                        pageBuilder.addRecord();
                        evenOneJsonParsed = true;
                    } catch (JsonRecordValidateException e) {
                        if (stopOnInvalidRecord) {
                            throw new DataException(String.format("Invalid record: %s", value.toJson()), e);
                        }
                        log.warn(String.format("Skipped record (%s): %s", e.getMessage(), value.toJson()));
                    }
                }
            } catch (IOException | JsonParseException e) {
                if (Exec.isPreview() && evenOneJsonParsed) {
                    // ignore in preview if at least one JSON is already parsed.
                    break;
                }
                throw new DataException(e);
            }
        }
        pageBuilder.finish();
    }
}
Also used : PageBuilder(org.embulk.spi.PageBuilder) IOException(java.io.IOException) JsonParseException(org.embulk.spi.json.JsonParseException) DataException(org.embulk.spi.DataException) FileInputInputStream(org.embulk.spi.util.FileInputInputStream) Column(org.embulk.spi.Column) Value(org.msgpack.value.Value) JsonParser(org.embulk.spi.json.JsonParser)

Example 2 with FileInputInputStream

use of org.embulk.spi.util.FileInputInputStream in project embulk by embulk.

the class TestFileInputInputStream method testSkipReturnsZeroForNoData.

@Test
public void testSkipReturnsZeroForNoData() {
    FileInputInputStream in = new FileInputInputStream(new MockFileInput());
    assertEquals("Verify skip() returns 0 when there is no data.", 0L, in.skip(1));
}
Also used : FileInputInputStream(org.embulk.spi.util.FileInputInputStream) Test(org.junit.Test)

Example 3 with FileInputInputStream

use of org.embulk.spi.util.FileInputInputStream in project embulk by embulk.

the class TestFileInputInputStream method newInputStream.

private void newInputStream() {
    fileInput = new ListFileInput(fileOutput.getFiles());
    in = new FileInputInputStream(fileInput);
}
Also used : FileInputInputStream(org.embulk.spi.util.FileInputInputStream) ListFileInput(org.embulk.spi.util.ListFileInput)

Example 4 with FileInputInputStream

use of org.embulk.spi.util.FileInputInputStream in project embulk by embulk.

the class Bzip2FileDecoderPlugin method open.

@Override
public FileInput open(TaskSource taskSource, FileInput fileInput) {
    PluginTask task = taskSource.loadTask(PluginTask.class);
    final FileInputInputStream files = new FileInputInputStream(fileInput);
    return new InputStreamFileInput(task.getBufferAllocator(), new InputStreamFileInput.Provider() {

        public InputStream openNext() throws IOException {
            if (!files.nextFile()) {
                return null;
            }
            return new BZip2CompressorInputStream(files, true);
        }

        public void close() throws IOException {
            files.close();
        }
    });
}
Also used : FileInputInputStream(org.embulk.spi.util.FileInputInputStream) BZip2CompressorInputStream(org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream) FileInputInputStream(org.embulk.spi.util.FileInputInputStream) BZip2CompressorInputStream(org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream) InputStream(java.io.InputStream) InputStreamFileInput(org.embulk.spi.util.InputStreamFileInput) IOException(java.io.IOException)

Example 5 with FileInputInputStream

use of org.embulk.spi.util.FileInputInputStream in project embulk by embulk.

the class GzipFileDecoderPlugin method open.

@Override
public FileInput open(TaskSource taskSource, FileInput fileInput) {
    PluginTask task = taskSource.loadTask(PluginTask.class);
    final FileInputInputStream files = new FileInputInputStream(fileInput);
    return new InputStreamFileInput(task.getBufferAllocator(), new InputStreamFileInput.Provider() {

        public InputStream openNext() throws IOException {
            if (!files.nextFile()) {
                return null;
            }
            return new GZIPInputStream(files, 8 * 1024);
        }

        public void close() throws IOException {
            files.close();
        }
    });
}
Also used : GZIPInputStream(java.util.zip.GZIPInputStream) FileInputInputStream(org.embulk.spi.util.FileInputInputStream) GZIPInputStream(java.util.zip.GZIPInputStream) FileInputInputStream(org.embulk.spi.util.FileInputInputStream) InputStream(java.io.InputStream) InputStreamFileInput(org.embulk.spi.util.InputStreamFileInput) IOException(java.io.IOException)

Aggregations

FileInputInputStream (org.embulk.spi.util.FileInputInputStream)5 IOException (java.io.IOException)3 InputStream (java.io.InputStream)2 InputStreamFileInput (org.embulk.spi.util.InputStreamFileInput)2 GZIPInputStream (java.util.zip.GZIPInputStream)1 BZip2CompressorInputStream (org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream)1 Column (org.embulk.spi.Column)1 DataException (org.embulk.spi.DataException)1 PageBuilder (org.embulk.spi.PageBuilder)1 JsonParseException (org.embulk.spi.json.JsonParseException)1 JsonParser (org.embulk.spi.json.JsonParser)1 ListFileInput (org.embulk.spi.util.ListFileInput)1 Test (org.junit.Test)1 Value (org.msgpack.value.Value)1