Search in sources :

Example 1 with SelfWritableConverter

use of org.datavec.api.io.converters.SelfWritableConverter in project deeplearning4j by deeplearning4j.

the class StringToDataSetExportFunction method processBatchIfRequired.

private void processBatchIfRequired(List<List<Writable>> list, boolean finalRecord) throws Exception {
    if (list.isEmpty())
        return;
    if (list.size() < batchSize && !finalRecord)
        return;
    RecordReader rr = new CollectionRecordReader(list);
    RecordReaderDataSetIterator iter = new RecordReaderDataSetIterator(rr, new SelfWritableConverter(), batchSize, labelIndex, numPossibleLabels, regression);
    DataSet ds = iter.next();
    String filename = "dataset_" + uid + "_" + (outputCount++) + ".bin";
    URI uri = new URI(outputDir.getPath() + "/" + filename);
    FileSystem file = FileSystem.get(uri, conf);
    try (FSDataOutputStream out = file.create(new Path(uri))) {
        ds.save(out);
    }
    list.clear();
}
Also used : Path(org.apache.hadoop.fs.Path) SelfWritableConverter(org.datavec.api.io.converters.SelfWritableConverter) DataSet(org.nd4j.linalg.dataset.DataSet) RecordReader(org.datavec.api.records.reader.RecordReader) CollectionRecordReader(org.datavec.api.records.reader.impl.collection.CollectionRecordReader) FileSystem(org.apache.hadoop.fs.FileSystem) CollectionRecordReader(org.datavec.api.records.reader.impl.collection.CollectionRecordReader) RecordReaderDataSetIterator(org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator) FSDataOutputStream(org.apache.hadoop.fs.FSDataOutputStream) URI(java.net.URI)

Aggregations

URI (java.net.URI)1 FSDataOutputStream (org.apache.hadoop.fs.FSDataOutputStream)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 Path (org.apache.hadoop.fs.Path)1 SelfWritableConverter (org.datavec.api.io.converters.SelfWritableConverter)1 RecordReader (org.datavec.api.records.reader.RecordReader)1 CollectionRecordReader (org.datavec.api.records.reader.impl.collection.CollectionRecordReader)1 RecordReaderDataSetIterator (org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator)1 DataSet (org.nd4j.linalg.dataset.DataSet)1