Search in sources :

Example 1 with IExternalIndexer

use of org.apache.asterix.external.api.IExternalIndexer in project asterixdb by apache.

the class HDFSDataSourceFactory method createRecordReader.

/**
     * HDFS Datasource is a special case in two ways:
     * 1. It supports indexing.
     * 2. It returns input as a set of writable object that we sometimes internally transform into a byte stream
     * Hence, it can produce:
     * 1. StreamRecordReader: When we transform the input into a byte stream.
     * 2. Indexing Stream Record Reader: When we transform the input into a byte stream and perform indexing.
     * 3. HDFS Record Reader: When we simply pass the Writable object as it is to the parser.
     */
@Override
public IRecordReader<? extends Object> createRecordReader(IHyracksTaskContext ctx, int partition) throws HyracksDataException {
    try {
        IExternalIndexer indexer = files == null ? null : ExternalIndexerProvider.getIndexer(configuration);
        if (recordReaderClazz != null) {
            StreamRecordReader streamReader = (StreamRecordReader) recordReaderClazz.getConstructor().newInstance();
            streamReader.configure(createInputStream(ctx, partition, indexer), configuration);
            if (indexer != null) {
                return new IndexingStreamRecordReader(streamReader, indexer);
            } else {
                return streamReader;
            }
        }
        restoreConfig(ctx);
        return new HDFSRecordReader<>(read, inputSplits, readSchedule, nodeName, conf, files, indexer);
    } catch (Exception e) {
        throw new HyracksDataException(e);
    }
}
Also used : StreamRecordReader(org.apache.asterix.external.input.record.reader.stream.StreamRecordReader) IndexingStreamRecordReader(org.apache.asterix.external.input.record.reader.IndexingStreamRecordReader) IndexingStreamRecordReader(org.apache.asterix.external.input.record.reader.IndexingStreamRecordReader) IExternalIndexer(org.apache.asterix.external.api.IExternalIndexer) HyracksDataException(org.apache.hyracks.api.exceptions.HyracksDataException) AsterixException(org.apache.asterix.common.exceptions.AsterixException) IOException(java.io.IOException) HyracksDataException(org.apache.hyracks.api.exceptions.HyracksDataException) HDFSRecordReader(org.apache.asterix.external.input.record.reader.hdfs.HDFSRecordReader)

Aggregations

IOException (java.io.IOException)1 AsterixException (org.apache.asterix.common.exceptions.AsterixException)1 IExternalIndexer (org.apache.asterix.external.api.IExternalIndexer)1 IndexingStreamRecordReader (org.apache.asterix.external.input.record.reader.IndexingStreamRecordReader)1 HDFSRecordReader (org.apache.asterix.external.input.record.reader.hdfs.HDFSRecordReader)1 StreamRecordReader (org.apache.asterix.external.input.record.reader.stream.StreamRecordReader)1 HyracksDataException (org.apache.hyracks.api.exceptions.HyracksDataException)1