Search in sources :

Example 1 with StreamRecordReader

use of org.apache.asterix.external.input.record.reader.stream.StreamRecordReader in project asterixdb by apache.

the class StreamRecordReaderProvider method initRecordReaders.

protected static Map<String, List<Pair<String[], Class>>> initRecordReaders() throws AsterixException {
    Map<String, List<Pair<String[], Class>>> recordReaders = new HashMap<>();
    ClassLoader cl = StreamRecordReaderProvider.class.getClassLoader();
    final Charset encoding = Charset.forName("UTF-8");
    try {
        Enumeration<URL> urls = cl.getResources(RESOURCE);
        for (URL url : Collections.list(urls)) {
            InputStream is = url.openStream();
            String config = IOUtils.toString(is, encoding);
            is.close();
            String[] classNames = config.split("\n");
            for (String className : classNames) {
                if (className.startsWith("#")) {
                    continue;
                }
                final Class<?> clazz = Class.forName(className);
                StreamRecordReader newInstance = (StreamRecordReader) clazz.getConstructor().newInstance();
                List<String> formats = newInstance.getRecordReaderFormats();
                String[] configs = newInstance.getRequiredConfigs().split(":");
                for (String format : formats) {
                    if (!recordReaders.containsKey(format)) {
                        recordReaders.put(format, new ArrayList<>());
                    }
                    recordReaders.get(format).add(Pair.of(configs, clazz));
                }
            }
        }
    } catch (IOException | ClassNotFoundException | InvocationTargetException | IllegalAccessException | NoSuchMethodException | InstantiationException e) {
        throw new AsterixException(e);
    }
    return recordReaders;
}
Also used : HashMap(java.util.HashMap) StreamRecordReader(org.apache.asterix.external.input.record.reader.stream.StreamRecordReader) AsterixInputStream(org.apache.asterix.external.api.AsterixInputStream) InputStream(java.io.InputStream) Charset(java.nio.charset.Charset) IOException(java.io.IOException) URL(java.net.URL) InvocationTargetException(java.lang.reflect.InvocationTargetException) AsterixException(org.apache.asterix.common.exceptions.AsterixException) ArrayList(java.util.ArrayList) List(java.util.List)

Example 2 with StreamRecordReader

use of org.apache.asterix.external.input.record.reader.stream.StreamRecordReader in project asterixdb by apache.

the class HDFSDataSourceFactory method createRecordReader.

/**
     * HDFS Datasource is a special case in two ways:
     * 1. It supports indexing.
     * 2. It returns input as a set of writable object that we sometimes internally transform into a byte stream
     * Hence, it can produce:
     * 1. StreamRecordReader: When we transform the input into a byte stream.
     * 2. Indexing Stream Record Reader: When we transform the input into a byte stream and perform indexing.
     * 3. HDFS Record Reader: When we simply pass the Writable object as it is to the parser.
     */
@Override
public IRecordReader<? extends Object> createRecordReader(IHyracksTaskContext ctx, int partition) throws HyracksDataException {
    try {
        IExternalIndexer indexer = files == null ? null : ExternalIndexerProvider.getIndexer(configuration);
        if (recordReaderClazz != null) {
            StreamRecordReader streamReader = (StreamRecordReader) recordReaderClazz.getConstructor().newInstance();
            streamReader.configure(createInputStream(ctx, partition, indexer), configuration);
            if (indexer != null) {
                return new IndexingStreamRecordReader(streamReader, indexer);
            } else {
                return streamReader;
            }
        }
        restoreConfig(ctx);
        return new HDFSRecordReader<>(read, inputSplits, readSchedule, nodeName, conf, files, indexer);
    } catch (Exception e) {
        throw new HyracksDataException(e);
    }
}
Also used : StreamRecordReader(org.apache.asterix.external.input.record.reader.stream.StreamRecordReader) IndexingStreamRecordReader(org.apache.asterix.external.input.record.reader.IndexingStreamRecordReader) IndexingStreamRecordReader(org.apache.asterix.external.input.record.reader.IndexingStreamRecordReader) IExternalIndexer(org.apache.asterix.external.api.IExternalIndexer) HyracksDataException(org.apache.hyracks.api.exceptions.HyracksDataException) AsterixException(org.apache.asterix.common.exceptions.AsterixException) IOException(java.io.IOException) HyracksDataException(org.apache.hyracks.api.exceptions.HyracksDataException) HDFSRecordReader(org.apache.asterix.external.input.record.reader.hdfs.HDFSRecordReader)

Aggregations

IOException (java.io.IOException)2 AsterixException (org.apache.asterix.common.exceptions.AsterixException)2 StreamRecordReader (org.apache.asterix.external.input.record.reader.stream.StreamRecordReader)2 InputStream (java.io.InputStream)1 InvocationTargetException (java.lang.reflect.InvocationTargetException)1 URL (java.net.URL)1 Charset (java.nio.charset.Charset)1 ArrayList (java.util.ArrayList)1 HashMap (java.util.HashMap)1 List (java.util.List)1 AsterixInputStream (org.apache.asterix.external.api.AsterixInputStream)1 IExternalIndexer (org.apache.asterix.external.api.IExternalIndexer)1 IndexingStreamRecordReader (org.apache.asterix.external.input.record.reader.IndexingStreamRecordReader)1 HDFSRecordReader (org.apache.asterix.external.input.record.reader.hdfs.HDFSRecordReader)1 HyracksDataException (org.apache.hyracks.api.exceptions.HyracksDataException)1