Search in sources :

Example 1 with PeekingIterator

use of org.apache.accumulo.core.util.PeekingIterator in project hive by apache.

the class HiveAccumuloTableInputFormat method getRecordReader.

/**
   * Setup accumulo input format from conf properties. Delegates to final RecordReader from mapred
   * package.
   *
   * @param inputSplit
   * @param jobConf
   * @param reporter
   * @return RecordReader
   * @throws IOException
   */
@Override
public RecordReader<Text, AccumuloHiveRow> getRecordReader(InputSplit inputSplit, final JobConf jobConf, final Reporter reporter) throws IOException {
    final ColumnMapper columnMapper;
    try {
        columnMapper = getColumnMapper(jobConf);
    } catch (TooManyAccumuloColumnsException e) {
        throw new IOException(e);
    }
    try {
        final List<IteratorSetting> iterators = predicateHandler.getIterators(jobConf, columnMapper);
        HiveAccumuloSplit hiveSplit = (HiveAccumuloSplit) inputSplit;
        RangeInputSplit rangeSplit = hiveSplit.getSplit();
        log.info("Split: " + rangeSplit);
        // Should be fixed in Accumulo 1.5.2 and 1.6.1
        if (null == rangeSplit.getIterators() || (rangeSplit.getIterators().isEmpty() && !iterators.isEmpty())) {
            log.debug("Re-setting iterators on InputSplit due to Accumulo bug.");
            rangeSplit.setIterators(iterators);
        }
        // but we want it to, so just re-set it if it's null.
        if (null == getTableName(rangeSplit)) {
            final AccumuloConnectionParameters accumuloParams = new AccumuloConnectionParameters(jobConf);
            log.debug("Re-setting table name on InputSplit due to Accumulo bug.");
            setTableName(rangeSplit, accumuloParams.getAccumuloTableName());
        }
        final RecordReader<Text, PeekingIterator<Map.Entry<Key, Value>>> recordReader = accumuloInputFormat.getRecordReader(rangeSplit, jobConf, reporter);
        return new HiveAccumuloRecordReader(recordReader, iterators.size());
    } catch (SerDeException e) {
        throw new IOException(StringUtils.stringifyException(e));
    }
}
Also used : Text(org.apache.hadoop.io.Text) IOException(java.io.IOException) PeekingIterator(org.apache.accumulo.core.util.PeekingIterator) TooManyAccumuloColumnsException(org.apache.hadoop.hive.accumulo.serde.TooManyAccumuloColumnsException) RangeInputSplit(org.apache.accumulo.core.client.mapred.RangeInputSplit) IteratorSetting(org.apache.accumulo.core.client.IteratorSetting) Value(org.apache.accumulo.core.data.Value) AccumuloConnectionParameters(org.apache.hadoop.hive.accumulo.AccumuloConnectionParameters) Map(java.util.Map) Key(org.apache.accumulo.core.data.Key) SerDeException(org.apache.hadoop.hive.serde2.SerDeException) ColumnMapper(org.apache.hadoop.hive.accumulo.columns.ColumnMapper)

Aggregations

IOException (java.io.IOException)1 Map (java.util.Map)1 IteratorSetting (org.apache.accumulo.core.client.IteratorSetting)1 RangeInputSplit (org.apache.accumulo.core.client.mapred.RangeInputSplit)1 Key (org.apache.accumulo.core.data.Key)1 Value (org.apache.accumulo.core.data.Value)1 PeekingIterator (org.apache.accumulo.core.util.PeekingIterator)1 AccumuloConnectionParameters (org.apache.hadoop.hive.accumulo.AccumuloConnectionParameters)1 ColumnMapper (org.apache.hadoop.hive.accumulo.columns.ColumnMapper)1 TooManyAccumuloColumnsException (org.apache.hadoop.hive.accumulo.serde.TooManyAccumuloColumnsException)1 SerDeException (org.apache.hadoop.hive.serde2.SerDeException)1 Text (org.apache.hadoop.io.Text)1