Search in sources :

Example 1 with HiveInputSplit

use of org.apache.hadoop.hive.ql.io.HiveInputFormat.HiveInputSplit in project hive by apache.

the class SplitFilter method filter.

public List<HiveInputSplit> filter(HiveInputSplit[] splits) throws IOException {
    long sumSplitLengths = 0;
    List<HiveInputSplit> newSplits = new ArrayList<>();
    Arrays.sort(splits, new HiveInputSplitComparator());
    for (HiveInputSplit split : splits) {
        LOG.info("split start : " + split.getStart());
        LOG.info("split end : " + (split.getStart() + split.getLength()));
        try {
            if (indexResult.contains(split)) {
                HiveInputSplit newSplit = split;
                if (isAdjustmentRequired(newSplits, split)) {
                    newSplit = adjustSplit(split);
                }
                sumSplitLengths += newSplit.getLength();
                if (sumSplitLengths > maxInputSize) {
                    String messageTemplate = "Size of data to read during a compact-index-based query " + "exceeded the maximum of %d set in %s";
                    throw new IOException(String.format(messageTemplate, maxInputSize, HiveConf.ConfVars.HIVE_INDEX_COMPACT_QUERY_MAX_SIZE.varname));
                }
                newSplits.add(newSplit);
            }
        } catch (HiveException e) {
            throw new RuntimeException("Unable to get metadata for input table split " + split.getPath(), e);
        }
    }
    LOG.info("Number of input splits: {}, new input splits: {}, sum of split lengths: {}", splits.length, newSplits.size(), sumSplitLengths);
    return newSplits;
}
Also used : HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) ArrayList(java.util.ArrayList) HiveInputSplit(org.apache.hadoop.hive.ql.io.HiveInputFormat.HiveInputSplit) IOException(java.io.IOException)

Example 2 with HiveInputSplit

use of org.apache.hadoop.hive.ql.io.HiveInputFormat.HiveInputSplit in project hive by apache.

the class SplitFilterTestCase method assertSplits.

private void assertSplits(Collection<HiveInputSplit> expectedSplits, Collection<HiveInputSplit> actualSplits) {
    SplitFilter.HiveInputSplitComparator hiveInputSplitComparator = new SplitFilter.HiveInputSplitComparator();
    List<HiveInputSplit> sortedExpectedSplits = new ArrayList<>(expectedSplits);
    Collections.sort(sortedExpectedSplits, hiveInputSplitComparator);
    List<HiveInputSplit> sortedActualSplits = new ArrayList<>(actualSplits);
    Collections.sort(sortedActualSplits, hiveInputSplitComparator);
    assertEquals("Number of selected splits.", sortedExpectedSplits.size(), sortedActualSplits.size());
    for (int i = 0; i < sortedExpectedSplits.size(); i++) {
        HiveInputSplit expectedSplit = sortedExpectedSplits.get(i);
        HiveInputSplit actualSplit = sortedActualSplits.get(i);
        String splitName = "Split #" + i;
        assertEquals(splitName + " path.", expectedSplit.getPath(), actualSplit.getPath());
        assertEquals(splitName + " start.", expectedSplit.getStart(), actualSplit.getStart());
        assertEquals(splitName + " length.", expectedSplit.getLength(), actualSplit.getLength());
    }
}
Also used : ArrayList(java.util.ArrayList) HiveInputSplit(org.apache.hadoop.hive.ql.io.HiveInputFormat.HiveInputSplit)

Aggregations

ArrayList (java.util.ArrayList)2 HiveInputSplit (org.apache.hadoop.hive.ql.io.HiveInputFormat.HiveInputSplit)2 IOException (java.io.IOException)1 HiveException (org.apache.hadoop.hive.ql.metadata.HiveException)1