Search in sources :

Example 1 with CompactionPartitionId

use of org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartitionId in project hbase by apache.

the class PartitionedMobCompactor method compactMobFiles.

/**
   * Compacts the selected small mob files and all the del files.
   * @param request The compaction request.
   * @return The paths of new mob files after compactions.
   * @throws IOException if IO failure is encountered
   */
protected List<Path> compactMobFiles(final PartitionedMobCompactionRequest request) throws IOException {
    Collection<CompactionPartition> partitions = request.compactionPartitions;
    if (partitions == null || partitions.isEmpty()) {
        LOG.info("No partitions of mob files");
        return Collections.emptyList();
    }
    List<Path> paths = new ArrayList<>();
    final Connection c = ConnectionFactory.createConnection(conf);
    final Table table = c.getTable(tableName);
    try {
        Map<CompactionPartitionId, Future<List<Path>>> results = new HashMap<>();
        // compact the mob files by partitions in parallel.
        for (final CompactionPartition partition : partitions) {
            // How to efficiently come up a list of delFiles for one partition?
            // Search the delPartitions and collect all the delFiles for the partition
            // One optimization can do is that if there is no del file, we do not need to
            // come up with startKey/endKey.
            List<StoreFile> delFiles = getListOfDelFilesForPartition(partition, request.getDelPartitions());
            results.put(partition.getPartitionId(), pool.submit(new Callable<List<Path>>() {

                @Override
                public List<Path> call() throws Exception {
                    LOG.info("Compacting mob files for partition " + partition.getPartitionId());
                    return compactMobFilePartition(request, partition, delFiles, c, table);
                }
            }));
        }
        // compact the partitions in parallel.
        List<CompactionPartitionId> failedPartitions = new ArrayList<>();
        for (Entry<CompactionPartitionId, Future<List<Path>>> result : results.entrySet()) {
            try {
                paths.addAll(result.getValue().get());
            } catch (Exception e) {
                // just log the error
                LOG.error("Failed to compact the partition " + result.getKey(), e);
                failedPartitions.add(result.getKey());
            }
        }
        if (!failedPartitions.isEmpty()) {
            // if any partition fails in the compaction, directly throw an exception.
            throw new IOException("Failed to compact the partitions " + failedPartitions);
        }
    } finally {
        try {
            table.close();
        } catch (IOException e) {
            LOG.error("Failed to close the Table", e);
        }
    }
    return paths;
}
Also used : Path(org.apache.hadoop.fs.Path) CompactionPartition(org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartition) Table(org.apache.hadoop.hbase.client.Table) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) Connection(org.apache.hadoop.hbase.client.Connection) CompactionPartitionId(org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartitionId) IOException(java.io.IOException) Callable(java.util.concurrent.Callable) FileNotFoundException(java.io.FileNotFoundException) IOException(java.io.IOException) StoreFile(org.apache.hadoop.hbase.regionserver.StoreFile) Future(java.util.concurrent.Future)

Example 2 with CompactionPartitionId

use of org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartitionId in project hbase by apache.

the class TestPartitionedMobCompactionRequest method testCompactedPartitionId.

@Test
public void testCompactedPartitionId() {
    String startKey1 = "startKey1";
    String startKey2 = "startKey2";
    String date1 = "date1";
    String date2 = "date2";
    CompactionPartitionId partitionId1 = new CompactionPartitionId(startKey1, date1);
    CompactionPartitionId partitionId2 = new CompactionPartitionId(startKey2, date2);
    CompactionPartitionId partitionId3 = new CompactionPartitionId(startKey1, date2);
    Assert.assertTrue(partitionId1.equals(partitionId1));
    Assert.assertFalse(partitionId1.equals(partitionId2));
    Assert.assertFalse(partitionId1.equals(partitionId3));
    Assert.assertFalse(partitionId2.equals(partitionId3));
    Assert.assertEquals(startKey1, partitionId1.getStartKey());
    Assert.assertEquals(date1, partitionId1.getDate());
}
Also used : CompactionPartitionId(org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartitionId) Test(org.junit.Test)

Example 3 with CompactionPartitionId

use of org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartitionId in project hbase by apache.

the class PartitionedMobCompactor method select.

/**
   * Selects the compacted mob/del files.
   * Iterates the candidates to find out all the del files and small mob files.
   * @param candidates All the candidates.
   * @param allFiles Whether add all mob files into the compaction.
   * @return A compaction request.
   * @throws IOException if IO failure is encountered
   */
protected PartitionedMobCompactionRequest select(List<FileStatus> candidates, boolean allFiles) throws IOException {
    final Map<CompactionPartitionId, CompactionPartition> filesToCompact = new HashMap<>();
    final CompactionPartitionId id = new CompactionPartitionId();
    final NavigableMap<CompactionDelPartitionId, CompactionDelPartition> delFilesToCompact = new TreeMap<>();
    final CompactionDelPartitionId delId = new CompactionDelPartitionId();
    final ArrayList<CompactionDelPartition> allDelPartitions = new ArrayList<>();
    int selectedFileCount = 0;
    int irrelevantFileCount = 0;
    int totalDelFiles = 0;
    MobCompactPartitionPolicy policy = column.getMobCompactPartitionPolicy();
    Calendar calendar = Calendar.getInstance();
    Date currentDate = new Date();
    Date firstDayOfCurrentMonth = null;
    Date firstDayOfCurrentWeek = null;
    if (policy == MobCompactPartitionPolicy.MONTHLY) {
        firstDayOfCurrentMonth = MobUtils.getFirstDayOfMonth(calendar, currentDate);
        firstDayOfCurrentWeek = MobUtils.getFirstDayOfWeek(calendar, currentDate);
    } else if (policy == MobCompactPartitionPolicy.WEEKLY) {
        firstDayOfCurrentWeek = MobUtils.getFirstDayOfWeek(calendar, currentDate);
    }
    // We check if there is any del files so the logic can be optimized for the following processing
    // First step is to check if there is any delete files. If there is any delete files,
    // For each Partition, it needs to read its startKey and endKey from files.
    // If there is no delete file, there is no need to read startKey and endKey from files, this
    // is an optimization.
    boolean withDelFiles = false;
    for (FileStatus file : candidates) {
        if (!file.isFile()) {
            continue;
        }
        // group the del files and small files.
        FileStatus linkedFile = file;
        if (HFileLink.isHFileLink(file.getPath())) {
            HFileLink link = HFileLink.buildFromHFileLinkPattern(conf, file.getPath());
            linkedFile = getLinkedFileStatus(link);
            if (linkedFile == null) {
                continue;
            }
        }
        if (StoreFileInfo.isDelFile(linkedFile.getPath())) {
            withDelFiles = true;
            break;
        }
    }
    for (FileStatus file : candidates) {
        if (!file.isFile()) {
            irrelevantFileCount++;
            continue;
        }
        // group the del files and small files.
        FileStatus linkedFile = file;
        if (HFileLink.isHFileLink(file.getPath())) {
            HFileLink link = HFileLink.buildFromHFileLinkPattern(conf, file.getPath());
            linkedFile = getLinkedFileStatus(link);
            if (linkedFile == null) {
                // If the linked file cannot be found, regard it as an irrelevantFileCount file
                irrelevantFileCount++;
                continue;
            }
        }
        if (withDelFiles && StoreFileInfo.isDelFile(linkedFile.getPath())) {
            // File in the Del Partition List
            // Get delId from the file
            Reader reader = HFile.createReader(fs, linkedFile.getPath(), CacheConfig.DISABLED, conf);
            try {
                delId.setStartKey(reader.getFirstRowKey());
                delId.setEndKey(reader.getLastRowKey());
            } finally {
                reader.close();
            }
            CompactionDelPartition delPartition = delFilesToCompact.get(delId);
            if (delPartition == null) {
                CompactionDelPartitionId newDelId = new CompactionDelPartitionId(delId.getStartKey(), delId.getEndKey());
                delPartition = new CompactionDelPartition(newDelId);
                delFilesToCompact.put(newDelId, delPartition);
            }
            delPartition.addDelFile(file);
            totalDelFiles++;
        } else {
            String fileName = linkedFile.getPath().getName();
            String date = MobFileName.getDateFromName(fileName);
            boolean skipCompaction = MobUtils.fillPartitionId(id, firstDayOfCurrentMonth, firstDayOfCurrentWeek, date, policy, calendar, mergeableSize);
            if (allFiles || (!skipCompaction && (linkedFile.getLen() < id.getThreshold()))) {
                // add all files if allFiles is true,
                // otherwise add the small files to the merge pool
                // filter out files which are not supposed to be compacted with the
                // current policy
                id.setStartKey(MobFileName.getStartKeyFromName(fileName));
                CompactionPartition compactionPartition = filesToCompact.get(id);
                if (compactionPartition == null) {
                    CompactionPartitionId newId = new CompactionPartitionId(id.getStartKey(), id.getDate());
                    compactionPartition = new CompactionPartition(newId);
                    compactionPartition.addFile(file);
                    filesToCompact.put(newId, compactionPartition);
                    newId.updateLatestDate(date);
                } else {
                    compactionPartition.addFile(file);
                    compactionPartition.getPartitionId().updateLatestDate(date);
                }
                if (withDelFiles) {
                    // get startKey and endKey from the file and update partition
                    // TODO: is it possible to skip read of most hfiles?
                    Reader reader = HFile.createReader(fs, linkedFile.getPath(), CacheConfig.DISABLED, conf);
                    try {
                        compactionPartition.setStartKey(reader.getFirstRowKey());
                        compactionPartition.setEndKey(reader.getLastRowKey());
                    } finally {
                        reader.close();
                    }
                }
                selectedFileCount++;
            }
        }
    }
    /*
     * Merge del files so there are only non-overlapped del file lists
     */
    for (Map.Entry<CompactionDelPartitionId, CompactionDelPartition> entry : delFilesToCompact.entrySet()) {
        if (allDelPartitions.size() > 0) {
            // check if the current key range overlaps the previous one
            CompactionDelPartition prev = allDelPartitions.get(allDelPartitions.size() - 1);
            if (Bytes.compareTo(prev.getId().getEndKey(), entry.getKey().getStartKey()) >= 0) {
                // merge them together
                prev.getId().setEndKey(entry.getValue().getId().getEndKey());
                prev.addDelFileList(entry.getValue().listDelFiles());
            } else {
                allDelPartitions.add(entry.getValue());
            }
        } else {
            allDelPartitions.add(entry.getValue());
        }
    }
    PartitionedMobCompactionRequest request = new PartitionedMobCompactionRequest(filesToCompact.values(), allDelPartitions);
    if (candidates.size() == (totalDelFiles + selectedFileCount + irrelevantFileCount)) {
        // all the files are selected
        request.setCompactionType(CompactionType.ALL_FILES);
    }
    LOG.info("The compaction type is " + request.getCompactionType() + ", the request has " + totalDelFiles + " del files, " + selectedFileCount + " selected files, and " + irrelevantFileCount + " irrelevant files");
    return request;
}
Also used : CompactionDelPartitionId(org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionDelPartitionId) HFileLink(org.apache.hadoop.hbase.io.HFileLink) CompactionPartition(org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartition) FileStatus(org.apache.hadoop.fs.FileStatus) HashMap(java.util.HashMap) Calendar(java.util.Calendar) ArrayList(java.util.ArrayList) Reader(org.apache.hadoop.hbase.io.hfile.HFile.Reader) CompactionPartitionId(org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartitionId) TreeMap(java.util.TreeMap) Date(java.util.Date) CompactionDelPartition(org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionDelPartition) MobCompactPartitionPolicy(org.apache.hadoop.hbase.client.MobCompactPartitionPolicy) Map(java.util.Map) NavigableMap(java.util.NavigableMap) HashMap(java.util.HashMap) TreeMap(java.util.TreeMap)

Example 4 with CompactionPartitionId

use of org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartitionId in project hbase by apache.

the class TestPartitionedMobCompactionRequest method testCompactedPartition.

@Test
public void testCompactedPartition() {
    CompactionPartitionId partitionId = new CompactionPartitionId("startKey1", "date1");
    CompactionPartition partition = new CompactionPartition(partitionId);
    FileStatus file = new FileStatus(1, false, 1, 1024, 1, new Path("/test"));
    partition.addFile(file);
    Assert.assertEquals(file, partition.listFiles().get(0));
}
Also used : Path(org.apache.hadoop.fs.Path) CompactionPartition(org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartition) FileStatus(org.apache.hadoop.fs.FileStatus) CompactionPartitionId(org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartitionId) Test(org.junit.Test)

Aggregations

CompactionPartitionId (org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartitionId)4 CompactionPartition (org.apache.hadoop.hbase.mob.compactions.PartitionedMobCompactionRequest.CompactionPartition)3 ArrayList (java.util.ArrayList)2 HashMap (java.util.HashMap)2 FileStatus (org.apache.hadoop.fs.FileStatus)2 Path (org.apache.hadoop.fs.Path)2 Test (org.junit.Test)2 FileNotFoundException (java.io.FileNotFoundException)1 IOException (java.io.IOException)1 Calendar (java.util.Calendar)1 Date (java.util.Date)1 Map (java.util.Map)1 NavigableMap (java.util.NavigableMap)1 TreeMap (java.util.TreeMap)1 Callable (java.util.concurrent.Callable)1 Future (java.util.concurrent.Future)1 Connection (org.apache.hadoop.hbase.client.Connection)1 MobCompactPartitionPolicy (org.apache.hadoop.hbase.client.MobCompactPartitionPolicy)1 Table (org.apache.hadoop.hbase.client.Table)1 HFileLink (org.apache.hadoop.hbase.io.HFileLink)1