Search in sources :

Example 1 with NotServingRegionException

use of org.apache.hadoop.hbase.NotServingRegionException in project hbase by apache.

the class AsyncHBaseAdmin method closeRegion.

@Override
public CompletableFuture<Void> closeRegion(byte[] regionName, String serverName) {
    CompletableFuture<Void> future = new CompletableFuture<>();
    getRegion(regionName).whenComplete((p, err) -> {
        if (err != null) {
            future.completeExceptionally(err);
            return;
        }
        if (p == null || p.getFirst() == null) {
            future.completeExceptionally(new UnknownRegionException(Bytes.toStringBinary(regionName)));
            return;
        }
        if (serverName != null) {
            closeRegion(ServerName.valueOf(serverName), p.getFirst()).whenComplete((p2, err2) -> {
                if (err2 != null) {
                    future.completeExceptionally(err2);
                } else {
                    future.complete(null);
                }
            });
        } else {
            if (p.getSecond() == null) {
                future.completeExceptionally(new NotServingRegionException(regionName));
            } else {
                closeRegion(p.getSecond(), p.getFirst()).whenComplete((p2, err2) -> {
                    if (err2 != null) {
                        future.completeExceptionally(err2);
                    } else {
                        future.complete(null);
                    }
                });
            }
        }
    });
    return future;
}
Also used : CompletableFuture(java.util.concurrent.CompletableFuture) NotServingRegionException(org.apache.hadoop.hbase.NotServingRegionException) UnknownRegionException(org.apache.hadoop.hbase.UnknownRegionException)

Example 2 with NotServingRegionException

use of org.apache.hadoop.hbase.NotServingRegionException in project hbase by apache.

the class TestAdmin2 method testCloseRegionWhenEncodedRegionNameIsNotGiven.

@Test(timeout = 300000)
public void testCloseRegionWhenEncodedRegionNameIsNotGiven() throws Exception {
    final byte[] tableName = Bytes.toBytes(name.getMethodName());
    createTableWithDefaultConf(tableName);
    HRegionInfo info = null;
    HRegionServer rs = TEST_UTIL.getRSForFirstRegionInTable(TableName.valueOf(tableName));
    List<HRegionInfo> onlineRegions = ProtobufUtil.getOnlineRegions(rs.getRSRpcServices());
    for (HRegionInfo regionInfo : onlineRegions) {
        if (!regionInfo.isMetaTable()) {
            if (regionInfo.getRegionNameAsString().contains(name.getMethodName())) {
                info = regionInfo;
                try {
                    admin.closeRegionWithEncodedRegionName(regionInfo.getRegionNameAsString(), rs.getServerName().getServerName());
                } catch (NotServingRegionException nsre) {
                // expected, ignore it.
                }
            }
        }
    }
    onlineRegions = ProtobufUtil.getOnlineRegions(rs.getRSRpcServices());
    assertTrue("The region should be present in online regions list.", onlineRegions.contains(info));
}
Also used : HRegionInfo(org.apache.hadoop.hbase.HRegionInfo) NotServingRegionException(org.apache.hadoop.hbase.NotServingRegionException) HRegionServer(org.apache.hadoop.hbase.regionserver.HRegionServer) Test(org.junit.Test)

Example 3 with NotServingRegionException

use of org.apache.hadoop.hbase.NotServingRegionException in project hbase by apache.

the class RSRpcServices method getRegionScanner.

private RegionScannerHolder getRegionScanner(ScanRequest request) throws IOException {
    String scannerName = toScannerName(request.getScannerId());
    RegionScannerHolder rsh = this.scanners.get(scannerName);
    if (rsh == null) {
        // just ignore the next or close request if scanner does not exists.
        if (closedScanners.getIfPresent(scannerName) != null) {
            throw SCANNER_ALREADY_CLOSED;
        } else {
            LOG.warn("Client tried to access missing scanner " + scannerName);
            throw new UnknownScannerException("Unknown scanner '" + scannerName + "'. This can happen due to any of the following " + "reasons: a) Scanner id given is wrong, b) Scanner lease expired because of " + "long wait between consecutive client checkins, c) Server may be closing down, " + "d) RegionServer restart during upgrade.\nIf the issue is due to reason (b), a " + "possible fix would be increasing the value of" + "'hbase.client.scanner.timeout.period' configuration.");
        }
    }
    rejectIfInStandByState(rsh.r);
    RegionInfo hri = rsh.s.getRegionInfo();
    // Yes, should be the same instance
    if (server.getOnlineRegion(hri.getRegionName()) != rsh.r) {
        String msg = "Region has changed on the scanner " + scannerName + ": regionName=" + hri.getRegionNameAsString() + ", scannerRegionName=" + rsh.r;
        LOG.warn(msg + ", closing...");
        scanners.remove(scannerName);
        try {
            rsh.s.close();
        } catch (IOException e) {
            LOG.warn("Getting exception closing " + scannerName, e);
        } finally {
            try {
                server.getLeaseManager().cancelLease(scannerName);
            } catch (LeaseException e) {
                LOG.warn("Getting exception closing " + scannerName, e);
            }
        }
        throw new NotServingRegionException(msg);
    }
    return rsh;
}
Also used : NotServingRegionException(org.apache.hadoop.hbase.NotServingRegionException) RegionInfo(org.apache.hadoop.hbase.client.RegionInfo) ByteString(org.apache.hbase.thirdparty.com.google.protobuf.ByteString) IOException(java.io.IOException) DoNotRetryIOException(org.apache.hadoop.hbase.DoNotRetryIOException) HBaseIOException(org.apache.hadoop.hbase.HBaseIOException) UncheckedIOException(java.io.UncheckedIOException) UnknownScannerException(org.apache.hadoop.hbase.UnknownScannerException)

Example 4 with NotServingRegionException

use of org.apache.hadoop.hbase.NotServingRegionException in project hbase by apache.

the class RSRpcServices method replay.

/**
 * Replay the given changes when distributedLogReplay WAL edits from a failed RS. The guarantee is
 * that the given mutations will be durable on the receiving RS if this method returns without any
 * exception.
 * @param controller the RPC controller
 * @param request the request
 * @deprecated Since 3.0.0, will be removed in 4.0.0. Not used any more, put here only for
 *             compatibility with old region replica implementation. Now we will use
 *             {@code replicateToReplica} method instead.
 */
@Deprecated
@Override
@QosPriority(priority = HConstants.REPLAY_QOS)
public ReplicateWALEntryResponse replay(final RpcController controller, final ReplicateWALEntryRequest request) throws ServiceException {
    long before = EnvironmentEdgeManager.currentTime();
    CellScanner cells = getAndReset(controller);
    try {
        checkOpen();
        List<WALEntry> entries = request.getEntryList();
        if (entries == null || entries.isEmpty()) {
            // empty input
            return ReplicateWALEntryResponse.newBuilder().build();
        }
        ByteString regionName = entries.get(0).getKey().getEncodedRegionName();
        HRegion region = server.getRegionByEncodedName(regionName.toStringUtf8());
        RegionCoprocessorHost coprocessorHost = ServerRegionReplicaUtil.isDefaultReplica(region.getRegionInfo()) ? region.getCoprocessorHost() : // do not invoke coprocessors if this is a secondary region replica
        null;
        List<Pair<WALKey, WALEdit>> walEntries = new ArrayList<>();
        // Skip adding the edits to WAL if this is a secondary region replica
        boolean isPrimary = RegionReplicaUtil.isDefaultReplica(region.getRegionInfo());
        Durability durability = isPrimary ? Durability.USE_DEFAULT : Durability.SKIP_WAL;
        for (WALEntry entry : entries) {
            if (!regionName.equals(entry.getKey().getEncodedRegionName())) {
                throw new NotServingRegionException("Replay request contains entries from multiple " + "regions. First region:" + regionName.toStringUtf8() + " , other region:" + entry.getKey().getEncodedRegionName());
            }
            if (server.nonceManager != null && isPrimary) {
                long nonceGroup = entry.getKey().hasNonceGroup() ? entry.getKey().getNonceGroup() : HConstants.NO_NONCE;
                long nonce = entry.getKey().hasNonce() ? entry.getKey().getNonce() : HConstants.NO_NONCE;
                server.nonceManager.reportOperationFromWal(nonceGroup, nonce, entry.getKey().getWriteTime());
            }
            Pair<WALKey, WALEdit> walEntry = (coprocessorHost == null) ? null : new Pair<>();
            List<MutationReplay> edits = WALSplitUtil.getMutationsFromWALEntry(entry, cells, walEntry, durability);
            if (coprocessorHost != null) {
                // KeyValue.
                if (coprocessorHost.preWALRestore(region.getRegionInfo(), walEntry.getFirst(), walEntry.getSecond())) {
                    // if bypass this log entry, ignore it ...
                    continue;
                }
                walEntries.add(walEntry);
            }
            if (edits != null && !edits.isEmpty()) {
                // HBASE-17924
                // sort to improve lock efficiency
                Collections.sort(edits, (v1, v2) -> Row.COMPARATOR.compare(v1.mutation, v2.mutation));
                long replaySeqId = (entry.getKey().hasOrigSequenceNumber()) ? entry.getKey().getOrigSequenceNumber() : entry.getKey().getLogSequenceNumber();
                OperationStatus[] result = doReplayBatchOp(region, edits, replaySeqId);
                // check if it's a partial success
                for (int i = 0; result != null && i < result.length; i++) {
                    if (result[i] != OperationStatus.SUCCESS) {
                        throw new IOException(result[i].getExceptionMsg());
                    }
                }
            }
        }
        // sync wal at the end because ASYNC_WAL is used above
        WAL wal = region.getWAL();
        if (wal != null) {
            wal.sync();
        }
        if (coprocessorHost != null) {
            for (Pair<WALKey, WALEdit> entry : walEntries) {
                coprocessorHost.postWALRestore(region.getRegionInfo(), entry.getFirst(), entry.getSecond());
            }
        }
        return ReplicateWALEntryResponse.newBuilder().build();
    } catch (IOException ie) {
        throw new ServiceException(ie);
    } finally {
        final MetricsRegionServer metricsRegionServer = server.getMetrics();
        if (metricsRegionServer != null) {
            metricsRegionServer.updateReplay(EnvironmentEdgeManager.currentTime() - before);
        }
    }
}
Also used : WAL(org.apache.hadoop.hbase.wal.WAL) ByteString(org.apache.hbase.thirdparty.com.google.protobuf.ByteString) ArrayList(java.util.ArrayList) MutationReplay(org.apache.hadoop.hbase.wal.WALSplitUtil.MutationReplay) CellScanner(org.apache.hadoop.hbase.CellScanner) WALKey(org.apache.hadoop.hbase.wal.WALKey) WALEdit(org.apache.hadoop.hbase.wal.WALEdit) Pair(org.apache.hadoop.hbase.util.Pair) NameInt64Pair(org.apache.hadoop.hbase.shaded.protobuf.generated.HBaseProtos.NameInt64Pair) NameBytesPair(org.apache.hadoop.hbase.shaded.protobuf.generated.HBaseProtos.NameBytesPair) NotServingRegionException(org.apache.hadoop.hbase.NotServingRegionException) Durability(org.apache.hadoop.hbase.client.Durability) IOException(java.io.IOException) DoNotRetryIOException(org.apache.hadoop.hbase.DoNotRetryIOException) HBaseIOException(org.apache.hadoop.hbase.HBaseIOException) UncheckedIOException(java.io.UncheckedIOException) ServiceException(org.apache.hbase.thirdparty.com.google.protobuf.ServiceException) WALEntry(org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos.WALEntry) QosPriority(org.apache.hadoop.hbase.ipc.QosPriority)

Example 5 with NotServingRegionException

use of org.apache.hadoop.hbase.NotServingRegionException in project hbase by apache.

the class SplitLogWorker method splitLog.

/**
 * @return Result either DONE, RESIGNED, or ERR.
 */
static Status splitLog(String filename, CancelableProgressable p, Configuration conf, RegionServerServices server, LastSequenceId sequenceIdChecker, WALFactory factory) {
    Path walDir;
    FileSystem fs;
    try {
        walDir = CommonFSUtils.getWALRootDir(conf);
        fs = walDir.getFileSystem(conf);
    } catch (IOException e) {
        LOG.warn("Resigning, could not find root dir or fs", e);
        return Status.RESIGNED;
    }
    try {
        if (!processSyncReplicationWAL(filename, conf, server, fs, walDir)) {
            return Status.DONE;
        }
    } catch (IOException e) {
        LOG.warn("failed to process sync replication wal {}", filename, e);
        return Status.RESIGNED;
    }
    // encountered a bad non-retry-able persistent error.
    try {
        SplitLogWorkerCoordination splitLogWorkerCoordination = server.getCoordinatedStateManager() == null ? null : server.getCoordinatedStateManager().getSplitLogWorkerCoordination();
        if (!WALSplitter.splitLogFile(walDir, fs.getFileStatus(new Path(walDir, filename)), fs, conf, p, sequenceIdChecker, splitLogWorkerCoordination, factory, server)) {
            return Status.PREEMPTED;
        }
    } catch (InterruptedIOException iioe) {
        LOG.warn("Resigning, interrupted splitting WAL {}", filename, iioe);
        return Status.RESIGNED;
    } catch (IOException e) {
        if (e instanceof FileNotFoundException) {
            // A wal file may not exist anymore. Nothing can be recovered so move on
            LOG.warn("Done, WAL {} does not exist anymore", filename, e);
            return Status.DONE;
        }
        Throwable cause = e.getCause();
        if (e instanceof RetriesExhaustedException && (cause instanceof NotServingRegionException || cause instanceof ConnectException || cause instanceof SocketTimeoutException)) {
            LOG.warn("Resigning, can't connect to target regionserver splitting WAL {}", filename, e);
            return Status.RESIGNED;
        } else if (cause instanceof InterruptedException) {
            LOG.warn("Resigning, interrupted splitting WAL {}", filename, e);
            return Status.RESIGNED;
        }
        LOG.warn("Error splitting WAL {}", filename, e);
        return Status.ERR;
    }
    LOG.debug("Done splitting WAL {}", filename);
    return Status.DONE;
}
Also used : Path(org.apache.hadoop.fs.Path) InterruptedIOException(java.io.InterruptedIOException) SplitLogWorkerCoordination(org.apache.hadoop.hbase.coordination.SplitLogWorkerCoordination) SocketTimeoutException(java.net.SocketTimeoutException) RetriesExhaustedException(org.apache.hadoop.hbase.client.RetriesExhaustedException) NotServingRegionException(org.apache.hadoop.hbase.NotServingRegionException) FileSystem(org.apache.hadoop.fs.FileSystem) FileNotFoundException(java.io.FileNotFoundException) InterruptedIOException(java.io.InterruptedIOException) IOException(java.io.IOException) ConnectException(java.net.ConnectException)

Aggregations

NotServingRegionException (org.apache.hadoop.hbase.NotServingRegionException)22 IOException (java.io.IOException)11 DoNotRetryIOException (org.apache.hadoop.hbase.DoNotRetryIOException)7 HBaseIOException (org.apache.hadoop.hbase.HBaseIOException)5 HRegionInfo (org.apache.hadoop.hbase.HRegionInfo)5 HRegionLocation (org.apache.hadoop.hbase.HRegionLocation)4 ServerName (org.apache.hadoop.hbase.ServerName)4 Test (org.junit.Test)4 InterruptedIOException (java.io.InterruptedIOException)3 UncheckedIOException (java.io.UncheckedIOException)3 TableName (org.apache.hadoop.hbase.TableName)3 RegionInfo (org.apache.hadoop.hbase.client.RegionInfo)3 Pair (org.apache.hadoop.hbase.util.Pair)3 ByteString (org.apache.hbase.thirdparty.com.google.protobuf.ByteString)3 FileNotFoundException (java.io.FileNotFoundException)2 List (java.util.List)2 CompletableFuture (java.util.concurrent.CompletableFuture)2 ExecutionException (java.util.concurrent.ExecutionException)2 AtomicBoolean (java.util.concurrent.atomic.AtomicBoolean)2 CellScanner (org.apache.hadoop.hbase.CellScanner)2