Examples with DFSClient - org.apache.hadoop.hdfs.DFSClient

Example 26 with DFSClient

use of org.apache.hadoop.hdfs.DFSClient in project hadoop by apache.

the class TestBalancer method testBalancerWithRamDisk.

/*
   * Test Balancer with Ram_Disk configured
   * One DN has two files on RAM_DISK, other DN has no files on RAM_DISK.
   * Then verify that the balancer does not migrate files on RAM_DISK across DN.
   */
@Test(timeout = 300000)
public void testBalancerWithRamDisk() throws Exception {
    final int SEED = 0xFADED;
    final short REPL_FACT = 1;
    Configuration conf = new Configuration();
    final int defaultRamDiskCapacity = 10;
    final long ramDiskStorageLimit = ((long) defaultRamDiskCapacity * DEFAULT_RAM_DISK_BLOCK_SIZE) + (DEFAULT_RAM_DISK_BLOCK_SIZE - 1);
    final long diskStorageLimit = ((long) defaultRamDiskCapacity * DEFAULT_RAM_DISK_BLOCK_SIZE) + (DEFAULT_RAM_DISK_BLOCK_SIZE - 1);
    initConfWithRamDisk(conf, ramDiskStorageLimit);
    cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).storageCapacities(new long[] { ramDiskStorageLimit, diskStorageLimit }).storageTypes(new StorageType[] { RAM_DISK, DEFAULT }).build();
    cluster.waitActive();
    // Create few files on RAM_DISK
    final String METHOD_NAME = GenericTestUtils.getMethodName();
    final Path path1 = new Path("/" + METHOD_NAME + ".01.dat");
    final Path path2 = new Path("/" + METHOD_NAME + ".02.dat");
    DistributedFileSystem fs = cluster.getFileSystem();
    DFSClient client = fs.getClient();
    DFSTestUtil.createFile(fs, path1, true, DEFAULT_RAM_DISK_BLOCK_SIZE, 4 * DEFAULT_RAM_DISK_BLOCK_SIZE, DEFAULT_RAM_DISK_BLOCK_SIZE, REPL_FACT, SEED, true);
    DFSTestUtil.createFile(fs, path2, true, DEFAULT_RAM_DISK_BLOCK_SIZE, 1 * DEFAULT_RAM_DISK_BLOCK_SIZE, DEFAULT_RAM_DISK_BLOCK_SIZE, REPL_FACT, SEED, true);
    // Sleep for a short time to allow the lazy writer thread to do its job
    Thread.sleep(6 * 1000);
    // Add another fresh DN with the same type/capacity without files on RAM_DISK
    StorageType[][] storageTypes = new StorageType[][] { { RAM_DISK, DEFAULT } };
    long[][] storageCapacities = new long[][] { { ramDiskStorageLimit, diskStorageLimit } };
    cluster.startDataNodes(conf, REPL_FACT, storageTypes, true, null, null, null, storageCapacities, null, false, false, false, null);
    cluster.triggerHeartbeats();
    Collection<URI> namenodes = DFSUtil.getInternalNsRpcUris(conf);
    // Run Balancer
    final BalancerParameters p = BalancerParameters.DEFAULT;
    final int r = Balancer.run(namenodes, p, conf);
    // Validate no RAM_DISK block should be moved
    assertEquals(ExitStatus.NO_MOVE_PROGRESS.getExitCode(), r);
    // Verify files are still on RAM_DISK
    DFSTestUtil.verifyFileReplicasOnStorageType(fs, client, path1, RAM_DISK);
    DFSTestUtil.verifyFileReplicasOnStorageType(fs, client, path2, RAM_DISK);
}

Also used : Path(org.apache.hadoop.fs.Path) DFSClient(org.apache.hadoop.hdfs.DFSClient) MiniDFSCluster(org.apache.hadoop.hdfs.MiniDFSCluster) StorageType(org.apache.hadoop.fs.StorageType) Configuration(org.apache.hadoop.conf.Configuration) HdfsConfiguration(org.apache.hadoop.hdfs.HdfsConfiguration) DistributedFileSystem(org.apache.hadoop.hdfs.DistributedFileSystem) URI(java.net.URI) Test(org.junit.Test)

Example 27 with DFSClient

use of org.apache.hadoop.hdfs.DFSClient in project hadoop by apache.

the class TestBlockTokenWithDFS method doTestRead.

protected void doTestRead(Configuration conf, MiniDFSCluster cluster, boolean isStriped) throws Exception {
    final int numDataNodes = cluster.getDataNodes().size();
    final NameNode nn = cluster.getNameNode();
    final NamenodeProtocols nnProto = nn.getRpcServer();
    final BlockManager bm = nn.getNamesystem().getBlockManager();
    final BlockTokenSecretManager sm = bm.getBlockTokenSecretManager();
    // set a short token lifetime (1 second) initially
    SecurityTestUtil.setBlockTokenLifetime(sm, 1000L);
    Path fileToRead = new Path(FILE_TO_READ);
    FileSystem fs = cluster.getFileSystem();
    byte[] expected = generateBytes(FILE_SIZE);
    createFile(fs, fileToRead, expected);
    /*
       * setup for testing expiration handling of cached tokens
       */
    // read using blockSeekTo(). Acquired tokens are cached in in1
    FSDataInputStream in1 = fs.open(fileToRead);
    assertTrue(checkFile1(in1, expected));
    // read using blockSeekTo(). Acquired tokens are cached in in2
    FSDataInputStream in2 = fs.open(fileToRead);
    assertTrue(checkFile1(in2, expected));
    // read using fetchBlockByteRange(). Acquired tokens are cached in in3
    FSDataInputStream in3 = fs.open(fileToRead);
    assertTrue(checkFile2(in3, expected));
    /*
       * testing READ interface on DN using a BlockReader
       */
    DFSClient client = null;
    try {
        client = new DFSClient(new InetSocketAddress("localhost", cluster.getNameNodePort()), conf);
    } finally {
        if (client != null)
            client.close();
    }
    List<LocatedBlock> locatedBlocks = nnProto.getBlockLocations(FILE_TO_READ, 0, FILE_SIZE).getLocatedBlocks();
    // first block
    LocatedBlock lblock = locatedBlocks.get(0);
    // verify token is not expired
    assertFalse(isBlockTokenExpired(lblock));
    // read with valid token, should succeed
    tryRead(conf, lblock, true);
    while (!isBlockTokenExpired(lblock)) {
        try {
            Thread.sleep(10);
        } catch (InterruptedException ignored) {
        }
    }
    /*
       * continue testing READ interface on DN using a BlockReader
       */
    // verify token is expired
    assertTrue(isBlockTokenExpired(lblock));
    // read should fail
    tryRead(conf, lblock, false);
    // use a valid new token
    bm.setBlockToken(lblock, BlockTokenIdentifier.AccessMode.READ);
    // read should succeed
    tryRead(conf, lblock, true);
    // use a token with wrong blockID
    long rightId = lblock.getBlock().getBlockId();
    long wrongId = rightId + 1;
    lblock.getBlock().setBlockId(wrongId);
    bm.setBlockToken(lblock, BlockTokenIdentifier.AccessMode.READ);
    lblock.getBlock().setBlockId(rightId);
    // read should fail
    tryRead(conf, lblock, false);
    // use a token with wrong access modes
    bm.setBlockToken(lblock, BlockTokenIdentifier.AccessMode.WRITE);
    // read should fail
    tryRead(conf, lblock, false);
    // set a long token lifetime for future tokens
    SecurityTestUtil.setBlockTokenLifetime(sm, 600 * 1000L);
    /*
       * testing that when cached tokens are expired, DFSClient will re-fetch
       * tokens transparently for READ.
       */
    // confirm all tokens cached in in1 are expired by now
    List<LocatedBlock> lblocks = DFSTestUtil.getAllBlocks(in1);
    for (LocatedBlock blk : lblocks) {
        assertTrue(isBlockTokenExpired(blk));
    }
    // verify blockSeekTo() is able to re-fetch token transparently
    in1.seek(0);
    assertTrue(checkFile1(in1, expected));
    // confirm all tokens cached in in2 are expired by now
    List<LocatedBlock> lblocks2 = DFSTestUtil.getAllBlocks(in2);
    for (LocatedBlock blk : lblocks2) {
        assertTrue(isBlockTokenExpired(blk));
    }
    // via another interface method)
    if (isStriped) {
        // striped block doesn't support seekToNewSource
        in2.seek(0);
    } else {
        assertTrue(in2.seekToNewSource(0));
    }
    assertTrue(checkFile1(in2, expected));
    // confirm all tokens cached in in3 are expired by now
    List<LocatedBlock> lblocks3 = DFSTestUtil.getAllBlocks(in3);
    for (LocatedBlock blk : lblocks3) {
        assertTrue(isBlockTokenExpired(blk));
    }
    // verify fetchBlockByteRange() is able to re-fetch token transparently
    assertTrue(checkFile2(in3, expected));
    /*
       * testing that after datanodes are restarted on the same ports, cached
       * tokens should still work and there is no need to fetch new tokens from
       * namenode. This test should run while namenode is down (to make sure no
       * new tokens can be fetched from namenode).
       */
    // restart datanodes on the same ports that they currently use
    assertTrue(cluster.restartDataNodes(true));
    cluster.waitActive();
    assertEquals(numDataNodes, cluster.getDataNodes().size());
    cluster.shutdownNameNode(0);
    // confirm tokens cached in in1 are still valid
    lblocks = DFSTestUtil.getAllBlocks(in1);
    for (LocatedBlock blk : lblocks) {
        assertFalse(isBlockTokenExpired(blk));
    }
    // verify blockSeekTo() still works (forced to use cached tokens)
    in1.seek(0);
    assertTrue(checkFile1(in1, expected));
    // confirm tokens cached in in2 are still valid
    lblocks2 = DFSTestUtil.getAllBlocks(in2);
    for (LocatedBlock blk : lblocks2) {
        assertFalse(isBlockTokenExpired(blk));
    }
    // verify blockSeekTo() still works (forced to use cached tokens)
    if (isStriped) {
        in2.seek(0);
    } else {
        in2.seekToNewSource(0);
    }
    assertTrue(checkFile1(in2, expected));
    // confirm tokens cached in in3 are still valid
    lblocks3 = DFSTestUtil.getAllBlocks(in3);
    for (LocatedBlock blk : lblocks3) {
        assertFalse(isBlockTokenExpired(blk));
    }
    // verify fetchBlockByteRange() still works (forced to use cached tokens)
    assertTrue(checkFile2(in3, expected));
    /*
       * testing that when namenode is restarted, cached tokens should still
       * work and there is no need to fetch new tokens from namenode. Like the
       * previous test, this test should also run while namenode is down. The
       * setup for this test depends on the previous test.
       */
    // restart the namenode and then shut it down for test
    cluster.restartNameNode(0);
    cluster.shutdownNameNode(0);
    // verify blockSeekTo() still works (forced to use cached tokens)
    in1.seek(0);
    assertTrue(checkFile1(in1, expected));
    // verify again blockSeekTo() still works (forced to use cached tokens)
    if (isStriped) {
        in2.seek(0);
    } else {
        in2.seekToNewSource(0);
    }
    assertTrue(checkFile1(in2, expected));
    // verify fetchBlockByteRange() still works (forced to use cached tokens)
    assertTrue(checkFile2(in3, expected));
    /*
       * testing that after both namenode and datanodes got restarted (namenode
       * first, followed by datanodes), DFSClient can't access DN without
       * re-fetching tokens and is able to re-fetch tokens transparently. The
       * setup of this test depends on the previous test.
       */
    // restore the cluster and restart the datanodes for test
    cluster.restartNameNode(0);
    assertTrue(cluster.restartDataNodes(true));
    cluster.waitActive();
    assertEquals(numDataNodes, cluster.getDataNodes().size());
    // shutdown namenode so that DFSClient can't get new tokens from namenode
    cluster.shutdownNameNode(0);
    // verify blockSeekTo() fails (cached tokens become invalid)
    in1.seek(0);
    assertFalse(checkFile1(in1, expected));
    // verify fetchBlockByteRange() fails (cached tokens become invalid)
    assertFalse(checkFile2(in3, expected));
    // restart the namenode to allow DFSClient to re-fetch tokens
    cluster.restartNameNode(0);
    // verify blockSeekTo() works again (by transparently re-fetching
    // tokens from namenode)
    in1.seek(0);
    assertTrue(checkFile1(in1, expected));
    if (isStriped) {
        in2.seek(0);
    } else {
        in2.seekToNewSource(0);
    }
    assertTrue(checkFile1(in2, expected));
    // verify fetchBlockByteRange() works again (by transparently
    // re-fetching tokens from namenode)
    assertTrue(checkFile2(in3, expected));
    /*
       * testing that when datanodes are restarted on different ports, DFSClient
       * is able to re-fetch tokens transparently to connect to them
       */
    // restart datanodes on newly assigned ports
    assertTrue(cluster.restartDataNodes(false));
    cluster.waitActive();
    assertEquals(numDataNodes, cluster.getDataNodes().size());
    // verify blockSeekTo() is able to re-fetch token transparently
    in1.seek(0);
    assertTrue(checkFile1(in1, expected));
    // verify blockSeekTo() is able to re-fetch token transparently
    if (isStriped) {
        in2.seek(0);
    } else {
        in2.seekToNewSource(0);
    }
    assertTrue(checkFile1(in2, expected));
    // verify fetchBlockByteRange() is able to re-fetch token transparently
    assertTrue(checkFile2(in3, expected));
}

Also used : Path(org.apache.hadoop.fs.Path) DFSClient(org.apache.hadoop.hdfs.DFSClient) NameNode(org.apache.hadoop.hdfs.server.namenode.NameNode) NamenodeProtocols(org.apache.hadoop.hdfs.server.protocol.NamenodeProtocols) InetSocketAddress(java.net.InetSocketAddress) LocatedBlock(org.apache.hadoop.hdfs.protocol.LocatedBlock) FileSystem(org.apache.hadoop.fs.FileSystem) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) BlockTokenSecretManager(org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager)

Example 28 with DFSClient

use of org.apache.hadoop.hdfs.DFSClient in project hadoop by apache.

the class TestDFSAdmin method testReportCommand.

@Test(timeout = 120000)
public void testReportCommand() throws Exception {
    redirectStream();
    /* init conf */
    final Configuration dfsConf = new HdfsConfiguration();
    dfsConf.setInt(DFSConfigKeys.DFS_NAMENODE_HEARTBEAT_RECHECK_INTERVAL_KEY, // 0.5s
    500);
    dfsConf.setLong(DFS_HEARTBEAT_INTERVAL_KEY, 1);
    final Path baseDir = new Path(PathUtils.getTestDir(getClass()).getAbsolutePath(), GenericTestUtils.getMethodName());
    dfsConf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, baseDir.toString());
    final int numDn = 3;
    /* init cluster */
    try (MiniDFSCluster miniCluster = new MiniDFSCluster.Builder(dfsConf).numDataNodes(numDn).build()) {
        miniCluster.waitActive();
        assertEquals(numDn, miniCluster.getDataNodes().size());
        /* local vars */
        final DFSAdmin dfsAdmin = new DFSAdmin(dfsConf);
        final DFSClient client = miniCluster.getFileSystem().getClient();
        /* run and verify report command */
        resetStream();
        assertEquals(0, ToolRunner.run(dfsAdmin, new String[] { "-report" }));
        verifyNodesAndCorruptBlocks(numDn, numDn, 0, client);
        /* shut down one DN */
        final List<DataNode> datanodes = miniCluster.getDataNodes();
        final DataNode last = datanodes.get(datanodes.size() - 1);
        last.shutdown();
        miniCluster.setDataNodeDead(last.getDatanodeId());
        /* run and verify report command */
        assertEquals(0, ToolRunner.run(dfsAdmin, new String[] { "-report" }));
        verifyNodesAndCorruptBlocks(numDn, numDn - 1, 0, client);
        /* corrupt one block */
        final short replFactor = 1;
        final long fileLength = 512L;
        final FileSystem fs = miniCluster.getFileSystem();
        final Path file = new Path(baseDir, "/corrupted");
        DFSTestUtil.createFile(fs, file, fileLength, replFactor, 12345L);
        DFSTestUtil.waitReplication(fs, file, replFactor);
        final ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, file);
        final int blockFilesCorrupted = miniCluster.corruptBlockOnDataNodes(block);
        assertEquals("Fail to corrupt all replicas for block " + block, replFactor, blockFilesCorrupted);
        try {
            IOUtils.copyBytes(fs.open(file), new IOUtils.NullOutputStream(), conf, true);
            fail("Should have failed to read the file with corrupted blocks.");
        } catch (ChecksumException ignored) {
        // expected exception reading corrupt blocks
        }
        /*
       * Increase replication factor, this should invoke transfer request.
       * Receiving datanode fails on checksum and reports it to namenode
       */
        fs.setReplication(file, (short) (replFactor + 1));
        /* get block details and check if the block is corrupt */
        GenericTestUtils.waitFor(new Supplier<Boolean>() {

            @Override
            public Boolean get() {
                LocatedBlocks blocks = null;
                try {
                    miniCluster.triggerBlockReports();
                    blocks = client.getNamenode().getBlockLocations(file.toString(), 0, Long.MAX_VALUE);
                } catch (IOException e) {
                    return false;
                }
                return blocks != null && blocks.get(0).isCorrupt();
            }
        }, 1000, 60000);
        BlockManagerTestUtil.updateState(miniCluster.getNameNode().getNamesystem().getBlockManager());
        /* run and verify report command */
        resetStream();
        assertEquals(0, ToolRunner.run(dfsAdmin, new String[] { "-report" }));
        verifyNodesAndCorruptBlocks(numDn, numDn - 1, 1, client);
    }
}

Also used : Path(org.apache.hadoop.fs.Path) DFSClient(org.apache.hadoop.hdfs.DFSClient) MiniDFSCluster(org.apache.hadoop.hdfs.MiniDFSCluster) Configuration(org.apache.hadoop.conf.Configuration) HdfsConfiguration(org.apache.hadoop.hdfs.HdfsConfiguration) ChecksumException(org.apache.hadoop.fs.ChecksumException) StrBuilder(org.apache.commons.lang.text.StrBuilder) LocatedBlocks(org.apache.hadoop.hdfs.protocol.LocatedBlocks) ExtendedBlock(org.apache.hadoop.hdfs.protocol.ExtendedBlock) CoreMatchers.containsString(org.hamcrest.CoreMatchers.containsString) IOException(java.io.IOException) HdfsConfiguration(org.apache.hadoop.hdfs.HdfsConfiguration) IOUtils(org.apache.hadoop.io.IOUtils) DataNode(org.apache.hadoop.hdfs.server.datanode.DataNode) FileSystem(org.apache.hadoop.fs.FileSystem) Test(org.junit.Test)

Example 29 with DFSClient

use of org.apache.hadoop.hdfs.DFSClient in project hadoop by apache.

the class WebHdfsHandler method onGetFileChecksum.

private void onGetFileChecksum(ChannelHandlerContext ctx) throws IOException {
    MD5MD5CRC32FileChecksum checksum = null;
    final String nnId = params.namenodeId();
    DFSClient dfsclient = newDfsClient(nnId, conf);
    try {
        checksum = dfsclient.getFileChecksum(path, Long.MAX_VALUE);
        dfsclient.close();
        dfsclient = null;
    } finally {
        IOUtils.cleanup(LOG, dfsclient);
    }
    final byte[] js = JsonUtil.toJsonString(checksum).getBytes(StandardCharsets.UTF_8);
    resp = new DefaultFullHttpResponse(HTTP_1_1, OK, Unpooled.wrappedBuffer(js));
    resp.headers().set(CONTENT_TYPE, APPLICATION_JSON_UTF8);
    resp.headers().set(CONTENT_LENGTH, js.length);
    resp.headers().set(CONNECTION, CLOSE);
    ctx.writeAndFlush(resp).addListener(ChannelFutureListener.CLOSE);
}

Also used : DFSClient(org.apache.hadoop.hdfs.DFSClient) DefaultFullHttpResponse(io.netty.handler.codec.http.DefaultFullHttpResponse) MD5MD5CRC32FileChecksum(org.apache.hadoop.fs.MD5MD5CRC32FileChecksum)

Example 30 with DFSClient

use of org.apache.hadoop.hdfs.DFSClient in project hadoop by apache.

the class WebHdfsHandler method onCreate.

private void onCreate(ChannelHandlerContext ctx) throws IOException, URISyntaxException {
    writeContinueHeader(ctx);
    final String nnId = params.namenodeId();
    final int bufferSize = params.bufferSize();
    final short replication = params.replication();
    final long blockSize = params.blockSize();
    final FsPermission unmaskedPermission = params.unmaskedPermission();
    final FsPermission permission = unmaskedPermission == null ? params.permission() : FsCreateModes.create(params.permission(), unmaskedPermission);
    final boolean createParent = params.createParent();
    EnumSet<CreateFlag> flags = params.createFlag();
    if (flags.equals(EMPTY_CREATE_FLAG)) {
        flags = params.overwrite() ? EnumSet.of(CreateFlag.CREATE, CreateFlag.OVERWRITE) : EnumSet.of(CreateFlag.CREATE);
    } else {
        if (params.overwrite()) {
            flags.add(CreateFlag.OVERWRITE);
        }
    }
    final DFSClient dfsClient = newDfsClient(nnId, confForCreate);
    OutputStream out = dfsClient.createWrappedOutputStream(dfsClient.create(path, permission, flags, createParent, replication, blockSize, null, bufferSize, null), null);
    resp = new DefaultHttpResponse(HTTP_1_1, CREATED);
    final URI uri = new URI(HDFS_URI_SCHEME, nnId, path, null, null);
    resp.headers().set(LOCATION, uri.toString());
    resp.headers().set(CONTENT_LENGTH, 0);
    resp.headers().set(ACCESS_CONTROL_ALLOW_ORIGIN, "*");
    ctx.pipeline().replace(this, HdfsWriter.class.getSimpleName(), new HdfsWriter(dfsClient, out, resp));
}

Also used : CreateFlag(org.apache.hadoop.fs.CreateFlag) DFSClient(org.apache.hadoop.hdfs.DFSClient) OutputStream(java.io.OutputStream) URI(java.net.URI) DefaultHttpResponse(io.netty.handler.codec.http.DefaultHttpResponse) FsPermission(org.apache.hadoop.fs.permission.FsPermission)

Aggregations

DFSClient (org.apache.hadoop.hdfs.DFSClient)107 Test (org.junit.Test)58 IOException (java.io.IOException)39 Nfs3FileAttributes (org.apache.hadoop.nfs.nfs3.Nfs3FileAttributes)27 FileHandle (org.apache.hadoop.nfs.nfs3.FileHandle)26 Path (org.apache.hadoop.fs.Path)20 DistributedFileSystem (org.apache.hadoop.hdfs.DistributedFileSystem)19 VisibleForTesting (com.google.common.annotations.VisibleForTesting)18 Configuration (org.apache.hadoop.conf.Configuration)17 HdfsFileStatus (org.apache.hadoop.hdfs.protocol.HdfsFileStatus)15 MiniDFSCluster (org.apache.hadoop.hdfs.MiniDFSCluster)14 InetSocketAddress (java.net.InetSocketAddress)13 FileSystem (org.apache.hadoop.fs.FileSystem)12 NfsConfiguration (org.apache.hadoop.hdfs.nfs.conf.NfsConfiguration)12 FsPermission (org.apache.hadoop.fs.permission.FsPermission)10 HdfsDataOutputStream (org.apache.hadoop.hdfs.client.HdfsDataOutputStream)10 ArrayList (java.util.ArrayList)9 InvocationTargetException (java.lang.reflect.InvocationTargetException)8 LocatedBlock (org.apache.hadoop.hdfs.protocol.LocatedBlock)8 ShellBasedIdMapping (org.apache.hadoop.security.ShellBasedIdMapping)8