Search in sources :

Example 1 with HarFileSystem

use of org.apache.hadoop.fs.HarFileSystem in project hadoop by apache.

the class TestHadoopArchives method testCopyToLocal.

@Test
public /*
   * Tests copying from archive file system to a local file system
   */
void testCopyToLocal() throws Exception {
    final String fullHarPathStr = makeArchive();
    // make path to copy the file to:
    final String tmpDir = System.getProperty("test.build.data", "build/test/data") + "/work-dir/har-fs-tmp";
    final Path tmpPath = new Path(tmpDir);
    final LocalFileSystem localFs = FileSystem.getLocal(new Configuration());
    localFs.delete(tmpPath, true);
    localFs.mkdirs(tmpPath);
    assertTrue(localFs.exists(tmpPath));
    // Create fresh HarFs:
    final HarFileSystem harFileSystem = new HarFileSystem(fs);
    try {
        final URI harUri = new URI(fullHarPathStr);
        harFileSystem.initialize(harUri, fs.getConf());
        final Path sourcePath = new Path(fullHarPathStr + Path.SEPARATOR + "a");
        final Path targetPath = new Path(tmpPath, "straus");
        // copy the Har file to a local file system:
        harFileSystem.copyToLocalFile(false, sourcePath, targetPath);
        FileStatus straus = localFs.getFileStatus(targetPath);
        // the file should contain just 1 character:
        assertEquals(1, straus.getLen());
    } finally {
        harFileSystem.close();
        localFs.delete(tmpPath, true);
    }
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) LocatedFileStatus(org.apache.hadoop.fs.LocatedFileStatus) CapacitySchedulerConfiguration(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration) Configuration(org.apache.hadoop.conf.Configuration) LocalFileSystem(org.apache.hadoop.fs.LocalFileSystem) HarFileSystem(org.apache.hadoop.fs.HarFileSystem) URI(java.net.URI) Test(org.junit.Test)

Example 2 with HarFileSystem

use of org.apache.hadoop.fs.HarFileSystem in project hadoop by apache.

the class TestHadoopArchives method testReadFileContent.

@Test
public void testReadFileContent() throws Exception {
    fileList.add(createFile(inputPath, fs, "c c"));
    final Path sub1 = new Path(inputPath, "sub 1");
    fs.mkdirs(sub1);
    fileList.add(createFile(inputPath, fs, sub1.getName(), "file x y z"));
    fileList.add(createFile(inputPath, fs, sub1.getName(), "file"));
    fileList.add(createFile(inputPath, fs, sub1.getName(), "x"));
    fileList.add(createFile(inputPath, fs, sub1.getName(), "y"));
    fileList.add(createFile(inputPath, fs, sub1.getName(), "z"));
    final Path sub2 = new Path(inputPath, "sub 1 with suffix");
    fs.mkdirs(sub2);
    fileList.add(createFile(inputPath, fs, sub2.getName(), "z"));
    // Generate a big binary file content:
    final byte[] binContent = prepareBin();
    fileList.add(createFile(inputPath, fs, binContent, sub2.getName(), "bin"));
    fileList.add(createFile(inputPath, fs, new byte[0], sub2.getName(), "zero-length"));
    final String fullHarPathStr = makeArchive();
    // Create fresh HarFs:
    final HarFileSystem harFileSystem = new HarFileSystem(fs);
    try {
        final URI harUri = new URI(fullHarPathStr);
        harFileSystem.initialize(harUri, fs.getConf());
        // now read the file content and compare it against the expected:
        int readFileCount = 0;
        for (final String pathStr0 : fileList) {
            final Path path = new Path(fullHarPathStr + Path.SEPARATOR + pathStr0);
            final String baseName = path.getName();
            final FileStatus status = harFileSystem.getFileStatus(path);
            if (status.isFile()) {
                // read the file:
                final byte[] actualContentSimple = readAllSimple(harFileSystem.open(path), true);
                final byte[] actualContentBuffer = readAllWithBuffer(harFileSystem.open(path), true);
                assertArrayEquals(actualContentSimple, actualContentBuffer);
                final byte[] actualContentFully = readAllWithReadFully(actualContentSimple.length, harFileSystem.open(path), true);
                assertArrayEquals(actualContentSimple, actualContentFully);
                final byte[] actualContentSeek = readAllWithSeek(actualContentSimple.length, harFileSystem.open(path), true);
                assertArrayEquals(actualContentSimple, actualContentSeek);
                final byte[] actualContentRead4 = readAllWithRead4(harFileSystem.open(path), true);
                assertArrayEquals(actualContentSimple, actualContentRead4);
                final byte[] actualContentSkip = readAllWithSkip(actualContentSimple.length, harFileSystem.open(path), harFileSystem.open(path), true);
                assertArrayEquals(actualContentSimple, actualContentSkip);
                if ("bin".equals(baseName)) {
                    assertArrayEquals(binContent, actualContentSimple);
                } else if ("zero-length".equals(baseName)) {
                    assertEquals(0, actualContentSimple.length);
                } else {
                    String actual = new String(actualContentSimple, "UTF-8");
                    assertEquals(baseName, actual);
                }
                readFileCount++;
            }
        }
        assertEquals(fileList.size(), readFileCount);
    } finally {
        harFileSystem.close();
    }
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) LocatedFileStatus(org.apache.hadoop.fs.LocatedFileStatus) HarFileSystem(org.apache.hadoop.fs.HarFileSystem) URI(java.net.URI) Test(org.junit.Test)

Aggregations

URI (java.net.URI)2 FileStatus (org.apache.hadoop.fs.FileStatus)2 HarFileSystem (org.apache.hadoop.fs.HarFileSystem)2 LocatedFileStatus (org.apache.hadoop.fs.LocatedFileStatus)2 Path (org.apache.hadoop.fs.Path)2 Test (org.junit.Test)2 Configuration (org.apache.hadoop.conf.Configuration)1 LocalFileSystem (org.apache.hadoop.fs.LocalFileSystem)1 CapacitySchedulerConfiguration (org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration)1