Search in sources :

Example 1 with SeekableFSInputStream

use of org.apache.gobblin.util.io.SeekableFSInputStream in project incubator-gobblin by apache.

the class ParallelRunnerTest method testMovePath.

@Test
public void testMovePath() throws IOException, URISyntaxException {
    String expected = "test";
    ByteArrayOutputStream actual = new ByteArrayOutputStream();
    Path src = new Path("/src/file.txt");
    Path dst = new Path("/dst/file.txt");
    FileSystem fs1 = Mockito.mock(FileSystem.class);
    Mockito.when(fs1.exists(src)).thenReturn(true);
    Mockito.when(fs1.isFile(src)).thenReturn(true);
    Mockito.when(fs1.getUri()).thenReturn(new URI("fs1:////"));
    Mockito.when(fs1.getFileStatus(src)).thenReturn(new FileStatus(1, false, 1, 1, 1, src));
    Mockito.when(fs1.open(src)).thenReturn(new FSDataInputStream(new SeekableFSInputStream(new ByteArrayInputStream(expected.getBytes()))));
    Mockito.when(fs1.delete(src, true)).thenReturn(true);
    FileSystem fs2 = Mockito.mock(FileSystem.class);
    Mockito.when(fs2.exists(dst)).thenReturn(false);
    Mockito.when(fs2.getUri()).thenReturn(new URI("fs2:////"));
    Mockito.when(fs2.getConf()).thenReturn(new Configuration());
    Mockito.when(fs2.create(dst, false)).thenReturn(new FSDataOutputStream(actual, null));
    try (ParallelRunner parallelRunner = new ParallelRunner(1, fs1)) {
        parallelRunner.movePath(src, fs2, dst, Optional.<String>absent());
    }
    Assert.assertEquals(actual.toString(), expected);
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) Configuration(org.apache.hadoop.conf.Configuration) ByteArrayInputStream(java.io.ByteArrayInputStream) FileSystem(org.apache.hadoop.fs.FileSystem) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) SeekableFSInputStream(org.apache.gobblin.util.io.SeekableFSInputStream) ByteArrayOutputStream(java.io.ByteArrayOutputStream) FSDataOutputStream(org.apache.hadoop.fs.FSDataOutputStream) URI(java.net.URI) Test(org.testng.annotations.Test)

Example 2 with SeekableFSInputStream

use of org.apache.gobblin.util.io.SeekableFSInputStream in project incubator-gobblin by apache.

the class SimpleHadoopFilesystemConfigStore method deploy.

/**
 * Deploy configs provided by {@link FsDeploymentConfig#getDeployableConfigSource()} to HDFS.
 * For each {@link ConfigStream} returned by {@link DeployableConfigSource#getConfigStreams()}, creates a resource on HDFS.
 * <br>
 * <ul> Does the following:
 * <li> Read {@link ConfigStream}s and write them to HDFS
 * <li> Create parent directories of {@link ConfigStream#getConfigPath()} if required
 * <li> Set {@link FsDeploymentConfig#getStorePermissions()} to all resourced created on HDFS
 * <li> Update current active version in the store metadata file.
 * </ul>
 *
 * <p>
 *  For example: If "test-root" is a resource in classpath and all resources under it needs to be deployed,
 * <br>
 * <br>
 * <b>In Classpath:</b><br>
 * <blockquote> <code>
 *       test-root<br>
 *       &emsp;/data<br>
 *       &emsp;&emsp;/set1<br>
 *       &emsp;&emsp;&emsp;/main.conf<br>
 *       &emsp;/tag<br>
 *       &emsp;&emsp;/tag1<br>
 *       &emsp;&emsp;&emsp;/main.conf<br>
 *     </code> </blockquote>
 * </p>
 *
 * <p>
 *  A new version 2.0.0 {@link FsDeploymentConfig#getNewVersion()} is created on HDFS under <code>this.physicalStoreRoot/_CONFIG_STORE</code>
 * <br>
 * <br>
 * <b>On HDFS after deploy:</b><br>
 * <blockquote> <code>
 *       /_CONFIG_STORE<br>
 *       &emsp;/2.0.0<br>
 *       &emsp;&emsp;/data<br>
 *       &emsp;&emsp;&emsp;/set1<br>
 *       &emsp;&emsp;&emsp;&emsp;/main.conf<br>
 *       &emsp;&emsp;/tag<br>
 *       &emsp;&emsp;&emsp;/tag1<br>
 *       &emsp;&emsp;&emsp;&emsp;/main.conf<br>
 *     </code> </blockquote>
 * </p>
 */
@Override
public void deploy(FsDeploymentConfig deploymentConfig) throws IOException {
    log.info("Deploying with config : " + deploymentConfig);
    Path hdfsconfigStoreRoot = new Path(this.physicalStoreRoot.getPath(), CONFIG_STORE_NAME);
    if (!this.fs.exists(hdfsconfigStoreRoot)) {
        throw new IOException("Config store root not present at " + this.physicalStoreRoot.getPath());
    }
    Path hdfsNewVersionPath = new Path(hdfsconfigStoreRoot, deploymentConfig.getNewVersion());
    if (!this.fs.exists(hdfsNewVersionPath)) {
        this.fs.mkdirs(hdfsNewVersionPath, deploymentConfig.getStorePermissions());
        Set<ConfigStream> confStreams = deploymentConfig.getDeployableConfigSource().getConfigStreams();
        for (ConfigStream confStream : confStreams) {
            String confAtPath = confStream.getConfigPath();
            log.info("Copying resource at : " + confAtPath);
            Path hdsfConfPath = new Path(hdfsNewVersionPath, confAtPath);
            if (!this.fs.exists(hdsfConfPath.getParent())) {
                this.fs.mkdirs(hdsfConfPath.getParent());
            }
            // If an empty directory needs to created it may not have a stream.
            if (confStream.getInputStream().isPresent()) {
                // Read the resource as a stream from the classpath and write it to HDFS
                try (SeekableFSInputStream inputStream = new SeekableFSInputStream(confStream.getInputStream().get());
                    FSDataOutputStream os = this.fs.create(hdsfConfPath, false)) {
                    StreamUtils.copy(inputStream, os);
                }
            }
        }
        // Set permission for newly copied files
        for (FileStatus fileStatus : FileListUtils.listPathsRecursively(this.fs, hdfsNewVersionPath, FileListUtils.NO_OP_PATH_FILTER)) {
            this.fs.setPermission(fileStatus.getPath(), deploymentConfig.getStorePermissions());
        }
    } else {
        log.warn(String.format("STORE WITH VERSION %s ALREADY EXISTS. NEW RESOURCES WILL NOT BE COPIED. ONLY STORE MEATADATA FILE WILL BE UPDATED TO %s", deploymentConfig.getNewVersion(), deploymentConfig.getNewVersion()));
    }
    this.storeMetadata.setCurrentVersion(deploymentConfig.getNewVersion());
    log.info(String.format("New version %s of config store deployed at %s", deploymentConfig.getNewVersion(), hdfsconfigStoreRoot));
}
Also used : SingleLinkedListConfigKeyPath(org.apache.gobblin.config.common.impl.SingleLinkedListConfigKeyPath) Path(org.apache.hadoop.fs.Path) ConfigKeyPath(org.apache.gobblin.config.store.api.ConfigKeyPath) ConfigStream(org.apache.gobblin.config.store.deploy.ConfigStream) FileStatus(org.apache.hadoop.fs.FileStatus) SeekableFSInputStream(org.apache.gobblin.util.io.SeekableFSInputStream) IOException(java.io.IOException) FSDataOutputStream(org.apache.hadoop.fs.FSDataOutputStream)

Aggregations

SeekableFSInputStream (org.apache.gobblin.util.io.SeekableFSInputStream)2 FSDataOutputStream (org.apache.hadoop.fs.FSDataOutputStream)2 FileStatus (org.apache.hadoop.fs.FileStatus)2 Path (org.apache.hadoop.fs.Path)2 ByteArrayInputStream (java.io.ByteArrayInputStream)1 ByteArrayOutputStream (java.io.ByteArrayOutputStream)1 IOException (java.io.IOException)1 URI (java.net.URI)1 SingleLinkedListConfigKeyPath (org.apache.gobblin.config.common.impl.SingleLinkedListConfigKeyPath)1 ConfigKeyPath (org.apache.gobblin.config.store.api.ConfigKeyPath)1 ConfigStream (org.apache.gobblin.config.store.deploy.ConfigStream)1 Configuration (org.apache.hadoop.conf.Configuration)1 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 Test (org.testng.annotations.Test)1