Search in sources :

Example 56 with Location

use of org.apache.twill.filesystem.Location in project cdap by caskdata.

the class IntegrationTestManager method addPluginArtifact.

@Override
public ArtifactManager addPluginArtifact(ArtifactId artifactId, Set<ArtifactRange> parents, @Nullable Set<PluginClass> additionalPlugins, Class<?> pluginClass, Class<?>... pluginClasses) throws Exception {
    Manifest manifest = createManifest(pluginClass, pluginClasses);
    final Location appJar = PluginJarHelper.createPluginJar(locationFactory, manifest, pluginClass, pluginClasses);
    artifactClient.add(Ids.namespace(artifactId.getNamespace()), artifactId.getArtifact(), new InputSupplier<InputStream>() {

        @Override
        public InputStream getInput() throws IOException {
            return appJar.getInputStream();
        }
    }, artifactId.getVersion(), parents, additionalPlugins);
    appJar.delete();
    return new RemoteArtifactManager(clientConfig, restClient, artifactId);
}
Also used : FileInputStream(java.io.FileInputStream) InputStream(java.io.InputStream) RemoteArtifactManager(co.cask.cdap.test.remote.RemoteArtifactManager) IOException(java.io.IOException) Manifest(java.util.jar.Manifest) Location(org.apache.twill.filesystem.Location)

Example 57 with Location

use of org.apache.twill.filesystem.Location in project cdap by caskdata.

the class SparkCredentialsUpdaterTest method testCleanup.

@Test
public void testCleanup() throws IOException, InterruptedException {
    Location credentialsDir = Locations.toLocation(TEMPORARY_FOLDER.newFolder());
    // Create a updater that don't do any auto-update within the test time
    SparkCredentialsUpdater updater = new SparkCredentialsUpdater(createCredentialsSupplier(), credentialsDir, "credentials", TimeUnit.DAYS.toMillis(1), TimeUnit.SECONDS.toMillis(3), 3) {

        @Override
        long getNextUpdateDelay(Credentials credentials) throws IOException {
            return TimeUnit.DAYS.toMillis(1);
        }
    };
    updater.startAndWait();
    try {
        // Expect this loop to finish in 3 seconds because we don't want sleep for too long for testing cleanup
        for (int i = 1; i <= 5; i++) {
            Assert.assertEquals(i, credentialsDir.list().size());
            updater.run();
        }
        // Sleep for 3 seconds, which is the cleanup expire time
        TimeUnit.SECONDS.sleep(3);
        // Run the updater again, it should only have three files (2 older than expire time, 1 new)
        updater.run();
        Assert.assertEquals(3, credentialsDir.list().size());
    } finally {
        updater.stopAndWait();
    }
}
Also used : Credentials(org.apache.hadoop.security.Credentials) Location(org.apache.twill.filesystem.Location) Test(org.junit.Test)

Example 58 with Location

use of org.apache.twill.filesystem.Location in project cdap by caskdata.

the class SparkCredentialsUpdaterTest method testUpdater.

@Test
public void testUpdater() throws Exception {
    Location credentialsDir = Locations.toLocation(TEMPORARY_FOLDER.newFolder());
    // Create a updater that don't do any auto-update within the test time and don't cleanup
    SparkCredentialsUpdater updater = new SparkCredentialsUpdater(createCredentialsSupplier(), credentialsDir, "credentials", TimeUnit.DAYS.toMillis(1), TimeUnit.DAYS.toMillis(1), Integer.MAX_VALUE) {

        @Override
        long getNextUpdateDelay(Credentials credentials) throws IOException {
            return TimeUnit.DAYS.toMillis(1);
        }
    };
    // Before the updater starts, the directory is empty
    Assert.assertTrue(credentialsDir.list().isEmpty());
    UserGroupInformation.getCurrentUser().addToken(new Token<>(Bytes.toBytes("id"), Bytes.toBytes("pass"), new Text("kind"), new Text("service")));
    updater.startAndWait();
    try {
        List<Location> expectedFiles = new ArrayList<>();
        expectedFiles.add(credentialsDir.append("credentials-1"));
        for (int i = 1; i <= 10; i++) {
            Assert.assertEquals(expectedFiles, listAndSort(credentialsDir));
            // Read the credentials from the last file
            Credentials newCredentials = new Credentials();
            try (DataInputStream is = new DataInputStream(expectedFiles.get(expectedFiles.size() - 1).getInputStream())) {
                newCredentials.readTokenStorageStream(is);
            }
            // Should contains all tokens of the current user
            Credentials userCredentials = UserGroupInformation.getCurrentUser().getCredentials();
            for (Token<? extends TokenIdentifier> token : userCredentials.getAllTokens()) {
                Assert.assertEquals(token, newCredentials.getToken(token.getService()));
            }
            UserGroupInformation.getCurrentUser().addToken(new Token<>(Bytes.toBytes("id" + i), Bytes.toBytes("pass" + i), new Text("kind" + i), new Text("service" + i)));
            updater.run();
            expectedFiles.add(credentialsDir.append("credentials-" + (i + 1)));
        }
    } finally {
        updater.stopAndWait();
    }
}
Also used : ArrayList(java.util.ArrayList) Text(org.apache.hadoop.io.Text) DataInputStream(java.io.DataInputStream) Credentials(org.apache.hadoop.security.Credentials) Location(org.apache.twill.filesystem.Location) Test(org.junit.Test)

Example 59 with Location

use of org.apache.twill.filesystem.Location in project cdap by caskdata.

the class PartitionRollbackTestRun method testPFSRollback.

/*
   * This tests all the following cases:
   *
   *  1. addPartition(location) fails because partition already exists
   *  2. addPartition(location) fails because Hive partition already exists
   *  3. addPartition(location) succeeds but transaction fails
   *  4. partitionOutput.addPartition() fails because partition already exists
   *  5. partitionOutput.addPartition() fails because Hive partition already exists
   *  6. partitionOutput.addPartition() succeeds but transaction fails
   *  7. mapreduce writing partition fails because location already exists
   *  8. mapreduce writing partition fails because partition already exists
   *  9. mapreduce writing partition fails because Hive partition already exists
   *  10. mapreduce writing dynamic partition fails because location already exists
   *  11. mapreduce writing dynamic partition fails because partition already exists
   *  12. mapreduce writing dynamic partition fails because Hive partition already exists
   *  13. multi-output mapreduce writing partition fails because location already exists
   *  13a. first output fails, other output must rollback 0 and 5
   *  13b. second output fails, first output must rollback 0 and 5
   *  14. multi-output mapreduce writing partition fails because partition already exists
   *  14a. first output fails, other output must rollback partition 5
   *  14b. second output fails, first output must rollback partition 5
   *  15. multi-output mapreduce writing partition fails because Hive partition already exists
   *  15a. first output fails, other output must rollback partitions 0 and 5
   *  15b. second output fails, first output must rollback partitions 0 and 5
   *
   * For all these cases, we validate that existing files and partitions are preserved, and newly
   * added files and partitions are rolled back.
   */
@Test
public void testPFSRollback() throws Exception {
    ApplicationManager appManager = deployApplication(AppWritingToPartitioned.class);
    MapReduceManager mrManager = appManager.getMapReduceManager(MAPREDUCE);
    int numRuns = 0;
    Validator pfsValidator = new Validator(PFS);
    Validator otherValidator = new Validator(OTHER);
    final UnitTestManager.UnitTestDatasetManager<PartitionedFileSet> pfsManager = pfsValidator.getPfsManager();
    final PartitionedFileSet pfs = pfsManager.get();
    final PartitionedFileSet other = otherValidator.getPfsManager().get();
    final String path3 = pfsValidator.getRelativePath3();
    // 1. addPartition(location) fails because partition already exists
    try {
        pfsManager.execute(new Runnable() {

            @Override
            public void run() {
                pfs.addPartition(KEY_1, path3);
            }
        });
        Assert.fail("Expected tx to fail because partition for number=1 already exists");
    } catch (TransactionFailureException e) {
    // expected
    }
    pfsValidator.validate();
    // 2. addPartition(location) fails because Hive partition already exists
    try {
        pfsManager.execute(new Runnable() {

            @Override
            public void run() {
                pfs.addPartition(KEY_4, path3);
            }
        });
        Assert.fail("Expected tx to fail because hive partition for number=1 already exists");
    } catch (TransactionFailureException e) {
    // expected
    }
    pfsValidator.validate();
    // 3. addPartition(location) succeeds but transaction fails
    try {
        pfsManager.execute(new Runnable() {

            @Override
            public void run() {
                pfs.addPartition(KEY_3, path3);
                throw new RuntimeException("fail the tx");
            }
        });
        Assert.fail("Expected tx to fail because it threw a runtime exception");
    } catch (TransactionFailureException e) {
    // expected
    }
    pfsValidator.validate();
    // 4. partitionOutput.addPartition() fails because partition already exists
    final PartitionOutput output2x = pfs.getPartitionOutput(KEY_2);
    final Location location2x = output2x.getLocation();
    try (Writer writer = new OutputStreamWriter(location2x.append("file").getOutputStream())) {
        writer.write("2x,2x\n");
    }
    try {
        pfsManager.execute(new Runnable() {

            @Override
            public void run() {
                output2x.addPartition();
            }
        });
        Assert.fail("Expected tx to fail because partition for number=2 already exists");
    } catch (TransactionFailureException e) {
    // expected
    }
    pfsValidator.validate();
    Assert.assertFalse(location2x.exists());
    // 5. partitionOutput.addPartition() fails because Hive partition already exists
    final PartitionOutput output4x = pfs.getPartitionOutput(KEY_4);
    final Location location4x = output4x.getLocation();
    try (Writer writer = new OutputStreamWriter(location4x.append("file").getOutputStream())) {
        writer.write("4x,4x\n");
    }
    try {
        pfsManager.execute(new Runnable() {

            @Override
            public void run() {
                output4x.addPartition();
            }
        });
        Assert.fail("Expected tx to fail because hive partition for number=4 already exists");
    } catch (TransactionFailureException e) {
    // expected
    }
    pfsValidator.validate();
    Assert.assertFalse(location4x.exists());
    // 6. partitionOutput.addPartition() succeeds but transaction fails
    final PartitionOutput output5x = pfs.getPartitionOutput(KEY_5);
    final Location location5x = output5x.getLocation();
    try (Writer writer = new OutputStreamWriter(location5x.append("file").getOutputStream())) {
        writer.write("5x,5x\n");
    }
    try {
        pfsManager.execute(new Runnable() {

            @Override
            public void run() {
                output5x.addPartition();
                throw new RuntimeException("fail the tx");
            }
        });
        Assert.fail("Expected tx to fail because it threw a runtime exception");
    } catch (TransactionFailureException e) {
    // expected
    }
    pfsValidator.validate();
    Assert.assertFalse(location5x.exists());
    // 7. mapreduce writing partition fails because location already exists
    mrManager.start(ImmutableMap.of(PFS_OUT, "1", "input.text", "1x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    // 8. mapreduce writing partition fails because partition already exists
    mrManager.start(ImmutableMap.of(PFS_OUT, "2", "input.text", "2x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    Assert.assertFalse(pfs.getPartitionOutput(KEY_2).getLocation().exists());
    // 9. mapreduce writing partition fails because Hive partition already exists
    mrManager.start(ImmutableMap.of(PFS_OUT, "4", "input.text", "4x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    Assert.assertFalse(pfs.getPartitionOutput(KEY_4).getLocation().exists());
    // 10. mapreduce writing dynamic partition fails because location already exists
    mrManager.start(ImmutableMap.of("input.text", "3x 5x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    Assert.assertFalse(pfs.getPartitionOutput(KEY_5).getLocation().exists());
    // 11. mapreduce writing dynamic partition fails because partition already exists
    mrManager.start(ImmutableMap.of("input.text", "2x 5x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    Assert.assertFalse(pfs.getPartitionOutput(KEY_2).getLocation().exists());
    Assert.assertFalse(pfs.getPartitionOutput(KEY_5).getLocation().exists());
    // 12. mapreduce writing dynamic partition fails because Hive partition already exists
    mrManager.start(ImmutableMap.of("input.text", "0x 4x 5x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    Assert.assertFalse(pfs.getPartitionOutput(KEY_0).getLocation().exists());
    Assert.assertFalse(pfs.getPartitionOutput(KEY_4).getLocation().exists());
    Assert.assertFalse(pfs.getPartitionOutput(KEY_5).getLocation().exists());
    // 13. multi-output mapreduce writing partition fails because location already exists
    // 13a. first output fails, other output must rollback 0 and 5
    mrManager.start(ImmutableMap.of("output.datasets", BOTH, PFS_OUT, "1", "input.text", "0x 5x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    otherValidator.validate();
    Assert.assertFalse(other.getPartitionOutput(KEY_0).getLocation().exists());
    Assert.assertFalse(other.getPartitionOutput(KEY_5).getLocation().exists());
    // 13b. second output fails, first output must rollback 0 and 5
    mrManager.start(ImmutableMap.of("output.datasets", BOTH, OTHER_OUT, "1", "input.text", "0x 5x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    otherValidator.validate();
    Assert.assertFalse(pfs.getPartitionOutput(KEY_0).getLocation().exists());
    Assert.assertFalse(pfs.getPartitionOutput(KEY_5).getLocation().exists());
    // 14. multi-output mapreduce writing partition fails because partition already exists
    // 14a. first output fails, other output must rollback partition 5
    // TODO: bring this back when CDAP-8766 is fixed
    /*
    mrManager.start(ImmutableMap.of("output.datasets", BOTH, PFS_OUT, "2", OTHER_OUT, "5", "input.text", "2x 5x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    otherValidator.validate();
    Assert.assertFalse(other.getPartitionOutput(KEY_5).getLocation().exists());
    */
    // 14b. second output fails, first output must rollback partition 5
    mrManager.start(ImmutableMap.of("output.datasets", BOTH, PFS_OUT, "5", OTHER_OUT, "2", "input.text", "2x 5x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    otherValidator.validate();
    Assert.assertFalse(pfs.getPartitionOutput(KEY_5).getLocation().exists());
// 15. multi-output mapreduce writing partition fails because Hive partition already exists
// 15a. first output fails, other output must rollback partitions 0 and 5
// TODO: bring this back when CDAP-8766 is fixed
/*
    mrManager.start(ImmutableMap.of("output.datasets", BOTH, PFS_OUT, "4", "input.text", "0x 5x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    otherValidator.validate();
    Assert.assertFalse(pfs.getPartitionOutput(KEY_4).getLocation().exists());
    Assert.assertFalse(other.getPartitionOutput(KEY_0).getLocation().exists());
    Assert.assertFalse(other.getPartitionOutput(KEY_5).getLocation().exists());
    // 15b. second output fails, first output must rollback partitions 0 and 5
    mrManager.start(ImmutableMap.of("output.datasets", BOTH, OTHER_OUT, "4", "input.text", "0x 5x"));
    mrManager.waitForRuns(ProgramRunStatus.FAILED, ++numRuns, 2, TimeUnit.MINUTES);
    pfsValidator.validate();
    otherValidator.validate();
    Assert.assertFalse(other.getPartitionOutput(KEY_4).getLocation().exists());
    Assert.assertFalse(pfs.getPartitionOutput(KEY_0).getLocation().exists());
    Assert.assertFalse(pfs.getPartitionOutput(KEY_5).getLocation().exists());
    */
}
Also used : ApplicationManager(co.cask.cdap.test.ApplicationManager) MapReduceManager(co.cask.cdap.test.MapReduceManager) PartitionedFileSet(co.cask.cdap.api.dataset.lib.PartitionedFileSet) TransactionFailureException(org.apache.tephra.TransactionFailureException) PartitionOutput(co.cask.cdap.api.dataset.lib.PartitionOutput) UnitTestManager(co.cask.cdap.test.UnitTestManager) OutputStreamWriter(java.io.OutputStreamWriter) OutputStreamWriter(java.io.OutputStreamWriter) Writer(java.io.Writer) Location(org.apache.twill.filesystem.Location) Test(org.junit.Test)

Example 60 with Location

use of org.apache.twill.filesystem.Location in project cdap by caskdata.

the class LogFileManager method getLocation.

private TimeStampLocation getLocation(LogPathIdentifier logPathIdentifier) throws IOException {
    ensureDirectoryCheck(logsDirectoryLocation);
    long currentTime = System.currentTimeMillis();
    String date = new SimpleDateFormat("yyyy-MM-dd").format(new Date(currentTime));
    Location contextLocation = logsDirectoryLocation.append(logPathIdentifier.getNamespaceId()).append(date).append(logPathIdentifier.getPathId1()).append(logPathIdentifier.getPathId2());
    ensureDirectoryCheck(contextLocation);
    String fileName = String.format("%s.avro", currentTime);
    return new TimeStampLocation(contextLocation.append(fileName), currentTime);
}
Also used : SimpleDateFormat(java.text.SimpleDateFormat) Date(java.util.Date) Location(org.apache.twill.filesystem.Location)

Aggregations

Location (org.apache.twill.filesystem.Location)246 Test (org.junit.Test)104 IOException (java.io.IOException)57 File (java.io.File)39 LocalLocationFactory (org.apache.twill.filesystem.LocalLocationFactory)29 LocationFactory (org.apache.twill.filesystem.LocationFactory)29 FileSet (co.cask.cdap.api.dataset.lib.FileSet)28 StreamEvent (co.cask.cdap.api.flow.flowlet.StreamEvent)27 PartitionedFileSet (co.cask.cdap.api.dataset.lib.PartitionedFileSet)23 CConfiguration (co.cask.cdap.common.conf.CConfiguration)19 NamespaceId (co.cask.cdap.proto.id.NamespaceId)19 Manifest (java.util.jar.Manifest)18 HashMap (java.util.HashMap)17 StreamId (co.cask.cdap.proto.id.StreamId)16 OutputStream (java.io.OutputStream)15 DatasetFramework (co.cask.cdap.data2.dataset2.DatasetFramework)13 TimePartitionedFileSet (co.cask.cdap.api.dataset.lib.TimePartitionedFileSet)11 StreamConfig (co.cask.cdap.data2.transaction.stream.StreamConfig)10 ArrayList (java.util.ArrayList)9 StreamAdmin (co.cask.cdap.data2.transaction.stream.StreamAdmin)8