Search in sources :

Example 6 with ApplicationWithPrograms

use of co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms in project cdap by caskdata.

the class FlowTest method testFlow.

@Test
public void testFlow() throws Exception {
    final ApplicationWithPrograms app = AppFabricTestHelper.deployApplicationWithManager(WordCountApp.class, TEMP_FOLDER_SUPPLIER);
    List<ProgramController> controllers = Lists.newArrayList();
    for (ProgramDescriptor programDescriptor : app.getPrograms()) {
        // running mapreduce is out of scope of this tests (there's separate unit-test for that)
        if (programDescriptor.getProgramId().getType() == ProgramType.MAPREDUCE) {
            continue;
        }
        controllers.add(AppFabricTestHelper.submit(app, programDescriptor.getSpecification().getClassName(), new BasicArguments(), TEMP_FOLDER_SUPPLIER));
    }
    TimeUnit.SECONDS.sleep(1);
    TransactionSystemClient txSystemClient = AppFabricTestHelper.getInjector().getInstance(TransactionSystemClient.class);
    QueueName queueName = QueueName.fromStream(app.getApplicationId().getNamespace(), "text");
    QueueClientFactory queueClientFactory = AppFabricTestHelper.getInjector().getInstance(QueueClientFactory.class);
    QueueProducer producer = queueClientFactory.createProducer(queueName);
    // start tx to write in queue in tx
    Transaction tx = txSystemClient.startShort();
    ((TransactionAware) producer).startTx(tx);
    StreamEventCodec codec = new StreamEventCodec();
    for (int i = 0; i < 10; i++) {
        String msg = "Testing message " + i;
        StreamEvent event = new StreamEvent(ImmutableMap.<String, String>of(), ByteBuffer.wrap(msg.getBytes(Charsets.UTF_8)));
        producer.enqueue(new QueueEntry(codec.encodePayload(event)));
    }
    // commit tx
    ((TransactionAware) producer).commitTx();
    txSystemClient.commit(tx);
    // Query the service for at most 10 seconds for the expected result
    Gson gson = new Gson();
    DiscoveryServiceClient discoveryServiceClient = AppFabricTestHelper.getInjector().getInstance(DiscoveryServiceClient.class);
    ServiceDiscovered serviceDiscovered = discoveryServiceClient.discover(String.format("service.%s.%s.%s", DefaultId.NAMESPACE.getNamespace(), "WordCountApp", "WordFrequencyService"));
    EndpointStrategy endpointStrategy = new RandomEndpointStrategy(serviceDiscovered);
    int trials = 0;
    while (trials++ < 10) {
        Discoverable discoverable = endpointStrategy.pick(2, TimeUnit.SECONDS);
        URL url = new URL(String.format("http://%s:%d/v3/namespaces/default/apps/%s/services/%s/methods/%s/%s", discoverable.getSocketAddress().getHostName(), discoverable.getSocketAddress().getPort(), "WordCountApp", "WordFrequencyService", "wordfreq", "text:Testing"));
        try {
            HttpURLConnection urlConn = (HttpURLConnection) url.openConnection();
            Map<String, Long> responseContent = gson.fromJson(new InputStreamReader(urlConn.getInputStream(), Charsets.UTF_8), new TypeToken<Map<String, Long>>() {
            }.getType());
            LOG.info("Service response: " + responseContent);
            if (ImmutableMap.of("text:Testing", 10L).equals(responseContent)) {
                break;
            }
        } catch (Throwable t) {
            LOG.info("Exception when trying to query service.", t);
        }
        TimeUnit.SECONDS.sleep(1);
    }
    Assert.assertTrue(trials < 10);
    for (ProgramController controller : controllers) {
        controller.stop().get();
    }
}
Also used : DiscoveryServiceClient(org.apache.twill.discovery.DiscoveryServiceClient) Gson(com.google.gson.Gson) URL(java.net.URL) TransactionSystemClient(org.apache.tephra.TransactionSystemClient) StreamEventCodec(co.cask.cdap.common.stream.StreamEventCodec) HttpURLConnection(java.net.HttpURLConnection) QueueProducer(co.cask.cdap.data2.queue.QueueProducer) EndpointStrategy(co.cask.cdap.common.discovery.EndpointStrategy) RandomEndpointStrategy(co.cask.cdap.common.discovery.RandomEndpointStrategy) ApplicationWithPrograms(co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms) ProgramDescriptor(co.cask.cdap.app.program.ProgramDescriptor) BasicArguments(co.cask.cdap.internal.app.runtime.BasicArguments) QueueName(co.cask.cdap.common.queue.QueueName) ProgramController(co.cask.cdap.app.runtime.ProgramController) Discoverable(org.apache.twill.discovery.Discoverable) InputStreamReader(java.io.InputStreamReader) StreamEvent(co.cask.cdap.api.flow.flowlet.StreamEvent) ServiceDiscovered(org.apache.twill.discovery.ServiceDiscovered) QueueEntry(co.cask.cdap.data2.queue.QueueEntry) Transaction(org.apache.tephra.Transaction) TransactionAware(org.apache.tephra.TransactionAware) TypeToken(com.google.common.reflect.TypeToken) QueueClientFactory(co.cask.cdap.data2.queue.QueueClientFactory) RandomEndpointStrategy(co.cask.cdap.common.discovery.RandomEndpointStrategy) Test(org.junit.Test)

Example 7 with ApplicationWithPrograms

use of co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms in project cdap by caskdata.

the class DynamicPartitionerWithAvroTest method runDynamicPartitionerMapReduce.

private void runDynamicPartitionerMapReduce(final List<? extends GenericRecord> records, boolean allowConcurrentWriters, boolean expectedStatus) throws Exception {
    ApplicationWithPrograms app = deployApp(AppWithMapReduceUsingAvroDynamicPartitioner.class);
    final long now = System.currentTimeMillis();
    final Multimap<PartitionKey, GenericRecord> keyToRecordsMap = groupByPartitionKey(records, now);
    // write values to the input kvTable
    final KeyValueTable kvTable = datasetCache.getDataset(INPUT_DATASET);
    Transactions.createTransactionExecutor(txExecutorFactory, kvTable).execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() {
            // the keys are not used; it matters that they're unique though
            for (int i = 0; i < records.size(); i++) {
                kvTable.write(Integer.toString(i), records.get(i).toString());
            }
        }
    });
    String allowConcurrencyKey = "dataset." + OUTPUT_DATASET + "." + PartitionedFileSetArguments.DYNAMIC_PARTITIONER_ALLOW_CONCURRENCY;
    // run the partition writer m/r with this output partition time
    ImmutableMap<String, String> arguments = ImmutableMap.of(OUTPUT_PARTITION_KEY, Long.toString(now), allowConcurrencyKey, Boolean.toString(allowConcurrentWriters));
    long startTime = System.currentTimeMillis();
    boolean status = runProgram(app, AppWithMapReduceUsingAvroDynamicPartitioner.DynamicPartitioningMapReduce.class, new BasicArguments(arguments));
    Assert.assertEquals(expectedStatus, status);
    if (!expectedStatus) {
        // if we expect the program to fail, no need to check the output data for expected results
        return;
    }
    // Verify notifications
    List<Notification> notifications = getDataNotifications(startTime);
    Assert.assertEquals(1, notifications.size());
    Assert.assertEquals(NamespaceId.DEFAULT.dataset(OUTPUT_DATASET), DatasetId.fromString(notifications.get(0).getProperties().get("datasetId")));
    // this should have created a partition in the pfs
    final PartitionedFileSet pfs = datasetCache.getDataset(OUTPUT_DATASET);
    final Location pfsBaseLocation = pfs.getEmbeddedFileSet().getBaseLocation();
    Transactions.createTransactionExecutor(txExecutorFactory, (TransactionAware) pfs).execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() throws IOException {
            Map<PartitionKey, PartitionDetail> partitions = new HashMap<>();
            for (PartitionDetail partition : pfs.getPartitions(null)) {
                partitions.put(partition.getPartitionKey(), partition);
                // check that the mapreduce wrote the output partition metadata to all the output partitions
                Assert.assertEquals(AppWithMapReduceUsingAvroDynamicPartitioner.DynamicPartitioningMapReduce.METADATA, partition.getMetadata().asMap());
            }
            Assert.assertEquals(3, partitions.size());
            Assert.assertEquals(keyToRecordsMap.keySet(), partitions.keySet());
            // Check relative paths of the partitions. Also check that their location = pfs baseLocation + relativePath
            for (Map.Entry<PartitionKey, PartitionDetail> partitionKeyEntry : partitions.entrySet()) {
                PartitionDetail partitionDetail = partitionKeyEntry.getValue();
                String relativePath = partitionDetail.getRelativePath();
                int zip = (int) partitionKeyEntry.getKey().getField("zip");
                Assert.assertEquals(Long.toString(now) + Path.SEPARATOR + zip, relativePath);
                Assert.assertEquals(pfsBaseLocation.append(relativePath), partitionDetail.getLocation());
            }
            for (Map.Entry<PartitionKey, Collection<GenericRecord>> keyToRecordsEntry : keyToRecordsMap.asMap().entrySet()) {
                Set<GenericRecord> genericRecords = new HashSet<>(keyToRecordsEntry.getValue());
                Assert.assertEquals(genericRecords, readOutput(partitions.get(keyToRecordsEntry.getKey()).getLocation()));
            }
        }
    });
}
Also used : HashSet(java.util.HashSet) PartitionedFileSet(co.cask.cdap.api.dataset.lib.PartitionedFileSet) Set(java.util.Set) PartitionDetail(co.cask.cdap.api.dataset.lib.PartitionDetail) Notification(co.cask.cdap.proto.Notification) ApplicationWithPrograms(co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms) BasicArguments(co.cask.cdap.internal.app.runtime.BasicArguments) GenericRecord(org.apache.avro.generic.GenericRecord) TransactionExecutor(org.apache.tephra.TransactionExecutor) PartitionedFileSet(co.cask.cdap.api.dataset.lib.PartitionedFileSet) IOException(java.io.IOException) KeyValueTable(co.cask.cdap.api.dataset.lib.KeyValueTable) TransactionAware(org.apache.tephra.TransactionAware) PartitionKey(co.cask.cdap.api.dataset.lib.PartitionKey) HashMap(java.util.HashMap) Map(java.util.Map) ImmutableMap(com.google.common.collect.ImmutableMap) Location(org.apache.twill.filesystem.Location)

Example 8 with ApplicationWithPrograms

use of co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms in project cdap by caskdata.

the class MapReduceProgramRunnerTest method testWordCount.

@Test
public void testWordCount() throws Exception {
    // deploy to namespace default by default
    final ApplicationWithPrograms app = deployApp(AppWithMapReduce.class);
    final String inputPath = createInput();
    final java.io.File outputDir = new java.io.File(TEMP_FOLDER.newFolder(), "output");
    try {
        datasetCache.getDataset("someOtherNameSpace", "jobConfig");
        Assert.fail("getDataset() should throw an exception when accessing a non-existing dataset.");
    } catch (DatasetInstantiationException e) {
    // expected
    }
    // Should work if explicitly specify the default namespace
    final KeyValueTable jobConfigTable = datasetCache.getDataset(NamespaceId.DEFAULT.getNamespace(), "jobConfig");
    // write config into dataset
    Transactions.createTransactionExecutor(txExecutorFactory, jobConfigTable).execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() {
            jobConfigTable.write(Bytes.toBytes("inputPath"), Bytes.toBytes(inputPath));
            jobConfigTable.write(Bytes.toBytes("outputPath"), Bytes.toBytes(outputDir.getPath()));
        }
    });
    runProgram(app, AppWithMapReduce.ClassicWordCount.class, false, true);
    Assert.assertEquals("true", System.getProperty("partitioner.initialize"));
    Assert.assertEquals("true", System.getProperty("partitioner.destroy"));
    Assert.assertEquals("true", System.getProperty("partitioner.set.conf"));
    Assert.assertEquals("true", System.getProperty("comparator.initialize"));
    Assert.assertEquals("true", System.getProperty("comparator.destroy"));
    Assert.assertEquals("true", System.getProperty("comparator.set.conf"));
    File[] outputFiles = outputDir.listFiles(new FilenameFilter() {

        @Override
        public boolean accept(File dir, String name) {
            return name.startsWith("part-r-") && !name.endsWith(".crc");
        }
    });
    Assert.assertNotNull("no output files found", outputFiles);
    int lines = 0;
    for (File file : outputFiles) {
        lines += Files.readLines(file, Charsets.UTF_8).size();
    }
    // dummy check that output file is not empty
    Assert.assertTrue(lines > 0);
}
Also used : TransactionExecutor(org.apache.tephra.TransactionExecutor) File(java.io.File) FilenameFilter(java.io.FilenameFilter) ApplicationWithPrograms(co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms) KeyValueTable(co.cask.cdap.api.dataset.lib.KeyValueTable) File(java.io.File) DatasetInstantiationException(co.cask.cdap.api.data.DatasetInstantiationException) Test(org.junit.Test)

Example 9 with ApplicationWithPrograms

use of co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms in project cdap by caskdata.

the class MapReduceProgramRunnerTest method testFailureInOutputCommitter.

@Test
public void testFailureInOutputCommitter() throws Exception {
    final ApplicationWithPrograms app = deployApp(AppWithMapReduce.class);
    // We want to verify that when a mapreduce fails when committing the dataset outputs,
    // the destroy method is still called and committed.
    // (1) setup the datasets we use
    datasetCache.newTransactionContext();
    final KeyValueTable kvTable = datasetCache.getDataset("recorder");
    Transactions.createTransactionExecutor(txExecutorFactory, datasetCache.getTransactionAwares()).execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() {
            // the table should not have initialized=true
            kvTable.write("initialized", "false");
        }
    });
    // 2) run job
    runProgram(app, AppWithMapReduce.MapReduceWithFailingOutputCommitter.class, new HashMap<String, String>(), false);
    // 3) verify results
    Transactions.createTransactionExecutor(txExecutorFactory, datasetCache.getTransactionAwares()).execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() {
            // the destroy() method should have recorded FAILED status in the kv table
            Assert.assertEquals(ProgramStatus.FAILED.name(), Bytes.toString(kvTable.read("status")));
        }
    });
    datasetCache.dismissTransactionContext();
}
Also used : ApplicationWithPrograms(co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms) KeyValueTable(co.cask.cdap.api.dataset.lib.KeyValueTable) TransactionExecutor(org.apache.tephra.TransactionExecutor) Test(org.junit.Test)

Example 10 with ApplicationWithPrograms

use of co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms in project cdap by caskdata.

the class MapReduceProgramRunnerTest method testMapReduceDriverResources.

@Test
public void testMapReduceDriverResources() throws Exception {
    final ApplicationWithPrograms app = deployApp(AppWithMapReduce.class);
    MapReduceSpecification mrSpec = app.getSpecification().getMapReduce().get(AppWithMapReduce.ClassicWordCount.class.getSimpleName());
    Assert.assertEquals(AppWithMapReduce.ClassicWordCount.MEMORY_MB, mrSpec.getDriverResources().getMemoryMB());
}
Also used : ApplicationWithPrograms(co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms) MapReduceSpecification(co.cask.cdap.api.mapreduce.MapReduceSpecification) Test(org.junit.Test)

Aggregations

ApplicationWithPrograms (co.cask.cdap.internal.app.deploy.pipeline.ApplicationWithPrograms)30 Test (org.junit.Test)23 BasicArguments (co.cask.cdap.internal.app.runtime.BasicArguments)18 TransactionExecutor (org.apache.tephra.TransactionExecutor)11 KeyValueTable (co.cask.cdap.api.dataset.lib.KeyValueTable)10 ProgramController (co.cask.cdap.app.runtime.ProgramController)8 ProgramDescriptor (co.cask.cdap.app.program.ProgramDescriptor)7 Location (org.apache.twill.filesystem.Location)6 IOException (java.io.IOException)5 TransactionAware (org.apache.tephra.TransactionAware)5 Table (co.cask.cdap.api.dataset.table.Table)4 File (java.io.File)4 RandomEndpointStrategy (co.cask.cdap.common.discovery.RandomEndpointStrategy)3 AppDeploymentInfo (co.cask.cdap.internal.app.deploy.pipeline.AppDeploymentInfo)3 Discoverable (org.apache.twill.discovery.Discoverable)3 DiscoveryServiceClient (org.apache.twill.discovery.DiscoveryServiceClient)3 DatasetInstantiationException (co.cask.cdap.api.data.DatasetInstantiationException)2 FileSet (co.cask.cdap.api.dataset.lib.FileSet)2 StreamEvent (co.cask.cdap.api.flow.flowlet.StreamEvent)2 ApplicationNotFoundException (co.cask.cdap.common.ApplicationNotFoundException)2