Examples with JobStoryProducer - org.apache.hadoop.tools.rumen.JobStoryProducer

Example 1 with JobStoryProducer

use of org.apache.hadoop.tools.rumen.JobStoryProducer in project hadoop by apache.

the class TestGridMixClasses method testSerialReaderThread.

/*
   * test SerialJobFactory
   */
@Test(timeout = 120000)
public void testSerialReaderThread() throws Exception {
    Configuration conf = new Configuration();
    File fin = new File("src" + File.separator + "test" + File.separator + "resources" + File.separator + "data" + File.separator + "wordcount2.json");
    // read couple jobs from wordcount2.json
    JobStoryProducer jobProducer = new ZombieJobProducer(new Path(fin.getAbsolutePath()), null, conf);
    CountDownLatch startFlag = new CountDownLatch(1);
    UserResolver resolver = new SubmitterUserResolver();
    FakeJobSubmitter submitter = new FakeJobSubmitter();
    File ws = new File("target" + File.separator + this.getClass().getName());
    if (!ws.exists()) {
        Assert.assertTrue(ws.mkdirs());
    }
    SerialJobFactory jobFactory = new SerialJobFactory(submitter, jobProducer, new Path(ws.getAbsolutePath()), conf, startFlag, resolver);
    Path ioPath = new Path(ws.getAbsolutePath());
    jobFactory.setDistCacheEmulator(new DistributedCacheEmulator(conf, ioPath));
    Thread test = jobFactory.createReaderThread();
    test.start();
    Thread.sleep(1000);
    // SerialReaderThread waits startFlag
    assertEquals(0, submitter.getJobs().size());
    // start!
    startFlag.countDown();
    while (test.isAlive()) {
        Thread.sleep(1000);
        jobFactory.update(null);
    }
    // submitter was called twice
    assertEquals(2, submitter.getJobs().size());
}

Also used : JobStoryProducer(org.apache.hadoop.tools.rumen.JobStoryProducer) Path(org.apache.hadoop.fs.Path) ZombieJobProducer(org.apache.hadoop.tools.rumen.ZombieJobProducer) Configuration(org.apache.hadoop.conf.Configuration) CountDownLatch(java.util.concurrent.CountDownLatch) File(java.io.File) Test(org.junit.Test)

Example 2 with JobStoryProducer

use of org.apache.hadoop.tools.rumen.JobStoryProducer in project hadoop by apache.

the class TestGridmixSubmission method testTraceReader.

/**
   * Tests the reading of traces in GridMix3. These traces are generated by
   * Rumen and are in the JSON format. The traces can optionally be compressed
   * and uncompressed traces can also be passed to GridMix3 via its standard
   * input stream. The testing is effected via JUnit assertions.
   *
   * @throws Exception if there was an error.
   */
@Test(timeout = 20000)
public void testTraceReader() throws Exception {
    Configuration conf = new Configuration();
    FileSystem lfs = FileSystem.getLocal(conf);
    Path rootInputDir = new Path(System.getProperty("src.test.data"));
    rootInputDir = rootInputDir.makeQualified(lfs.getUri(), lfs.getWorkingDirectory());
    Path rootTempDir = new Path(System.getProperty("test.build.data", System.getProperty("java.io.tmpdir")), "testTraceReader");
    rootTempDir = rootTempDir.makeQualified(lfs.getUri(), lfs.getWorkingDirectory());
    Path inputFile = new Path(rootInputDir, "wordcount.json.gz");
    Path tempFile = new Path(rootTempDir, "gridmix3-wc.json");
    InputStream origStdIn = System.in;
    InputStream tmpIs = null;
    try {
        DebugGridmix dgm = new DebugGridmix();
        JobStoryProducer jsp = dgm.createJobStoryProducer(inputFile.toString(), conf);
        LOG.info("Verifying JobStory from compressed trace...");
        verifyWordCountJobStory(jsp.getNextJob());
        expandGzippedTrace(lfs, inputFile, tempFile);
        jsp = dgm.createJobStoryProducer(tempFile.toString(), conf);
        LOG.info("Verifying JobStory from uncompressed trace...");
        verifyWordCountJobStory(jsp.getNextJob());
        tmpIs = lfs.open(tempFile);
        System.setIn(tmpIs);
        LOG.info("Verifying JobStory from trace in standard input...");
        jsp = dgm.createJobStoryProducer("-", conf);
        verifyWordCountJobStory(jsp.getNextJob());
    } finally {
        System.setIn(origStdIn);
        if (tmpIs != null) {
            tmpIs.close();
        }
        lfs.delete(rootTempDir, true);
    }
}

Also used : Path(org.apache.hadoop.fs.Path) JobStoryProducer(org.apache.hadoop.tools.rumen.JobStoryProducer) Configuration(org.apache.hadoop.conf.Configuration) GZIPInputStream(java.util.zip.GZIPInputStream) InputStream(java.io.InputStream) FileSystem(org.apache.hadoop.fs.FileSystem) Test(org.junit.Test)

Example 3 with JobStoryProducer

use of org.apache.hadoop.tools.rumen.JobStoryProducer in project hadoop by apache.

the class Gridmix method setupDistCacheEmulation.

/**
   * Setup gridmix for emulation of distributed cache load. This includes
   * generation of distributed cache files, if needed.
   * @param conf gridmix configuration
   * @param traceIn trace file path(if it is '-', then trace comes from the
   *                stream stdin)
   * @param ioPath &lt;ioPath&gt;/input/ is the dir where input data (a) exists
   *               or (b) is generated. &lt;ioPath&gt;/distributedCache/ is the
   *               folder where distributed cache data (a) exists or (b) is to be
   *               generated by gridmix.
   * @param generate true if -generate option was specified
   * @return exit code
   * @throws IOException
   * @throws InterruptedException
   */
private int setupDistCacheEmulation(Configuration conf, String traceIn, Path ioPath, boolean generate) throws IOException, InterruptedException {
    distCacheEmulator.init(traceIn, factory.jobCreator, generate);
    int exitCode = 0;
    if (distCacheEmulator.shouldGenerateDistCacheData() || distCacheEmulator.shouldEmulateDistCacheLoad()) {
        JobStoryProducer jsp = createJobStoryProducer(traceIn, conf);
        exitCode = distCacheEmulator.setupGenerateDistCacheData(jsp);
        if (exitCode == 0) {
            // If there are files to be generated, run a MapReduce job to generate
            // these distributed cache files of all the simulated jobs of this trace.
            writeDistCacheData(conf);
        }
    }
    return exitCode;
}

Also used : JobStoryProducer(org.apache.hadoop.tools.rumen.JobStoryProducer)

Aggregations

JobStoryProducer (org.apache.hadoop.tools.rumen.JobStoryProducer)3 Configuration (org.apache.hadoop.conf.Configuration)2 Path (org.apache.hadoop.fs.Path)2 Test (org.junit.Test)2 File (java.io.File)1 InputStream (java.io.InputStream)1 CountDownLatch (java.util.concurrent.CountDownLatch)1 GZIPInputStream (java.util.zip.GZIPInputStream)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 ZombieJobProducer (org.apache.hadoop.tools.rumen.ZombieJobProducer)1