Search in sources :

Example 71 with LocalFileSystem

use of org.apache.hadoop.fs.LocalFileSystem in project druid by druid-io.

the class JobHelper method setupClasspath.

/**
   * Uploads jar files to hdfs and configures the classpath.
   * Snapshot jar files are uploaded to intermediateClasspath and not shared across multiple jobs.
   * Non-Snapshot jar files are uploaded to a distributedClasspath and shared across multiple jobs.
   *
   * @param distributedClassPath  classpath shared across multiple jobs
   * @param intermediateClassPath classpath exclusive for this job. used to upload SNAPSHOT jar files.
   * @param job                   job to run
   *
   * @throws IOException
   */
public static void setupClasspath(final Path distributedClassPath, final Path intermediateClassPath, final Job job) throws IOException {
    String classpathProperty = System.getProperty("druid.hadoop.internal.classpath");
    if (classpathProperty == null) {
        classpathProperty = System.getProperty("java.class.path");
    }
    String[] jarFiles = classpathProperty.split(File.pathSeparator);
    final Configuration conf = job.getConfiguration();
    final FileSystem fs = distributedClassPath.getFileSystem(conf);
    if (fs instanceof LocalFileSystem) {
        return;
    }
    for (String jarFilePath : jarFiles) {
        final File jarFile = new File(jarFilePath);
        if (jarFile.getName().endsWith(".jar")) {
            try {
                RetryUtils.retry(new Callable<Boolean>() {

                    @Override
                    public Boolean call() throws Exception {
                        if (isSnapshot(jarFile)) {
                            addSnapshotJarToClassPath(jarFile, intermediateClassPath, fs, job);
                        } else {
                            addJarToClassPath(jarFile, distributedClassPath, intermediateClassPath, fs, job);
                        }
                        return true;
                    }
                }, shouldRetryPredicate(), NUM_RETRIES);
            } catch (Exception e) {
                throw Throwables.propagate(e);
            }
        }
    }
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) LocalFileSystem(org.apache.hadoop.fs.LocalFileSystem) FileSystem(org.apache.hadoop.fs.FileSystem) LocalFileSystem(org.apache.hadoop.fs.LocalFileSystem) File(java.io.File) URISyntaxException(java.net.URISyntaxException) JsonProcessingException(com.fasterxml.jackson.core.JsonProcessingException) IOException(java.io.IOException)

Example 72 with LocalFileSystem

use of org.apache.hadoop.fs.LocalFileSystem in project druid by druid-io.

the class HadoopDruidIndexerConfigTest method shouldMakeDefaultSegmentOutputPathIfNotHDFS.

@Test
public void shouldMakeDefaultSegmentOutputPathIfNotHDFS() {
    final HadoopIngestionSpec schema;
    try {
        schema = jsonReadWriteRead("{\n" + "    \"dataSchema\": {\n" + "        \"dataSource\": \"the:data:source\",\n" + "        \"metricsSpec\": [],\n" + "        \"granularitySpec\": {\n" + "            \"type\": \"uniform\",\n" + "            \"segmentGranularity\": \"hour\",\n" + "            \"intervals\": [\"2012-07-10/P1D\"]\n" + "        }\n" + "    },\n" + "    \"ioConfig\": {\n" + "        \"type\": \"hadoop\",\n" + "        \"segmentOutputPath\": \"/tmp/dru:id/data:test\"\n" + "    }\n" + "}", HadoopIngestionSpec.class);
    } catch (Exception e) {
        throw Throwables.propagate(e);
    }
    HadoopDruidIndexerConfig cfg = new HadoopDruidIndexerConfig(schema.withTuningConfig(schema.getTuningConfig().withVersion("some:brand:new:version")));
    Bucket bucket = new Bucket(4711, new DateTime(2012, 07, 10, 5, 30), 4712);
    Path path = JobHelper.makeFileNamePath(new Path(cfg.getSchema().getIOConfig().getSegmentOutputPath()), new LocalFileSystem(), new DataSegment(cfg.getSchema().getDataSchema().getDataSource(), cfg.getSchema().getDataSchema().getGranularitySpec().bucketInterval(bucket.time).get(), cfg.getSchema().getTuningConfig().getVersion(), null, null, null, new NumberedShardSpec(bucket.partitionNum, 5000), -1, -1), JobHelper.INDEX_ZIP);
    Assert.assertEquals("file:/tmp/dru:id/data:test/the:data:source/2012-07-10T05:00:00.000Z_2012-07-10T06:00:00.000Z/some:brand:new:" + "version/4712/index.zip", path.toString());
    path = JobHelper.makeFileNamePath(new Path(cfg.getSchema().getIOConfig().getSegmentOutputPath()), new LocalFileSystem(), new DataSegment(cfg.getSchema().getDataSchema().getDataSource(), cfg.getSchema().getDataSchema().getGranularitySpec().bucketInterval(bucket.time).get(), cfg.getSchema().getTuningConfig().getVersion(), null, null, null, new NumberedShardSpec(bucket.partitionNum, 5000), -1, -1), JobHelper.DESCRIPTOR_JSON);
    Assert.assertEquals("file:/tmp/dru:id/data:test/the:data:source/2012-07-10T05:00:00.000Z_2012-07-10T06:00:00.000Z/some:brand:new:" + "version/4712/descriptor.json", path.toString());
    path = JobHelper.makeTmpPath(new Path(cfg.getSchema().getIOConfig().getSegmentOutputPath()), new LocalFileSystem(), new DataSegment(cfg.getSchema().getDataSchema().getDataSource(), cfg.getSchema().getDataSchema().getGranularitySpec().bucketInterval(bucket.time).get(), cfg.getSchema().getTuningConfig().getVersion(), null, null, null, new NumberedShardSpec(bucket.partitionNum, 5000), -1, -1), new TaskAttemptID("abc", 123, TaskType.REDUCE, 1, 0));
    Assert.assertEquals("file:/tmp/dru:id/data:test/the:data:source/2012-07-10T05:00:00.000Z_2012-07-10T06:00:00.000Z/some:brand:new:" + "version/4712/4712_index.zip.0", path.toString());
}
Also used : Path(org.apache.hadoop.fs.Path) LocalFileSystem(org.apache.hadoop.fs.LocalFileSystem) TaskAttemptID(org.apache.hadoop.mapreduce.TaskAttemptID) DataSegment(io.druid.timeline.DataSegment) DateTime(org.joda.time.DateTime) NumberedShardSpec(io.druid.timeline.partition.NumberedShardSpec) HashBasedNumberedShardSpec(io.druid.timeline.partition.HashBasedNumberedShardSpec) Test(org.junit.Test)

Example 73 with LocalFileSystem

use of org.apache.hadoop.fs.LocalFileSystem in project druid by druid-io.

the class IndexGeneratorJobTest method writeDataToLocalSequenceFile.

private void writeDataToLocalSequenceFile(File outputFile, List<String> data) throws IOException {
    Configuration conf = new Configuration();
    LocalFileSystem fs = FileSystem.getLocal(conf);
    Writer fileWriter = SequenceFile.createWriter(fs, conf, new Path(outputFile.getAbsolutePath()), BytesWritable.class, BytesWritable.class, SequenceFile.CompressionType.NONE, (CompressionCodec) null);
    int keyCount = 10;
    for (String line : data) {
        ByteBuffer buf = ByteBuffer.allocate(4);
        buf.putInt(keyCount);
        BytesWritable key = new BytesWritable(buf.array());
        BytesWritable value = new BytesWritable(line.getBytes(Charsets.UTF_8));
        fileWriter.append(key, value);
        keyCount += 1;
    }
    fileWriter.close();
}
Also used : Path(org.apache.hadoop.fs.Path) Configuration(org.apache.hadoop.conf.Configuration) LocalFileSystem(org.apache.hadoop.fs.LocalFileSystem) BytesWritable(org.apache.hadoop.io.BytesWritable) ByteBuffer(java.nio.ByteBuffer) Writer(org.apache.hadoop.io.SequenceFile.Writer)

Example 74 with LocalFileSystem

use of org.apache.hadoop.fs.LocalFileSystem in project hadoop by apache.

the class TestDelegationTokenFetcher method testDelegationTokenWithoutRenewerViaRPC.

@Test
public void testDelegationTokenWithoutRenewerViaRPC() throws Exception {
    conf.setBoolean(DFS_NAMENODE_DELEGATION_TOKEN_ALWAYS_USE_KEY, true);
    MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf).numDataNodes(0).build();
    try {
        cluster.waitActive();
        DistributedFileSystem fs = cluster.getFileSystem();
        // Should be able to fetch token without renewer.
        LocalFileSystem localFileSystem = FileSystem.getLocal(conf);
        Path p = new Path(f.getRoot().getAbsolutePath(), tokenFile);
        p = localFileSystem.makeQualified(p);
        DelegationTokenFetcher.saveDelegationToken(conf, fs, null, p);
        Credentials creds = Credentials.readTokenStorageFile(p, conf);
        Iterator<Token<?>> itr = creds.getAllTokens().iterator();
        assertTrue("token not exist error", itr.hasNext());
        final Token token = itr.next();
        assertNotNull("Token should be there without renewer", token);
        // Test compatibility of DelegationTokenFetcher.printTokensToString
        String expectedNonVerbose = "Token (HDFS_DELEGATION_TOKEN token 1 for " + System.getProperty("user.name") + " with renewer ) for";
        String resNonVerbose = DelegationTokenFetcher.printTokensToString(conf, p, false);
        assertTrue("The non verbose output is expected to start with \"" + expectedNonVerbose + "\"", resNonVerbose.startsWith(expectedNonVerbose));
        LOG.info(resNonVerbose);
        LOG.info(DelegationTokenFetcher.printTokensToString(conf, p, true));
        try {
            // Without renewer renewal of token should fail.
            DelegationTokenFetcher.renewTokens(conf, p);
            fail("Should have failed to renew");
        } catch (AccessControlException e) {
            GenericTestUtils.assertExceptionContains("tried to renew a token (" + token.decodeIdentifier() + ") without a renewer", e);
        }
    } finally {
        cluster.shutdown();
    }
}
Also used : Path(org.apache.hadoop.fs.Path) MiniDFSCluster(org.apache.hadoop.hdfs.MiniDFSCluster) LocalFileSystem(org.apache.hadoop.fs.LocalFileSystem) AccessControlException(org.apache.hadoop.security.AccessControlException) Token(org.apache.hadoop.security.token.Token) Matchers.anyString(org.mockito.Matchers.anyString) DistributedFileSystem(org.apache.hadoop.hdfs.DistributedFileSystem) Credentials(org.apache.hadoop.security.Credentials) Test(org.junit.Test)

Example 75 with LocalFileSystem

use of org.apache.hadoop.fs.LocalFileSystem in project hadoop by apache.

the class TestMapFile method setup.

@Before
public void setup() throws Exception {
    LocalFileSystem fs = FileSystem.getLocal(conf);
    if (fs.exists(TEST_DIR) && !fs.delete(TEST_DIR, true)) {
        Assert.fail("Can't clean up test root dir");
    }
    fs.mkdirs(TEST_DIR);
}
Also used : LocalFileSystem(org.apache.hadoop.fs.LocalFileSystem) Before(org.junit.Before)

Aggregations

LocalFileSystem (org.apache.hadoop.fs.LocalFileSystem)120 Path (org.apache.hadoop.fs.Path)77 Test (org.junit.Test)63 Configuration (org.apache.hadoop.conf.Configuration)56 FileSystem (org.apache.hadoop.fs.FileSystem)35 IOException (java.io.IOException)33 File (java.io.File)23 NewTableConfiguration (org.apache.accumulo.core.client.admin.NewTableConfiguration)23 SamplerConfiguration (org.apache.accumulo.core.client.sample.SamplerConfiguration)23 SummarizerConfiguration (org.apache.accumulo.core.client.summary.SummarizerConfiguration)23 DefaultConfiguration (org.apache.accumulo.core.conf.DefaultConfiguration)23 Key (org.apache.accumulo.core.data.Key)22 Value (org.apache.accumulo.core.data.Value)22 ArrayList (java.util.ArrayList)19 ExecutorService (java.util.concurrent.ExecutorService)15 Future (java.util.concurrent.Future)15 Scanner (org.apache.accumulo.core.client.Scanner)14 DataSegment (org.apache.druid.timeline.DataSegment)13 DataSegmentPusher (org.apache.druid.segment.loading.DataSegmentPusher)8 HdfsDataSegmentPusher (org.apache.druid.storage.hdfs.HdfsDataSegmentPusher)8