Search in sources :

Example 1 with MapReduceConfiguration

use of datawave.webservice.mr.configuration.MapReduceConfiguration in project datawave by NationalSecurityAgency.

the class MapReduceBeanTest method testInvalidInputFormat.

@Test(expected = DatawaveWebApplicationException.class)
public void testInvalidInputFormat() throws Exception {
    Job mockJob = createMock(Job.class);
    bean.setJob(mockJob);
    MapReduceConfiguration mrConfig = applicationContext.getBean(MapReduceConfiguration.class);
    mrConfig.getJobConfiguration().clear();
    mrConfig.getJobConfiguration().put("TestJob", new MapReduceJobConfiguration());
    // BulkResultsJob uses AccumuloInputFormat, MapReduceJobs.xml in
    // src/test/resources specifies something else
    expect(ctx.getCallerPrincipal()).andReturn(principal);
    replayAll();
    bean.submit("TestJob", "queryId:1243;format:XML");
    verifyAll();
}
Also used : MapReduceJobConfiguration(datawave.webservice.mr.configuration.MapReduceJobConfiguration) MapReduceConfiguration(datawave.webservice.mr.configuration.MapReduceConfiguration) Job(org.apache.hadoop.mapreduce.Job) PrepareForTest(org.powermock.core.classloader.annotations.PrepareForTest) Test(org.junit.Test)

Example 2 with MapReduceConfiguration

use of datawave.webservice.mr.configuration.MapReduceConfiguration in project datawave by NationalSecurityAgency.

the class MapReduceBeanTest method testNoResults.

@Test(expected = NoResultsException.class)
public void testNoResults() throws Exception {
    Job mockJob = createMock(Job.class);
    bean.setJob(mockJob);
    MapReduceJobConfiguration cfg = new MapReduceJobConfiguration() {

        @Override
        public final void initializeConfiguration(String jobId, Job job, Map<String, String> runtimeParameters, DatawavePrincipal serverPrincipal) throws Exception {
            throw new NoResultsException(new QueryException(DatawaveErrorCode.NO_RANGES));
        }
    };
    MapReduceConfiguration mrConfig = applicationContext.getBean(MapReduceConfiguration.class);
    mrConfig.getJobConfiguration().clear();
    mrConfig.getJobConfiguration().put("TestJob", cfg);
    // BulkResultsJob uses AccumuloInputFormat, MapReduceJobs.xml in
    // src/test/resources specifies something else
    expect(ctx.getCallerPrincipal()).andReturn(principal);
    replayAll();
    bean.submit("TestJob", "queryId:1243;format:XML");
    verifyAll();
}
Also used : NoResultsException(datawave.webservice.common.exception.NoResultsException) QueryException(datawave.webservice.query.exception.QueryException) MapReduceJobConfiguration(datawave.webservice.mr.configuration.MapReduceJobConfiguration) MapReduceConfiguration(datawave.webservice.mr.configuration.MapReduceConfiguration) Job(org.apache.hadoop.mapreduce.Job) Map(java.util.Map) DatawavePrincipal(datawave.security.authorization.DatawavePrincipal) PrepareForTest(org.powermock.core.classloader.annotations.PrepareForTest) Test(org.junit.Test)

Example 3 with MapReduceConfiguration

use of datawave.webservice.mr.configuration.MapReduceConfiguration in project datawave by NationalSecurityAgency.

the class MapReduceBean method submit.

/**
 * Execute a MapReduce job with the given name and runtime parameters
 *
 * @param jobName
 *            Name of the map reduce job configuration
 * @param parameters
 *            A semi-colon separated list name:value pairs. These are the required and optional parameters listed in the MapReduceConfiguration objects
 *            returned in the call to list()
 * @return {@code datawave.webservice.result.GenericResponse<String>} job id
 * @RequestHeader X-ProxiedEntitiesChain use when proxying request for user by specifying a chain of DNs of the identities to proxy
 * @RequestHeader X-ProxiedIssuersChain required when using X-ProxiedEntitiesChain, specify one issuer DN per subject DN listed in X-ProxiedEntitiesChain
 * @ResponseHeader X-OperationTimeInMS time spent on the server performing the operation, does not account for network or result serialization
 * @HTTP 200 success
 * @HTTP 204 if no data was found
 * @HTTP 400 if jobName is invalid
 * @HTTP 401 if user does not have correct roles
 * @HTTP 500 error starting the job
 */
@POST
@Produces({ "application/xml", "text/xml", "application/json", "text/yaml", "text/x-yaml", "application/x-yaml", "application/x-protobuf", "application/x-protostuff" })
@javax.ws.rs.Path("/submit")
@GZIP
public GenericResponse<String> submit(@FormParam("jobName") String jobName, @FormParam("parameters") String parameters) {
    GenericResponse<String> response = new GenericResponse<>();
    // Find out who/what called this method
    Principal p = ctx.getCallerPrincipal();
    String sid;
    Set<Collection<String>> cbAuths = new HashSet<>();
    DatawavePrincipal datawavePrincipal = null;
    if (p instanceof DatawavePrincipal) {
        datawavePrincipal = (DatawavePrincipal) p;
        sid = datawavePrincipal.getShortName();
        cbAuths.addAll(datawavePrincipal.getAuthorizations());
    } else {
        QueryException qe = new QueryException(DatawaveErrorCode.UNEXPECTED_PRINCIPAL_ERROR, MessageFormat.format("Class: {0}", p.getClass().getName()));
        response.addException(qe);
        throw new DatawaveWebApplicationException(qe, response);
    }
    // Get the MapReduceJobConfiguration from the configuration
    MapReduceJobConfiguration job;
    try {
        job = this.mapReduceConfiguration.getConfiguration(jobName);
    } catch (IllegalArgumentException e) {
        BadRequestQueryException qe = new BadRequestQueryException(DatawaveErrorCode.JOB_CONFIGURATION_ERROR, e);
        response.addException(qe);
        throw new BadRequestException(qe, response);
    }
    // Ensure that the user has the required roles and has passed the required auths
    if (null != job.getRequiredRoles() || null != job.getRequiredAuths()) {
        try {
            canRunJob(datawavePrincipal, new MultivaluedMapImpl<>(), job.getRequiredRoles(), job.getRequiredAuths());
        } catch (UnauthorizedQueryException qe) {
            // user does not have all of the required roles or did not pass the required auths
            response.addException(qe);
            throw new UnauthorizedException(qe, response);
        }
    }
    // Parse the parameters
    Map<String, String> runtimeParameters = new HashMap<>();
    if (null != parameters) {
        String[] param = parameters.split(PARAMETER_SEPARATOR);
        for (String yyy : param) {
            String[] parts = yyy.split(PARAMETER_NAME_VALUE_SEPARATOR);
            if (parts.length == 2) {
                runtimeParameters.put(parts[0], parts[1]);
            }
        }
    }
    // Check to see if the job configuration class implements specific interfaces.
    if (job instanceof NeedCallerDetails) {
        ((NeedCallerDetails) job).setUserSid(sid);
        ((NeedCallerDetails) job).setPrincipal(p);
    }
    if (job instanceof NeedAccumuloConnectionFactory) {
        ((NeedAccumuloConnectionFactory) job).setAccumuloConnectionFactory(this.connectionFactory);
    }
    if (job instanceof NeedAccumuloDetails) {
        ((NeedAccumuloDetails) job).setUsername(this.connectionPoolsConfiguration.getPools().get(this.connectionPoolsConfiguration.getDefaultPool()).getUsername());
        ((NeedAccumuloDetails) job).setPassword(this.connectionPoolsConfiguration.getPools().get(this.connectionPoolsConfiguration.getDefaultPool()).getPassword());
        ((NeedAccumuloDetails) job).setInstanceName(this.connectionPoolsConfiguration.getPools().get(this.connectionPoolsConfiguration.getDefaultPool()).getInstance());
        ((NeedAccumuloDetails) job).setZookeepers(this.connectionPoolsConfiguration.getPools().get(this.connectionPoolsConfiguration.getDefaultPool()).getZookeepers());
    }
    if (job instanceof NeedQueryLogicFactory) {
        ((NeedQueryLogicFactory) job).setQueryLogicFactory(this.queryLogicFactory);
    }
    if (job instanceof NeedQueryPersister) {
        ((NeedQueryPersister) job).setPersister(this.queryPersister);
    }
    if (job instanceof NeedQueryCache) {
        ((NeedQueryCache) job).setQueryCache(cache);
    }
    if (job instanceof NeedSecurityDomain) {
        ((NeedSecurityDomain) job).setSecurityDomain(this.jsseSecurityDomain);
    }
    // If this job is being restarted, then the jobId will be the same. The restart method
    // puts the id into the runtime parameters
    String id = runtimeParameters.get(JOB_ID);
    if (null == id)
        id = UUID.randomUUID().toString();
    org.apache.hadoop.conf.Configuration conf = new org.apache.hadoop.conf.Configuration();
    StringBuilder name = new StringBuilder().append(jobName).append("_sid_").append(sid).append("_id_").append(id);
    Job j;
    try {
        j = createJob(conf, name);
        job.initializeConfiguration(id, j, runtimeParameters, serverPrincipal);
    } catch (WebApplicationException waEx) {
        throw waEx;
    } catch (Exception e) {
        QueryException qe = new QueryException(DatawaveErrorCode.LOGIC_CONFIGURATION_ERROR, e);
        log.error(qe.getMessage(), e);
        response.addException(qe.getBottomQueryException());
        throw new DatawaveWebApplicationException(qe, response);
    }
    // Enforce that certain InputFormat classes are being used here.
    if (this.mapReduceConfiguration.isRestrictInputFormats()) {
        // Make sure that the job input format is in the list
        Class<? extends InputFormat<?, ?>> ifClass;
        try {
            ifClass = j.getInputFormatClass();
        } catch (ClassNotFoundException e1) {
            QueryException qe = new QueryException(DatawaveErrorCode.INPUT_FORMAT_CLASS_ERROR, e1);
            log.error(qe);
            response.addException(qe);
            throw new DatawaveWebApplicationException(qe, response);
        }
        if (!this.mapReduceConfiguration.getValidInputFormats().contains(ifClass)) {
            IllegalArgumentException e = new IllegalArgumentException("Invalid input format class specified. Must use one of " + this.mapReduceConfiguration.getValidInputFormats());
            QueryException qe = new QueryException(DatawaveErrorCode.INVALID_FORMAT, e);
            log.error(qe);
            response.addException(qe.getBottomQueryException());
            throw new DatawaveWebApplicationException(qe, response);
        }
    }
    try {
        j.submit();
    } catch (Exception e) {
        QueryException qe = new QueryException(DatawaveErrorCode.MAPREDUCE_JOB_START_ERROR, e);
        log.error(qe.getMessage(), qe);
        response.addException(qe.getBottomQueryException());
        throw new DatawaveWebApplicationException(qe, response);
    }
    JobID mapReduceJobId = j.getJobID();
    log.info("JOB ID: " + mapReduceJobId);
    // Create an entry in the state table
    boolean restarted = (runtimeParameters.get(JOB_ID) != null);
    try {
        if (!restarted)
            mapReduceState.create(id, job.getHdfsUri(), job.getJobTracker(), job.getJobDir(), mapReduceJobId.toString(), job.getResultsDir(), parameters, jobName);
        else
            mapReduceState.addJob(id, mapReduceJobId.toString());
    } catch (Exception e) {
        QueryException qe = new QueryException(DatawaveErrorCode.MAPREDUCE_STATE_PERSISTENCE_ERROR, e);
        log.error(qe);
        response.addException(qe.getBottomQueryException());
        try {
            j.killJob();
        } catch (IOException ioe) {
            QueryException qe2 = new QueryException(DatawaveErrorCode.MAPREDUCE_JOB_KILL_ERROR, ioe);
            response.addException(qe2);
        }
        throw new DatawaveWebApplicationException(qe, response);
    }
    response.setResult(id);
    return response;
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) ConnectionPoolsConfiguration(datawave.webservice.common.connection.config.ConnectionPoolsConfiguration) OozieJobConfiguration(datawave.webservice.mr.configuration.OozieJobConfiguration) MapReduceJobConfiguration(datawave.webservice.mr.configuration.MapReduceJobConfiguration) MapReduceConfiguration(datawave.webservice.mr.configuration.MapReduceConfiguration) DatawaveWebApplicationException(datawave.webservice.common.exception.DatawaveWebApplicationException) WebApplicationException(javax.ws.rs.WebApplicationException) HashMap(java.util.HashMap) NeedQueryPersister(datawave.webservice.mr.configuration.NeedQueryPersister) NeedCallerDetails(datawave.webservice.mr.configuration.NeedCallerDetails) NeedQueryLogicFactory(datawave.webservice.mr.configuration.NeedQueryLogicFactory) DatawavePrincipal(datawave.security.authorization.DatawavePrincipal) MapReduceJobConfiguration(datawave.webservice.mr.configuration.MapReduceJobConfiguration) Configuration(org.apache.hadoop.conf.Configuration) UnauthorizedException(datawave.webservice.common.exception.UnauthorizedException) DatawaveWebApplicationException(datawave.webservice.common.exception.DatawaveWebApplicationException) NeedQueryCache(datawave.webservice.mr.configuration.NeedQueryCache) RunningJob(org.apache.hadoop.mapred.RunningJob) Job(org.apache.hadoop.mapreduce.Job) HashSet(java.util.HashSet) GenericResponse(datawave.webservice.result.GenericResponse) BadRequestQueryException(datawave.webservice.query.exception.BadRequestQueryException) NeedAccumuloConnectionFactory(datawave.webservice.mr.configuration.NeedAccumuloConnectionFactory) IOException(java.io.IOException) DatawaveWebApplicationException(datawave.webservice.common.exception.DatawaveWebApplicationException) WebApplicationException(javax.ws.rs.WebApplicationException) NotFoundQueryException(datawave.webservice.query.exception.NotFoundQueryException) IOException(java.io.IOException) QueryException(datawave.webservice.query.exception.QueryException) BadRequestException(datawave.webservice.common.exception.BadRequestException) NotFoundException(datawave.webservice.common.exception.NotFoundException) UnauthorizedQueryException(datawave.webservice.query.exception.UnauthorizedQueryException) UnauthorizedException(datawave.webservice.common.exception.UnauthorizedException) BadRequestQueryException(datawave.webservice.query.exception.BadRequestQueryException) UnauthorizedQueryException(datawave.webservice.query.exception.UnauthorizedQueryException) NotFoundQueryException(datawave.webservice.query.exception.NotFoundQueryException) QueryException(datawave.webservice.query.exception.QueryException) UnauthorizedQueryException(datawave.webservice.query.exception.UnauthorizedQueryException) BadRequestQueryException(datawave.webservice.query.exception.BadRequestQueryException) Collection(java.util.Collection) BadRequestException(datawave.webservice.common.exception.BadRequestException) NeedAccumuloDetails(datawave.webservice.mr.configuration.NeedAccumuloDetails) Principal(java.security.Principal) ServerPrincipal(datawave.security.system.ServerPrincipal) DatawavePrincipal(datawave.security.authorization.DatawavePrincipal) JobID(org.apache.hadoop.mapreduce.JobID) NeedSecurityDomain(datawave.webservice.mr.configuration.NeedSecurityDomain) POST(javax.ws.rs.POST) Produces(javax.ws.rs.Produces) GZIP(org.jboss.resteasy.annotations.GZIP)

Aggregations

MapReduceConfiguration (datawave.webservice.mr.configuration.MapReduceConfiguration)3 MapReduceJobConfiguration (datawave.webservice.mr.configuration.MapReduceJobConfiguration)3 Job (org.apache.hadoop.mapreduce.Job)3 DatawavePrincipal (datawave.security.authorization.DatawavePrincipal)2 QueryException (datawave.webservice.query.exception.QueryException)2 Test (org.junit.Test)2 PrepareForTest (org.powermock.core.classloader.annotations.PrepareForTest)2 ServerPrincipal (datawave.security.system.ServerPrincipal)1 ConnectionPoolsConfiguration (datawave.webservice.common.connection.config.ConnectionPoolsConfiguration)1 BadRequestException (datawave.webservice.common.exception.BadRequestException)1 DatawaveWebApplicationException (datawave.webservice.common.exception.DatawaveWebApplicationException)1 NoResultsException (datawave.webservice.common.exception.NoResultsException)1 NotFoundException (datawave.webservice.common.exception.NotFoundException)1 UnauthorizedException (datawave.webservice.common.exception.UnauthorizedException)1 NeedAccumuloConnectionFactory (datawave.webservice.mr.configuration.NeedAccumuloConnectionFactory)1 NeedAccumuloDetails (datawave.webservice.mr.configuration.NeedAccumuloDetails)1 NeedCallerDetails (datawave.webservice.mr.configuration.NeedCallerDetails)1 NeedQueryCache (datawave.webservice.mr.configuration.NeedQueryCache)1 NeedQueryLogicFactory (datawave.webservice.mr.configuration.NeedQueryLogicFactory)1 NeedQueryPersister (datawave.webservice.mr.configuration.NeedQueryPersister)1