Search in sources :

Example 6 with SparkSpecification

use of co.cask.cdap.api.spark.SparkSpecification in project cdap by caskdata.

the class ApplicationSpecificationCodec method deserialize.

@Override
public ApplicationSpecification deserialize(JsonElement json, Type typeOfT, JsonDeserializationContext context) throws JsonParseException {
    JsonObject jsonObj = json.getAsJsonObject();
    String name = jsonObj.get("name").getAsString();
    String appVersion = ApplicationId.DEFAULT_VERSION;
    if (jsonObj.has("appVersion")) {
        appVersion = jsonObj.get("appVersion").getAsString();
    }
    String description = jsonObj.get("description").getAsString();
    String configuration = null;
    if (jsonObj.has("configuration")) {
        configuration = jsonObj.get("configuration").getAsString();
    }
    ArtifactId artifactId = context.deserialize(jsonObj.get("artifactId"), ArtifactId.class);
    Map<String, StreamSpecification> streams = deserializeMap(jsonObj.get("streams"), context, StreamSpecification.class);
    Map<String, String> datasetModules = deserializeMap(jsonObj.get("datasetModules"), context, String.class);
    Map<String, DatasetCreationSpec> datasetInstances = deserializeMap(jsonObj.get("datasetInstances"), context, DatasetCreationSpec.class);
    Map<String, FlowSpecification> flows = deserializeMap(jsonObj.get("flows"), context, FlowSpecification.class);
    Map<String, MapReduceSpecification> mapReduces = deserializeMap(jsonObj.get("mapReduces"), context, MapReduceSpecification.class);
    Map<String, SparkSpecification> sparks = deserializeMap(jsonObj.get("sparks"), context, SparkSpecification.class);
    Map<String, WorkflowSpecification> workflows = deserializeMap(jsonObj.get("workflows"), context, WorkflowSpecification.class);
    Map<String, ServiceSpecification> services = deserializeMap(jsonObj.get("services"), context, ServiceSpecification.class);
    Map<String, ScheduleSpecification> schedules = deserializeMap(jsonObj.get("schedules"), context, ScheduleSpecification.class);
    Map<String, ScheduleCreationSpec> programSchedules = deserializeMap(jsonObj.get("programSchedules"), context, ScheduleCreationSpec.class);
    Map<String, WorkerSpecification> workers = deserializeMap(jsonObj.get("workers"), context, WorkerSpecification.class);
    Map<String, Plugin> plugins = deserializeMap(jsonObj.get("plugins"), context, Plugin.class);
    return new DefaultApplicationSpecification(name, appVersion, description, configuration, artifactId, streams, datasetModules, datasetInstances, flows, mapReduces, sparks, workflows, services, schedules, programSchedules, workers, plugins);
}
Also used : ServiceSpecification(co.cask.cdap.api.service.ServiceSpecification) ArtifactId(co.cask.cdap.api.artifact.ArtifactId) JsonObject(com.google.gson.JsonObject) SparkSpecification(co.cask.cdap.api.spark.SparkSpecification) FlowSpecification(co.cask.cdap.api.flow.FlowSpecification) WorkflowSpecification(co.cask.cdap.api.workflow.WorkflowSpecification) ScheduleSpecification(co.cask.cdap.api.schedule.ScheduleSpecification) StreamSpecification(co.cask.cdap.api.data.stream.StreamSpecification) WorkerSpecification(co.cask.cdap.api.worker.WorkerSpecification) MapReduceSpecification(co.cask.cdap.api.mapreduce.MapReduceSpecification) ScheduleCreationSpec(co.cask.cdap.internal.schedule.ScheduleCreationSpec) DatasetCreationSpec(co.cask.cdap.internal.dataset.DatasetCreationSpec) Plugin(co.cask.cdap.api.plugin.Plugin)

Example 7 with SparkSpecification

use of co.cask.cdap.api.spark.SparkSpecification in project cdap by caskdata.

the class DefaultAppConfigurer method addSpark.

@Override
public void addSpark(Spark spark) {
    Preconditions.checkArgument(spark != null, "Spark cannot be null.");
    DefaultSparkConfigurer configurer = null;
    // It is a bit hacky here to look for the DefaultExtendedSparkConfigurer implementation through the
    // SparkRunnerClassloader directly (CDAP-11797)
    ClassLoader sparkRunnerClassLoader = ClassLoaders.findByName(spark.getClass().getClassLoader(), "co.cask.cdap.app.runtime.spark.classloader.SparkRunnerClassLoader");
    if (sparkRunnerClassLoader != null) {
        try {
            configurer = (DefaultSparkConfigurer) sparkRunnerClassLoader.loadClass("co.cask.cdap.app.deploy.spark.DefaultExtendedSparkConfigurer").getConstructor(Spark.class, Id.Namespace.class, Id.Artifact.class, ArtifactRepository.class, PluginInstantiator.class).newInstance(spark, deployNamespace, artifactId, artifactRepository, pluginInstantiator);
        } catch (Exception e) {
            // Ignore it and the configurer will be defaulted to DefaultSparkConfigurer
            LOG.trace("No DefaultExtendedSparkConfigurer found. Fallback to DefaultSparkConfigurer.", e);
        }
    }
    if (configurer == null) {
        configurer = new DefaultSparkConfigurer(spark, deployNamespace, artifactId, artifactRepository, pluginInstantiator);
    }
    spark.configure(configurer);
    addDatasetsAndPlugins(configurer);
    SparkSpecification spec = configurer.createSpecification();
    sparks.put(spec.getName(), spec);
}
Also used : SparkSpecification(co.cask.cdap.api.spark.SparkSpecification) DefaultSparkConfigurer(co.cask.cdap.internal.app.spark.DefaultSparkConfigurer) ArtifactRepository(co.cask.cdap.internal.app.runtime.artifact.ArtifactRepository) PluginInstantiator(co.cask.cdap.internal.app.runtime.plugin.PluginInstantiator) Spark(co.cask.cdap.api.spark.Spark)

Example 8 with SparkSpecification

use of co.cask.cdap.api.spark.SparkSpecification in project cdap by caskdata.

the class SparkSpecificationCodec method deserialize.

@Override
public SparkSpecification deserialize(JsonElement json, Type typeOfT, JsonDeserializationContext context) throws JsonParseException {
    JsonObject jsonObj = json.getAsJsonObject();
    String className = jsonObj.get("className").getAsString();
    String name = jsonObj.get("name").getAsString();
    String description = jsonObj.get("description").getAsString();
    String mainClassName = jsonObj.get("mainClassName").getAsString();
    Set<String> datasets = deserializeSet(jsonObj.get("datasets"), context, String.class);
    Map<String, String> properties = deserializeMap(jsonObj.get("properties"), context, String.class);
    Resources clientResources = deserializeResources(jsonObj, "client", context);
    Resources driverResources = deserializeResources(jsonObj, "driver", context);
    Resources executorResources = deserializeResources(jsonObj, "executor", context);
    return new SparkSpecification(className, name, description, mainClassName, datasets, properties, clientResources, driverResources, executorResources);
}
Also used : SparkSpecification(co.cask.cdap.api.spark.SparkSpecification) JsonObject(com.google.gson.JsonObject) Resources(co.cask.cdap.api.Resources)

Example 9 with SparkSpecification

use of co.cask.cdap.api.spark.SparkSpecification in project cdap by caskdata.

the class DistributedSparkProgramRunner method validateOptions.

@Override
protected void validateOptions(Program program, ProgramOptions options) {
    super.validateOptions(program, options);
    // Extract and verify parameters
    ApplicationSpecification appSpec = program.getApplicationSpecification();
    Preconditions.checkNotNull(appSpec, "Missing application specification for %s", program.getId());
    ProgramType processorType = program.getType();
    Preconditions.checkNotNull(processorType, "Missing processor type for %s", program.getId());
    Preconditions.checkArgument(processorType == ProgramType.SPARK, "Only SPARK process type is supported. Program type is %s for %s", processorType, program.getId());
    SparkSpecification spec = appSpec.getSpark().get(program.getName());
    Preconditions.checkNotNull(spec, "Missing SparkSpecification for %s", program.getId());
}
Also used : ApplicationSpecification(co.cask.cdap.api.app.ApplicationSpecification) SparkSpecification(co.cask.cdap.api.spark.SparkSpecification) ProgramType(co.cask.cdap.proto.ProgramType)

Example 10 with SparkSpecification

use of co.cask.cdap.api.spark.SparkSpecification in project cdap by caskdata.

the class DistributedSparkProgramRunner method setupLaunchConfig.

@Override
protected void setupLaunchConfig(LaunchConfig launchConfig, Program program, ProgramOptions options, CConfiguration cConf, Configuration hConf, File tempDir) throws IOException {
    // Update the container hConf
    hConf.setBoolean(SparkRuntimeContextConfig.HCONF_ATTR_CLUSTER_MODE, true);
    hConf.set("hive.metastore.token.signature", HiveAuthFactory.HS2_CLIENT_TOKEN);
    if (SecurityUtil.isKerberosEnabled(cConf)) {
        // Need to divide the interval by 0.8 because Spark logic has a 0.8 discount on the interval
        // If we don't offset it, it will look for the new credentials too soon
        // Also add 5 seconds to the interval to give master time to push the changes to the Spark client container
        hConf.setLong(SparkRuntimeContextConfig.HCONF_ATTR_CREDENTIALS_UPDATE_INTERVAL_MS, (long) ((secureStoreRenewer.getUpdateInterval() + 5000) / 0.8));
    }
    // Setup the launch config
    ApplicationSpecification appSpec = program.getApplicationSpecification();
    SparkSpecification spec = appSpec.getSpark().get(program.getName());
    Map<String, String> clientArgs = RuntimeArguments.extractScope("task", "client", options.getUserArguments().asMap());
    Resources resources = SystemArguments.getResources(clientArgs, spec.getClientResources());
    // Add runnable. Only one instance for the spark client
    launchConfig.addRunnable(spec.getName(), new SparkTwillRunnable(spec.getName()), resources, 1);
    // Add extra resources, classpath, dependencies, env and setup ClassAcceptor
    Map<String, LocalizeResource> localizeResources = new HashMap<>();
    Map<String, String> extraEnv = new HashMap<>(SparkPackageUtils.getSparkClientEnv());
    SparkPackageUtils.prepareSparkResources(sparkCompat, locationFactory, tempDir, localizeResources, extraEnv);
    // Add the mapreduce resources and path as well for the InputFormat/OutputFormat classes
    MapReduceContainerHelper.localizeFramework(hConf, localizeResources);
    extraEnv.put(Constants.SPARK_COMPAT_ENV, sparkCompat.getCompat());
    launchConfig.addExtraResources(localizeResources).addExtraDependencies(SparkProgramRuntimeProvider.class).addExtraEnv(extraEnv).addExtraClasspath(MapReduceContainerHelper.addMapReduceClassPath(hConf, new ArrayList<String>())).setClassAcceptor(createBundlerClassAcceptor());
}
Also used : ApplicationSpecification(co.cask.cdap.api.app.ApplicationSpecification) SparkSpecification(co.cask.cdap.api.spark.SparkSpecification) HashMap(java.util.HashMap) LocalizeResource(co.cask.cdap.internal.app.runtime.distributed.LocalizeResource) SparkProgramRuntimeProvider(co.cask.cdap.app.runtime.spark.SparkProgramRuntimeProvider) Resources(co.cask.cdap.api.Resources)

Aggregations

SparkSpecification (co.cask.cdap.api.spark.SparkSpecification)10 ApplicationSpecification (co.cask.cdap.api.app.ApplicationSpecification)4 Resources (co.cask.cdap.api.Resources)2 FlowSpecification (co.cask.cdap.api.flow.FlowSpecification)2 MapReduceSpecification (co.cask.cdap.api.mapreduce.MapReduceSpecification)2 ServiceSpecification (co.cask.cdap.api.service.ServiceSpecification)2 Spark (co.cask.cdap.api.spark.Spark)2 PluginInstantiator (co.cask.cdap.internal.app.runtime.plugin.PluginInstantiator)2 ProgramType (co.cask.cdap.proto.ProgramType)2 ProgramId (co.cask.cdap.proto.id.ProgramId)2 JsonObject (com.google.gson.JsonObject)2 ArtifactId (co.cask.cdap.api.artifact.ArtifactId)1 StreamSpecification (co.cask.cdap.api.data.stream.StreamSpecification)1 FlowletConnection (co.cask.cdap.api.flow.FlowletConnection)1 FlowletDefinition (co.cask.cdap.api.flow.FlowletDefinition)1 MetricsCollectionService (co.cask.cdap.api.metrics.MetricsCollectionService)1 Plugin (co.cask.cdap.api.plugin.Plugin)1 ScheduleSpecification (co.cask.cdap.api.schedule.ScheduleSpecification)1 HttpServiceHandlerSpecification (co.cask.cdap.api.service.http.HttpServiceHandlerSpecification)1 WorkerSpecification (co.cask.cdap.api.worker.WorkerSpecification)1