Search in sources :

Example 6 with SparkJobConfiguration

use of io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration in project hopsworks by logicalclocks.

the class JobController method getConfiguration.

@TransactionAttribute(TransactionAttributeType.NEVER)
public JobConfiguration getConfiguration(Project project, JobType jobType, boolean useDefaultConfig) {
    Optional<DefaultJobConfiguration> defaultConfig;
    if (jobType.equals(JobType.SPARK) || jobType.equals(JobType.PYSPARK)) {
        /**
         * The Spark and PySpark configuration is stored in the same configuration entry in the
         * database for DefaultJobConfiguration. Namely in a PySpark configuration. We infer the JobType based on if
         * you set a .jar or .py. However when creating the DefaultJobConfiguration, as part of the PK the JobType
         * needs to be set. So for now PySpark/Spark shares the same configuration.
         */
        defaultConfig = project.getDefaultJobConfigurationCollection().stream().filter(conf -> conf.getDefaultJobConfigurationPK().getType().equals(JobType.PYSPARK)).findFirst();
        defaultConfig.ifPresent(defaultJobConfiguration -> ((SparkJobConfiguration) defaultJobConfiguration.getJobConfig()).setMainClass(null));
    } else {
        defaultConfig = project.getDefaultJobConfigurationCollection().stream().filter(conf -> conf.getDefaultJobConfigurationPK().getType().equals(jobType)).findFirst();
    }
    if (defaultConfig.isPresent()) {
        return defaultConfig.get().getJobConfig();
    } else if (useDefaultConfig) {
        switch(jobType) {
            case SPARK:
            case PYSPARK:
                return new SparkJobConfiguration();
            case FLINK:
                return new FlinkJobConfiguration();
            default:
                throw new IllegalArgumentException("Job type not supported: " + jobType);
        }
    } else {
        return null;
    }
}
Also used : DefaultJobConfiguration(io.hops.hopsworks.persistence.entity.project.jobs.DefaultJobConfiguration) SparkJobConfiguration(io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration) FlinkJobConfiguration(io.hops.hopsworks.persistence.entity.jobs.configuration.flink.FlinkJobConfiguration) TransactionAttribute(javax.ejb.TransactionAttribute)

Example 7 with SparkJobConfiguration

use of io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration in project hopsworks by logicalclocks.

the class JobController method inspectProgram.

@TransactionAttribute(TransactionAttributeType.NEVER)
public JobConfiguration inspectProgram(String path, Project project, Users user, JobType jobType) throws JobException {
    DistributedFileSystemOps udfso = null;
    try {
        String username = hdfsUsersBean.getHdfsUserName(project, user);
        udfso = dfs.getDfsOps(username);
        LOGGER.log(Level.FINE, "Inspecting executable job program by {0} at path: {1}", new Object[] { username, path });
        JobConfiguration jobConf = getConfiguration(project, jobType, true);
        switch(jobType) {
            case SPARK:
            case PYSPARK:
                if (Strings.isNullOrEmpty(path) || !(path.endsWith(".jar") || path.endsWith(".py") || path.endsWith(".ipynb"))) {
                    throw new IllegalArgumentException("Path does not point to a .jar, .py or .ipynb file.");
                }
                return sparkController.inspectProgram((SparkJobConfiguration) jobConf, path, udfso);
            case FLINK:
                return jobConf;
            default:
                throw new IllegalArgumentException("Job type not supported: " + jobType);
        }
    } finally {
        if (udfso != null) {
            dfs.closeDfsClient(udfso);
        }
    }
}
Also used : DistributedFileSystemOps(io.hops.hopsworks.common.hdfs.DistributedFileSystemOps) DefaultJobConfiguration(io.hops.hopsworks.persistence.entity.project.jobs.DefaultJobConfiguration) SparkJobConfiguration(io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration) FlinkJobConfiguration(io.hops.hopsworks.persistence.entity.jobs.configuration.flink.FlinkJobConfiguration) JobConfiguration(io.hops.hopsworks.persistence.entity.jobs.configuration.JobConfiguration) TransactionAttribute(javax.ejb.TransactionAttribute)

Example 8 with SparkJobConfiguration

use of io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration in project hopsworks by logicalclocks.

the class JobController method putJob.

public Jobs putJob(Users user, Project project, Jobs job, JobConfiguration config) throws JobException {
    try {
        if (config.getJobType() == JobType.SPARK || config.getJobType() == JobType.PYSPARK) {
            SparkConfigurationUtil sparkConfigurationUtil = new SparkConfigurationUtil();
            SparkJobConfiguration sparkJobConfiguration = (SparkJobConfiguration) config;
            sparkConfigurationUtil.validateExecutorMemory(sparkJobConfiguration.getExecutorMemory(), settings);
        }
        job = jobFacade.put(user, project, config, job);
    } catch (IllegalStateException ise) {
        if (ise.getCause() instanceof JAXBException) {
            throw new JobException(RESTCodes.JobErrorCode.JOB_CONFIGURATION_CONVERT_TO_JSON_ERROR, Level.FINE, "Unable to create json from JobConfiguration", ise.getMessage(), ise);
        } else {
            throw ise;
        }
    }
    if (config.getSchedule() != null) {
        scheduler.scheduleJobPeriodic(job);
    }
    activityFacade.persistActivity(ActivityFacade.CREATED_JOB + getJobNameForActivity(job.getName()), project, user, ActivityFlag.JOB);
    return job;
}
Also used : JobException(io.hops.hopsworks.exceptions.JobException) SparkJobConfiguration(io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration) JAXBException(javax.xml.bind.JAXBException) SparkConfigurationUtil(io.hops.hopsworks.common.util.SparkConfigurationUtil)

Example 9 with SparkJobConfiguration

use of io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration in project hopsworks by logicalclocks.

the class FsJobManagerController method configureJob.

private Jobs configureJob(Users user, Project project, SparkJobConfiguration sparkJobConfiguration, String jobName, String defaultArgs) throws JobException {
    if (sparkJobConfiguration == null) {
        // set defaults for spark job size
        sparkJobConfiguration = new SparkJobConfiguration();
    }
    sparkJobConfiguration.setAppName(jobName);
    sparkJobConfiguration.setMainClass(Settings.SPARK_PY_MAINCLASS);
    sparkJobConfiguration.setAppPath(settings.getFSJobUtilPath());
    sparkJobConfiguration.setDefaultArgs(defaultArgs);
    return jobController.putJob(user, project, null, sparkJobConfiguration);
}
Also used : SparkJobConfiguration(io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration)

Example 10 with SparkJobConfiguration

use of io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration in project hopsworks by logicalclocks.

the class ModelsController method versionProgram.

public String versionProgram(Accessor accessor, String jobName, String kernelId, String modelName, int modelVersion) throws JobException, ServiceException {
    if (!Strings.isNullOrEmpty(jobName)) {
        // model in job
        Jobs experimentJob = jobController.getJob(accessor.experimentProject, jobName);
        switch(experimentJob.getJobType()) {
            case SPARK:
            case PYSPARK:
                {
                    SparkJobConfiguration sparkJobConf = (SparkJobConfiguration) experimentJob.getJobConfig();
                    String suffix = sparkJobConf.getAppPath().substring(sparkJobConf.getAppPath().lastIndexOf("."));
                    String relativePath = Settings.HOPS_MODELS_DATASET + "/" + modelName + "/" + modelVersion + "/program" + suffix;
                    Path path = new Path(Utils.getProjectPath(accessor.modelProject.getName()) + relativePath);
                    jobController.versionProgram(sparkJobConf.getAppPath(), accessor.udfso, path);
                    return relativePath;
                }
            case PYTHON:
                {
                    throw new IllegalArgumentException("python jobs unavailable in community");
                }
            default:
                throw new IllegalArgumentException("cannot version program for job type:" + experimentJob.getJobType());
        }
    } else {
        // model in jupyter
        String relativePath = Settings.HOPS_MODELS_DATASET + "/" + modelName + "/" + modelVersion + "/program.ipynb";
        Path path = new Path(Utils.getProjectPath(accessor.modelProject.getName()) + relativePath);
        jupyterController.versionProgram(accessor.hdfsUser, kernelId, path, accessor.udfso);
        return relativePath;
    }
}
Also used : Path(org.apache.hadoop.fs.Path) DatasetPath(io.hops.hopsworks.common.dataset.util.DatasetPath) Jobs(io.hops.hopsworks.persistence.entity.jobs.description.Jobs) SparkJobConfiguration(io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration)

Aggregations

SparkJobConfiguration (io.hops.hopsworks.persistence.entity.jobs.configuration.spark.SparkJobConfiguration)18 JobException (io.hops.hopsworks.exceptions.JobException)4 IOException (java.io.IOException)4 ProjectException (io.hops.hopsworks.exceptions.ProjectException)3 Jobs (io.hops.hopsworks.persistence.entity.jobs.description.Jobs)3 DefaultJobConfiguration (io.hops.hopsworks.persistence.entity.project.jobs.DefaultJobConfiguration)3 HashMap (java.util.HashMap)3 TransactionAttribute (javax.ejb.TransactionAttribute)3 Path (org.apache.hadoop.fs.Path)3 DatasetPath (io.hops.hopsworks.common.dataset.util.DatasetPath)2 SparkConfigurationUtil (io.hops.hopsworks.common.util.SparkConfigurationUtil)2 ConfigProperty (io.hops.hopsworks.common.util.templates.ConfigProperty)2 Inode (io.hops.hopsworks.persistence.entity.hdfs.inode.Inode)2 FlinkJobConfiguration (io.hops.hopsworks.persistence.entity.jobs.configuration.flink.FlinkJobConfiguration)2 Execution (io.hops.hopsworks.persistence.entity.jobs.history.Execution)2 JAXBException (javax.xml.bind.JAXBException)2 ServiceDiscoveryException (com.logicalclocks.servicediscoverclient.exceptions.ServiceDiscoveryException)1 Service (com.logicalclocks.servicediscoverclient.service.Service)1 TemplateException (freemarker.template.TemplateException)1 DistributedFileSystemOps (io.hops.hopsworks.common.hdfs.DistributedFileSystemOps)1