Search in sources :

Example 1 with DistributedFileSystemOps

use of io.hops.hopsworks.common.hdfs.DistributedFileSystemOps in project hopsworks by logicalclocks.

the class ModelUtils method getModelsAccessor.

public ModelsController.Accessor getModelsAccessor(Users user, Project userProject, Project modelProject, Project experimentProject) throws DatasetException {
    DistributedFileSystemOps udfso = null;
    try {
        String hdfsUser = hdfsUsersController.getHdfsUserName(experimentProject, user);
        udfso = dfs.getDfsOps(hdfsUser);
        return new ModelsController.Accessor(user, userProject, modelProject, experimentProject, udfso, hdfsUser);
    } catch (Throwable t) {
        if (udfso != null) {
            dfs.closeDfsClient(udfso);
        }
        throw new DatasetException(RESTCodes.DatasetErrorCode.DATASET_OPERATION_ERROR, Level.INFO);
    }
}
Also used : DistributedFileSystemOps(io.hops.hopsworks.common.hdfs.DistributedFileSystemOps) DatasetException(io.hops.hopsworks.exceptions.DatasetException)

Example 2 with DistributedFileSystemOps

use of io.hops.hopsworks.common.hdfs.DistributedFileSystemOps in project hopsworks by logicalclocks.

the class JupyterService method startNotebookServer.

@POST
@Path("/start")
@Consumes(MediaType.APPLICATION_JSON)
@Produces(MediaType.APPLICATION_JSON)
@AllowedProjectRoles({ AllowedProjectRoles.DATA_OWNER, AllowedProjectRoles.DATA_SCIENTIST })
@JWTRequired(acceptedTokens = { Audience.API }, allowedUserRoles = { "HOPS_ADMIN", "HOPS_USER" })
public Response startNotebookServer(JupyterSettings jupyterSettings, @Context HttpServletRequest req, @Context SecurityContext sc, @Context UriInfo uriInfo) throws ProjectException, HopsSecurityException, ServiceException, GenericException, JobException {
    Users hopsworksUser = jWTHelper.getUserPrincipal(sc);
    String hdfsUser = hdfsUsersController.getHdfsUserName(project, hopsworksUser);
    // from in the front-end
    if (jupyterSettings.getUsers() == null) {
        jupyterSettings.setUsers(hopsworksUser);
    }
    if (project.getPaymentType().equals(PaymentType.PREPAID)) {
        YarnProjectsQuota projectQuota = yarnProjectsQuotaFacade.findByProjectName(project.getName());
        if (projectQuota == null || projectQuota.getQuotaRemaining() <= 0) {
            throw new ProjectException(RESTCodes.ProjectErrorCode.PROJECT_QUOTA_ERROR, Level.FINE);
        }
    }
    if (project.getPythonEnvironment() == null) {
        throw new ProjectException(RESTCodes.ProjectErrorCode.ANACONDA_NOT_ENABLED, Level.FINE);
    }
    if (jupyterSettings.getMode() == null) {
        // set default mode for jupyter if mode is null
        jupyterSettings.setMode(JupyterMode.JUPYTER_LAB);
    }
    // Jupyter Git works only for JupyterLab
    if (jupyterSettings.isGitBackend() && jupyterSettings.getMode().equals(JupyterMode.JUPYTER_CLASSIC)) {
        throw new ServiceException(RESTCodes.ServiceErrorCode.JUPYTER_START_ERROR, Level.FINE, "Git support available only in JupyterLab");
    }
    // Do not allow auto push on shutdown if api key is missing
    GitConfig gitConfig = jupyterSettings.getGitConfig();
    if (jupyterSettings.isGitBackend() && gitConfig.getShutdownAutoPush() && Strings.isNullOrEmpty(gitConfig.getApiKeyName())) {
        throw new ServiceException(RESTCodes.ServiceErrorCode.JUPYTER_START_ERROR, Level.FINE, "Auto push not supported if api key is not configured.");
    }
    // Verify that API token has got write access on the repo if ShutdownAutoPush is enabled
    if (jupyterSettings.isGitBackend() && gitConfig.getShutdownAutoPush() && !jupyterNbVCSController.hasWriteAccess(hopsworksUser, gitConfig.getApiKeyName(), gitConfig.getRemoteGitURL(), gitConfig.getGitBackend())) {
        throw new ServiceException(RESTCodes.ServiceErrorCode.JUPYTER_START_ERROR, Level.FINE, "API token " + gitConfig.getApiKeyName() + " does not have write access on " + gitConfig.getRemoteGitURL());
    }
    JupyterProject jp = jupyterFacade.findByUser(hdfsUser);
    if (jp == null) {
        HdfsUsers user = hdfsUsersFacade.findByName(hdfsUser);
        String configSecret = DigestUtils.sha256Hex(Integer.toString(ThreadLocalRandom.current().nextInt()));
        JupyterDTO dto = null;
        DistributedFileSystemOps dfso = dfsService.getDfsOps();
        String allowOriginHost = uriInfo.getBaseUri().getHost();
        int allowOriginPort = uriInfo.getBaseUri().getPort();
        String allowOriginPortStr = allowOriginPort != -1 ? ":" + allowOriginPort : "";
        String allowOrigin = settings.getJupyterOriginScheme() + "://" + allowOriginHost + allowOriginPortStr;
        try {
            jupyterSettingsFacade.update(jupyterSettings);
            // Inspect dependencies
            sparkController.inspectDependencies(project, hopsworksUser, (SparkJobConfiguration) jupyterSettings.getJobConfig());
            dto = jupyterManager.startJupyterServer(project, configSecret, hdfsUser, hopsworksUser, jupyterSettings, allowOrigin);
            jupyterJWTManager.materializeJWT(hopsworksUser, project, jupyterSettings, dto.getCid(), dto.getPort(), JUPYTER_JWT_AUD);
            HopsUtils.materializeCertificatesForUserCustomDir(project.getName(), user.getUsername(), settings.getHdfsTmpCertDir(), dfso, certificateMaterializer, settings, dto.getCertificatesDir());
            jupyterManager.waitForStartup(project, hopsworksUser);
        } catch (ServiceException | TimeoutException ex) {
            if (dto != null) {
                jupyterController.shutdownQuietly(project, hdfsUser, hopsworksUser, configSecret, dto.getCid(), dto.getPort());
            }
            throw new ServiceException(RESTCodes.ServiceErrorCode.JUPYTER_START_ERROR, Level.SEVERE, ex.getMessage(), null, ex);
        } catch (IOException ex) {
            if (dto != null) {
                jupyterController.shutdownQuietly(project, hdfsUser, hopsworksUser, configSecret, dto.getCid(), dto.getPort());
            }
            throw new HopsSecurityException(RESTCodes.SecurityErrorCode.CERT_MATERIALIZATION_ERROR, Level.SEVERE, ex.getMessage(), null, ex);
        } finally {
            if (dfso != null) {
                dfsService.closeDfsClient(dfso);
            }
        }
        String externalIp = Ip.getHost(req.getRequestURL().toString());
        try {
            Date expirationDate = new Date();
            Calendar cal = Calendar.getInstance();
            cal.setTime(expirationDate);
            cal.add(Calendar.HOUR_OF_DAY, jupyterSettings.getShutdownLevel());
            expirationDate = cal.getTime();
            jp = jupyterFacade.saveServer(externalIp, project, configSecret, dto.getPort(), user.getId(), dto.getToken(), dto.getCid(), expirationDate, jupyterSettings.isNoLimit());
            // set minutes left until notebook server is killed
            Duration durationLeft = Duration.between(new Date().toInstant(), jp.getExpires().toInstant());
            jp.setMinutesUntilExpiration(durationLeft.toMinutes());
        } catch (Exception e) {
            LOGGER.log(Level.SEVERE, "Failed to save Jupyter notebook settings", e);
            jupyterController.shutdownQuietly(project, hdfsUser, hopsworksUser, configSecret, dto.getCid(), dto.getPort());
        }
        if (jp == null) {
            throw new ServiceException(RESTCodes.ServiceErrorCode.JUPYTER_SAVE_SETTINGS_ERROR, Level.SEVERE);
        }
        if (jupyterSettings.isGitBackend()) {
            try {
                // Init is idempotent, calling it on an already initialized repo won't affect it
                jupyterNbVCSController.init(jp, jupyterSettings);
                if (jupyterSettings.getGitConfig().getStartupAutoPull()) {
                    jupyterNbVCSController.pull(jp, jupyterSettings);
                }
            } catch (ServiceException ex) {
                jupyterController.shutdownQuietly(project, hdfsUser, hopsworksUser, configSecret, dto.getCid(), dto.getPort());
                throw ex;
            }
        }
    } else {
        throw new ServiceException(RESTCodes.ServiceErrorCode.JUPYTER_SERVER_ALREADY_RUNNING, Level.FINE);
    }
    return noCacheResponse.getNoCacheResponseBuilder(Response.Status.OK).entity(jp).build();
}
Also used : DistributedFileSystemOps(io.hops.hopsworks.common.hdfs.DistributedFileSystemOps) Calendar(java.util.Calendar) JupyterProject(io.hops.hopsworks.persistence.entity.jupyter.JupyterProject) Duration(java.time.Duration) HdfsUsers(io.hops.hopsworks.persistence.entity.hdfs.user.HdfsUsers) Users(io.hops.hopsworks.persistence.entity.user.Users) IOException(java.io.IOException) HdfsUsers(io.hops.hopsworks.persistence.entity.hdfs.user.HdfsUsers) Date(java.util.Date) TimeoutException(java.util.concurrent.TimeoutException) ProjectException(io.hops.hopsworks.exceptions.ProjectException) JobException(io.hops.hopsworks.exceptions.JobException) GenericException(io.hops.hopsworks.exceptions.GenericException) HopsSecurityException(io.hops.hopsworks.exceptions.HopsSecurityException) ElasticException(io.hops.hopsworks.exceptions.ElasticException) IOException(java.io.IOException) ServiceException(io.hops.hopsworks.exceptions.ServiceException) HopsSecurityException(io.hops.hopsworks.exceptions.HopsSecurityException) ProjectException(io.hops.hopsworks.exceptions.ProjectException) ServiceException(io.hops.hopsworks.exceptions.ServiceException) GitConfig(io.hops.hopsworks.persistence.entity.jupyter.config.GitConfig) YarnProjectsQuota(io.hops.hopsworks.persistence.entity.jobs.quota.YarnProjectsQuota) JupyterDTO(io.hops.hopsworks.common.dao.jupyter.config.JupyterDTO) TimeoutException(java.util.concurrent.TimeoutException) Path(javax.ws.rs.Path) POST(javax.ws.rs.POST) Consumes(javax.ws.rs.Consumes) Produces(javax.ws.rs.Produces) JWTRequired(io.hops.hopsworks.jwt.annotation.JWTRequired) AllowedProjectRoles(io.hops.hopsworks.api.filter.AllowedProjectRoles)

Example 3 with DistributedFileSystemOps

use of io.hops.hopsworks.common.hdfs.DistributedFileSystemOps in project hopsworks by logicalclocks.

the class XAttrsResource method get.

@ApiOperation(value = "Get extended attributes attached to a path.", response = XAttrDTO.class)
@GET
@Path("{path: .+}")
@Produces(MediaType.APPLICATION_JSON)
@AllowedProjectRoles({ AllowedProjectRoles.DATA_SCIENTIST, AllowedProjectRoles.DATA_OWNER })
@JWTRequired(acceptedTokens = { Audience.API }, allowedUserRoles = { "HOPS_ADMIN", "HOPS_USER" })
public Response get(@Context SecurityContext sc, @Context UriInfo uriInfo, @PathParam("path") String path, @QueryParam("pathType") @DefaultValue("DATASET") DatasetType pathType, @QueryParam("name") String xattrName) throws DatasetException, MetadataException {
    Users user = jWTHelper.getUserPrincipal(sc);
    Map<String, String> result = new HashMap<>();
    DistributedFileSystemOps udfso = dfs.getDfsOps(hdfsUsersController.getHdfsUserName(project, user));
    String inodePath = datasetHelper.getDatasetPathIfFileExist(project, path, pathType).getFullPath().toString();
    try {
        if (xattrName != null) {
            String xattr = xattrsController.getXAttr(inodePath, xattrName, udfso);
            if (Strings.isNullOrEmpty(xattr)) {
                throw new MetadataException(RESTCodes.MetadataErrorCode.METADATA_MISSING_FIELD, Level.FINE);
            }
            result.put(xattrName, xattr);
        } else {
            result.putAll(xattrsController.getXAttrs(inodePath, udfso));
        }
    } finally {
        dfs.closeDfsClient(udfso);
    }
    ResourceRequest resourceRequest = new ResourceRequest(ResourceRequest.Name.XATTRS);
    XAttrDTO dto = xattrsBuilder.build(uriInfo, resourceRequest, project, inodePath, result);
    return Response.ok().entity(dto).build();
}
Also used : HashMap(java.util.HashMap) DistributedFileSystemOps(io.hops.hopsworks.common.hdfs.DistributedFileSystemOps) Users(io.hops.hopsworks.persistence.entity.user.Users) ResourceRequest(io.hops.hopsworks.common.api.ResourceRequest) MetadataException(io.hops.hopsworks.exceptions.MetadataException) Path(javax.ws.rs.Path) Produces(javax.ws.rs.Produces) GET(javax.ws.rs.GET) JWTRequired(io.hops.hopsworks.jwt.annotation.JWTRequired) ApiOperation(io.swagger.annotations.ApiOperation) AllowedProjectRoles(io.hops.hopsworks.api.filter.AllowedProjectRoles)

Example 4 with DistributedFileSystemOps

use of io.hops.hopsworks.common.hdfs.DistributedFileSystemOps in project hopsworks by logicalclocks.

the class TrainingDatasetController method delete.

public String delete(Users user, Project project, Featurestore featurestore, Integer trainingDatasetId) throws FeaturestoreException {
    TrainingDataset trainingDataset = trainingDatasetFacade.findByIdAndFeaturestore(trainingDatasetId, featurestore).orElseThrow(() -> new FeaturestoreException(RESTCodes.FeaturestoreErrorCode.TRAINING_DATASET_NOT_FOUND, Level.FINE, "training dataset id:" + trainingDatasetId));
    featurestoreUtils.verifyUserRole(trainingDataset, featurestore, user, project);
    statisticsController.deleteStatistics(project, user, trainingDataset);
    String dsPath = getTrainingDatasetInodePath(trainingDataset);
    String username = hdfsUsersBean.getHdfsUserName(project, user);
    // we rely on the foreign keys to cascade from inode -> external/hopsfs td -> trainig dataset
    DistributedFileSystemOps udfso = dfs.getDfsOps(username);
    try {
        // TODO(Fabio): if Data owner *In project* do operation as superuser
        udfso.rm(dsPath, true);
    } catch (IOException e) {
    } finally {
        if (udfso != null) {
            dfs.closeDfsClient(udfso);
        }
    }
    return trainingDataset.getName();
}
Also used : TrainingDataset(io.hops.hopsworks.persistence.entity.featurestore.trainingdataset.TrainingDataset) HopsfsTrainingDataset(io.hops.hopsworks.persistence.entity.featurestore.trainingdataset.hopsfs.HopsfsTrainingDataset) ExternalTrainingDataset(io.hops.hopsworks.persistence.entity.featurestore.trainingdataset.external.ExternalTrainingDataset) DistributedFileSystemOps(io.hops.hopsworks.common.hdfs.DistributedFileSystemOps) IOException(java.io.IOException) FeaturestoreException(io.hops.hopsworks.exceptions.FeaturestoreException)

Example 5 with DistributedFileSystemOps

use of io.hops.hopsworks.common.hdfs.DistributedFileSystemOps in project hopsworks by logicalclocks.

the class GitController method clone.

public GitOpExecution clone(CloneCommandConfiguration cloneConfigurationDTO, Project project, Users hopsworksUser) throws IllegalArgumentException, GitOpException, HopsSecurityException, DatasetException {
    commandConfigurationValidator.verifyCloneOptions(cloneConfigurationDTO);
    // create the repository dir. The go-git does not create a directory, so we need to create it before
    String fullRepoDirPath = cloneConfigurationDTO.getPath() + File.separator + commandConfigurationValidator.getRepositoryName(cloneConfigurationDTO.getUrl());
    DistributedFileSystemOps udfso = dfsService.getDfsOps(hdfsUsersController.getHdfsUserName(project, hopsworksUser));
    try {
        datasetController.createSubDirectory(project, new Path(fullRepoDirPath), udfso);
    } finally {
        // Close the udfso
        dfsService.closeDfsClient(udfso);
    }
    Inode inode = inodeController.getInodeAtPath(fullRepoDirPath);
    GitRepository repository = gitRepositoryFacade.create(inode, project, cloneConfigurationDTO.getProvider(), hopsworksUser);
    // Create the default remote
    gitRepositoryRemotesFacade.save(new GitRepositoryRemote(repository, Constants.REPOSITORY_DEFAULT_REMOTE_NAME, cloneConfigurationDTO.getUrl()));
    GitCommandConfiguration configuration = new GitCommandConfigurationBuilder().setCommandType(GitCommandType.CLONE).setUrl(cloneConfigurationDTO.getUrl()).setProvider(cloneConfigurationDTO.getProvider()).setPath(fullRepoDirPath).setBranchName(cloneConfigurationDTO.getBranch()).build();
    return executionController.createExecution(configuration, project, hopsworksUser, repository);
}
Also used : Path(org.apache.hadoop.fs.Path) GitRepository(io.hops.hopsworks.persistence.entity.git.GitRepository) Inode(io.hops.hopsworks.persistence.entity.hdfs.inode.Inode) GitRepositoryRemote(io.hops.hopsworks.persistence.entity.git.GitRepositoryRemote) GitCommandConfiguration(io.hops.hopsworks.persistence.entity.git.config.GitCommandConfiguration) DistributedFileSystemOps(io.hops.hopsworks.common.hdfs.DistributedFileSystemOps)

Aggregations

DistributedFileSystemOps (io.hops.hopsworks.common.hdfs.DistributedFileSystemOps)93 IOException (java.io.IOException)51 Path (org.apache.hadoop.fs.Path)28 DatasetException (io.hops.hopsworks.exceptions.DatasetException)18 FeaturestoreException (io.hops.hopsworks.exceptions.FeaturestoreException)18 ServiceException (io.hops.hopsworks.exceptions.ServiceException)13 Project (io.hops.hopsworks.persistence.entity.project.Project)12 Inode (io.hops.hopsworks.persistence.entity.hdfs.inode.Inode)11 HopsSecurityException (io.hops.hopsworks.exceptions.HopsSecurityException)10 ArrayList (java.util.ArrayList)10 Produces (javax.ws.rs.Produces)10 Users (io.hops.hopsworks.persistence.entity.user.Users)9 Dataset (io.hops.hopsworks.persistence.entity.dataset.Dataset)8 AllowedProjectRoles (io.hops.hopsworks.api.filter.AllowedProjectRoles)7 JobException (io.hops.hopsworks.exceptions.JobException)7 ProjectException (io.hops.hopsworks.exceptions.ProjectException)7 Date (java.util.Date)7 HashMap (java.util.HashMap)7 Path (javax.ws.rs.Path)7 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)7