Search in sources :

Example 6 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class KettleAnalyzerUtil method normalizeFilePath.

/**
 * Utility method for normalizing file paths used in Metaverse Id generation. It will convert a valid path into a
 * consistent path regardless of URI notation or filesystem absolute path.
 *
 * @param filePath full path to normalize
 * @return the normalized path
 */
public static String normalizeFilePath(String filePath) throws MetaverseException {
    try {
        String path = filePath;
        FileObject fo = KettleVFS.getFileObject(filePath);
        try {
            path = fo.getURL().getPath();
        } catch (Throwable t) {
        // Something went wrong with VFS, just try the filePath
        }
        File f = new File(path);
        return f.getAbsolutePath();
    } catch (Exception e) {
        throw new MetaverseException(e);
    }
}
Also used : FileObject(org.apache.commons.vfs2.FileObject) File(java.io.File) KettleException(org.pentaho.di.core.exception.KettleException) MetaverseException(org.pentaho.metaverse.api.MetaverseException) KettleFileException(org.pentaho.di.core.exception.KettleFileException) MetaverseException(org.pentaho.metaverse.api.MetaverseException)

Example 7 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class LineageClient method getOriginSteps.

/**
 * Finds the step(s) in the given transformation that created the given field, with respect to the given target step.
 * This means if a field has been renamed or derived from another field from another step, then the lineage graph
 * is traversed back from the target step to determine which steps contributed to the field in the target step.
 * This differs from getCreatorSteps() as the lineage graph traversal will not stop with a "creates" relationship;
 * rather, this method will traverse other relationships ("uses", "derives", e.g.) to find the actual origin fields
 * that comprise the final field in the target step.
 *
 * @param transMeta      a reference to a transformation's metadata
 * @param targetStepName the target step name associated with the given field names
 * @param fieldNames     a collection of field names associated with the target step, for which to find the step(s)
 *                       and field(s) that contributed to those fields
 * @return a map from target field name to step-field objects, where each step has created a field with
 * the returned name, and that field has contributed in some way to the specified target field.
 * @throws MetaverseException if an error occurred while finding the origin steps
 */
@Override
public Map<String, Set<StepField>> getOriginSteps(TransMeta transMeta, String targetStepName, Collection<String> fieldNames) throws MetaverseException {
    Map<String, Set<StepField>> originStepsMap = new HashMap<>();
    try {
        Future<Graph> lineageGraphTask = LineageGraphMap.getInstance().get(transMeta);
        if (lineageGraphTask != null) {
            Graph lineageGraph = lineageGraphTask.get();
            List<Vertex> targetFields = getTargetFields(lineageGraph, targetStepName, fieldNames);
            GremlinPipeline pipe = getOriginStepsPipe(targetFields);
            List<List<Vertex>> pathList = pipe.toList();
            if (pathList != null) {
                for (List<Vertex> path : pathList) {
                    // Transform each path of vertices into a "path" of StepFieldOperations objects (basically save off
                    // properties of each vertex into a new list)
                    String targetField = path.get(0).getProperty(DictionaryConst.PROPERTY_NAME);
                    Set<StepField> pathSet = originStepsMap.get(targetField);
                    if (pathSet == null) {
                        pathSet = new HashSet<>();
                        originStepsMap.put(targetField, pathSet);
                    }
                    Vertex v = path.get(path.size() - 1);
                    Map<String, String> stepField = STEPFIELDOPS_PIPE_FUNC.compute(v);
                    String stepName = stepField.get("stepName");
                    String fieldName = stepField.get("fieldName");
                    pathSet.add(new StepField(stepName, fieldName));
                }
            }
        }
    } catch (Exception e) {
        throw new MetaverseException(e);
    }
    return originStepsMap;
}
Also used : Vertex(com.tinkerpop.blueprints.Vertex) GremlinPipeline(com.tinkerpop.gremlin.java.GremlinPipeline) HashSet(java.util.HashSet) Set(java.util.Set) HashMap(java.util.HashMap) MetaverseException(org.pentaho.metaverse.api.MetaverseException) Graph(com.tinkerpop.blueprints.Graph) StepField(org.pentaho.metaverse.api.StepField) ArrayList(java.util.ArrayList) List(java.util.List) MetaverseException(org.pentaho.metaverse.api.MetaverseException)

Example 8 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class LineageClient method getOperationPaths.

/**
 * Returns the paths between the origin field(s) and target field(s). A path in this context is an ordered list of
 * StepFieldOperations objects, each of which corresponds to a field at a certain step where operation(s) are
 * applied. The order of the list corresponds to the order of the steps from the origin step (see getOriginSteps())
 * to the target step. This method can be used to trace a target field back to its origin and discovering what
 * operations were performed upon it during it's lifetime. Inversely the path could be used to re-apply the operations
 * to the origin field, resulting in the field's "value" at each point in the path.
 *
 * @param transMeta      a reference to a transformation's metadata
 * @param targetStepName the target step name associated with the given field names
 * @param fieldNames     an array of field names associated with the target step, for which to find the step(s) and
 *                       field(s) and operation(s) that contributed to those fields
 * @return a map of target field name to an ordered list of StepFieldOperations objects, describing the path from the
 * origin step field to the target step field, including the operations performed.
 * @throws MetaverseException if an error occurred while finding the origin steps
 */
@Override
public Map<String, Set<List<StepFieldOperations>>> getOperationPaths(TransMeta transMeta, String targetStepName, final Collection<String> fieldNames) throws MetaverseException {
    Map<String, Set<List<StepFieldOperations>>> operationPathMap = new HashMap<>();
    try {
        Future<Graph> lineageGraphTask = LineageGraphMap.getInstance().get(transMeta);
        if (lineageGraphTask != null) {
            Graph lineageGraph = lineageGraphTask.get();
            if (lineageGraph != null) {
                // Get the creator field nodes for all the field names passed in
                List<Vertex> getTargetFields = getTargetFields(lineageGraph, targetStepName, fieldNames);
                // The "origin steps pipe" with a second param of true returns a pipeline that will return paths between
                // the origin field nodes and the target field node.
                GremlinPipeline pipe = getOriginStepsPipe(getTargetFields);
                List<List<Vertex>> pathList = pipe.toList();
                if (pathList != null) {
                    for (List<Vertex> path : pathList) {
                        // Transform each path of vertices into a "path" of StepFieldOperations objects (basically save off
                        // properties of each vertex into a new list)
                        List<StepFieldOperations> stepFieldOps = new ArrayList<>();
                        String targetField = path.get(0).getProperty(DictionaryConst.PROPERTY_NAME);
                        Set<List<StepFieldOperations>> pathSet = operationPathMap.get(targetField);
                        if (pathSet == null) {
                            pathSet = new HashSet<>();
                            operationPathMap.put(targetField, pathSet);
                        }
                        for (Vertex v : path) {
                            Map<String, String> stepField = STEPFIELDOPS_PIPE_FUNC.compute(v);
                            String stepName = stepField.get("stepName");
                            String fieldName = stepField.get("fieldName");
                            Operations operations = MetaverseUtil.convertOperationsStringToMap((String) v.getProperty(DictionaryConst.PROPERTY_OPERATIONS));
                            stepFieldOps.add(0, new StepFieldOperations(stepName, fieldName, operations));
                        }
                        pathSet.add(stepFieldOps);
                    }
                }
            }
        }
    } catch (Exception e) {
        throw new MetaverseException(e);
    }
    return operationPathMap;
}
Also used : Vertex(com.tinkerpop.blueprints.Vertex) GremlinPipeline(com.tinkerpop.gremlin.java.GremlinPipeline) HashSet(java.util.HashSet) Set(java.util.Set) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) MetaverseException(org.pentaho.metaverse.api.MetaverseException) Graph(com.tinkerpop.blueprints.Graph) StepFieldOperations(org.pentaho.metaverse.api.StepFieldOperations) ArrayList(java.util.ArrayList) List(java.util.List) Operations(org.pentaho.metaverse.api.model.Operations) StepFieldOperations(org.pentaho.metaverse.api.StepFieldOperations) MetaverseException(org.pentaho.metaverse.api.MetaverseException)

Example 9 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class JobRuntimeExtensionPoint method callExtensionPoint.

/**
 * Callback when a job is about to be started
 *
 * @param logChannelInterface A reference to the log in this context (the Job object's log)
 * @param o                   The object being operated on (Job in this case)
 * @throws org.pentaho.di.core.exception.KettleException
 */
@Override
public void callExtensionPoint(LogChannelInterface logChannelInterface, Object o) throws KettleException {
    // Job Started listeners get called after the extension point is invoked, so just add a job listener
    if (o instanceof Job) {
        Job job = ((Job) o);
        // If runtime lineage collection is disabled, don't run any lineage processes/methods
        if (!isRuntimeEnabled()) {
            return;
        }
        // Create and populate an execution profile with what we know so far
        ExecutionProfile executionProfile = new ExecutionProfile();
        populateExecutionProfile(executionProfile, job);
        IMetaverseBuilder builder = JobLineageHolderMap.getInstance().getMetaverseBuilder(job);
        // Add the job finished listener
        job.addJobListener(this);
        // Analyze the current transformation
        if (documentAnalyzer != null) {
            documentAnalyzer.setMetaverseBuilder(builder);
            // Create a document for the Trans
            final String clientName = executionProfile.getExecutionEngine().getName();
            final INamespace namespace = new Namespace(clientName);
            final IMetaverseNode designNode = builder.getMetaverseObjectFactory().createNodeObject(clientName, clientName, DictionaryConst.NODE_TYPE_LOCATOR);
            builder.addNode(designNode);
            final JobMeta jobMeta = job.getJobMeta();
            // The variables and parameters in the Job may not have been set on the meta, so we do it here
            // to ensure the job analyzer will have access to the parameter values.
            jobMeta.copyParametersFrom(job);
            jobMeta.activateParameters();
            job.copyVariablesFrom(jobMeta);
            if (job.getRep() != null) {
                jobMeta.setRepository(job.getRep());
            }
            String id = getFilename(jobMeta);
            if (!id.endsWith(jobMeta.getDefaultExtension())) {
                id += "." + jobMeta.getDefaultExtension();
            }
            IDocument metaverseDocument = builder.getMetaverseObjectFactory().createDocumentObject();
            metaverseDocument.setNamespace(namespace);
            metaverseDocument.setContent(jobMeta);
            metaverseDocument.setStringID(id);
            metaverseDocument.setName(jobMeta.getName());
            metaverseDocument.setExtension("kjb");
            metaverseDocument.setMimeType(URLConnection.getFileNameMap().getContentTypeFor("job.kjb"));
            metaverseDocument.setContext(new AnalysisContext(DictionaryConst.CONTEXT_RUNTIME));
            String normalizedPath;
            try {
                normalizedPath = KettleAnalyzerUtil.normalizeFilePath(id);
            } catch (MetaverseException e) {
                normalizedPath = id;
            }
            metaverseDocument.setProperty(DictionaryConst.PROPERTY_NAME, job.getName());
            metaverseDocument.setProperty(DictionaryConst.PROPERTY_PATH, normalizedPath);
            metaverseDocument.setProperty(DictionaryConst.PROPERTY_NAMESPACE, namespace.getNamespaceId());
            Runnable analyzerRunner = MetaverseUtil.getAnalyzerRunner(documentAnalyzer, metaverseDocument);
            MetaverseCompletionService.getInstance().submit(analyzerRunner, id);
        }
        // Save the lineage objects for later
        LineageHolder holder = JobLineageHolderMap.getInstance().getLineageHolder(job);
        holder.setExecutionProfile(executionProfile);
        holder.setMetaverseBuilder(builder);
    }
}
Also used : JobMeta(org.pentaho.di.job.JobMeta) IMetaverseNode(org.pentaho.metaverse.api.IMetaverseNode) AnalysisContext(org.pentaho.metaverse.api.AnalysisContext) ExecutionProfile(org.pentaho.metaverse.impl.model.ExecutionProfile) IExecutionProfile(org.pentaho.metaverse.api.model.IExecutionProfile) INamespace(org.pentaho.metaverse.api.INamespace) Namespace(org.pentaho.metaverse.api.Namespace) INamespace(org.pentaho.metaverse.api.INamespace) Job(org.pentaho.di.job.Job) IMetaverseBuilder(org.pentaho.metaverse.api.IMetaverseBuilder) IDocument(org.pentaho.metaverse.api.IDocument) MetaverseException(org.pentaho.metaverse.api.MetaverseException) LineageHolder(org.pentaho.metaverse.api.model.LineageHolder)

Example 10 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class TransOpenedExtensionPoint method callExtensionPoint.

/**
 * This method is called by the Kettle code
 *
 * @param log    the logging channel to log debugging information to
 * @param object The subject object that is passed to the plugin code
 * @throws org.pentaho.di.core.exception.KettleException If an error has occurred and the parent process should stop.
 */
@Override
public void callExtensionPoint(final LogChannelInterface log, Object object) throws KettleException {
    if (object instanceof TransMeta) {
        try {
            TransMeta transMeta = (TransMeta) object;
            TransExtensionPointUtil.addLineageGraph(transMeta);
        } catch (MetaverseException me) {
            if (log != null && log.isDebug()) {
                log.logDebug(Messages.getString("ERROR.Graph.CouldNotCreate", me.getMessage()));
            }
        }
    }
}
Also used : TransMeta(org.pentaho.di.trans.TransMeta) MetaverseException(org.pentaho.metaverse.api.MetaverseException)

Aggregations

MetaverseException (org.pentaho.metaverse.api.MetaverseException)11 Graph (com.tinkerpop.blueprints.Graph)4 IMetaverseBuilder (org.pentaho.metaverse.api.IMetaverseBuilder)4 IMetaverseNode (org.pentaho.metaverse.api.IMetaverseNode)4 TransMeta (org.pentaho.di.trans.TransMeta)3 IDocument (org.pentaho.metaverse.api.IDocument)3 INamespace (org.pentaho.metaverse.api.INamespace)3 Namespace (org.pentaho.metaverse.api.Namespace)3 Vertex (com.tinkerpop.blueprints.Vertex)2 TinkerGraph (com.tinkerpop.blueprints.impls.tg.TinkerGraph)2 GremlinPipeline (com.tinkerpop.gremlin.java.GremlinPipeline)2 File (java.io.File)2 ArrayList (java.util.ArrayList)2 HashMap (java.util.HashMap)2 HashSet (java.util.HashSet)2 List (java.util.List)2 Set (java.util.Set)2 KettleException (org.pentaho.di.core.exception.KettleException)2 AnalysisContext (org.pentaho.metaverse.api.AnalysisContext)2 IExecutionProfile (org.pentaho.metaverse.api.model.IExecutionProfile)2