Search in sources :

Example 6 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class KettleAnalyzerUtil method normalizeFilePath.

 * Utility method for normalizing file paths used in Metaverse Id generation. It will convert a valid path into a
 * consistent path regardless of URI notation or filesystem absolute path.
 * @param filePath full path to normalize
 * @return the normalized path
public static String normalizeFilePath(String filePath) throws MetaverseException {
    try {
        String path = filePath;
        FileObject fo = KettleVFS.getFileObject(filePath);
        try {
            path = fo.getURL().getPath();
        } catch (Throwable t) {
        // Something went wrong with VFS, just try the filePath
        File f = new File(path);
        return f.getAbsolutePath();
    } catch (Exception e) {
        throw new MetaverseException(e);
Also used : FileObject(org.apache.commons.vfs2.FileObject) File( KettleException(org.pentaho.di.core.exception.KettleException) MetaverseException(org.pentaho.metaverse.api.MetaverseException) KettleFileException(org.pentaho.di.core.exception.KettleFileException) MetaverseException(org.pentaho.metaverse.api.MetaverseException)

Example 7 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class LineageClient method getOriginSteps.

 * Finds the step(s) in the given transformation that created the given field, with respect to the given target step.
 * This means if a field has been renamed or derived from another field from another step, then the lineage graph
 * is traversed back from the target step to determine which steps contributed to the field in the target step.
 * This differs from getCreatorSteps() as the lineage graph traversal will not stop with a "creates" relationship;
 * rather, this method will traverse other relationships ("uses", "derives", e.g.) to find the actual origin fields
 * that comprise the final field in the target step.
 * @param transMeta      a reference to a transformation's metadata
 * @param targetStepName the target step name associated with the given field names
 * @param fieldNames     a collection of field names associated with the target step, for which to find the step(s)
 *                       and field(s) that contributed to those fields
 * @return a map from target field name to step-field objects, where each step has created a field with
 * the returned name, and that field has contributed in some way to the specified target field.
 * @throws MetaverseException if an error occurred while finding the origin steps
public Map<String, Set<StepField>> getOriginSteps(TransMeta transMeta, String targetStepName, Collection<String> fieldNames) throws MetaverseException {
    Map<String, Set<StepField>> originStepsMap = new HashMap<>();
    try {
        Future<Graph> lineageGraphTask = LineageGraphMap.getInstance().get(transMeta);
        if (lineageGraphTask != null) {
            Graph lineageGraph = lineageGraphTask.get();
            List<Vertex> targetFields = getTargetFields(lineageGraph, targetStepName, fieldNames);
            GremlinPipeline pipe = getOriginStepsPipe(targetFields);
            List<List<Vertex>> pathList = pipe.toList();
            if (pathList != null) {
                for (List<Vertex> path : pathList) {
                    // Transform each path of vertices into a "path" of StepFieldOperations objects (basically save off
                    // properties of each vertex into a new list)
                    String targetField = path.get(0).getProperty(DictionaryConst.PROPERTY_NAME);
                    Set<StepField> pathSet = originStepsMap.get(targetField);
                    if (pathSet == null) {
                        pathSet = new HashSet<>();
                        originStepsMap.put(targetField, pathSet);
                    Vertex v = path.get(path.size() - 1);
                    Map<String, String> stepField = STEPFIELDOPS_PIPE_FUNC.compute(v);
                    String stepName = stepField.get("stepName");
                    String fieldName = stepField.get("fieldName");
                    pathSet.add(new StepField(stepName, fieldName));
    } catch (Exception e) {
        throw new MetaverseException(e);
    return originStepsMap;
Also used : Vertex(com.tinkerpop.blueprints.Vertex) GremlinPipeline( HashSet(java.util.HashSet) Set(java.util.Set) HashMap(java.util.HashMap) MetaverseException(org.pentaho.metaverse.api.MetaverseException) Graph(com.tinkerpop.blueprints.Graph) StepField(org.pentaho.metaverse.api.StepField) ArrayList(java.util.ArrayList) List(java.util.List) MetaverseException(org.pentaho.metaverse.api.MetaverseException)

Example 8 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class LineageClient method getOperationPaths.

 * Returns the paths between the origin field(s) and target field(s). A path in this context is an ordered list of
 * StepFieldOperations objects, each of which corresponds to a field at a certain step where operation(s) are
 * applied. The order of the list corresponds to the order of the steps from the origin step (see getOriginSteps())
 * to the target step. This method can be used to trace a target field back to its origin and discovering what
 * operations were performed upon it during it's lifetime. Inversely the path could be used to re-apply the operations
 * to the origin field, resulting in the field's "value" at each point in the path.
 * @param transMeta      a reference to a transformation's metadata
 * @param targetStepName the target step name associated with the given field names
 * @param fieldNames     an array of field names associated with the target step, for which to find the step(s) and
 *                       field(s) and operation(s) that contributed to those fields
 * @return a map of target field name to an ordered list of StepFieldOperations objects, describing the path from the
 * origin step field to the target step field, including the operations performed.
 * @throws MetaverseException if an error occurred while finding the origin steps
public Map<String, Set<List<StepFieldOperations>>> getOperationPaths(TransMeta transMeta, String targetStepName, final Collection<String> fieldNames) throws MetaverseException {
    Map<String, Set<List<StepFieldOperations>>> operationPathMap = new HashMap<>();
    try {
        Future<Graph> lineageGraphTask = LineageGraphMap.getInstance().get(transMeta);
        if (lineageGraphTask != null) {
            Graph lineageGraph = lineageGraphTask.get();
            if (lineageGraph != null) {
                // Get the creator field nodes for all the field names passed in
                List<Vertex> getTargetFields = getTargetFields(lineageGraph, targetStepName, fieldNames);
                // The "origin steps pipe" with a second param of true returns a pipeline that will return paths between
                // the origin field nodes and the target field node.
                GremlinPipeline pipe = getOriginStepsPipe(getTargetFields);
                List<List<Vertex>> pathList = pipe.toList();
                if (pathList != null) {
                    for (List<Vertex> path : pathList) {
                        // Transform each path of vertices into a "path" of StepFieldOperations objects (basically save off
                        // properties of each vertex into a new list)
                        List<StepFieldOperations> stepFieldOps = new ArrayList<>();
                        String targetField = path.get(0).getProperty(DictionaryConst.PROPERTY_NAME);
                        Set<List<StepFieldOperations>> pathSet = operationPathMap.get(targetField);
                        if (pathSet == null) {
                            pathSet = new HashSet<>();
                            operationPathMap.put(targetField, pathSet);
                        for (Vertex v : path) {
                            Map<String, String> stepField = STEPFIELDOPS_PIPE_FUNC.compute(v);
                            String stepName = stepField.get("stepName");
                            String fieldName = stepField.get("fieldName");
                            Operations operations = MetaverseUtil.convertOperationsStringToMap((String) v.getProperty(DictionaryConst.PROPERTY_OPERATIONS));
                            stepFieldOps.add(0, new StepFieldOperations(stepName, fieldName, operations));
    } catch (Exception e) {
        throw new MetaverseException(e);
    return operationPathMap;
Also used : Vertex(com.tinkerpop.blueprints.Vertex) GremlinPipeline( HashSet(java.util.HashSet) Set(java.util.Set) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) MetaverseException(org.pentaho.metaverse.api.MetaverseException) Graph(com.tinkerpop.blueprints.Graph) StepFieldOperations(org.pentaho.metaverse.api.StepFieldOperations) ArrayList(java.util.ArrayList) List(java.util.List) Operations(org.pentaho.metaverse.api.model.Operations) StepFieldOperations(org.pentaho.metaverse.api.StepFieldOperations) MetaverseException(org.pentaho.metaverse.api.MetaverseException)

Example 9 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class JobRuntimeExtensionPoint method callExtensionPoint.

 * Callback when a job is about to be started
 * @param logChannelInterface A reference to the log in this context (the Job object's log)
 * @param o                   The object being operated on (Job in this case)
 * @throws org.pentaho.di.core.exception.KettleException
public void callExtensionPoint(LogChannelInterface logChannelInterface, Object o) throws KettleException {
    // Job Started listeners get called after the extension point is invoked, so just add a job listener
    if (o instanceof Job) {
        Job job = ((Job) o);
        // If runtime lineage collection is disabled, don't run any lineage processes/methods
        if (!isRuntimeEnabled()) {
        // Create and populate an execution profile with what we know so far
        ExecutionProfile executionProfile = new ExecutionProfile();
        populateExecutionProfile(executionProfile, job);
        IMetaverseBuilder builder = JobLineageHolderMap.getInstance().getMetaverseBuilder(job);
        // Add the job finished listener
        // Analyze the current transformation
        if (documentAnalyzer != null) {
            // Create a document for the Trans
            final String clientName = executionProfile.getExecutionEngine().getName();
            final INamespace namespace = new Namespace(clientName);
            final IMetaverseNode designNode = builder.getMetaverseObjectFactory().createNodeObject(clientName, clientName, DictionaryConst.NODE_TYPE_LOCATOR);
            final JobMeta jobMeta = job.getJobMeta();
            // The variables and parameters in the Job may not have been set on the meta, so we do it here
            // to ensure the job analyzer will have access to the parameter values.
            if (job.getRep() != null) {
            String id = getFilename(jobMeta);
            if (!id.endsWith(jobMeta.getDefaultExtension())) {
                id += "." + jobMeta.getDefaultExtension();
            IDocument metaverseDocument = builder.getMetaverseObjectFactory().createDocumentObject();
            metaverseDocument.setContext(new AnalysisContext(DictionaryConst.CONTEXT_RUNTIME));
            String normalizedPath;
            try {
                normalizedPath = KettleAnalyzerUtil.normalizeFilePath(id);
            } catch (MetaverseException e) {
                normalizedPath = id;
            metaverseDocument.setProperty(DictionaryConst.PROPERTY_NAME, job.getName());
            metaverseDocument.setProperty(DictionaryConst.PROPERTY_PATH, normalizedPath);
            metaverseDocument.setProperty(DictionaryConst.PROPERTY_NAMESPACE, namespace.getNamespaceId());
            Runnable analyzerRunner = MetaverseUtil.getAnalyzerRunner(documentAnalyzer, metaverseDocument);
            MetaverseCompletionService.getInstance().submit(analyzerRunner, id);
        // Save the lineage objects for later
        LineageHolder holder = JobLineageHolderMap.getInstance().getLineageHolder(job);
Also used : JobMeta(org.pentaho.di.job.JobMeta) IMetaverseNode(org.pentaho.metaverse.api.IMetaverseNode) AnalysisContext(org.pentaho.metaverse.api.AnalysisContext) ExecutionProfile(org.pentaho.metaverse.impl.model.ExecutionProfile) IExecutionProfile(org.pentaho.metaverse.api.model.IExecutionProfile) INamespace(org.pentaho.metaverse.api.INamespace) Namespace(org.pentaho.metaverse.api.Namespace) INamespace(org.pentaho.metaverse.api.INamespace) Job(org.pentaho.di.job.Job) IMetaverseBuilder(org.pentaho.metaverse.api.IMetaverseBuilder) IDocument(org.pentaho.metaverse.api.IDocument) MetaverseException(org.pentaho.metaverse.api.MetaverseException) LineageHolder(org.pentaho.metaverse.api.model.LineageHolder)

Example 10 with MetaverseException

use of org.pentaho.metaverse.api.MetaverseException in project pentaho-metaverse by pentaho.

the class TransOpenedExtensionPoint method callExtensionPoint.

 * This method is called by the Kettle code
 * @param log    the logging channel to log debugging information to
 * @param object The subject object that is passed to the plugin code
 * @throws org.pentaho.di.core.exception.KettleException If an error has occurred and the parent process should stop.
public void callExtensionPoint(final LogChannelInterface log, Object object) throws KettleException {
    if (object instanceof TransMeta) {
        try {
            TransMeta transMeta = (TransMeta) object;
        } catch (MetaverseException me) {
            if (log != null && log.isDebug()) {
                log.logDebug(Messages.getString("ERROR.Graph.CouldNotCreate", me.getMessage()));
Also used : TransMeta(org.pentaho.di.trans.TransMeta) MetaverseException(org.pentaho.metaverse.api.MetaverseException)


MetaverseException (org.pentaho.metaverse.api.MetaverseException)11 Graph (com.tinkerpop.blueprints.Graph)4 IMetaverseBuilder (org.pentaho.metaverse.api.IMetaverseBuilder)4 IMetaverseNode (org.pentaho.metaverse.api.IMetaverseNode)4 TransMeta (org.pentaho.di.trans.TransMeta)3 IDocument (org.pentaho.metaverse.api.IDocument)3 INamespace (org.pentaho.metaverse.api.INamespace)3 Namespace (org.pentaho.metaverse.api.Namespace)3 Vertex (com.tinkerpop.blueprints.Vertex)2 TinkerGraph ( GremlinPipeline ( File ( ArrayList (java.util.ArrayList)2 HashMap (java.util.HashMap)2 HashSet (java.util.HashSet)2 List (java.util.List)2 Set (java.util.Set)2 KettleException (org.pentaho.di.core.exception.KettleException)2 AnalysisContext (org.pentaho.metaverse.api.AnalysisContext)2 IExecutionProfile (org.pentaho.metaverse.api.model.IExecutionProfile)2