Search in sources :

Example 71 with Feature

use of org.apache.uima.cas.Feature in project webanno by webanno.

the class CustomTypesTest method testProfType.

@Test
public void testProfType() throws Exception {
    TypeSystemDescription tsd = TypeSystemDescriptionFactory.createTypeSystemDescription("desc.types.TestTypeSystemDescriptor");
    CAS cas = CasCreationUtils.createCas(tsd, null, null);
    cas.setDocumentText("I listen to lectures by Prof. Gurevych sometimes.");
    TypeSystem ts = cas.getTypeSystem();
    Type profType = ts.getType("de.tud.Prof");
    Feature profNameFeature = profType.getFeatureByBaseName("fullName");
    Feature profBossFeature = profType.getFeatureByBaseName("boss");
    AnnotationFS proemel = cas.createAnnotation(profType, 0, 0);
    proemel.setStringValue(profNameFeature, "Hans Juergen Proeml");
    cas.addFsToIndexes(proemel);
    AnnotationFS gurevych = cas.createAnnotation(profType, 24, 38);
    gurevych.setStringValue(profNameFeature, "Iryna Gurevych");
    gurevych.setFeatureValue(profBossFeature, proemel);
    cas.addFsToIndexes(gurevych);
    for (String feature : Arrays.asList("fullName", "boss")) {
        Feature someFeature = gurevych.getType().getFeatureByBaseName(feature);
        if (someFeature.getRange().isPrimitive()) {
            String value = gurevych.getFeatureValueAsString(someFeature);
            System.out.println(value);
        } else {
            FeatureStructure value = gurevych.getFeatureValue(someFeature);
            System.out.printf("%s (%s)%n", value.getFeatureValueAsString(profNameFeature), value.getType());
        }
    }
}
Also used : FeatureStructure(org.apache.uima.cas.FeatureStructure) AnnotationFS(org.apache.uima.cas.text.AnnotationFS) TypeSystem(org.apache.uima.cas.TypeSystem) Type(org.apache.uima.cas.Type) TypeSystemDescription(org.apache.uima.resource.metadata.TypeSystemDescription) CAS(org.apache.uima.cas.CAS) Feature(org.apache.uima.cas.Feature) Test(org.junit.Test)

Example 72 with Feature

use of org.apache.uima.cas.Feature in project webanno by webanno.

the class ConstraintsVerifier method verify.

@Override
public boolean verify(FeatureStructure featureStructure, ParsedConstraints parsedConstraints) {
    boolean isOk = false;
    Type type = featureStructure.getType();
    for (Feature feature : type.getFeatures()) {
        if (feature.getRange().isPrimitive()) {
            String scopeName = featureStructure.getFeatureValueAsString(feature);
            List<Rule> rules = parsedConstraints.getScopeByName(scopeName).getRules();
        // Check if all the feature values are ok according to the
        // rules;
        } else {
        // Here some recursion would be in order
        }
    }
    return isOk;
}
Also used : Type(org.apache.uima.cas.Type) Rule(de.tudarmstadt.ukp.clarin.webanno.constraints.model.Rule) Feature(org.apache.uima.cas.Feature)

Example 73 with Feature

use of org.apache.uima.cas.Feature in project webanno by webanno.

the class ValuesGenerator method getValue.

private ArrayList<String> getValue(FeatureStructure aContext, String aPath) throws UIMAException {
    String head, tail;
    if (aPath.contains(".")) {
        // Separate first part of path to be
        head = aPath.substring(0, aPath.indexOf("."));
        // processed.
        // The remaining part
        tail = aPath.substring(aPath.indexOf(".") + 1);
    } else {
        head = aPath;
        tail = "";
    }
    List<String> values = new ArrayList<>();
    if (head.startsWith("@")) {
        String typename = imports.get(head.substring(1));
        Type type = aContext.getCAS().getTypeSystem().getType(typename);
        AnnotationFS ctxAnnFs = (AnnotationFS) aContext;
        // List<String> values = new ArrayList<>();
        for (AnnotationFS fs : selectAt(aContext.getCAS(), type, ctxAnnFs.getBegin(), ctxAnnFs.getEnd())) {
            values.addAll(getValue(fs, tail));
        }
        return (ArrayList<String>) values;
    } else if (head.endsWith("()")) {
        if (StringUtils.isNotEmpty(tail)) {
            throw new IllegalStateException("No additional steps possible after function");
        }
        if ("text()".equals(head)) {
            if (aContext instanceof AnnotationFS) {
                values.add(((AnnotationFS) aContext).getCoveredText());
                return (ArrayList<String>) values;
            } else {
                throw new IllegalStateException("Cannot use [text()] on non-annotations");
            }
        } else {
            throw new IllegalStateException("Unknown path function [" + aPath + "]");
        }
    } else if (StringUtils.isNotEmpty(tail)) {
        /*
             * Extracting feature and passing FeatureStructure based on that. Shortening the path
             * variable by removing first element in the aPath separated by "." (dot)
             */
        Feature feature = aContext.getType().getFeatureByBaseName(aPath.substring(0, aPath.indexOf(".")));
        if (feature == null) {
            throw new IllegalStateException("Feature [" + aPath + "] does not exist on type [" + aContext.getType().getName() + "]");
        }
        return getValue(aContext.getFeatureValue(feature), aPath.substring(aPath.indexOf(".") + 1));
    // throw new UnsupportedOperationException("Error in rule");
    } else {
        Feature feature = aContext.getType().getFeatureByBaseName(aPath);
        if (feature == null) {
            throw new IllegalStateException("Feature [" + aPath + "] does not exist on type [" + aContext.getType().getName() + "]");
        }
        values.add(aContext.getFeatureValueAsString(feature));
        return (ArrayList<String>) values;
    }
}
Also used : AnnotationFS(org.apache.uima.cas.text.AnnotationFS) Type(org.apache.uima.cas.Type) ArrayList(java.util.ArrayList) Feature(org.apache.uima.cas.Feature)

Example 74 with Feature

use of org.apache.uima.cas.Feature in project webanno by webanno.

the class WebannoTsv3Reader method addAnnotations.

/**
 * Importing span annotations including slot annotations.
 */
private void addAnnotations(JCas aJCas, Map<Type, Map<AnnotationUnit, List<AnnotationFS>>> aAnnosPerTypePerUnit) {
    for (Type type : annotationsPerPostion.keySet()) {
        Map<AnnotationUnit, Map<Integer, AnnotationFS>> multiTokUnits = new HashMap<>();
        int ref = 1;
        // to see if it is on multiple token
        AnnotationFS prevAnnoFs = null;
        for (AnnotationUnit unit : annotationsPerPostion.get(type).keySet()) {
            int end = unit.end;
            List<AnnotationFS> annos = aAnnosPerTypePerUnit.get(type).get(unit);
            int j = 0;
            Feature linkeF = null;
            Map<AnnotationFS, List<FeatureStructure>> linkFSesPerSlotAnno = new HashMap<>();
            if (allLayers.get(type).size() == 0) {
                ref = addAnnotationWithNoFeature(aJCas, type, unit, annos, multiTokUnits, end, ref);
                continue;
            }
            for (Feature feat : allLayers.get(type)) {
                String anno = annotationsPerPostion.get(type).get(unit).get(j);
                if (!anno.equals("_")) {
                    int i = 0;
                    // if it is a slot annotation (multiple slots per
                    // single annotation
                    // (Target1<--role1--Base--role2-->Target2)
                    int slot = 0;
                    boolean targetAdd = false;
                    String stackedAnnoRegex = "(?<!\\\\)" + Pattern.quote("|");
                    String[] stackedAnnos = anno.split(stackedAnnoRegex);
                    for (String mAnnos : stackedAnnos) {
                        String multipleSlotAnno = "(?<!\\\\)" + Pattern.quote(";");
                        for (String mAnno : mAnnos.split(multipleSlotAnno)) {
                            String depRef = "";
                            String multSpliter = "(?<!\\\\)" + Pattern.quote("[");
                            // is this slot target ambiguous?
                            boolean ambigTarget = false;
                            if (mAnno.split(multSpliter).length > 1) {
                                ambigTarget = true;
                                depRef = mAnno.substring(mAnno.indexOf("[") + 1, mAnno.length() - 1);
                                ref = depRef.contains("_") ? ref : Integer.valueOf(mAnno.substring(mAnno.indexOf("[") + 1, mAnno.length() - 1));
                                mAnno = mAnno.substring(0, mAnno.indexOf("["));
                            }
                            if (mAnno.equals("*")) {
                                mAnno = null;
                            }
                            boolean isMultitoken = false;
                            if (!multiTokUnits.isEmpty() && prevAnnoFs != null && prevAnnoFs.getBegin() != unit.begin) {
                                contAnno: for (AnnotationUnit u : multiTokUnits.keySet()) {
                                    for (Integer r : multiTokUnits.get(u).keySet()) {
                                        if (ref == r) {
                                            isMultitoken = true;
                                            prevAnnoFs = multiTokUnits.get(u).get(r);
                                            break contAnno;
                                        }
                                    }
                                }
                            }
                            if (isMultitoken) {
                                Feature endF = type.getFeatureByBaseName(CAS.FEATURE_BASE_NAME_END);
                                prevAnnoFs.setIntValue(endF, end);
                                mAnno = getEscapeChars(mAnno);
                                prevAnnoFs.setFeatureValueFromString(feat, mAnno);
                                if (feat.getShortName().equals(REF_LINK)) {
                                    // since REF_REL do not start with BIO,
                                    // update it it...
                                    annos.set(i, prevAnnoFs);
                                }
                                setAnnoRefPerUnit(unit, type, ref, prevAnnoFs);
                            } else {
                                if (roleLinks.containsKey(feat)) {
                                    linkeF = feat;
                                    FeatureStructure link = aJCas.getCas().createFS(slotLinkTypes.get(feat));
                                    Feature roleFeat = link.getType().getFeatureByBaseName("role");
                                    mAnno = getEscapeChars(mAnno);
                                    link.setStringValue(roleFeat, mAnno);
                                    linkFSesPerSlotAnno.putIfAbsent(annos.get(i), new ArrayList<>());
                                    linkFSesPerSlotAnno.get(annos.get(i)).add(link);
                                } else if (roleTargets.containsKey(feat)) {
                                    FeatureStructure link = linkFSesPerSlotAnno.get(annos.get(i)).get(slot);
                                    int customTypeNumber = 0;
                                    if (mAnno.split("-").length > 2) {
                                        customTypeNumber = Integer.valueOf(mAnno.substring(mAnno.lastIndexOf("-") + 1));
                                        mAnno = mAnno.substring(0, mAnno.lastIndexOf("-"));
                                    }
                                    AnnotationUnit targetUnit = token2Units.get(mAnno);
                                    Type tType = null;
                                    if (customTypeNumber == 0) {
                                        tType = roleTargets.get(feat);
                                    } else {
                                        tType = layerMaps.get(customTypeNumber);
                                    }
                                    AnnotationFS targetFs;
                                    if (ambigTarget) {
                                        targetFs = annosPerRef.get(tType).get(targetUnit).get(ref);
                                    } else {
                                        targetFs = annosPerRef.get(tType).get(targetUnit).entrySet().iterator().next().getValue();
                                    }
                                    link.setFeatureValue(feat, targetFs);
                                    addSlotAnnotations(linkFSesPerSlotAnno, linkeF);
                                    targetAdd = true;
                                    slot++;
                                } else if (feat.getShortName().equals(REF_REL)) {
                                    int chainNo = Integer.valueOf(mAnno.split("->")[1].split("-")[0]);
                                    int LinkNo = Integer.valueOf(mAnno.split("->")[1].split("-")[1]);
                                    chainAnnosPerTyep.putIfAbsent(type, new TreeMap<>());
                                    if (chainAnnosPerTyep.get(type).get(chainNo) != null && chainAnnosPerTyep.get(type).get(chainNo).get(LinkNo) != null) {
                                        continue;
                                    }
                                    String refRel = mAnno.split("->")[0];
                                    refRel = getEscapeChars(refRel);
                                    if (refRel.equals("*")) {
                                        refRel = null;
                                    }
                                    annos.get(i).setFeatureValueFromString(feat, refRel);
                                    chainAnnosPerTyep.putIfAbsent(type, new TreeMap<>());
                                    chainAnnosPerTyep.get(type).putIfAbsent(chainNo, new TreeMap<>());
                                    chainAnnosPerTyep.get(type).get(chainNo).put(LinkNo, annos.get(i));
                                } else if (feat.getShortName().equals(REF_LINK)) {
                                    mAnno = getEscapeChars(mAnno);
                                    annos.get(i).setFeatureValueFromString(feat, mAnno);
                                    aJCas.addFsToIndexes(annos.get(i));
                                } else if (depFeatures.get(type) != null && depFeatures.get(type).equals(feat)) {
                                    int g = depRef.isEmpty() ? 0 : Integer.valueOf(depRef.split("_")[0]);
                                    int d = depRef.isEmpty() ? 0 : Integer.valueOf(depRef.split("_")[1]);
                                    Type depType = depTypess.get(type);
                                    AnnotationUnit govUnit = token2Units.get(mAnno);
                                    int l = annotationsPerPostion.get(type).get(unit).size();
                                    String thisUnit = annotationsPerPostion.get(type).get(unit).get(l - 1);
                                    AnnotationUnit depUnit = token2Units.get(thisUnit);
                                    AnnotationFS govFs;
                                    AnnotationFS depFs;
                                    if (depType.getName().equals(POS.class.getName())) {
                                        depType = aJCas.getCas().getTypeSystem().getType(Token.class.getName());
                                        govFs = units2Tokens.get(govUnit);
                                        depFs = units2Tokens.get(unit);
                                    } else // in WebAnno world :)(!
                                    if (depType.getName().equals(Token.class.getName())) {
                                        govFs = units2Tokens.get(govUnit);
                                        depFs = units2Tokens.get(unit);
                                    } else if (g == 0 && d == 0) {
                                        govFs = annosPerRef.get(depType).get(govUnit).entrySet().iterator().next().getValue();
                                        depFs = annosPerRef.get(depType).get(depUnit).entrySet().iterator().next().getValue();
                                    } else if (g == 0) {
                                        govFs = annosPerRef.get(depType).get(govUnit).entrySet().iterator().next().getValue();
                                        depFs = annosPerRef.get(depType).get(depUnit).get(d);
                                    } else {
                                        govFs = annosPerRef.get(depType).get(govUnit).get(g);
                                        depFs = annosPerRef.get(depType).get(depUnit).entrySet().iterator().next().getValue();
                                    }
                                    annos.get(i).setFeatureValue(feat, depFs);
                                    annos.get(i).setFeatureValue(type.getFeatureByBaseName(GOVERNOR), govFs);
                                    if (depFs.getBegin() <= annos.get(i).getBegin()) {
                                        Feature beginF = type.getFeatureByBaseName(CAS.FEATURE_BASE_NAME_BEGIN);
                                        annos.get(i).setIntValue(beginF, depFs.getBegin());
                                    } else {
                                        Feature endF = type.getFeatureByBaseName(CAS.FEATURE_BASE_NAME_END);
                                        annos.get(i).setIntValue(endF, depFs.getEnd());
                                    }
                                    aJCas.addFsToIndexes(annos.get(i));
                                } else {
                                    mAnno = getEscapeChars(mAnno);
                                    multiTokUnits.putIfAbsent(unit, new HashMap<>());
                                    multiTokUnits.get(unit).put(ref, annos.get(i));
                                    prevAnnoFs = annos.get(i);
                                    annos.get(i).setFeatureValueFromString(feat, mAnno);
                                    aJCas.addFsToIndexes(annos.get(i));
                                    setAnnoRefPerUnit(unit, type, ref, annos.get(i));
                                }
                            }
                            if (stackedAnnos.length > 1) {
                                ref++;
                            }
                        }
                        if (type.getName().equals(POS.class.getName())) {
                            units2Tokens.get(unit).setPos((POS) annos.get(i));
                        }
                        if (type.getName().equals(Lemma.class.getName())) {
                            units2Tokens.get(unit).setLemma((Lemma) annos.get(i));
                        }
                        if (type.getName().equals(Stem.class.getName())) {
                            units2Tokens.get(unit).setStem((Stem) annos.get(i));
                        }
                        if (type.getName().equals(MorphologicalFeatures.class.getName())) {
                            units2Tokens.get(unit).setMorph((MorphologicalFeatures) annos.get(i));
                        }
                        i++;
                    }
                    if (targetAdd) {
                        linkFSesPerSlotAnno = new HashMap<>();
                    }
                } else {
                    prevAnnoFs = null;
                }
                j++;
            }
            if (prevAnnoFs != null) {
                ref++;
            }
        }
        annosPerRef.put(type, multiTokUnits);
    }
}
Also used : MorphologicalFeatures(de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.morph.MorphologicalFeatures) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) Feature(org.apache.uima.cas.Feature) Stem(de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Stem) FeatureStructure(org.apache.uima.cas.FeatureStructure) AnnotationFS(org.apache.uima.cas.text.AnnotationFS) Type(org.apache.uima.cas.Type) AnnotationUnit(de.tudarmstadt.ukp.clarin.webanno.tsv.util.AnnotationUnit) POS(de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.POS) Lemma(de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Lemma) ArrayList(java.util.ArrayList) List(java.util.List) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) Map(java.util.Map) TreeMap(java.util.TreeMap)

Example 75 with Feature

use of org.apache.uima.cas.Feature in project webanno by webanno.

the class WebannoTsv3Reader method setLayerAndFeature.

/**
 * Get the type and feature information from the TSV file header
 *
 * @param header
 *            the header line
 * @throws IOException
 *             If the type or the feature do not exist in the CAs
 */
private void setLayerAndFeature(JCas aJcas, String header) throws IOException {
    try {
        StringTokenizer headerTk = new StringTokenizer(header, "#");
        while (headerTk.hasMoreTokens()) {
            String layerNames = headerTk.nextToken().trim();
            StringTokenizer layerTk = new StringTokenizer(layerNames, "|");
            Set<Feature> features = new LinkedHashSet<>();
            String layerName = layerTk.nextToken().trim();
            layerName = layerName.substring(layerName.indexOf("=") + 1);
            Iterator<Type> types = aJcas.getTypeSystem().getTypeIterator();
            boolean layerExists = false;
            while (types.hasNext()) {
                if (types.next().getName().equals(layerName)) {
                    layerExists = true;
                    break;
                }
            }
            if (!layerExists) {
                throw new IOException(fileName + " This is not a valid TSV File. The layer " + layerName + " is not created in the project.");
            }
            Type layer = CasUtil.getType(aJcas.getCas(), layerName);
            // holder
            if (!layerTk.hasMoreTokens()) {
                columns++;
                allLayers.put(layer, features);
                layerMaps.put(layerMaps.size() + 1, layer);
                return;
            }
            while (layerTk.hasMoreTokens()) {
                String ft = layerTk.nextToken().trim();
                columns++;
                Feature feature;
                if (ft.startsWith(BT)) {
                    feature = layer.getFeatureByBaseName(DEPENDENT);
                    depFeatures.put(layer, feature);
                    depTypess.put(layer, CasUtil.getType(aJcas.getCas(), ft.substring(3)));
                } else {
                    feature = layer.getFeatureByBaseName(ft);
                }
                if (ft.startsWith(ROLE)) {
                    ft = ft.substring(5);
                    String t = layerTk.nextToken();
                    columns++;
                    Type tType = CasUtil.getType(aJcas.getCas(), t);
                    String fName = ft.substring(0, ft.indexOf("_"));
                    Feature slotF = layer.getFeatureByBaseName(fName.substring(fName.indexOf(":") + 1));
                    if (slotF == null) {
                        throw new IOException(fileName + " This is not a valid TSV File. The feature " + ft + " is not created for the layer " + layerName);
                    }
                    features.add(slotF);
                    roleLinks.put(slotF, tType);
                    Type slotType = CasUtil.getType(aJcas.getCas(), ft.substring(ft.indexOf("_") + 1));
                    Feature tFeatore = slotType.getFeatureByBaseName("target");
                    if (tFeatore == null) {
                        throw new IOException(fileName + " This is not a valid TSV File. The feature " + ft + " is not created for the layer " + layerName);
                    }
                    roleTargets.put(tFeatore, tType);
                    features.add(tFeatore);
                    slotLinkTypes.put(slotF, slotType);
                    continue;
                }
                if (feature == null) {
                    throw new IOException(fileName + " This is not a valid TSV File. The feature " + ft + " is not created for the layer " + layerName);
                }
                features.add(feature);
            }
            allLayers.put(layer, features);
            layerMaps.put(layerMaps.size() + 1, layer);
        }
    } catch (Exception e) {
        throw new IOException(e.getMessage() + "\nTSV header:\n" + header);
    }
}
Also used : LinkedHashSet(java.util.LinkedHashSet) StringTokenizer(java.util.StringTokenizer) Type(org.apache.uima.cas.Type) IOException(java.io.IOException) Feature(org.apache.uima.cas.Feature) CollectionException(org.apache.uima.collection.CollectionException) IOException(java.io.IOException)

Aggregations

Feature (org.apache.uima.cas.Feature)84 Type (org.apache.uima.cas.Type)62 AnnotationFeature (de.tudarmstadt.ukp.clarin.webanno.model.AnnotationFeature)50 AnnotationFS (org.apache.uima.cas.text.AnnotationFS)48 ArrayList (java.util.ArrayList)23 FeatureStructure (org.apache.uima.cas.FeatureStructure)18 CasUtil.getType (org.apache.uima.fit.util.CasUtil.getType)18 JCas (org.apache.uima.jcas.JCas)18 List (java.util.List)15 Test (org.junit.Test)14 Token (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token)13 WebAnnoCasUtil.setFeature (de.tudarmstadt.ukp.clarin.webanno.api.annotation.util.WebAnnoCasUtil.setFeature)12 POS (de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.POS)12 CAS (org.apache.uima.cas.CAS)10 HashSet (java.util.HashSet)8 LinkedHashMap (java.util.LinkedHashMap)8 Map (java.util.Map)8 HashMap (java.util.HashMap)7 TypeSystem (org.apache.uima.cas.TypeSystem)7 AnnotationException (de.tudarmstadt.ukp.clarin.webanno.api.annotation.exception.AnnotationException)6