Search in sources :

Example 1 with EntityAnnotator

use of org.molgenis.data.annotation.core.entity.EntityAnnotator in project molgenis by molgenis.

the class HPOAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createHpoOutputAttributes();
    AnnotatorInfo info = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.PHENOTYPE_ASSOCIATION, NAME, "The Human Phenotype Ontology (HPO) aims to provide a standardized vocabulary of phenotypic abnormalities encountered in human disease." + "Terms in the HPO describes a phenotypic abnormality, such as atrial septal defect.The HPO is currently being developed using the medical literature, Orphanet, DECIPHER, and OMIM. HPO currently contains approximately 11,000 terms and over 115,000 annotations to hereditary diseases." + "Please note that if SnpEff was used to annotate in order to add the gene symbols to the variants, than this annotator should be used on the result entity rather than the variant entity itself.", attributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(HPO_RESOURCE, info, geneNameQueryCreator, new HpoResultFilter(entityTypeFactory, this), dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(HPO_LOCATION, HPOAnnotatorSettings)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createHpoOutputAttributes();
        }
    };
    annotator.init(entityAnnotator);
}
Also used : Attribute(org.molgenis.data.meta.model.Attribute) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory) SingleFileLocationCmdLineAnnotatorSettingsConfigurer(org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)

Example 2 with EntityAnnotator

use of org.molgenis.data.annotation.core.entity.EntityAnnotator in project molgenis by molgenis.

the class CaddAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createCaddAnnotatorAttributes();
    AnnotatorInfo caddInfo = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.PATHOGENICITY_ESTIMATE, NAME, "CADD is a tool for scoring the deleteriousness of single nucleotide variants as well as insertion/deletions variants in the human genome.\n" + "While many variant annotation and scoring utils are around, most annotations tend to exploit a single information type (e.g. conservation) " + "and/or are restricted in scope (e.g. to missense changes). " + "Thus, a broadly applicable metric that objectively weights and integrates diverse information is needed. " + "Combined Annotation Dependent Depletion (CADD) is a framework that integrates multiple " + "annotations into one metric by contrasting variants that survived natural selection with simulated mutations.\n" + "C-scores strongly correlate with allelic diversity, pathogenicity of both coding and non-coding variants, and experimentally measured " + "regulatory effects, and also highly rank causal variants within " + "individual genome sequences. Finally, C-scores of complex trait-associated variants from genome-wide association studies (GWAS) are " + "significantly higher than matched controls and correlate with study sample size, likely reflecting the increased accuracy of larger GWAS.\n" + "CADD can quantitatively prioritize functional, deleterious, and disease causal variants across a wide range of functional categories, " + "effect sizes and genetic architectures and can be used prioritize " + "causal variation in both research and clinical settings. (source: http://cadd.gs.washington.edu/info)", attributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(CADD_TABIX_RESOURCE, caddInfo, new LocusQueryCreator(vcfAttributes), new MultiAllelicResultFilter(attributes, true, vcfAttributes), dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(CaddAnnotatorSettings.Meta.CADD_LOCATION, caddAnnotatorSettings)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createCaddAnnotatorAttributes();
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Attribute(org.molgenis.data.meta.model.Attribute) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) MultiAllelicResultFilter(org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory) SingleFileLocationCmdLineAnnotatorSettingsConfigurer(org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)

Example 3 with EntityAnnotator

use of org.molgenis.data.annotation.core.entity.EntityAnnotator in project molgenis by molgenis.

the class FitConAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createFitconOutputAttributes();
    AnnotatorInfo fitconInfo = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.EFFECT_PREDICTION, NAME, "Summary: Annotating genetic variants, especially non-coding variants, " + "for the purpose of identifying pathogenic variants remains a challenge. " + "Combined annotation-dependent depletion (CADD) is an al- gorithm designed " + "to annotate both coding and non-coding variants, and has been shown to " + "outper- form other annotation algorithms. CADD trains a linear kernel support" + " vector machine (SVM) to dif- ferentiate evolutionarily derived, likely benign," + " alleles from simulated, likely deleterious, variants. However, SVMs cannot " + "capture non-linear relationships among the features, which can limit per- formance. " + "To address this issue, we have developed FITCON. FITCON uses the same feature set and " + "training data as CADD to train a deep neural network (DNN). DNNs can capture non-linear" + " relation- ships among features and are better suited than SVMs for problems with a " + "large number of samples and features. We exploit Compute Unified Device Architecture-compatible" + " graphics processing units and deep learning techniques such as dropout and momentum training to" + " accelerate the DNN train- ing. FITCON achieves about a 19%relative reduction in the error rate and" + " about a 14%relative increase in the area under the curve (AUC) metric over CADD’s SVMmethodology." + " All data and source code are available at https://cbcl.ics.uci.edu/ public_data/FITCON/. Contact:", attributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(FITCON_TABIX_RESOURCE, fitconInfo, new LocusQueryCreator(vcfAttributes), new MultiAllelicResultFilter(attributes, vcfAttributes), dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(FITCON_LOCATION, fitConAnnotatorSettings)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createFitconOutputAttributes();
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Attribute(org.molgenis.data.meta.model.Attribute) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) MultiAllelicResultFilter(org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory) SingleFileLocationCmdLineAnnotatorSettingsConfigurer(org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)

Example 4 with EntityAnnotator

use of org.molgenis.data.annotation.core.entity.EntityAnnotator in project molgenis by molgenis.

the class GavinAnnotator method init.

public void init() {
    LinkedList<Attribute> attributes = createGavinOutputAttributes();
    String description = "Please note that this annotator processes the results from a SnpEff annotation\nTherefor it should be used on the result entity rather than the variant entity itself.\nThe corresponding variant entity should also be annotated with CADD and EXaC";
    AnnotatorInfo gavinInfo = AnnotatorInfo.create(READY, PATHOGENICITY_ESTIMATE, NAME, description, attributes);
    EntityAnnotator entityAnnotator = new QueryAnnotatorImpl(RESOURCE, gavinInfo, geneNameQueryCreator, dataService, resources, (annotationSourceFileName) -> gavinAnnotatorSettings.set(VARIANT_FILE_LOCATION, annotationSourceFileName)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createGavinOutputAttributes();
        }

        @Override
        public List<Attribute> getRequiredAttributes() {
            List<Attribute> requiredAttributes = new ArrayList<>();
            EntityType entityType = entityTypeFactory.create(VARIANT);
            List<Attribute> refAttributesList = Arrays.asList(createCaddScaledAttr(attributeFactory), getExacAFAttr(attributeFactory), vcfAttributes.getAltAttribute());
            entityType.addAttributes(refAttributesList);
            Attribute refAttr = attributeFactory.create().setName(VARIANT).setDataType(XREF).setRefEntity(entityType).setDescription("This annotator needs a references to an entity containing: " + StreamSupport.stream(refAttributesList.spliterator(), false).map(Attribute::getName).collect(Collectors.joining(", ")));
            requiredAttributes.addAll(Arrays.asList(effectsMetaData.getGeneNameAttr(), effectsMetaData.getPutativeImpactAttr(), refAttr, effectsMetaData.getAltAttr()));
            return requiredAttributes;
        }

        @Override
        protected void processQueryResults(Entity entity, Iterable<Entity> annotationSourceEntities, boolean updateMode) {
            if (updateMode) {
                throw new MolgenisDataException("This annotator/filter does not support updating of values");
            }
            String alt = entity.getString(EffectsMetaData.ALT);
            if (alt == null) {
                entity.set(CLASSIFICATION, "");
                entity.set(CONFIDENCE, "");
                entity.set(REASON, "Missing ALT allele no judgment could be determined.");
                return;
            }
            if (alt.contains(",")) {
                throw new MolgenisDataException("The gavin annotator only accepts single allele input ('effect entities').");
            }
            int sourceEntitiesSize = Iterables.size(annotationSourceEntities);
            Entity variantEntity = entity.getEntity(VARIANT);
            Map<String, Double> caddMap = AnnotatorUtils.toAlleleMap(variantEntity.getString(ALT), variantEntity.getString(CADD_SCALED));
            Map<String, Double> exacMap = AnnotatorUtils.toAlleleMap(variantEntity.getString(ALT), variantEntity.getString(EXAC_AF));
            Impact impact = Impact.valueOf(entity.getString(PUTATIVE_IMPACT));
            Double exacMAF = exacMap.get(alt);
            Double caddScaled = caddMap.get(alt);
            String gene = entity.getString(GENE_NAME);
            if (exacMAF == null) {
                exacMAF = 0.0;
            }
            if (sourceEntitiesSize == 1) {
                Entity annotationSourceEntity = annotationSourceEntities.iterator().next();
                Judgment judgment = gavinAlgorithm.classifyVariant(impact, caddScaled, exacMAF, gene, GavinThresholds.fromEntity(annotationSourceEntity));
                entity.set(CLASSIFICATION, judgment.getClassification().toString());
                entity.set(CONFIDENCE, judgment.getConfidence().toString());
                entity.set(REASON, judgment.getReason());
            } else if (sourceEntitiesSize == 0) {
                // if we have no data for this gene, immediately fall back to the naive method
                Judgment judgment = gavinAlgorithm.genomewideClassifyVariant(impact, caddScaled, exacMAF, gene);
                entity.set(CLASSIFICATION, judgment.getClassification().toString());
                entity.set(CONFIDENCE, judgment.getConfidence().toString());
                entity.set(REASON, judgment.getReason());
            } else {
                String message = "invalid number [" + sourceEntitiesSize + "] of results for this gene in annotation resource";
                entity.set(REASON, message);
                throw new MolgenisDataException(message);
            }
        }
    };
    annotator.init(entityAnnotator);
}
Also used : Impact(org.molgenis.data.annotation.core.entity.impl.snpeff.Impact) Entity(org.molgenis.data.Entity) QueryAnnotatorImpl(org.molgenis.data.annotation.core.entity.impl.framework.QueryAnnotatorImpl) MolgenisDataException(org.molgenis.data.MolgenisDataException) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo)

Example 5 with EntityAnnotator

use of org.molgenis.data.annotation.core.entity.EntityAnnotator in project molgenis by molgenis.

the class ExacAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createExacOutputAttributes();
    List<Attribute> resourceMetaData = new ArrayList<>(asList(attributeFactory.create().setName(EXAC_AF_ResourceAttributeName).setDataType(STRING), attributeFactory.create().setName(EXAC_AC_HOM_ResourceAttributeName).setDataType(STRING), attributeFactory.create().setName(EXAC_AC_HET_ResourceAttributeName).setDataType(STRING)));
    AnnotatorInfo exacInfo = AnnotatorInfo.create(READY, POPULATION_REFERENCE, "exac", " The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate" + " and harmonize exome sequencing data from a wide variety of large-scale sequencing projects" + ", and to make summary data available for the wider scientific community.The data set provided" + " on this website spans 60,706 unrelated individuals sequenced as part of various " + "disease-specific and population genetic studies. ", attributes);
    // TODO: properly test multiAllelicFresultFilter
    LocusQueryCreator locusQueryCreator = new LocusQueryCreator(vcfAttributes);
    MultiAllelicResultFilter multiAllelicResultFilter = new MultiAllelicResultFilter(resourceMetaData, vcfAttributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(EXAC_TABIX_RESOURCE, exacInfo, locusQueryCreator, multiAllelicResultFilter, dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(EXAC_LOCATION, exacAnnotatorSettings)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createExacOutputAttributes();
        }

        @Override
        protected Object getResourceAttributeValue(Attribute attr, Entity sourceEntity) {
            String attrName = EXAC_AF.equals(attr.getName()) ? EXAC_AF_ResourceAttributeName : EXAC_AC_HOM.equals(attr.getName()) ? EXAC_AC_HOM_ResourceAttributeName : EXAC_AC_HET.equals(attr.getName()) ? EXAC_AC_HET_ResourceAttributeName : attr.getName();
            return sourceEntity.get(attrName);
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Entity(org.molgenis.data.Entity) Attribute(org.molgenis.data.meta.model.Attribute) MultiAllelicResultFilter(org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) ArrayList(java.util.ArrayList) Lists.newArrayList(com.google.common.collect.Lists.newArrayList) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory) SingleFileLocationCmdLineAnnotatorSettingsConfigurer(org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)

Aggregations

AnnotatorInfo (org.molgenis.data.annotation.core.entity.AnnotatorInfo)10 EntityAnnotator (org.molgenis.data.annotation.core.entity.EntityAnnotator)10 Attribute (org.molgenis.data.meta.model.Attribute)9 AttributeFactory (org.molgenis.data.meta.model.AttributeFactory)9 AbstractAnnotator (org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator)8 LocusQueryCreator (org.molgenis.data.annotation.core.query.LocusQueryCreator)7 SingleFileLocationCmdLineAnnotatorSettingsConfigurer (org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)7 Entity (org.molgenis.data.Entity)5 MultiAllelicResultFilter (org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter)5 MolgenisDataException (org.molgenis.data.MolgenisDataException)2 QueryAnnotatorImpl (org.molgenis.data.annotation.core.entity.impl.framework.QueryAnnotatorImpl)2 Lists.newArrayList (com.google.common.collect.Lists.newArrayList)1 ArrayList (java.util.ArrayList)1 Impact (org.molgenis.data.annotation.core.entity.impl.snpeff.Impact)1 ClinvarMultiAllelicResultFilter (org.molgenis.data.annotation.core.filter.ClinvarMultiAllelicResultFilter)1