Search in sources :

Example 1 with LocusQueryCreator

use of org.molgenis.data.annotation.core.query.LocusQueryCreator in project molgenis by molgenis.

the class CaddAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createCaddAnnotatorAttributes();
    AnnotatorInfo caddInfo = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.PATHOGENICITY_ESTIMATE, NAME, "CADD is a tool for scoring the deleteriousness of single nucleotide variants as well as insertion/deletions variants in the human genome.\n" + "While many variant annotation and scoring utils are around, most annotations tend to exploit a single information type (e.g. conservation) " + "and/or are restricted in scope (e.g. to missense changes). " + "Thus, a broadly applicable metric that objectively weights and integrates diverse information is needed. " + "Combined Annotation Dependent Depletion (CADD) is a framework that integrates multiple " + "annotations into one metric by contrasting variants that survived natural selection with simulated mutations.\n" + "C-scores strongly correlate with allelic diversity, pathogenicity of both coding and non-coding variants, and experimentally measured " + "regulatory effects, and also highly rank causal variants within " + "individual genome sequences. Finally, C-scores of complex trait-associated variants from genome-wide association studies (GWAS) are " + "significantly higher than matched controls and correlate with study sample size, likely reflecting the increased accuracy of larger GWAS.\n" + "CADD can quantitatively prioritize functional, deleterious, and disease causal variants across a wide range of functional categories, " + "effect sizes and genetic architectures and can be used prioritize " + "causal variation in both research and clinical settings. (source: http://cadd.gs.washington.edu/info)", attributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(CADD_TABIX_RESOURCE, caddInfo, new LocusQueryCreator(vcfAttributes), new MultiAllelicResultFilter(attributes, true, vcfAttributes), dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(CaddAnnotatorSettings.Meta.CADD_LOCATION, caddAnnotatorSettings)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createCaddAnnotatorAttributes();
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Attribute(org.molgenis.data.meta.model.Attribute) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) MultiAllelicResultFilter(org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory) SingleFileLocationCmdLineAnnotatorSettingsConfigurer(org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)

Example 2 with LocusQueryCreator

use of org.molgenis.data.annotation.core.query.LocusQueryCreator in project molgenis by molgenis.

the class FitConAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createFitconOutputAttributes();
    AnnotatorInfo fitconInfo = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.EFFECT_PREDICTION, NAME, "Summary: Annotating genetic variants, especially non-coding variants, " + "for the purpose of identifying pathogenic variants remains a challenge. " + "Combined annotation-dependent depletion (CADD) is an al- gorithm designed " + "to annotate both coding and non-coding variants, and has been shown to " + "outper- form other annotation algorithms. CADD trains a linear kernel support" + " vector machine (SVM) to dif- ferentiate evolutionarily derived, likely benign," + " alleles from simulated, likely deleterious, variants. However, SVMs cannot " + "capture non-linear relationships among the features, which can limit per- formance. " + "To address this issue, we have developed FITCON. FITCON uses the same feature set and " + "training data as CADD to train a deep neural network (DNN). DNNs can capture non-linear" + " relation- ships among features and are better suited than SVMs for problems with a " + "large number of samples and features. We exploit Compute Unified Device Architecture-compatible" + " graphics processing units and deep learning techniques such as dropout and momentum training to" + " accelerate the DNN train- ing. FITCON achieves about a 19%relative reduction in the error rate and" + " about a 14%relative increase in the area under the curve (AUC) metric over CADD’s SVMmethodology." + " All data and source code are available at https://cbcl.ics.uci.edu/ public_data/FITCON/. Contact:", attributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(FITCON_TABIX_RESOURCE, fitconInfo, new LocusQueryCreator(vcfAttributes), new MultiAllelicResultFilter(attributes, vcfAttributes), dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(FITCON_LOCATION, fitConAnnotatorSettings)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createFitconOutputAttributes();
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Attribute(org.molgenis.data.meta.model.Attribute) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) MultiAllelicResultFilter(org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory) SingleFileLocationCmdLineAnnotatorSettingsConfigurer(org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)

Example 3 with LocusQueryCreator

use of org.molgenis.data.annotation.core.query.LocusQueryCreator in project molgenis by molgenis.

the class ExacAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createExacOutputAttributes();
    List<Attribute> resourceMetaData = new ArrayList<>(asList(attributeFactory.create().setName(EXAC_AF_ResourceAttributeName).setDataType(STRING), attributeFactory.create().setName(EXAC_AC_HOM_ResourceAttributeName).setDataType(STRING), attributeFactory.create().setName(EXAC_AC_HET_ResourceAttributeName).setDataType(STRING)));
    AnnotatorInfo exacInfo = AnnotatorInfo.create(READY, POPULATION_REFERENCE, "exac", " The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate" + " and harmonize exome sequencing data from a wide variety of large-scale sequencing projects" + ", and to make summary data available for the wider scientific community.The data set provided" + " on this website spans 60,706 unrelated individuals sequenced as part of various " + "disease-specific and population genetic studies. ", attributes);
    // TODO: properly test multiAllelicFresultFilter
    LocusQueryCreator locusQueryCreator = new LocusQueryCreator(vcfAttributes);
    MultiAllelicResultFilter multiAllelicResultFilter = new MultiAllelicResultFilter(resourceMetaData, vcfAttributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(EXAC_TABIX_RESOURCE, exacInfo, locusQueryCreator, multiAllelicResultFilter, dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(EXAC_LOCATION, exacAnnotatorSettings)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createExacOutputAttributes();
        }

        @Override
        protected Object getResourceAttributeValue(Attribute attr, Entity sourceEntity) {
            String attrName = EXAC_AF.equals(attr.getName()) ? EXAC_AF_ResourceAttributeName : EXAC_AC_HOM.equals(attr.getName()) ? EXAC_AC_HOM_ResourceAttributeName : EXAC_AC_HET.equals(attr.getName()) ? EXAC_AC_HET_ResourceAttributeName : attr.getName();
            return sourceEntity.get(attrName);
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Entity(org.molgenis.data.Entity) Attribute(org.molgenis.data.meta.model.Attribute) MultiAllelicResultFilter(org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) ArrayList(java.util.ArrayList) Lists.newArrayList(com.google.common.collect.Lists.newArrayList) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory) SingleFileLocationCmdLineAnnotatorSettingsConfigurer(org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)

Example 4 with LocusQueryCreator

use of org.molgenis.data.annotation.core.query.LocusQueryCreator in project molgenis by molgenis.

the class ClinvarAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createClinvarOutputAttributes();
    AnnotatorInfo clinvarInfo = AnnotatorInfo.create(Status.READY, AnnotatorInfo.Type.PATHOGENICITY_ESTIMATE, NAME, " ClinVar is a freely accessible, public archive of reports of the relationships" + " among human variations and phenotypes, with supporting evidence. ClinVar thus facilitates" + " access to and communication about the relationships asserted between human variation and " + "observed health status, and the history of that interpretation. ClinVar collects reports " + "of variants found in patient samples, assertions made regarding their clinical significance, " + "information about the submitter, and other supporting data. The alleles described in submissions " + "are mapped to reference sequences, and reported according to the HGVS standard. ClinVar then " + "presents the data for interactive users as well as those wishing to use ClinVar in daily " + "workflows and other local applications. ClinVar works in collaboration with interested " + "organizations to meet the needs of the medical genetics community as efficiently and effectively " + "as possible. Information about using ClinVar is available at: http://www.ncbi.nlm.nih.gov/clinvar/docs/help/.", attributes);
    LocusQueryCreator locusQueryCreator = new LocusQueryCreator(vcfAttributes);
    ClinvarMultiAllelicResultFilter clinvarMultiAllelicResultFilter = new ClinvarMultiAllelicResultFilter(vcfAttributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(CLINVAR_TABIX_RESOURCE, clinvarInfo, locusQueryCreator, clinvarMultiAllelicResultFilter, dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(CLINVAR_LOCATION, clinvarAnnotatorSettings)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createClinvarOutputAttributes();
        }

        @Override
        protected Object getResourceAttributeValue(Attribute attr, Entity sourceEntity) {
            String attrName;
            if (CLINVAR_CLNSIG.equals(attr.getName())) {
                attrName = CLINVAR_CLNSIG_ResourceAttributeName;
            } else if (CLINVAR_CLNALLE.equals(attr.getName())) {
                attrName = CLINVAR_CLINALL_ResourceAttributeName;
            } else {
                attrName = attr.getName();
            }
            return sourceEntity.get(attrName);
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Entity(org.molgenis.data.Entity) Attribute(org.molgenis.data.meta.model.Attribute) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) ClinvarMultiAllelicResultFilter(org.molgenis.data.annotation.core.filter.ClinvarMultiAllelicResultFilter) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory) SingleFileLocationCmdLineAnnotatorSettingsConfigurer(org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)

Example 5 with LocusQueryCreator

use of org.molgenis.data.annotation.core.query.LocusQueryCreator in project molgenis by molgenis.

the class GoNLAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createGoNlOutputAttributes();
    AnnotatorInfo thousandGenomeInfo = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.POPULATION_REFERENCE, NAME, "What genetic variation is to be found in the Dutch indigenous population? " + "Detailed knowledge about this is not only interesting in itself, " + "it also helps to extract useful biomedical information from Dutch biobanks. " + "The Dutch biobank collaboration BBMRI-NL has initiated the extensive Rainbow Project “Genome of the Netherlands” (GoNL) " + "because it offers unique opportunities for science and for the development of new treatments and diagnostic techniques. " + "A close-up look at the DNA of 750 Dutch people-250 trio’s of two parents and an adult child-plus a " + "global genetic profile of large numbers of Dutch will disclose a wealth of new information, new insights, " + "and possible applications.", attributes);
    LocusQueryCreator locusQueryCreator = new LocusQueryCreator(vcfAttributes);
    EntityAnnotator entityAnnotator = new QueryAnnotatorImpl(GONL_MULTI_FILE_RESOURCE, thousandGenomeInfo, locusQueryCreator, dataService, resources, (annotationSourceFileName) -> {
        goNLAnnotatorSettings.set(ROOT_DIRECTORY, annotationSourceFileName);
        goNLAnnotatorSettings.set(FILEPATTERN, "gonl.chr%s.snps_indels.r5.vcf.gz");
        goNLAnnotatorSettings.set(OVERRIDE_CHROMOSOME_FILES, "X:gonl.chrX.release4.gtc.vcf.gz");
        goNLAnnotatorSettings.set(CHROMOSOMES, "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X");
    }) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createGoNlOutputAttributes();
        }

        @Override
        protected void processQueryResults(Entity entity, Iterable<Entity> annotationSourceEntities, boolean updateMode) {
            if (updateMode) {
                throw new MolgenisDataException("This annotator/filter does not support updating of values");
            }
            List<Entity> refMatches = determineRefMatches(entity, annotationSourceEntities);
            setGoNLFrequencies(entity, refMatches);
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Entity(org.molgenis.data.Entity) MolgenisDataException(org.molgenis.data.MolgenisDataException) Attribute(org.molgenis.data.meta.model.Attribute) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) QueryAnnotatorImpl(org.molgenis.data.annotation.core.entity.impl.framework.QueryAnnotatorImpl) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory)

Aggregations

AnnotatorInfo (org.molgenis.data.annotation.core.entity.AnnotatorInfo)7 EntityAnnotator (org.molgenis.data.annotation.core.entity.EntityAnnotator)7 LocusQueryCreator (org.molgenis.data.annotation.core.query.LocusQueryCreator)7 Attribute (org.molgenis.data.meta.model.Attribute)7 AttributeFactory (org.molgenis.data.meta.model.AttributeFactory)7 AbstractAnnotator (org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator)6 MultiAllelicResultFilter (org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter)5 SingleFileLocationCmdLineAnnotatorSettingsConfigurer (org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)5 Entity (org.molgenis.data.Entity)4 Lists.newArrayList (com.google.common.collect.Lists.newArrayList)1 ArrayList (java.util.ArrayList)1 MolgenisDataException (org.molgenis.data.MolgenisDataException)1 QueryAnnotatorImpl (org.molgenis.data.annotation.core.entity.impl.framework.QueryAnnotatorImpl)1 ClinvarMultiAllelicResultFilter (org.molgenis.data.annotation.core.filter.ClinvarMultiAllelicResultFilter)1