Search in sources :

Example 6 with LocusQueryCreator

use of org.molgenis.data.annotation.core.query.LocusQueryCreator in project molgenis by molgenis.

the class DannAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createDannOutputAttributes();
    AnnotatorInfo dannInfo = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.PATHOGENICITY_ESTIMATE, NAME, "Annotating genetic variants, especially non-coding variants, " + "for the purpose of identifying pathogenic variants remains a challenge." + " Combined annotation-dependent depletion (CADD) is an al- gorithm designed " + "to annotate both coding and non-coding variants, and has been shown to outper- form " + "other annotation algorithms. CADD trains a linear kernel support vector machine (SVM) " + "to dif- ferentiate evolutionarily derived, likely benign, alleles from simulated, " + "likely deleterious, variants. However, SVMs cannot capture non-linear relationships" + " among the features, which can limit performance. To address this issue, we have" + " developed DANN. DANN uses the same feature set and training data as CADD to train" + " a deep neural network (DNN). DNNs can capture non-linear relation- ships among " + "features and are better suited than SVMs for problems with a large number of samples " + "and features. We exploit Compute Unified Device Architecture-compatible " + "graphics processing units and deep learning techniques such as dropout and momentum " + "training to accelerate the DNN training. DANN achieves about a 19%relative reduction " + "in the error rate and about a 14%relative increase in the area under the curve (AUC) metric " + "over CADD’s SVM methodology. " + "All data and source code are available at https://cbcl.ics.uci.edu/ public_data/DANN/.", attributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(DANN_TABIX_RESOURCE, dannInfo, new LocusQueryCreator(vcfAttributes), new MultiAllelicResultFilter(attributes, vcfAttributes), dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(DANN_LOCATION, dannAnnotatorSettings)) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createDannOutputAttributes();
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Attribute(org.molgenis.data.meta.model.Attribute) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) MultiAllelicResultFilter(org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory) SingleFileLocationCmdLineAnnotatorSettingsConfigurer(org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)

Example 7 with LocusQueryCreator

use of org.molgenis.data.annotation.core.query.LocusQueryCreator in project molgenis by molgenis.

the class ThousandGenomesAnnotator method init.

@Override
public void init() {
    List<Attribute> attributes = createThousandGenomesOutputAttributes();
    AnnotatorInfo thousandGenomeInfo = AnnotatorInfo.create(Status.READY, AnnotatorInfo.Type.POPULATION_REFERENCE, NAME, "The 1000 Genomes Project is an international collaboration to produce an " + "extensive public catalog of human genetic variation, including SNPs and structural variants, " + "and their haplotype contexts. This resource will support genome-wide association studies and other " + "medical research studies. " + "The genomes of about 2500 unidentified people from about 25 populations around the world will be" + "sequenced using next-generation sequencing technologies. " + "The results of the study will be freely and publicly accessible to researchers worldwide. " + "Further information about the project is available in the About tab. Information about downloading, " + "browsing or using the 1000 Genomes data is available at: http://www.1000genomes.org/ ", attributes);
    LocusQueryCreator locusQueryCreator = new LocusQueryCreator(vcfAttributes);
    MultiAllelicResultFilter multiAllelicResultFilter = new MultiAllelicResultFilter(singletonList(attributeFactory.create().setName(THOUSAND_GENOME_AF_RESOURCE_ATTRIBUTE_NAME).setDataType(DECIMAL)), vcfAttributes);
    EntityAnnotator entityAnnotator = new AbstractAnnotator(THOUSAND_GENOME_MULTI_FILE_RESOURCE, thousandGenomeInfo, locusQueryCreator, multiAllelicResultFilter, dataService, resources, (annotationSourceFileName) -> {
        thousendGenomesAnnotatorSettings.set(ROOT_DIRECTORY, annotationSourceFileName);
        thousendGenomesAnnotatorSettings.set(FILEPATTERN, "ALL.chr%s.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.vcf.gz");
        thousendGenomesAnnotatorSettings.set(CHROMOSOMES, "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22");
    }) {

        @Override
        public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
            return createThousandGenomesOutputAttributes();
        }

        @Override
        protected Object getResourceAttributeValue(Attribute attr, Entity entityType) {
            String attrName = THOUSAND_GENOME_AF.equals(attr.getName()) ? THOUSAND_GENOME_AF_RESOURCE_ATTRIBUTE_NAME : attr.getName();
            return entityType.get(attrName);
        }
    };
    annotator.init(entityAnnotator);
}
Also used : LocusQueryCreator(org.molgenis.data.annotation.core.query.LocusQueryCreator) Entity(org.molgenis.data.Entity) Attribute(org.molgenis.data.meta.model.Attribute) MultiAllelicResultFilter(org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter) EntityAnnotator(org.molgenis.data.annotation.core.entity.EntityAnnotator) AbstractAnnotator(org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator) AnnotatorInfo(org.molgenis.data.annotation.core.entity.AnnotatorInfo) AttributeFactory(org.molgenis.data.meta.model.AttributeFactory)

Aggregations

AnnotatorInfo (org.molgenis.data.annotation.core.entity.AnnotatorInfo)7 EntityAnnotator (org.molgenis.data.annotation.core.entity.EntityAnnotator)7 LocusQueryCreator (org.molgenis.data.annotation.core.query.LocusQueryCreator)7 Attribute (org.molgenis.data.meta.model.Attribute)7 AttributeFactory (org.molgenis.data.meta.model.AttributeFactory)7 AbstractAnnotator (org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator)6 MultiAllelicResultFilter (org.molgenis.data.annotation.core.filter.MultiAllelicResultFilter)5 SingleFileLocationCmdLineAnnotatorSettingsConfigurer (org.molgenis.data.annotation.web.settings.SingleFileLocationCmdLineAnnotatorSettingsConfigurer)5 Entity (org.molgenis.data.Entity)4 Lists.newArrayList (com.google.common.collect.Lists.newArrayList)1 ArrayList (java.util.ArrayList)1 MolgenisDataException (org.molgenis.data.MolgenisDataException)1 QueryAnnotatorImpl (org.molgenis.data.annotation.core.entity.impl.framework.QueryAnnotatorImpl)1 ClinvarMultiAllelicResultFilter (org.molgenis.data.annotation.core.filter.ClinvarMultiAllelicResultFilter)1