use of org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator in project molgenis by molgenis.
the class HPOAnnotator method init.
@Override
public void init() {
List<Attribute> attributes = createHpoOutputAttributes();
AnnotatorInfo info = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.PHENOTYPE_ASSOCIATION, NAME, "The Human Phenotype Ontology (HPO) aims to provide a standardized vocabulary of phenotypic abnormalities encountered in human disease." + "Terms in the HPO describes a phenotypic abnormality, such as atrial septal defect.The HPO is currently being developed using the medical literature, Orphanet, DECIPHER, and OMIM. HPO currently contains approximately 11,000 terms and over 115,000 annotations to hereditary diseases." + "Please note that if SnpEff was used to annotate in order to add the gene symbols to the variants, than this annotator should be used on the result entity rather than the variant entity itself.", attributes);
EntityAnnotator entityAnnotator = new AbstractAnnotator(HPO_RESOURCE, info, geneNameQueryCreator, new HpoResultFilter(entityTypeFactory, this), dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(HPO_LOCATION, HPOAnnotatorSettings)) {
@Override
public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
return createHpoOutputAttributes();
}
};
annotator.init(entityAnnotator);
}
use of org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator in project molgenis by molgenis.
the class CaddAnnotator method init.
@Override
public void init() {
List<Attribute> attributes = createCaddAnnotatorAttributes();
AnnotatorInfo caddInfo = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.PATHOGENICITY_ESTIMATE, NAME, "CADD is a tool for scoring the deleteriousness of single nucleotide variants as well as insertion/deletions variants in the human genome.\n" + "While many variant annotation and scoring utils are around, most annotations tend to exploit a single information type (e.g. conservation) " + "and/or are restricted in scope (e.g. to missense changes). " + "Thus, a broadly applicable metric that objectively weights and integrates diverse information is needed. " + "Combined Annotation Dependent Depletion (CADD) is a framework that integrates multiple " + "annotations into one metric by contrasting variants that survived natural selection with simulated mutations.\n" + "C-scores strongly correlate with allelic diversity, pathogenicity of both coding and non-coding variants, and experimentally measured " + "regulatory effects, and also highly rank causal variants within " + "individual genome sequences. Finally, C-scores of complex trait-associated variants from genome-wide association studies (GWAS) are " + "significantly higher than matched controls and correlate with study sample size, likely reflecting the increased accuracy of larger GWAS.\n" + "CADD can quantitatively prioritize functional, deleterious, and disease causal variants across a wide range of functional categories, " + "effect sizes and genetic architectures and can be used prioritize " + "causal variation in both research and clinical settings. (source: http://cadd.gs.washington.edu/info)", attributes);
EntityAnnotator entityAnnotator = new AbstractAnnotator(CADD_TABIX_RESOURCE, caddInfo, new LocusQueryCreator(vcfAttributes), new MultiAllelicResultFilter(attributes, true, vcfAttributes), dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(CaddAnnotatorSettings.Meta.CADD_LOCATION, caddAnnotatorSettings)) {
@Override
public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
return createCaddAnnotatorAttributes();
}
};
annotator.init(entityAnnotator);
}
use of org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator in project molgenis by molgenis.
the class FitConAnnotator method init.
@Override
public void init() {
List<Attribute> attributes = createFitconOutputAttributes();
AnnotatorInfo fitconInfo = AnnotatorInfo.create(AnnotatorInfo.Status.READY, AnnotatorInfo.Type.EFFECT_PREDICTION, NAME, "Summary: Annotating genetic variants, especially non-coding variants, " + "for the purpose of identifying pathogenic variants remains a challenge. " + "Combined annotation-dependent depletion (CADD) is an al- gorithm designed " + "to annotate both coding and non-coding variants, and has been shown to " + "outper- form other annotation algorithms. CADD trains a linear kernel support" + " vector machine (SVM) to dif- ferentiate evolutionarily derived, likely benign," + " alleles from simulated, likely deleterious, variants. However, SVMs cannot " + "capture non-linear relationships among the features, which can limit per- formance. " + "To address this issue, we have developed FITCON. FITCON uses the same feature set and " + "training data as CADD to train a deep neural network (DNN). DNNs can capture non-linear" + " relation- ships among features and are better suited than SVMs for problems with a " + "large number of samples and features. We exploit Compute Unified Device Architecture-compatible" + " graphics processing units and deep learning techniques such as dropout and momentum training to" + " accelerate the DNN train- ing. FITCON achieves about a 19%relative reduction in the error rate and" + " about a 14%relative increase in the area under the curve (AUC) metric over CADD’s SVMmethodology." + " All data and source code are available at https://cbcl.ics.uci.edu/ public_data/FITCON/. Contact:", attributes);
EntityAnnotator entityAnnotator = new AbstractAnnotator(FITCON_TABIX_RESOURCE, fitconInfo, new LocusQueryCreator(vcfAttributes), new MultiAllelicResultFilter(attributes, vcfAttributes), dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(FITCON_LOCATION, fitConAnnotatorSettings)) {
@Override
public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
return createFitconOutputAttributes();
}
};
annotator.init(entityAnnotator);
}
use of org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator in project molgenis by molgenis.
the class ExacAnnotator method init.
@Override
public void init() {
List<Attribute> attributes = createExacOutputAttributes();
List<Attribute> resourceMetaData = new ArrayList<>(asList(attributeFactory.create().setName(EXAC_AF_ResourceAttributeName).setDataType(STRING), attributeFactory.create().setName(EXAC_AC_HOM_ResourceAttributeName).setDataType(STRING), attributeFactory.create().setName(EXAC_AC_HET_ResourceAttributeName).setDataType(STRING)));
AnnotatorInfo exacInfo = AnnotatorInfo.create(READY, POPULATION_REFERENCE, "exac", " The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate" + " and harmonize exome sequencing data from a wide variety of large-scale sequencing projects" + ", and to make summary data available for the wider scientific community.The data set provided" + " on this website spans 60,706 unrelated individuals sequenced as part of various " + "disease-specific and population genetic studies. ", attributes);
// TODO: properly test multiAllelicFresultFilter
LocusQueryCreator locusQueryCreator = new LocusQueryCreator(vcfAttributes);
MultiAllelicResultFilter multiAllelicResultFilter = new MultiAllelicResultFilter(resourceMetaData, vcfAttributes);
EntityAnnotator entityAnnotator = new AbstractAnnotator(EXAC_TABIX_RESOURCE, exacInfo, locusQueryCreator, multiAllelicResultFilter, dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(EXAC_LOCATION, exacAnnotatorSettings)) {
@Override
public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
return createExacOutputAttributes();
}
@Override
protected Object getResourceAttributeValue(Attribute attr, Entity sourceEntity) {
String attrName = EXAC_AF.equals(attr.getName()) ? EXAC_AF_ResourceAttributeName : EXAC_AC_HOM.equals(attr.getName()) ? EXAC_AC_HOM_ResourceAttributeName : EXAC_AC_HET.equals(attr.getName()) ? EXAC_AC_HET_ResourceAttributeName : attr.getName();
return sourceEntity.get(attrName);
}
};
annotator.init(entityAnnotator);
}
use of org.molgenis.data.annotation.core.entity.impl.framework.AbstractAnnotator in project molgenis by molgenis.
the class ClinvarAnnotator method init.
@Override
public void init() {
List<Attribute> attributes = createClinvarOutputAttributes();
AnnotatorInfo clinvarInfo = AnnotatorInfo.create(Status.READY, AnnotatorInfo.Type.PATHOGENICITY_ESTIMATE, NAME, " ClinVar is a freely accessible, public archive of reports of the relationships" + " among human variations and phenotypes, with supporting evidence. ClinVar thus facilitates" + " access to and communication about the relationships asserted between human variation and " + "observed health status, and the history of that interpretation. ClinVar collects reports " + "of variants found in patient samples, assertions made regarding their clinical significance, " + "information about the submitter, and other supporting data. The alleles described in submissions " + "are mapped to reference sequences, and reported according to the HGVS standard. ClinVar then " + "presents the data for interactive users as well as those wishing to use ClinVar in daily " + "workflows and other local applications. ClinVar works in collaboration with interested " + "organizations to meet the needs of the medical genetics community as efficiently and effectively " + "as possible. Information about using ClinVar is available at: http://www.ncbi.nlm.nih.gov/clinvar/docs/help/.", attributes);
LocusQueryCreator locusQueryCreator = new LocusQueryCreator(vcfAttributes);
ClinvarMultiAllelicResultFilter clinvarMultiAllelicResultFilter = new ClinvarMultiAllelicResultFilter(vcfAttributes);
EntityAnnotator entityAnnotator = new AbstractAnnotator(CLINVAR_TABIX_RESOURCE, clinvarInfo, locusQueryCreator, clinvarMultiAllelicResultFilter, dataService, resources, new SingleFileLocationCmdLineAnnotatorSettingsConfigurer(CLINVAR_LOCATION, clinvarAnnotatorSettings)) {
@Override
public List<Attribute> createAnnotatorAttributes(AttributeFactory attributeFactory) {
return createClinvarOutputAttributes();
}
@Override
protected Object getResourceAttributeValue(Attribute attr, Entity sourceEntity) {
String attrName;
if (CLINVAR_CLNSIG.equals(attr.getName())) {
attrName = CLINVAR_CLNSIG_ResourceAttributeName;
} else if (CLINVAR_CLNALLE.equals(attr.getName())) {
attrName = CLINVAR_CLINALL_ResourceAttributeName;
} else {
attrName = attr.getName();
}
return sourceEntity.get(attrName);
}
};
annotator.init(entityAnnotator);
}
Aggregations