Search in sources :

Example 1 with SearchResults

use of gov.nih.nci.ctd2.dashboard.util.SearchResults in project nci-ctd2-dashboard by CBIIT.

the class DashboardDaoImpl method ontologySearch.

/*
     * To get observation 'search' results, i.e. the intersection concept, the
     * implmentation of ontology search will be much more complex. Ideally it would
     * be better to compeltely separate the observation part, but to avoid repeating
     * the actual hierarchical searching, embedding observations here is the best
     * choice. Although the previous implementation is better for searching
     * subjects, but to cover the observation parts, we have no choice but to
     * compromise the clarity here.
     * 
     * Other points of consideration for the purpose of observation 'search': (1)
     * the ontologySearch cover the original non-ontology subject results (in
     * principle) (2) the subjects other than TissueSample and ECO term are neither
     * affected or covered by ontology search so it is not consistent.
     */
@Override
public SearchResults ontologySearch(String queryString) {
    final String[] searchTerms = parseWords(queryString);
    Set<Integer> observationsIntersection = null;
    Set<SubjectResult> subject_result = null;
    final int termCount = searchTerms.length;
    if (termCount <= 1) {
        // prevent wasting time finding observations
        subject_result = new HashSet<SubjectResult>(ontologySearchOneTerm(searchTerms[0].replace("\"", ""), null));
    } else {
        boolean first = true;
        Map<SubjectResult, Integer> subjectResultMap = new HashMap<SubjectResult, Integer>();
        for (String oneTerm : searchTerms) {
            oneTerm = oneTerm.replace("\"", "");
            log.debug("ontology search term:" + oneTerm);
            Set<Integer> observations = new HashSet<Integer>();
            List<SubjectResult> oneTermList = ontologySearchOneTerm(oneTerm, observations);
            for (SubjectResult s : oneTermList) {
                Integer matchNumber = subjectResultMap.get(s);
                if (matchNumber != null) {
                    s.matchNumber = matchNumber + 1;
                }
                subjectResultMap.put(s, s.getMatchNumber());
            }
            if (first) {
                observationsIntersection = observations;
                first = false;
            } else {
                observationsIntersection.retainAll(observations);
            }
        }
        subject_result = subjectResultMap.keySet();
    }
    SearchResults searchResults = new SearchResults();
    if (subject_result.size() > maxNumberOfSearchResults) {
        searchResults.oversized = subject_result.size();
        searchResults.subject_result = subject_result.stream().sorted(new SearchResultComparator()).limit(maxNumberOfSearchResults).collect(Collectors.toList());
        log.debug("size after limiting: " + subject_result.size());
    } else {
        searchResults.subject_result = new ArrayList<SubjectResult>(subject_result);
    }
    if (observationsIntersection != null) {
        searchResults.observation_result = observationsIntersection.stream().map(id -> this.getEntityById(Observation.class, id)).collect(Collectors.toList());
        log.debug("size of observation intersection: " + observationsIntersection.size());
    }
    return searchResults;
}
Also used : HashMap(java.util.HashMap) SearchResults(gov.nih.nci.ctd2.dashboard.util.SearchResults) BigInteger(java.math.BigInteger) SubjectResult(gov.nih.nci.ctd2.dashboard.util.SubjectResult) Observation(gov.nih.nci.ctd2.dashboard.model.Observation) HashSet(java.util.HashSet)

Example 2 with SearchResults

use of gov.nih.nci.ctd2.dashboard.util.SearchResults in project nci-ctd2-dashboard by CBIIT.

the class DashboardDaoImpl method search.

@Override
@Cacheable(value = "searchCache")
public SearchResults search(String queryString) {
    queryString = queryString.trim();
    final String[] searchTerms = parseWords(queryString);
    log.debug("search terms: " + String.join(",", searchTerms));
    Map<Subject, Integer> subjects = new HashMap<Subject, Integer>();
    Map<Submission, Integer> submissions = new HashMap<Submission, Integer>();
    for (String singleTerm : searchTerms) {
        searchSingleTerm(singleTerm, subjects, submissions);
    }
    SearchResults searchResults = new SearchResults();
    searchResults.submission_result = submissions.keySet().stream().map(submission -> {
        ObservationTemplate template = submission.getObservationTemplate();
        return new SearchResults.SubmissionResult(submission.getStableURL(), submission.getSubmissionDate(), template.getDescription(), template.getTier(), template.getSubmissionCenter().getDisplayName(), submission.getId(), findObservationsBySubmission(submission).size(), template.getIsSubmissionStory());
    }).collect(Collectors.toList());
    Map<String, Set<Observation>> observationMap = new HashMap<String, Set<Observation>>();
    List<SubjectResult> subject_result = new ArrayList<SubjectResult>();
    for (Subject subject : subjects.keySet()) {
        Set<Observation> observations = new HashSet<Observation>();
        Set<SubmissionCenter> submissionCenters = new HashSet<SubmissionCenter>();
        Set<String> roles = new HashSet<String>();
        for (ObservedSubject observedSubject : findObservedSubjectBySubject(subject)) {
            Observation observation = observedSubject.getObservation();
            observations.add(observation);
            ObservationTemplate observationTemplate = observation.getSubmission().getObservationTemplate();
            submissionCenters.add(observationTemplate.getSubmissionCenter());
            roles.add(observedSubject.getObservedSubjectRole().getSubjectRole().getDisplayName());
        }
        SubjectResult x = new SubjectResult(subject, observations.size(), submissionCenters.size(), subjects.get(subject), roles);
        Arrays.stream(searchTerms).filter(term -> matchSubject(term, subject)).forEach(term -> {
            Set<Observation> obset = observationMap.get(term);
            if (obset == null) {
                obset = new HashSet<Observation>();
            }
            obset.addAll(observations);
            observationMap.put(term, obset);
        });
        subject_result.add(x);
    }
    /* search ECO terms */
    List<ECOTerm> ecoterms = findECOTerms(queryString);
    for (ECOTerm ecoterm : ecoterms) {
        List<Integer> observationIds = observationIdsForEcoCode(ecoterm.getCode());
        int observationNumber = observationIds.size();
        if (observationNumber == 0)
            continue;
        SubjectResult entity = new SubjectResult(ecoterm, observationNumber, centerCount(ecoterm.getCode()), null, // no matchNumber, no roles
        null);
        subject_result.add(entity);
        Set<Observation> observations = new HashSet<Observation>();
        observationIds.forEach(obid -> observations.add(getEntityById(Observation.class, obid)));
        Arrays.stream(searchTerms).filter(term -> ecoterm.containsTerm(term)).forEach(term -> {
            Set<Observation> obset = observationMap.get(term);
            if (obset == null) {
                obset = new HashSet<Observation>();
            }
            obset.addAll(observations);
            observationMap.put(term, obset);
        });
    }
    /*
         * Limit the size. This should be done more efficiently during the process of
         * builing up of the list.
         * Because the limit needs to be based on 'match number' ranking, which depends
         * on all terms, an efficient algorithm is not obvious.
         * Unfortunately, we also have to do this after processing all results because
         * we need (in fact more often) observation numbers as well in ranking. TODO
         */
    if (subject_result.size() > maxNumberOfSearchResults) {
        searchResults.oversized = subject_result.size();
        subject_result = subject_result.stream().sorted(new SearchResultComparator()).limit(maxNumberOfSearchResults).collect(Collectors.toList());
        log.debug("size after limiting: " + subject_result.size());
    }
    searchResults.subject_result = subject_result;
    if (searchTerms.length <= 1) {
        return searchResults;
    }
    // add intersection of observations
    Set<Observation> set0 = observationMap.get(searchTerms[0]);
    if (set0 == null) {
        log.debug("no observation for " + searchTerms[0]);
        return searchResults;
    }
    log.debug("set0 size=" + set0.size());
    for (int i = 1; i < searchTerms.length; i++) {
        Set<Observation> obset = observationMap.get(searchTerms[i]);
        if (obset == null) {
            log.debug("... no observation for " + searchTerms[i]);
            return searchResults;
        }
        log.debug("set " + i + " size=" + obset.size());
        set0.retainAll(obset);
    }
    // set0 is now the intersection
    if (set0.size() == 0) {
        log.debug("no intersection of observations");
    }
    if (set0.size() > maxNumberOfSearchResults) {
        searchResults.oversized_observations = set0.size();
        // no particular ranking is enforced when limiting
        set0 = set0.stream().limit(maxNumberOfSearchResults).collect(Collectors.toSet());
        log.debug("observation results count after limiting: " + set0.size());
    }
    searchResults.observation_result = new ArrayList<Observation>(set0);
    return searchResults;
}
Also used : Query(org.apache.lucene.search.Query) ObservedEvidenceRole(gov.nih.nci.ctd2.dashboard.model.ObservedEvidenceRole) Transcript(gov.nih.nci.ctd2.dashboard.model.Transcript) Arrays(java.util.Arrays) DashboardDao(gov.nih.nci.ctd2.dashboard.dao.DashboardDao) ObservedSubjectRole(gov.nih.nci.ctd2.dashboard.model.ObservedSubjectRole) Cacheable(org.springframework.cache.annotation.Cacheable) NoResultException(javax.persistence.NoResultException) XRefItem(gov.nih.nci.ctd2.dashboard.api.XRefItem) KeywordAnalyzer(org.apache.lucene.analysis.core.KeywordAnalyzer) SubmissionCenter(gov.nih.nci.ctd2.dashboard.model.SubmissionCenter) Matcher(java.util.regex.Matcher) MultiFieldQueryParser(org.apache.lucene.queryparser.classic.MultiFieldQueryParser) Map(java.util.Map) DashboardEntity(gov.nih.nci.ctd2.dashboard.model.DashboardEntity) CriteriaBuilder(javax.persistence.criteria.CriteriaBuilder) BigInteger(java.math.BigInteger) TissueSample(gov.nih.nci.ctd2.dashboard.model.TissueSample) Organism(gov.nih.nci.ctd2.dashboard.model.Organism) CriteriaQuery(javax.persistence.criteria.CriteriaQuery) ScrollableResults(org.hibernate.ScrollableResults) SearchResults(gov.nih.nci.ctd2.dashboard.util.SearchResults) SubjectResult(gov.nih.nci.ctd2.dashboard.util.SubjectResult) Collection(java.util.Collection) SessionFactory(org.hibernate.SessionFactory) Set(java.util.Set) FullTextQuery(org.hibernate.search.FullTextQuery) CellSample(gov.nih.nci.ctd2.dashboard.model.CellSample) Compound(gov.nih.nci.ctd2.dashboard.model.Compound) ECOTerm(gov.nih.nci.ctd2.dashboard.model.ECOTerm) SubjectWithOrganism(gov.nih.nci.ctd2.dashboard.model.SubjectWithOrganism) Collectors(java.util.stream.Collectors) TissueSampleImpl(gov.nih.nci.ctd2.dashboard.impl.TissueSampleImpl) ObservationItem(gov.nih.nci.ctd2.dashboard.api.ObservationItem) List(java.util.List) Xref(gov.nih.nci.ctd2.dashboard.model.Xref) EcoBrowse(gov.nih.nci.ctd2.dashboard.util.EcoBrowse) CompoundImpl(gov.nih.nci.ctd2.dashboard.impl.CompoundImpl) ScrollMode(org.hibernate.ScrollMode) ObservedSubject(gov.nih.nci.ctd2.dashboard.model.ObservedSubject) DashboardEntityImpl(gov.nih.nci.ctd2.dashboard.impl.DashboardEntityImpl) Gene(gov.nih.nci.ctd2.dashboard.model.Gene) Pattern(java.util.regex.Pattern) LogFactory(org.apache.commons.logging.LogFactory) ShRna(gov.nih.nci.ctd2.dashboard.model.ShRna) Observation(gov.nih.nci.ctd2.dashboard.model.Observation) ParseException(org.apache.lucene.queryparser.classic.ParseException) FullTextSession(org.hibernate.search.FullTextSession) SubmissionImpl(gov.nih.nci.ctd2.dashboard.impl.SubmissionImpl) Subject(gov.nih.nci.ctd2.dashboard.model.Subject) Submission(gov.nih.nci.ctd2.dashboard.model.Submission) Session(org.hibernate.Session) HashMap(java.util.HashMap) EvidenceItem(gov.nih.nci.ctd2.dashboard.api.EvidenceItem) Evidence(gov.nih.nci.ctd2.dashboard.model.Evidence) TypedQuery(javax.persistence.TypedQuery) ObservationTemplate(gov.nih.nci.ctd2.dashboard.model.ObservationTemplate) SubjectWithSummaries(gov.nih.nci.ctd2.dashboard.util.SubjectWithSummaries) ArrayList(java.util.ArrayList) HashSet(java.util.HashSet) Search(org.hibernate.search.Search) Summary(gov.nih.nci.ctd2.dashboard.util.Summary) SubjectImpl(gov.nih.nci.ctd2.dashboard.impl.SubjectImpl) AnimalModel(gov.nih.nci.ctd2.dashboard.model.AnimalModel) ObservedEvidence(gov.nih.nci.ctd2.dashboard.model.ObservedEvidence) WordCloudEntry(gov.nih.nci.ctd2.dashboard.util.WordCloudEntry) SubjectWithOrganismImpl(gov.nih.nci.ctd2.dashboard.impl.SubjectWithOrganismImpl) ObservationURIsAndTiers(gov.nih.nci.ctd2.dashboard.util.ObservationURIsAndTiers) FlushMode(org.hibernate.FlushMode) SubjectItem(gov.nih.nci.ctd2.dashboard.api.SubjectItem) Hierarchy(gov.nih.nci.ctd2.dashboard.util.Hierarchy) Annotation(gov.nih.nci.ctd2.dashboard.model.Annotation) ObservationTemplateImpl(gov.nih.nci.ctd2.dashboard.impl.ObservationTemplateImpl) Synonym(gov.nih.nci.ctd2.dashboard.model.Synonym) Protein(gov.nih.nci.ctd2.dashboard.model.Protein) Log(org.apache.commons.logging.Log) DashboardFactory(gov.nih.nci.ctd2.dashboard.model.DashboardFactory) Set(java.util.Set) HashSet(java.util.HashSet) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) SearchResults(gov.nih.nci.ctd2.dashboard.util.SearchResults) ObservationTemplate(gov.nih.nci.ctd2.dashboard.model.ObservationTemplate) ECOTerm(gov.nih.nci.ctd2.dashboard.model.ECOTerm) HashSet(java.util.HashSet) Submission(gov.nih.nci.ctd2.dashboard.model.Submission) ObservedSubject(gov.nih.nci.ctd2.dashboard.model.ObservedSubject) Subject(gov.nih.nci.ctd2.dashboard.model.Subject) BigInteger(java.math.BigInteger) SubmissionCenter(gov.nih.nci.ctd2.dashboard.model.SubmissionCenter) SubjectResult(gov.nih.nci.ctd2.dashboard.util.SubjectResult) Observation(gov.nih.nci.ctd2.dashboard.model.Observation) ObservedSubject(gov.nih.nci.ctd2.dashboard.model.ObservedSubject) Cacheable(org.springframework.cache.annotation.Cacheable)

Example 3 with SearchResults

use of gov.nih.nci.ctd2.dashboard.util.SearchResults in project nci-ctd2-dashboard by CBIIT.

the class OntologySearchController method ontologySearch.

@RequestMapping(method = { RequestMethod.GET }, headers = "Accept=application/json")
public ResponseEntity<String> ontologySearch(@RequestParam("terms") String terms) {
    HttpHeaders headers = new HttpHeaders();
    headers.add("Content-Type", "application/json; charset=utf-8");
    SearchResults ontologyResult = dashboardDao.ontologySearch(terms.replaceAll("`", "'"));
    log.debug("number of subject results from ontology search " + ontologyResult.numberOfSubjects());
    JSONSerializer jsonSerializer = new JSONSerializer().transform(new ImplTransformer(), Class.class).transform(new DateTransformer(), Date.class);
    return new ResponseEntity<String>(jsonSerializer.deepSerialize(ontologyResult), headers, HttpStatus.OK);
}
Also used : HttpHeaders(org.springframework.http.HttpHeaders) ResponseEntity(org.springframework.http.ResponseEntity) ImplTransformer(gov.nih.nci.ctd2.dashboard.util.ImplTransformer) DateTransformer(gov.nih.nci.ctd2.dashboard.util.DateTransformer) SearchResults(gov.nih.nci.ctd2.dashboard.util.SearchResults) JSONSerializer(flexjson.JSONSerializer) RequestMapping(org.springframework.web.bind.annotation.RequestMapping)

Example 4 with SearchResults

use of gov.nih.nci.ctd2.dashboard.util.SearchResults in project nci-ctd2-dashboard by CBIIT.

the class SearchController method getSearchResultsInJson.

@Transactional
@RequestMapping(value = "{keyword}", method = { RequestMethod.GET, RequestMethod.POST }, headers = "Accept=application/json")
public ResponseEntity<String> getSearchResultsInJson(@PathVariable String keyword) {
    HttpHeaders headers = new HttpHeaders();
    headers.add("Content-Type", "application/json; charset=utf-8");
    // This is to prevent unnecessary server loads
    if (keyword.length() < 2)
        return new ResponseEntity<String>(headers, HttpStatus.BAD_REQUEST);
    keyword = keyword.replaceAll("`", "'");
    SearchResults results = dashboardDao.search(keyword);
    log.debug("number of subject results from search " + results.numberOfSubjects());
    JSONSerializer jsonSerializer = new JSONSerializer().transform(new ImplTransformer(), Class.class).transform(new DateTransformer(), Date.class);
    return new ResponseEntity<String>(jsonSerializer.deepSerialize(results), headers, HttpStatus.OK);
}
Also used : HttpHeaders(org.springframework.http.HttpHeaders) ResponseEntity(org.springframework.http.ResponseEntity) ImplTransformer(gov.nih.nci.ctd2.dashboard.util.ImplTransformer) DateTransformer(gov.nih.nci.ctd2.dashboard.util.DateTransformer) SearchResults(gov.nih.nci.ctd2.dashboard.util.SearchResults) JSONSerializer(flexjson.JSONSerializer) Transactional(org.springframework.transaction.annotation.Transactional) RequestMapping(org.springframework.web.bind.annotation.RequestMapping)

Example 5 with SearchResults

use of gov.nih.nci.ctd2.dashboard.util.SearchResults in project nci-ctd2-dashboard by CBIIT.

the class RssController method searchRSS.

@Transactional
@RequestMapping(value = "search/{keyword}", method = { RequestMethod.GET, RequestMethod.POST })
public ResponseEntity<String> searchRSS(@PathVariable String keyword) {
    HttpHeaders headers = new HttpHeaders();
    headers.add("Content-Type", "application/rss+xml");
    // This is to prevent unnecessary server loads
    if (keyword.length() < 2)
        return new ResponseEntity<String>(headers, HttpStatus.BAD_REQUEST);
    try {
        keyword = URLDecoder.decode(keyword, Charset.defaultCharset().displayName());
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }
    // Search and find the entity hits
    SearchResults entitiesWithCounts = dashboardDao.search(keyword);
    List<DashboardEntity> searchEntities = new ArrayList<DashboardEntity>();
    for (SubjectResult subjectResult : entitiesWithCounts.subject_result) {
        try {
            Class<? extends DashboardEntity> clazz = Class.forName("gov.nih.nci.ctd2.dashboard.model." + subjectResult.className).asSubclass(DashboardEntity.class);
            DashboardEntity entity = dashboardDao.getEntityById(clazz, subjectResult.id);
            searchEntities.add(entity);
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
            continue;
        }
    }
    String titlePostfix = keyword;
    String rssDescription = "Latest observations and submission related to '" + keyword + "'";
    String dashboardUrl = context.getScheme() + "://" + context.getServerName() + context.getContextPath() + "/";
    String rssLink = dashboardUrl + "#search/" + keyword;
    String feedStr = generateFeed(searchEntities, titlePostfix, rssDescription, rssLink);
    return new ResponseEntity<String>(feedStr, headers, HttpStatus.OK);
}
Also used : HttpHeaders(org.springframework.http.HttpHeaders) ResponseEntity(org.springframework.http.ResponseEntity) SubjectResult(gov.nih.nci.ctd2.dashboard.util.SubjectResult) DashboardEntity(gov.nih.nci.ctd2.dashboard.model.DashboardEntity) ArrayList(java.util.ArrayList) UnsupportedEncodingException(java.io.UnsupportedEncodingException) SearchResults(gov.nih.nci.ctd2.dashboard.util.SearchResults) Transactional(org.springframework.transaction.annotation.Transactional) RequestMapping(org.springframework.web.bind.annotation.RequestMapping)

Aggregations

SearchResults (gov.nih.nci.ctd2.dashboard.util.SearchResults)5 SubjectResult (gov.nih.nci.ctd2.dashboard.util.SubjectResult)3 HttpHeaders (org.springframework.http.HttpHeaders)3 ResponseEntity (org.springframework.http.ResponseEntity)3 RequestMapping (org.springframework.web.bind.annotation.RequestMapping)3 JSONSerializer (flexjson.JSONSerializer)2 DashboardEntity (gov.nih.nci.ctd2.dashboard.model.DashboardEntity)2 Observation (gov.nih.nci.ctd2.dashboard.model.Observation)2 DateTransformer (gov.nih.nci.ctd2.dashboard.util.DateTransformer)2 ImplTransformer (gov.nih.nci.ctd2.dashboard.util.ImplTransformer)2 BigInteger (java.math.BigInteger)2 HashMap (java.util.HashMap)2 HashSet (java.util.HashSet)2 EvidenceItem (gov.nih.nci.ctd2.dashboard.api.EvidenceItem)1 ObservationItem (gov.nih.nci.ctd2.dashboard.api.ObservationItem)1 SubjectItem (gov.nih.nci.ctd2.dashboard.api.SubjectItem)1 XRefItem (gov.nih.nci.ctd2.dashboard.api.XRefItem)1 DashboardDao (gov.nih.nci.ctd2.dashboard.dao.DashboardDao)1 CompoundImpl (gov.nih.nci.ctd2.dashboard.impl.CompoundImpl)1 DashboardEntityImpl (gov.nih.nci.ctd2.dashboard.impl.DashboardEntityImpl)1