Search in sources :

Example 6 with AnnotationManager

use of edu.stanford.muse.AnnotationManager.AnnotationManager in project epadd by ePADD.

the class EmailRenderer method pagesForDocuments.

/*
	 * returns pages and html for a collection of docs, which can be put into a
	 * jog frame. indexer clusters are used to
	 *
	 * Changed the first arg type from: Collection<? extends EmailDocument> to Collection<Document>, as we get C
	 * ollection<Document> in browse page or from docsforquery, its a hassle to make them all return EmailDocument
	 * especially when no other document type is used anywhere
	 */
public static Pair<DataSet, String> pagesForDocuments(Collection<Document> docs, SearchResult result, String datasetTitle, MultiDoc.ClusteringType coptions) throws Exception {
    StringBuilder html = new StringBuilder();
    int pageNum = 0;
    List<String> pages = new ArrayList<>();
    // need clusters which map to sections in the browsing interface
    List<MultiDoc> clusters;
    // indexer may or may not have indexed all the docs in ds
    // if it has, use its clustering (could be yearly or monthly or category
    // wise
    // if (indexer != null && indexer.clustersIncludeAllDocs(ds))
    // if (indexer != null)
    // IMP: instead of searchResult.getDocsasSet() use the docs that is already ordered by
    // the sortBy order (in SearchResult.selectDocsAndBlobs method.
    clusters = result.getArchive().clustersForDocs(docs, coptions);
    /*
		 * else { // categorize by month if the docs have dates if
		 * (EmailUtils.allDocsAreDatedDocs(ds)) clusters =
		 * IndexUtils.partitionDocsByInterval(new ArrayList<DatedDocument>((Set)
		 * ds), true); else // must be category docs clusters =
		 * CategoryDocument.clustersDocsByCategoryName((Collection) ds); }
		 */
    List<Document> datasetDocs = new ArrayList<>();
    AnnotationManager annotationManager = result.getArchive().getAnnotationManager();
    // we build up a hierarchy of <section, document, page>
    for (MultiDoc md : clusters) {
        if (md.docs.size() == 0)
            continue;
        String description = md.description;
        // escape a double
        description = description.replace("\"", "\\\"");
        // quote if any
        // in the
        // description
        html.append("<div class=\"section\" name=\"" + description + "\">\n");
        List<List<String>> clusterResult = new ArrayList<>();
        for (Document d : md.docs) {
            String pdfAttrib = "";
            /*
				 * if (d instanceof PDFDocument) pdfAttrib = "pdfLink=\"" +
				 * ((PDFDocument) d).relativeURLForPDF + "\"";
				 */
            html.append("<div class=\"document\" " + pdfAttrib + ">\n");
            datasetDocs.add(d);
            pages.add(null);
            clusterResult.add(null);
            // clusterResult.add(docPageList);
            // for (String s: docPageList)
            {
                String comment = Util.escapeHTML(annotationManager.getAnnotation(d.getUniqueId()));
                html.append("<div class=\"page\"");
                if (!Util.nullOrEmpty(comment))
                    html.append(" comment=\"" + comment + "\"");
                if (!Util.nullOrEmpty(comment) && (d instanceof EmailDocument)) {
                    String messageId = d.getUniqueId();
                    html.append(" messageID=\"" + messageId + "\"");
                }
                if (d.isLiked())
                    html.append(" liked=\"true\"");
                // also make sure that browse.jsp (the jsp calling this function) should have a map of LabelID to Label Name, Label type in javascript
                if (d instanceof EmailDocument) {
                    Set<String> labels = result.getArchive().getLabelIDs((EmailDocument) d);
                    if (!Util.nullOrEmpty(labels)) {
                        String val = labels.stream().collect(Collectors.joining(","));
                        html.append(" labels=\"" + val + "\"");
                    } else
                        html.append(" labels=\"\"");
                }
                // ////////////////////////////////////////DONE reading labels///////////////////////////////////////////////////////////////////////////
                if (d instanceof EmailDocument)
                    html.append(" pageId='" + pageNum++ + "' " + " signature='" + Util.hash(((EmailDocument) d).getSignature()) + "' docId='" + d.getUniqueId() + "'></div>\n");
            }
            // document
            html.append("</div>");
        }
        // section
        html.append("</div>\n");
    }
    DataSet dataset = new DataSet(datasetDocs, result, datasetTitle);
    return new Pair<>(dataset, html.toString());
}
Also used : AnnotationManager(edu.stanford.muse.AnnotationManager.AnnotationManager) Pair(edu.stanford.muse.util.Pair)

Example 7 with AnnotationManager

use of edu.stanford.muse.AnnotationManager.AnnotationManager in project epadd by ePADD.

the class EmailRenderer method pagesForDocuments.

/*
	 * returns pages and a json object for a collection of docs, which can be put into a
	 * jog frame.
	 *
	 * Changed the first arg type from: Collection<? extends EmailDocument> to Collection<Document>, as we get C
	 * ollection<Document> in browse page or from docsforquery, its a hassle to make them all return EmailDocument
	 * especially when no other document type is used anywhere.
	 * The second result is a json array of objects, one for each message. each message's object has metadata for it such as
	 * id, labels and annotations.
	 */
public static Pair<DataSet, JSONArray> pagesForDocuments(Collection<Document> docs, SearchResult result, String datasetTitle, MultiDoc.ClusteringType coptions, Multimap<String, String> queryparams) {
    // need clusters which map to sections in the browsing interface
    List<MultiDoc> clusters;
    // indexer may or may not have indexed all the docs in ds
    // if it has, use its clustering (could be yearly or monthly or category
    // wise
    // if (indexer != null && indexer.clustersIncludeAllDocs(ds))
    // if (indexer != null)
    // IMP: instead of searchResult.getDocsasSet() use the docs that is already ordered by
    // the sortBy order (in SearchResult.selectDocsAndBlobs method.
    clusters = result.getArchive().clustersForDocs(docs, coptions);
    /*
		 * else { // categorize by month if the docs have dates if
		 * (EmailUtils.allDocsAreDatedDocs(ds)) clusters =
		 * IndexUtils.partitionDocsByInterval(new ArrayList<DatedDocument>((Set)
		 * ds), true); else // must be category docs clusters =
		 * CategoryDocument.clustersDocsByCategoryName((Collection) ds); }
		 */
    JSONArray resultObj = new JSONArray();
    int resultCount = 0;
    List<Document> datasetDocs = new ArrayList<>();
    AnnotationManager annotationManager = result.getArchive().getAnnotationManager();
    // we build up a hierarchy of <section, document, page>
    for (MultiDoc md : clusters) {
        if (md.docs.size() == 0)
            continue;
        List<List<String>> clusterResult = new ArrayList<>();
        for (Document d : md.docs) {
            String pdfAttrib = "";
            datasetDocs.add(d);
            clusterResult.add(null);
            // clusterResult.add(docPageList);
            // for (String s: docPageList)
            {
                JSONObject jsonObj = new JSONObject();
                String comment = Util.escapeHTML(annotationManager.getAnnotation(d.getUniqueId()));
                if (!Util.nullOrEmpty(comment))
                    jsonObj.put("annotation", comment);
                Set<String> labels = result.getArchive().getLabelIDs((EmailDocument) d);
                if (!Util.nullOrEmpty(labels)) {
                    JSONArray labs = new JSONArray();
                    int i = 0;
                    for (String l : labels) {
                        labs.put(i++, l);
                    }
                    jsonObj.put("labels", labs);
                }
                if (d instanceof EmailDocument) {
                    EmailDocument ed = (EmailDocument) d;
                    jsonObj.put("id", ed.getUniqueId());
                    jsonObj.put("threadID", ed.threadID);
                    // docsWithThreadID is not expensive method as it caches the result for future queries
                    jsonObj.put("msgInThread", result.getArchive().docsWithThreadId(ed.threadID).size());
                    jsonObj.put("nAttachments", ed.attachments != null ? ed.attachments.size() : 0);
                }
                resultObj.put(resultCount++, jsonObj);
            }
        }
    }
    DataSet dataset = new DataSet(datasetDocs, result, datasetTitle, queryparams);
    return new Pair<>(dataset, resultObj);
}
Also used : AnnotationManager(edu.stanford.muse.AnnotationManager.AnnotationManager) JSONArray(org.json.JSONArray) JSONObject(org.json.JSONObject) Pair(edu.stanford.muse.util.Pair)

Example 8 with AnnotationManager

use of edu.stanford.muse.AnnotationManager.AnnotationManager in project epadd by ePADD.

the class SearchResult method filterForAnnotationPresence.

/* Filter docs based on the presence/absence of annotation*/
private static SearchResult filterForAnnotationPresence(SearchResult inputSet) {
    String isAnnotated = JSPHelper.getParam(inputSet.queryParams, "isannotated");
    if (Util.nullOrEmpty(isAnnotated))
        return inputSet;
    boolean isAnn = "true".equals(isAnnotated);
    Map<Document, Pair<BodyHLInfo, AttachmentHLInfo>> outputDocs = new HashMap<>();
    AnnotationManager annotationManager = inputSet.getArchive().getAnnotationManager();
    if (isAnn) {
        inputSet.matchedDocs.keySet().stream().forEach(doc -> {
            EmailDocument ed = (EmailDocument) doc;
            if (!Util.nullOrEmpty(annotationManager.getAnnotation(ed.getUniqueId())))
                outputDocs.put(doc, inputSet.matchedDocs.get(doc));
        });
    } else {
        inputSet.matchedDocs.keySet().stream().forEach(doc -> {
            EmailDocument ed = (EmailDocument) doc;
            if (Util.nullOrEmpty(annotationManager.getAnnotation(ed.getUniqueId())))
                outputDocs.put(doc, inputSet.matchedDocs.get(doc));
        });
    }
    return new SearchResult(outputDocs, inputSet.archive, inputSet.queryParams, inputSet.commonHLInfo, inputSet.regexToHighlight);
}
Also used : AnnotationManager(edu.stanford.muse.AnnotationManager.AnnotationManager) Pair(edu.stanford.muse.util.Pair)

Example 9 with AnnotationManager

use of edu.stanford.muse.AnnotationManager.AnnotationManager in project epadd by ePADD.

the class SearchResult method filterForAnyAnnotation.

// ////////////////////////////////Annotation based checks////////////////////////////////////
/* Why two different API's needed for annotation based filtering?
    It is because of two different semantics associated with annotation
    based checks. One API is for the annotation facet which filters documents based on presence of absence of
    annotations. Another API is for the annotation advanced-search that filters documents based on only the
    presence of annotations. The semantics of 'off' in filterForAnyAnnotation is not the same as the semantics
     of "off" in filterForAnnotationPresence. Hence two different end points.
     */
/*Filter based on the presence of any annotation*/
private static SearchResult filterForAnyAnnotation(SearchResult inputSet) {
    String anyAnnotationCheck = JSPHelper.getParam(inputSet.queryParams, "anyAnnotationCheck");
    if (Util.nullOrEmpty(anyAnnotationCheck))
        return inputSet;
    boolean anyAnnotation = "on".equals(anyAnnotationCheck);
    if (!anyAnnotation)
        return inputSet;
    else {
        AnnotationManager annotationManager = inputSet.getArchive().getAnnotationManager();
        Map<Document, Pair<BodyHLInfo, AttachmentHLInfo>> outputDocs = new HashMap<>();
        inputSet.matchedDocs.keySet().stream().forEach(doc -> {
            EmailDocument ed = (EmailDocument) doc;
            String comment = annotationManager.getAnnotation(ed.getUniqueId());
            if (!Util.nullOrEmpty(comment))
                outputDocs.put(doc, inputSet.matchedDocs.get(doc));
        });
        return new SearchResult(outputDocs, inputSet.archive, inputSet.queryParams, inputSet.commonHLInfo, inputSet.regexToHighlight);
    }
}
Also used : AnnotationManager(edu.stanford.muse.AnnotationManager.AnnotationManager) Pair(edu.stanford.muse.util.Pair)

Aggregations

AnnotationManager (edu.stanford.muse.AnnotationManager.AnnotationManager)9 Pair (edu.stanford.muse.util.Pair)4 CorrespondentAuthorityMapper (edu.stanford.muse.AddressBookManager.CorrespondentAuthorityMapper)2 LabelManager (edu.stanford.muse.LabelManager.LabelManager)2 GZIPInputStream (java.util.zip.GZIPInputStream)2 CorruptIndexException (org.apache.lucene.index.CorruptIndexException)2 ParseException (org.apache.lucene.queryparser.classic.ParseException)2 LockObtainFailedException (org.apache.lucene.store.LockObtainFailedException)2 AddressBook (edu.stanford.muse.AddressBookManager.AddressBook)1 EntityBook (edu.stanford.muse.ie.variants.EntityBook)1 EntityBookManager (edu.stanford.muse.ie.variants.EntityBookManager)1 Archive (edu.stanford.muse.index.Archive)1 NoSuchAlgorithmException (java.security.NoSuchAlgorithmException)1 JSONArray (org.json.JSONArray)1 JSONObject (org.json.JSONObject)1