Search in sources :

Example 6 with IndexDocument

use of com.zimbra.cs.index.IndexDocument in project zm-mailbox by Zimbra.

the class MimeHandler method getDocument.

/**
     * Returns a Lucene document to index this content.
     *
     * @return Lucene document
     * @throws MimeHandlerException if a MIME parser error occurred
     * @throws ObjectHandlerException if a Zimlet error occurred
     * @throws ServiceException if other error occurred
     */
public final Document getDocument() throws MimeHandlerException, ObjectHandlerException, ServiceException {
    IndexDocument doc = new IndexDocument(new Document());
    doc.addMimeType(new MimeTypeTokenStream(getContentType()));
    addFields(doc.toDocument());
    String content = getContent();
    doc.addContent(content);
    getObjects(content, doc);
    doc.addPartName(partName);
    if (dataSource != null) {
        String name = dataSource.getName();
        if (name != null) {
            try {
                name = MimeUtility.decodeText(name);
            } catch (UnsupportedEncodingException ignore) {
            }
            doc.addFilename(name);
        }
    }
    return doc.toDocument();
}
Also used : IndexDocument(com.zimbra.cs.index.IndexDocument) MimeTypeTokenStream(com.zimbra.cs.index.analysis.MimeTypeTokenStream) UnsupportedEncodingException(java.io.UnsupportedEncodingException) IndexDocument(com.zimbra.cs.index.IndexDocument) Document(org.apache.lucene.document.Document)

Example 7 with IndexDocument

use of com.zimbra.cs.index.IndexDocument in project zm-mailbox by Zimbra.

the class IndexItem method redo.

@Override
public void redo() throws Exception {
    Mailbox mbox = MailboxManager.getInstance().getMailboxById(getMailboxId());
    MailItem item;
    try {
        item = mbox.getItemById(null, mId, type);
    } catch (MailServiceException.NoSuchItemException e) {
        // problem.  So just ignore the NoSuchItemException.
        return;
    }
    try {
        List<IndexDocument> docList = item.generateIndexData();
        mbox.index.redoIndexItem(item, mId, docList);
    } catch (Exception e) {
        // TODO - update the item and set the item's "unindexed" flag
        ZimbraLog.index.info("Caught exception attempting to replay IndexItem for ID " + mId + " item will not be indexed", e);
    }
}
Also used : IndexDocument(com.zimbra.cs.index.IndexDocument) MailItem(com.zimbra.cs.mailbox.MailItem) Mailbox(com.zimbra.cs.mailbox.Mailbox) MailServiceException(com.zimbra.cs.mailbox.MailServiceException) IOException(java.io.IOException) MailServiceException(com.zimbra.cs.mailbox.MailServiceException)

Example 8 with IndexDocument

use of com.zimbra.cs.index.IndexDocument in project zm-mailbox by Zimbra.

the class ParsedMessage method handleParseError.

/**
     * Log the error and index minimum information.
     *
     * @param mpi MIME info
     * @param error error to handle
     */
private void handleParseError(MPartInfo mpi, Throwable error) {
    numParseErrors++;
    LOG.warn("Unable to parse part=%s filename=%s content-type=%s message-id=%s", mpi.getPartName(), mpi.getFilename(), mpi.getContentType(), getMessageID(), error);
    if (ConversionException.isTemporaryCauseOf(error)) {
        temporaryAnalysisFailure = true;
    }
    if (!Strings.isNullOrEmpty(mpi.getFilename())) {
        filenames.add(mpi.getFilename());
    }
    IndexDocument doc = new IndexDocument(new Document());
    doc.addMimeType(new MimeTypeTokenStream(mpi.getContentType()));
    doc.addPartName(mpi.getPartName());
    doc.addFilename(mpi.getFilename());
    try {
        doc.addSortSize(mpi.getMimePart().getSize());
    } catch (MessagingException ignore) {
    }
    luceneDocuments.add(setLuceneHeadersFromContainer(doc));
}
Also used : IndexDocument(com.zimbra.cs.index.IndexDocument) MessagingException(javax.mail.MessagingException) MimeTypeTokenStream(com.zimbra.cs.index.analysis.MimeTypeTokenStream) Document(org.apache.lucene.document.Document) IndexDocument(com.zimbra.cs.index.IndexDocument)

Example 9 with IndexDocument

use of com.zimbra.cs.index.IndexDocument in project zm-mailbox by Zimbra.

the class Document method generateIndexData.

@Override
public List<IndexDocument> generateIndexData() throws TemporaryIndexingException {
    try {
        MailboxBlob mblob = getBlob();
        if (mblob == null) {
            ZimbraLog.index.warn("Unable to fetch blob for Document id=%d,ver=%d,vol=%s", mId, mVersion, getLocator());
            throw new MailItem.TemporaryIndexingException();
        }
        ParsedDocument pd = null;
        pd = new ParsedDocument(mblob.getLocalBlob(), getName(), getContentType(), getChangeDate(), getCreator(), getDescription(), isDescriptionEnabled());
        if (pd.hasTemporaryAnalysisFailure()) {
            throw new MailItem.TemporaryIndexingException();
        }
        IndexDocument doc = pd.getDocument();
        if (doc != null) {
            List<IndexDocument> toRet = new ArrayList<IndexDocument>(1);
            toRet.add(doc);
            return toRet;
        } else {
            return new ArrayList<IndexDocument>(0);
        }
    } catch (IOException e) {
        ZimbraLog.index.warn("Error generating index data for Wiki Document " + getId() + ". Item will not be indexed", e);
        return new ArrayList<IndexDocument>(0);
    } catch (ServiceException e) {
        ZimbraLog.index.warn("Error generating index data for Wiki Document " + getId() + ". Item will not be indexed", e);
        return new ArrayList<IndexDocument>(0);
    }
}
Also used : IndexDocument(com.zimbra.cs.index.IndexDocument) MailboxBlob(com.zimbra.cs.store.MailboxBlob) ParsedDocument(com.zimbra.cs.mime.ParsedDocument) ServiceException(com.zimbra.common.service.ServiceException) ArrayList(java.util.ArrayList) IOException(java.io.IOException)

Example 10 with IndexDocument

use of com.zimbra.cs.index.IndexDocument in project zm-mailbox by Zimbra.

the class MessageTest method indexRawMimeMessage.

@Test
public void indexRawMimeMessage() throws Exception {
    Account account = Provisioning.getInstance().getAccountById(MockProvisioning.DEFAULT_ACCOUNT_ID);
    account.setPrefMailDefaultCharset("ISO-2022-JP");
    Mailbox mbox = MailboxManager.getInstance().getMailboxByAccount(account);
    DeliveryOptions dopt = new DeliveryOptions().setFolderId(Mailbox.ID_FOLDER_INBOX);
    byte[] raw = ByteStreams.toByteArray(getClass().getResourceAsStream("raw-jis-msg.txt"));
    ParsedMessage pm = new ParsedMessage(raw, false);
    Message message = mbox.addMessage(null, pm, dopt, null);
    Assert.assertEquals("日本語", pm.getFragment(null));
    List<IndexDocument> docs = message.generateIndexData();
    Assert.assertEquals(2, docs.size());
    String subject = docs.get(0).toDocument().get(LuceneFields.L_H_SUBJECT);
    String body = docs.get(0).toDocument().get(LuceneFields.L_CONTENT);
    Assert.assertEquals("日本語", subject);
    Assert.assertEquals("日本語", body.trim());
}
Also used : Account(com.zimbra.cs.account.Account) IndexDocument(com.zimbra.cs.index.IndexDocument) ParsedMessage(com.zimbra.cs.mime.ParsedMessage) ParsedMessage(com.zimbra.cs.mime.ParsedMessage) Test(org.junit.Test)

Aggregations

IndexDocument (com.zimbra.cs.index.IndexDocument)14 RFC822AddressTokenStream (com.zimbra.cs.index.analysis.RFC822AddressTokenStream)5 IOException (java.io.IOException)5 ServiceException (com.zimbra.common.service.ServiceException)4 Document (org.apache.lucene.document.Document)4 FieldTokenStream (com.zimbra.cs.index.analysis.FieldTokenStream)3 MimeTypeTokenStream (com.zimbra.cs.index.analysis.MimeTypeTokenStream)3 ObjectHandlerException (com.zimbra.cs.object.ObjectHandlerException)3 ArrayList (java.util.ArrayList)3 MailServiceException (com.zimbra.cs.mailbox.MailServiceException)2 ParsedMessage (com.zimbra.cs.mime.ParsedMessage)2 MessagingException (javax.mail.MessagingException)2 MimeMessage (javax.mail.internet.MimeMessage)2 Test (org.junit.Test)2 ZVCalendar (com.zimbra.common.calendar.ZCalendar.ZVCalendar)1 ContentType (com.zimbra.common.mime.ContentType)1 ZMimeMessage (com.zimbra.common.zmime.ZMimeMessage)1 ZMimeMultipart (com.zimbra.common.zmime.ZMimeMultipart)1 Account (com.zimbra.cs.account.Account)1 ConversionException (com.zimbra.cs.convert.ConversionException)1