use of com.pff.PSTFile in project Xponents by OpenSextant.
the class OutlookPSTCrawler method collect.
@Override
public void collect() throws IOException, ConfigException {
//
// Logic: Traverse PST file.
// it contains mail, contacts, tasks, notes, other stuff?
//
// Replicate folder structure discovered.
// Mail and date-oriented items should be filed by date. For now, YYYY-MM-DD is fine.
//
// For mail messages, review DefaultMailCralwer:
// - for each message
// save message to disk; create parent folder to contain message contents
// run text conversion individually on attachments.
//
// - structure:
// ./Mail/
// 2014-04-09/messageABC.eml
// 2014-04-09/messageABC/attachment1.doc
log.info("Traversing PST Folders for FILE={}", pst);
try {
PSTFile pstStore = new PSTFile(pst);
processFolder(pstStore.getRootFolder());
} catch (PSTException err) {
throw new ConfigException("Failure with PST traversal", err);
}
}
use of com.pff.PSTFile in project tika by apache.
the class OutlookPSTParser method parse.
public void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException {
// Use the delegate parser to parse the contained document
EmbeddedDocumentExtractor embeddedExtractor = EmbeddedDocumentUtil.getEmbeddedDocumentExtractor(context);
metadata.set(Metadata.CONTENT_TYPE, MS_OUTLOOK_PST_MIMETYPE.toString());
XHTMLContentHandler xhtml = new XHTMLContentHandler(handler, metadata);
xhtml.startDocument();
TikaInputStream in = TikaInputStream.get(stream);
PSTFile pstFile = null;
try {
pstFile = new PSTFile(in.getFile().getPath());
metadata.set(Metadata.CONTENT_LENGTH, valueOf(pstFile.getFileHandle().length()));
boolean isValid = pstFile.getFileHandle().getFD().valid();
metadata.set("isValid", valueOf(isValid));
if (isValid) {
parseFolder(xhtml, pstFile.getRootFolder(), embeddedExtractor);
}
} catch (Exception e) {
throw new TikaException(e.getMessage(), e);
} finally {
if (pstFile != null && pstFile.getFileHandle() != null) {
try {
pstFile.getFileHandle().close();
} catch (IOException e) {
//swallow closing exception
}
}
}
xhtml.endDocument();
}
Aggregations