Search in sources :

Example 1 with PasswordBasedExtractor

use of org.codelibs.fess.crawler.extractor.impl.PasswordBasedExtractor in project fess-crawler by codelibs.

the class ExtractorFactoryTest method setUp.

@Override
protected void setUp() throws Exception {
    super.setUp();
    StandardCrawlerContainer container = new StandardCrawlerContainer().singleton("tikaExtractor", // 
    TikaExtractor.class).singleton("pdfExtractor", // 
    PdfExtractor.class).singleton("lhaExtractor", // 
    LhaExtractor.class).singleton("extractorFactory", ExtractorFactory.class);
    extractorFactory = container.getComponent("extractorFactory");
    TikaExtractor tikaExtractor = container.getComponent("tikaExtractor");
    LhaExtractor lhaExtractor = container.getComponent("lhaExtractor");
    PasswordBasedExtractor pdfExtractor = container.getComponent("pdfExtractor");
    extractorFactory.addExtractor("application/msword", tikaExtractor);
    extractorFactory.addExtractor("application/vnd.ms-excel", tikaExtractor);
    extractorFactory.addExtractor("application/vnd.ms-powerpoint", tikaExtractor);
    extractorFactory.addExtractor("application/vnd.visio", tikaExtractor);
    extractorFactory.addExtractor("application/pdf", pdfExtractor);
    extractorFactory.addExtractor("application/x-lha", lhaExtractor);
    extractorFactory.addExtractor("application/x-lharc", lhaExtractor);
}
Also used : PasswordBasedExtractor(org.codelibs.fess.crawler.extractor.impl.PasswordBasedExtractor) LhaExtractor(org.codelibs.fess.crawler.extractor.impl.LhaExtractor) StandardCrawlerContainer(org.codelibs.fess.crawler.container.StandardCrawlerContainer) TikaExtractor(org.codelibs.fess.crawler.extractor.impl.TikaExtractor)

Aggregations

StandardCrawlerContainer (org.codelibs.fess.crawler.container.StandardCrawlerContainer)1 LhaExtractor (org.codelibs.fess.crawler.extractor.impl.LhaExtractor)1 PasswordBasedExtractor (org.codelibs.fess.crawler.extractor.impl.PasswordBasedExtractor)1 TikaExtractor (org.codelibs.fess.crawler.extractor.impl.TikaExtractor)1