Search in sources :

Example 6 with CompositeDetector

use of org.apache.tika.detect.CompositeDetector in project tika by apache.

the class TikaConfigSerializer method addDetectors.

private static void addDetectors(Mode mode, Element rootElement, Document doc, TikaConfig config) throws Exception {
    Detector detector = config.getDetector();
    if (mode == Mode.MINIMAL && detector instanceof DefaultDetector) {
        // Don't output anything, all using defaults
        Node detComment = doc.createComment("for example: <detectors><detector class=\"org.apache.tika.detector.MimeTypes\"></detectors>");
        rootElement.appendChild(detComment);
        return;
    }
    Element detectorsElement = doc.createElement("detectors");
    if (mode == Mode.CURRENT && detector instanceof DefaultDetector || !(detector instanceof CompositeDetector)) {
        Element detectorElement = doc.createElement("detector");
        detectorElement.setAttribute("class", detector.getClass().getCanonicalName());
        detectorsElement.appendChild(detectorElement);
    } else {
        List<Detector> children = ((CompositeDetector) detector).getDetectors();
        for (Detector d : children) {
            Element detectorElement = doc.createElement("detector");
            detectorElement.setAttribute("class", d.getClass().getCanonicalName());
            detectorsElement.appendChild(detectorElement);
        }
    }
    rootElement.appendChild(detectorsElement);
}
Also used : DefaultDetector(org.apache.tika.detect.DefaultDetector) CompositeDetector(org.apache.tika.detect.CompositeDetector) DefaultEncodingDetector(org.apache.tika.detect.DefaultEncodingDetector) CompositeDetector(org.apache.tika.detect.CompositeDetector) Detector(org.apache.tika.detect.Detector) CompositeEncodingDetector(org.apache.tika.detect.CompositeEncodingDetector) EncodingDetector(org.apache.tika.detect.EncodingDetector) DefaultDetector(org.apache.tika.detect.DefaultDetector) Node(org.w3c.dom.Node) Element(org.w3c.dom.Element)

Example 7 with CompositeDetector

use of org.apache.tika.detect.CompositeDetector in project tika by apache.

the class CustomMimeInfo method customCompositeDetector.

public static String customCompositeDetector() throws Exception {
    String path = "file:///path/to/prescription-type.xml";
    MimeTypes typeDatabase = MimeTypesFactory.create(new URL(path));
    Tika tika = new Tika(new CompositeDetector(typeDatabase, new EncryptedPrescriptionDetector()));
    String type = tika.detect("/path/to/tmp/prescription.xpd");
    return type;
}
Also used : CompositeDetector(org.apache.tika.detect.CompositeDetector) MimeTypes(org.apache.tika.mime.MimeTypes) Tika(org.apache.tika.Tika) URL(java.net.URL)

Example 8 with CompositeDetector

use of org.apache.tika.detect.CompositeDetector in project tika by apache.

the class AdvancedTypeDetector method detectWithCustomDetector.

public static String detectWithCustomDetector(String name) throws Exception {
    String config = "/org/apache/tika/mime/tika-mimetypes.xml";
    Detector detector = MimeTypesFactory.create(config);
    Detector custom = new Detector() {

        private static final long serialVersionUID = -5420638839201540749L;

        public MediaType detect(InputStream input, Metadata metadata) {
            String type = metadata.get("my-custom-type-override");
            if (type != null) {
                return MediaType.parse(type);
            } else {
                return MediaType.OCTET_STREAM;
            }
        }
    };
    Tika tika = new Tika(new CompositeDetector(custom, detector));
    return tika.detect(name);
}
Also used : CompositeDetector(org.apache.tika.detect.CompositeDetector) CompositeDetector(org.apache.tika.detect.CompositeDetector) Detector(org.apache.tika.detect.Detector) InputStream(java.io.InputStream) Metadata(org.apache.tika.metadata.Metadata) Tika(org.apache.tika.Tika)

Example 9 with CompositeDetector

use of org.apache.tika.detect.CompositeDetector in project tika by apache.

the class TikaDetectorConfigTest method assertDetectors.

private void assertDetectors(CompositeDetector detector, boolean shouldHavePOIFS, boolean shouldHaveZip) {
    boolean hasZip = false;
    boolean hasPOIFS = false;
    for (Detector d : detector.getDetectors()) {
        if (d instanceof ZipContainerDetector) {
            if (shouldHaveZip) {
                hasZip = true;
            } else {
                fail("Shouldn't have the ZipContainerDetector from config");
            }
        }
        if (d instanceof POIFSContainerDetector) {
            if (shouldHavePOIFS) {
                hasPOIFS = true;
            } else {
                fail("Shouldn't have the POIFSContainerDetector from config");
            }
        }
    }
    if (shouldHavePOIFS)
        assertTrue("Should have the POIFSContainerDetector", hasPOIFS);
    if (shouldHaveZip)
        assertTrue("Should have the ZipContainerDetector", hasZip);
}
Also used : POIFSContainerDetector(org.apache.tika.parser.microsoft.POIFSContainerDetector) CompositeDetector(org.apache.tika.detect.CompositeDetector) EmptyDetector(org.apache.tika.detect.EmptyDetector) Detector(org.apache.tika.detect.Detector) ZipContainerDetector(org.apache.tika.parser.pkg.ZipContainerDetector) POIFSContainerDetector(org.apache.tika.parser.microsoft.POIFSContainerDetector) DefaultDetector(org.apache.tika.detect.DefaultDetector) ZipContainerDetector(org.apache.tika.parser.pkg.ZipContainerDetector)

Example 10 with CompositeDetector

use of org.apache.tika.detect.CompositeDetector in project tika by apache.

the class TikaDetectors method detectorAsMap.

private void detectorAsMap(Detector d, Map<String, Object> details) {
    details.put("name", d.getClass().getName());
    boolean isComposite = (d instanceof CompositeDetector);
    details.put("composite", isComposite);
    if (isComposite) {
        List<Map<String, Object>> c = new ArrayList<Map<String, Object>>();
        for (Detector cd : ((CompositeDetector) d).getDetectors()) {
            Map<String, Object> cdet = new HashMap<String, Object>();
            detectorAsMap(cd, cdet);
            c.add(cdet);
        }
        details.put("children", c);
    }
}
Also used : CompositeDetector(org.apache.tika.detect.CompositeDetector) CompositeDetector(org.apache.tika.detect.CompositeDetector) Detector(org.apache.tika.detect.Detector) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) Map(java.util.Map) HashMap(java.util.HashMap)

Aggregations

CompositeDetector (org.apache.tika.detect.CompositeDetector)11 Detector (org.apache.tika.detect.Detector)7 DefaultDetector (org.apache.tika.detect.DefaultDetector)4 Test (org.junit.Test)3 Tika (org.apache.tika.Tika)2 Metadata (org.apache.tika.metadata.Metadata)2 FileOutputStream (java.io.FileOutputStream)1 InputStream (java.io.InputStream)1 OutputStreamWriter (java.io.OutputStreamWriter)1 Writer (java.io.Writer)1 URL (java.net.URL)1 Charset (java.nio.charset.Charset)1 ArrayList (java.util.ArrayList)1 HashMap (java.util.HashMap)1 Map (java.util.Map)1 TikaConfig (org.apache.tika.config.TikaConfig)1 TikaConfigSerializer (org.apache.tika.config.TikaConfigSerializer)1 CompositeEncodingDetector (org.apache.tika.detect.CompositeEncodingDetector)1 DefaultEncodingDetector (org.apache.tika.detect.DefaultEncodingDetector)1 EmptyDetector (org.apache.tika.detect.EmptyDetector)1