Search in sources :

Example 6 with MediaTypeRegistry

use of org.apache.tika.mime.MediaTypeRegistry in project tika by apache.

the class DefaultParser method getParsers.

@Override
public Map<MediaType, Parser> getParsers(ParseContext context) {
    Map<MediaType, Parser> map = super.getParsers(context);
    if (loader != null) {
        // Add dynamic parser service (they always override static ones)
        MediaTypeRegistry registry = getMediaTypeRegistry();
        List<Parser> parsers = loader.loadDynamicServiceProviders(Parser.class);
        // best parser last
        Collections.reverse(parsers);
        for (Parser parser : parsers) {
            for (MediaType type : parser.getSupportedTypes(context)) {
                map.put(registry.normalize(type), parser);
            }
        }
    }
    return map;
}
Also used : MediaType(org.apache.tika.mime.MediaType) MediaTypeRegistry(org.apache.tika.mime.MediaTypeRegistry)

Example 7 with MediaTypeRegistry

use of org.apache.tika.mime.MediaTypeRegistry in project tika by apache.

the class TikaCLI method displaySupportedTypes.

/**
     * Prints all the known media types, aliases and matching parser classes.
     */
private void displaySupportedTypes() {
    AutoDetectParser parser = new AutoDetectParser();
    MediaTypeRegistry registry = parser.getMediaTypeRegistry();
    Map<MediaType, Parser> parsers = parser.getParsers();
    for (MediaType type : registry.getTypes()) {
        System.out.println(type);
        for (MediaType alias : registry.getAliases(type)) {
            System.out.println("  alias:     " + alias);
        }
        MediaType supertype = registry.getSupertype(type);
        if (supertype != null) {
            System.out.println("  supertype: " + supertype);
        }
        Parser p = parsers.get(type);
        if (p != null) {
            if (p instanceof CompositeParser) {
                p = ((CompositeParser) p).getParsers().get(type);
            }
            System.out.println("  parser:    " + p.getClass().getName());
        }
    }
}
Also used : CompositeParser(org.apache.tika.parser.CompositeParser) AutoDetectParser(org.apache.tika.parser.AutoDetectParser) MediaType(org.apache.tika.mime.MediaType) MediaTypeRegistry(org.apache.tika.mime.MediaTypeRegistry) Parser(org.apache.tika.parser.Parser) CompositeParser(org.apache.tika.parser.CompositeParser) AutoDetectParser(org.apache.tika.parser.AutoDetectParser) DigestingParser(org.apache.tika.parser.DigestingParser) NetworkParser(org.apache.tika.parser.NetworkParser) ForkParser(org.apache.tika.fork.ForkParser)

Example 8 with MediaTypeRegistry

use of org.apache.tika.mime.MediaTypeRegistry in project tika by apache.

the class TikaMimeTypes method getMediaTypes.

protected List<MediaTypeDetails> getMediaTypes() {
    MediaTypeRegistry registry = TikaResource.getConfig().getMediaTypeRegistry();
    Map<MediaType, Parser> parsers = ((CompositeParser) TikaResource.getConfig().getParser()).getParsers();
    List<MediaTypeDetails> types = new ArrayList<TikaMimeTypes.MediaTypeDetails>(registry.getTypes().size());
    for (MediaType type : registry.getTypes()) {
        MediaTypeDetails details = new MediaTypeDetails();
        details.type = type;
        details.aliases = registry.getAliases(type).toArray(new MediaType[0]);
        MediaType supertype = registry.getSupertype(type);
        if (supertype != null && !MediaType.OCTET_STREAM.equals(supertype)) {
            details.supertype = supertype;
        }
        Parser p = parsers.get(type);
        if (p != null) {
            if (p instanceof CompositeParser) {
                p = ((CompositeParser) p).getParsers().get(type);
            }
            details.parser = p.getClass().getName();
        }
        types.add(details);
    }
    return types;
}
Also used : CompositeParser(org.apache.tika.parser.CompositeParser) ArrayList(java.util.ArrayList) MediaType(org.apache.tika.mime.MediaType) MediaTypeRegistry(org.apache.tika.mime.MediaTypeRegistry) Parser(org.apache.tika.parser.Parser) CompositeParser(org.apache.tika.parser.CompositeParser)

Aggregations

MediaTypeRegistry (org.apache.tika.mime.MediaTypeRegistry)8 MediaType (org.apache.tika.mime.MediaType)7 ArrayList (java.util.ArrayList)2 HashSet (java.util.HashSet)2 TikaConfig (org.apache.tika.config.TikaConfig)2 CompositeParser (org.apache.tika.parser.CompositeParser)2 Parser (org.apache.tika.parser.Parser)2 BufferedInputStream (java.io.BufferedInputStream)1 BufferedReader (java.io.BufferedReader)1 File (java.io.File)1 FileInputStream (java.io.FileInputStream)1 InputStreamReader (java.io.InputStreamReader)1 TreeSet (java.util.TreeSet)1 PasswordRequiredException (org.apache.commons.compress.PasswordRequiredException)1 ArchiveEntry (org.apache.commons.compress.archivers.ArchiveEntry)1 ArchiveException (org.apache.commons.compress.archivers.ArchiveException)1 ArchiveInputStream (org.apache.commons.compress.archivers.ArchiveInputStream)1 ArchiveStreamFactory (org.apache.commons.compress.archivers.ArchiveStreamFactory)1 StreamingNotSupportedException (org.apache.commons.compress.archivers.StreamingNotSupportedException)1 ArArchiveInputStream (org.apache.commons.compress.archivers.ar.ArArchiveInputStream)1