Search in sources :

Example 16 with OfficeParserConfig

use of org.apache.tika.parser.microsoft.OfficeParserConfig in project tika by apache.

the class OOXMLParserTest method testTurningOffTextBoxExtractionExcel.

//TIKA-2346
@Test
public void testTurningOffTextBoxExtractionExcel() throws Exception {
    ParseContext pc = new ParseContext();
    OfficeParserConfig officeParserConfig = new OfficeParserConfig();
    officeParserConfig.setIncludeShapeBasedContent(false);
    pc.set(OfficeParserConfig.class, officeParserConfig);
    String xml = getXML("testEXCEL_textbox.xlsx", pc).xml;
    assertNotContained("autoshape", xml);
}
Also used : ParseContext(org.apache.tika.parser.ParseContext) OfficeParserConfig(org.apache.tika.parser.microsoft.OfficeParserConfig) ExcelParserTest(org.apache.tika.parser.microsoft.ExcelParserTest) Test(org.junit.Test) TikaTest(org.apache.tika.TikaTest) WordParserTest(org.apache.tika.parser.microsoft.WordParserTest)

Aggregations

OfficeParserConfig (org.apache.tika.parser.microsoft.OfficeParserConfig)16 ParseContext (org.apache.tika.parser.ParseContext)15 TikaTest (org.apache.tika.TikaTest)13 Test (org.junit.Test)13 Metadata (org.apache.tika.metadata.Metadata)9 AutoDetectParser (org.apache.tika.parser.AutoDetectParser)6 ExcelParserTest (org.apache.tika.parser.microsoft.ExcelParserTest)6 WordParserTest (org.apache.tika.parser.microsoft.WordParserTest)6 TikaConfig (org.apache.tika.config.TikaConfig)5 InputStream (java.io.InputStream)2 EncryptedDocumentException (org.apache.tika.exception.EncryptedDocumentException)2 TikaInputStream (org.apache.tika.io.TikaInputStream)2 File (java.io.File)1 Date (java.util.Date)1 HashMap (java.util.HashMap)1 Locale (java.util.Locale)1 Map (java.util.Map)1 CloseShieldInputStream (org.apache.commons.io.input.CloseShieldInputStream)1 POIXMLDocument (org.apache.poi.POIXMLDocument)1 POIXMLTextExtractor (org.apache.poi.POIXMLTextExtractor)1