Search in sources :

Example 21 with XWPFDocument

use of org.apache.poi.xwpf.usermodel.XWPFDocument in project poi by apache.

the class TestXWPFWordExtractor method testFormFootnotes.

public void testFormFootnotes() throws IOException {
    XWPFDocument doc = XWPFTestDataSamples.openSampleDocument("form_footnotes.docx");
    XWPFWordExtractor extractor = new XWPFWordExtractor(doc);
    String text = extractor.getText();
    assertContains(text, "testdoc");
    assertContains(text, "test phrase");
    extractor.close();
}
Also used : XWPFDocument(org.apache.poi.xwpf.usermodel.XWPFDocument)

Example 22 with XWPFDocument

use of org.apache.poi.xwpf.usermodel.XWPFDocument in project poi by apache.

the class TestXWPFWordExtractor method testNoFieldCodes.

/**
     * The output should not contain field codes, e.g. those specified in the
     * w:instrText tag (spec sec. 17.16.23)
     *
     * @throws IOException
     */
public void testNoFieldCodes() throws IOException {
    XWPFDocument doc = XWPFTestDataSamples.openSampleDocument("FieldCodes.docx");
    XWPFWordExtractor extractor = new XWPFWordExtractor(doc);
    String text = extractor.getText();
    assertTrue(text.length() > 0);
    assertFalse(text.contains("AUTHOR"));
    assertFalse(text.contains("CREATEDATE"));
    extractor.close();
}
Also used : XWPFDocument(org.apache.poi.xwpf.usermodel.XWPFDocument)

Example 23 with XWPFDocument

use of org.apache.poi.xwpf.usermodel.XWPFDocument in project poi by apache.

the class TestXWPFWordExtractor method testGetSimpleText.

/**
     * Get text out of the simple file
     *
     * @throws IOException
     */
public void testGetSimpleText() throws IOException {
    XWPFDocument doc = XWPFTestDataSamples.openSampleDocument("sample.docx");
    XWPFWordExtractor extractor = new XWPFWordExtractor(doc);
    String text = extractor.getText();
    assertTrue(text.length() > 0);
    // Check contents
    assertStartsWith(text, "Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Nunc at risus vel erat tempus posuere. Aenean non ante. Suspendisse vehicula dolor sit amet odio.");
    assertEndsWith(text, "Phasellus ultricies mi nec leo. Sed tempus. In sit amet lorem at velit faucibus vestibulum.\n");
    // Check number of paragraphs by counting number of newlines
    int numberOfParagraphs = StringUtil.countMatches(text, '\n');
    assertEquals(3, numberOfParagraphs);
    extractor.close();
}
Also used : XWPFDocument(org.apache.poi.xwpf.usermodel.XWPFDocument)

Example 24 with XWPFDocument

use of org.apache.poi.xwpf.usermodel.XWPFDocument in project poi by apache.

the class TestXWPFWordExtractor method testTableFootnotes.

public void testTableFootnotes() throws IOException {
    XWPFDocument doc = XWPFTestDataSamples.openSampleDocument("table_footnotes.docx");
    XWPFWordExtractor extractor = new XWPFWordExtractor(doc);
    assertContains(extractor.getText(), "snoska");
    extractor.close();
}
Also used : XWPFDocument(org.apache.poi.xwpf.usermodel.XWPFDocument)

Example 25 with XWPFDocument

use of org.apache.poi.xwpf.usermodel.XWPFDocument in project poi by apache.

the class TestXWPFWordExtractor method testDrawings.

/**
     * Test for parsing document with drawings to prevent
     * NoClassDefFoundError for CTAnchor in XWPFRun
     */
public void testDrawings() throws IOException {
    XWPFDocument doc = XWPFTestDataSamples.openSampleDocument("drawing.docx");
    XWPFWordExtractor extractor = new XWPFWordExtractor(doc);
    String text = extractor.getText();
    assertTrue(text.length() > 0);
    extractor.close();
}
Also used : XWPFDocument(org.apache.poi.xwpf.usermodel.XWPFDocument)

Aggregations

XWPFDocument (org.apache.poi.xwpf.usermodel.XWPFDocument)51 Test (org.junit.Test)15 FileOutputStream (java.io.FileOutputStream)11 File (java.io.File)9 XWPFParagraph (org.apache.poi.xwpf.usermodel.XWPFParagraph)9 XWPFRun (org.apache.poi.xwpf.usermodel.XWPFRun)9 InputStream (java.io.InputStream)6 OutputStream (java.io.OutputStream)6 FileInputStream (java.io.FileInputStream)4 XWPFTable (org.apache.poi.xwpf.usermodel.XWPFTable)4 OPCPackage (org.apache.poi.openxml4j.opc.OPCPackage)3 NPOIFSFileSystem (org.apache.poi.poifs.filesystem.NPOIFSFileSystem)3 XMLSlideShow (org.apache.poi.xslf.usermodel.XMLSlideShow)3 XWPFWordExtractor (org.apache.poi.xwpf.extractor.XWPFWordExtractor)3 XWPFFooter (org.apache.poi.xwpf.usermodel.XWPFFooter)3 XWPFHeader (org.apache.poi.xwpf.usermodel.XWPFHeader)3 ByteArrayInputStream (java.io.ByteArrayInputStream)2 ZipFile (java.util.zip.ZipFile)2 HSLFSlideShow (org.apache.poi.hslf.usermodel.HSLFSlideShow)2 HSSFWorkbook (org.apache.poi.hssf.usermodel.HSSFWorkbook)2