Search in sources :

Example 1 with PlexOfField

use of org.apache.poi.hwpf.model.PlexOfField in project poi by apache.

the class FieldsImpl method parseFieldStructureImpl.

private void parseFieldStructureImpl(List<PlexOfField> plexOfFields, int startOffsetInclusive, int endOffsetExclusive, List<FieldImpl> result) {
    int next = startOffsetInclusive;
    while (next < endOffsetExclusive) {
        PlexOfField startPlexOfField = plexOfFields.get(next);
        if (startPlexOfField.getFld().getBoundaryType() != FieldDescriptor.FIELD_BEGIN_MARK) {
            /* Start mark seems to be missing */
            next++;
            continue;
        }
        /*
             * we have start node. end offset points to next node, separator or
             * end
             */
        int nextNodePositionInList = binarySearch(plexOfFields, next + 1, endOffsetExclusive, startPlexOfField.getFcEnd());
        if (nextNodePositionInList < 0) {
            /*
                 * too bad, this start field mark doesn't have corresponding end
                 * field mark or separator field mark in fields table
                 */
            next++;
            continue;
        }
        PlexOfField nextPlexOfField = plexOfFields.get(nextNodePositionInList);
        switch(nextPlexOfField.getFld().getBoundaryType()) {
            case FieldDescriptor.FIELD_SEPARATOR_MARK:
                {
                    PlexOfField separatorPlexOfField = nextPlexOfField;
                    int endNodePositionInList = binarySearch(plexOfFields, nextNodePositionInList, endOffsetExclusive, separatorPlexOfField.getFcEnd());
                    if (endNodePositionInList < 0) {
                        /*
                     * too bad, this separator field mark doesn't have
                     * corresponding end field mark in fields table
                     */
                        next++;
                        continue;
                    }
                    PlexOfField endPlexOfField = plexOfFields.get(endNodePositionInList);
                    if (endPlexOfField.getFld().getBoundaryType() != FieldDescriptor.FIELD_END_MARK) {
                        /* Not and ending mark */
                        next++;
                        continue;
                    }
                    FieldImpl field = new FieldImpl(startPlexOfField, separatorPlexOfField, endPlexOfField);
                    result.add(field);
                    // adding included fields
                    if (startPlexOfField.getFcStart() + 1 < separatorPlexOfField.getFcStart() - 1) {
                        parseFieldStructureImpl(plexOfFields, next + 1, nextNodePositionInList, result);
                    }
                    if (separatorPlexOfField.getFcStart() + 1 < endPlexOfField.getFcStart() - 1) {
                        parseFieldStructureImpl(plexOfFields, nextNodePositionInList + 1, endNodePositionInList, result);
                    }
                    next = endNodePositionInList + 1;
                    break;
                }
            case FieldDescriptor.FIELD_END_MARK:
                {
                    // we have no separator
                    FieldImpl field = new FieldImpl(startPlexOfField, null, nextPlexOfField);
                    result.add(field);
                    // adding included fields
                    if (startPlexOfField.getFcStart() + 1 < nextPlexOfField.getFcStart() - 1) {
                        parseFieldStructureImpl(plexOfFields, next + 1, nextNodePositionInList, result);
                    }
                    next = nextNodePositionInList + 1;
                    break;
                }
            case FieldDescriptor.FIELD_BEGIN_MARK:
            default:
                {
                    /* something is wrong, ignoring this mark along with start mark */
                    next++;
                    continue;
                }
        }
    }
}
Also used : PlexOfField(org.apache.poi.hwpf.model.PlexOfField)

Example 2 with PlexOfField

use of org.apache.poi.hwpf.model.PlexOfField in project poi by apache.

the class TestBugs method test47286.

/**
     * [FAILING] Bug 47286 - Word documents saves in wrong format if source
     * contains form elements
     */
@SuppressWarnings("deprecation")
@Test
public void test47286() throws IOException {
    // Fetch the current text
    HWPFDocument doc1 = HWPFTestDataSamples.openSampleFile("Bug47286.doc");
    WordExtractor wordExtractor = new WordExtractor(doc1);
    final String text1;
    try {
        text1 = wordExtractor.getText().trim();
    } finally {
        wordExtractor.close();
        doc1.close();
    }
    // Re-load, then re-save and re-check
    doc1 = HWPFTestDataSamples.openSampleFile("Bug47286.doc");
    HWPFDocument doc2 = HWPFTestDataSamples.writeOutAndReadBack(doc1);
    WordExtractor wordExtractor2 = new WordExtractor(doc2);
    final String text2;
    try {
        text2 = wordExtractor2.getText().trim();
    } finally {
        wordExtractor2.close();
        doc1.close();
    }
    // the text in the saved document has some differences in line
    // separators but we tolerate that
    assertEqualsIgnoreNewline(text1.replaceAll("\n", ""), text2.replaceAll("\n", ""));
    assertEquals(doc1.getCharacterTable().getTextRuns().size(), doc2.getCharacterTable().getTextRuns().size());
    List<PlexOfField> expectedFields = doc1.getFieldsTables().getFieldsPLCF(FieldsDocumentPart.MAIN);
    List<PlexOfField> actualFields = doc2.getFieldsTables().getFieldsPLCF(FieldsDocumentPart.MAIN);
    assertEquals(expectedFields.size(), actualFields.size());
    assertTableStructures(doc1.getRange(), doc2.getRange());
}
Also used : HWPFDocument(org.apache.poi.hwpf.HWPFDocument) PlexOfField(org.apache.poi.hwpf.model.PlexOfField) WordExtractor(org.apache.poi.hwpf.extractor.WordExtractor) Test(org.junit.Test)

Aggregations

PlexOfField (org.apache.poi.hwpf.model.PlexOfField)2 HWPFDocument (org.apache.poi.hwpf.HWPFDocument)1 WordExtractor (org.apache.poi.hwpf.extractor.WordExtractor)1 Test (org.junit.Test)1