Search in sources :

Example 11 with Range

use of org.apache.poi.hwpf.usermodel.Range in project poi by apache.

the class HWPFLister method dumpParagraphsDom.

public void dumpParagraphsDom(boolean withText) {
    Range range = _doc.getOverallRange();
    for (int p = 0; p < range.numParagraphs(); p++) {
        Paragraph paragraph = range.getParagraph(p);
        System.out.println(p + ":\t" + paragraph);
        if (withText)
            System.out.println(paragraph.text());
    }
}
Also used : Range(org.apache.poi.hwpf.usermodel.Range) Paragraph(org.apache.poi.hwpf.usermodel.Paragraph)

Example 12 with Range

use of org.apache.poi.hwpf.usermodel.Range in project poi by apache.

the class Word6Extractor method getParagraphText.

/**
     * Get the text from the word file, as an array with one String
     *  per paragraph
     */
@Deprecated
public String[] getParagraphText() {
    String[] ret;
    // Extract using the model code
    try {
        Range r = doc.getRange();
        ret = WordExtractor.getParagraphText(r);
    } catch (Exception e) {
        // Something's up with turning the text pieces into paragraphs
        // Fall back to ripping out the text pieces
        ret = new String[doc.getTextTable().getTextPieces().size()];
        for (int i = 0; i < ret.length; i++) {
            ret[i] = doc.getTextTable().getTextPieces().get(i).getStringBuilder().toString();
            // Fix the line endings
            ret[i] = ret[i].replaceAll("\r", "￾");
            ret[i] = ret[i].replaceAll("￾", "\r\n");
        }
    }
    return ret;
}
Also used : Range(org.apache.poi.hwpf.usermodel.Range) IOException(java.io.IOException)

Example 13 with Range

use of org.apache.poi.hwpf.usermodel.Range in project poi by apache.

the class WordExtractor method getParagraphText.

/**
     * Get the text from the word file, as an array with one String per
     * paragraph
     */
public String[] getParagraphText() {
    String[] ret;
    // Extract using the model code
    try {
        Range r = doc.getRange();
        ret = getParagraphText(r);
    } catch (Exception e) {
        // Something's up with turning the text pieces into paragraphs
        // Fall back to ripping out the text pieces
        ret = new String[1];
        ret[0] = getTextFromPieces();
    }
    return ret;
}
Also used : Range(org.apache.poi.hwpf.usermodel.Range) IOException(java.io.IOException)

Example 14 with Range

use of org.apache.poi.hwpf.usermodel.Range in project poi by apache.

the class HWPFLister method dumpChpx.

public void dumpChpx(boolean withProperties, boolean withSprms) {
    for (CHPX chpx : _doc.getCharacterTable().getTextRuns()) {
        System.out.println(chpx);
        if (withProperties) {
            System.out.println(chpx.getCharacterProperties(_doc.getStyleSheet(), (short) StyleSheet.NIL_STYLE));
        }
        if (withSprms) {
            SprmIterator sprmIt = new SprmIterator(chpx.getGrpprl(), 0);
            while (sprmIt.hasNext()) {
                SprmOperation sprm = sprmIt.next();
                System.out.println("\t" + sprm);
            }
        }
        String text = new Range(chpx.getStart(), chpx.getEnd(), _doc.getOverallRange()) {

            public String toString() {
                return "CHPX range (" + super.toString() + ")";
            }
        }.text();
        StringBuilder stringBuilder = new StringBuilder();
        for (char c : text.toCharArray()) {
            if (c < 30)
                stringBuilder.append("\\0x").append(Integer.toHexString(c));
            else
                stringBuilder.append(c);
        }
        System.out.println(stringBuilder);
    }
}
Also used : CHPX(org.apache.poi.hwpf.model.CHPX) SprmIterator(org.apache.poi.hwpf.sprm.SprmIterator) SprmOperation(org.apache.poi.hwpf.sprm.SprmOperation) Range(org.apache.poi.hwpf.usermodel.Range)

Example 15 with Range

use of org.apache.poi.hwpf.usermodel.Range in project poi by apache.

the class PicturesTable method getAllPictures.

/**
   * Not all documents have all the images concatenated in the data stream
   * although MS claims so. The best approach is to scan all character runs.
   *
   * @return a list of Picture objects found in current document
   */
public List<Picture> getAllPictures() {
    ArrayList<Picture> pictures = new ArrayList<Picture>();
    Range range = _document.getOverallRange();
    for (int i = 0; i < range.numCharacterRuns(); i++) {
        CharacterRun run = range.getCharacterRun(i);
        if (run == null) {
            continue;
        }
        Picture picture = extractPicture(run, false);
        if (picture != null) {
            pictures.add(picture);
        }
    }
    searchForPictures(_dgg.getEscherRecords(), pictures);
    return pictures;
}
Also used : Picture(org.apache.poi.hwpf.usermodel.Picture) CharacterRun(org.apache.poi.hwpf.usermodel.CharacterRun) ArrayList(java.util.ArrayList) Range(org.apache.poi.hwpf.usermodel.Range)

Aggregations

Range (org.apache.poi.hwpf.usermodel.Range)24 HWPFDocument (org.apache.poi.hwpf.HWPFDocument)9 Paragraph (org.apache.poi.hwpf.usermodel.Paragraph)8 Bookmark (org.apache.poi.hwpf.usermodel.Bookmark)4 CharacterRun (org.apache.poi.hwpf.usermodel.CharacterRun)4 Picture (org.apache.poi.hwpf.usermodel.Picture)3 FileInputStream (java.io.FileInputStream)2 IOException (java.io.IOException)2 ArrayList (java.util.ArrayList)2 FileNotFoundException (java.io.FileNotFoundException)1 FileOutputStream (java.io.FileOutputStream)1 InputStream (java.io.InputStream)1 LinkedList (java.util.LinkedList)1 List (java.util.List)1 Matcher (java.util.regex.Matcher)1 SummaryInformation (org.apache.poi.hpsf.SummaryInformation)1 OLEShape (org.apache.poi.hslf.model.OLEShape)1 HSLFObjectData (org.apache.poi.hslf.usermodel.HSLFObjectData)1 HSLFPictureData (org.apache.poi.hslf.usermodel.HSLFPictureData)1 HSLFPictureShape (org.apache.poi.hslf.usermodel.HSLFPictureShape)1