Search in sources :

Example 1 with ReplacingInputStream

use of org.apache.poi.util.ReplacingInputStream in project poi by apache.

the class XSSFVMLDrawing method read.

@SuppressWarnings("resource")
protected void read(InputStream is) throws IOException, XmlException {
    Document doc;
    try {
        /*
             * This is a seriously sick fix for the fact that some .xlsx files contain raw bits
             * of HTML, without being escaped or properly turned into XML.
             * The result is that they contain things like >br<, which breaks the XML parsing.
             * This very sick InputStream wrapper attempts to spot these go past, and fix them.
             */
        doc = DocumentHelper.readDocument(new ReplacingInputStream(is, "<br>", "<br/>"));
    } catch (SAXException e) {
        throw new XmlException(e.getMessage(), e);
    }
    XmlObject root = XmlObject.Factory.parse(doc, DEFAULT_XML_OPTIONS);
    _qnames = new ArrayList<QName>();
    _items = new ArrayList<XmlObject>();
    for (XmlObject obj : root.selectPath("$this/xml/*")) {
        Node nd = obj.getDomNode();
        QName qname = new QName(nd.getNamespaceURI(), nd.getLocalName());
        if (qname.equals(QNAME_SHAPE_LAYOUT)) {
            _items.add(CTShapeLayout.Factory.parse(obj.xmlText(), DEFAULT_XML_OPTIONS));
        } else if (qname.equals(QNAME_SHAPE_TYPE)) {
            CTShapetype st = CTShapetype.Factory.parse(obj.xmlText(), DEFAULT_XML_OPTIONS);
            _items.add(st);
            _shapeTypeId = st.getId();
        } else if (qname.equals(QNAME_SHAPE)) {
            CTShape shape = CTShape.Factory.parse(obj.xmlText(), DEFAULT_XML_OPTIONS);
            String id = shape.getId();
            if (id != null) {
                Matcher m = ptrn_shapeId.matcher(id);
                if (m.find()) {
                    _shapeId = Math.max(_shapeId, Integer.parseInt(m.group(1)));
                }
            }
            _items.add(shape);
        } else {
            Document doc2;
            try {
                InputSource is2 = new InputSource(new StringReader(obj.xmlText()));
                doc2 = DocumentHelper.readDocument(is2);
            } catch (SAXException e) {
                throw new XmlException(e.getMessage(), e);
            }
            _items.add(XmlObject.Factory.parse(doc2, DEFAULT_XML_OPTIONS));
        }
        _qnames.add(qname);
    }
}
Also used : InputSource(org.xml.sax.InputSource) Matcher(java.util.regex.Matcher) QName(javax.xml.namespace.QName) Node(org.w3c.dom.Node) CTShape(com.microsoft.schemas.vml.CTShape) Document(org.w3c.dom.Document) ReplacingInputStream(org.apache.poi.util.ReplacingInputStream) SAXException(org.xml.sax.SAXException) CTShapetype(com.microsoft.schemas.vml.CTShapetype) XmlException(org.apache.xmlbeans.XmlException) StringReader(java.io.StringReader) XmlObject(org.apache.xmlbeans.XmlObject)

Aggregations

CTShape (com.microsoft.schemas.vml.CTShape)1 CTShapetype (com.microsoft.schemas.vml.CTShapetype)1 StringReader (java.io.StringReader)1 Matcher (java.util.regex.Matcher)1 QName (javax.xml.namespace.QName)1 ReplacingInputStream (org.apache.poi.util.ReplacingInputStream)1 XmlException (org.apache.xmlbeans.XmlException)1 XmlObject (org.apache.xmlbeans.XmlObject)1 Document (org.w3c.dom.Document)1 Node (org.w3c.dom.Node)1 InputSource (org.xml.sax.InputSource)1 SAXException (org.xml.sax.SAXException)1