Search in sources :

Example 11 with KCell

use of org.pentaho.di.core.spreadsheet.KCell in project pentaho-kettle by pentaho.

the class StaxWorkBookIT method testReadSameRow.

@Test
public void testReadSameRow() throws Exception {
    KWorkbook workbook = getWorkbook("src/it/resources/sample-file.xlsx", null);
    KSheet sheet1 = workbook.getSheet(0);
    KCell[] row = sheet1.getRow(3);
    assertEquals("Two", row[1].getValue());
    row = sheet1.getRow(3);
    assertEquals("Two", row[1].getValue());
}
Also used : KWorkbook(org.pentaho.di.core.spreadsheet.KWorkbook) KSheet(org.pentaho.di.core.spreadsheet.KSheet) KCell(org.pentaho.di.core.spreadsheet.KCell) Test(org.junit.Test)

Example 12 with KCell

use of org.pentaho.di.core.spreadsheet.KCell in project pentaho-kettle by pentaho.

the class ExcelInput method getRowFromWorkbooks.

public Object[] getRowFromWorkbooks() {
    // This procedure outputs a single Excel data row on the destination
    // rowsets...
    Object[] retval = null;
    try {
        // First, see if a file has been opened?
        if (data.workbook == null) {
            // Open a new openFile..
            data.file = data.files.getFile(data.filenr);
            data.filename = KettleVFS.getFilename(data.file);
            // Add additional fields?
            if (meta.getShortFileNameField() != null && meta.getShortFileNameField().length() > 0) {
                data.shortFilename = data.file.getName().getBaseName();
            }
            if (meta.getPathField() != null && meta.getPathField().length() > 0) {
                data.path = KettleVFS.getFilename(data.file.getParent());
            }
            if (meta.isHiddenField() != null && meta.isHiddenField().length() > 0) {
                data.hidden = data.file.isHidden();
            }
            if (meta.getExtensionField() != null && meta.getExtensionField().length() > 0) {
                data.extension = data.file.getName().getExtension();
            }
            if (meta.getLastModificationDateField() != null && meta.getLastModificationDateField().length() > 0) {
                data.lastModificationDateTime = new Date(data.file.getContent().getLastModifiedTime());
            }
            if (meta.getUriField() != null && meta.getUriField().length() > 0) {
                data.uriName = data.file.getName().getURI();
            }
            if (meta.getRootUriField() != null && meta.getRootUriField().length() > 0) {
                data.rootUriName = data.file.getName().getRootURI();
            }
            if (meta.getSizeField() != null && meta.getSizeField().length() > 0) {
                data.size = new Long(data.file.getContent().getSize());
            }
            if (meta.isAddResultFile()) {
                ResultFile resultFile = new ResultFile(ResultFile.FILE_TYPE_GENERAL, data.file, getTransMeta().getName(), toString());
                resultFile.setComment(BaseMessages.getString(PKG, "ExcelInput.Log.FileReadByStep"));
                addResultFile(resultFile);
            }
            if (log.isDetailed()) {
                logDetailed(BaseMessages.getString(PKG, "ExcelInput.Log.OpeningFile", "" + data.filenr + " : " + data.filename));
            }
            fpis = new FileInputStream(data.filename);
            data.workbook = WorkbookFactory.getWorkbook(meta.getSpreadSheetType(), fpis, meta.getEncoding());
            data.errorHandler.handleFile(data.file);
            // Start at the first sheet again...
            data.sheetnr = 0;
            // 
            if (meta.readAllSheets()) {
                data.sheetNames = data.workbook.getSheetNames();
                data.startColumn = new int[data.sheetNames.length];
                data.startRow = new int[data.sheetNames.length];
                for (int i = 0; i < data.sheetNames.length; i++) {
                    data.startColumn[i] = data.defaultStartColumn;
                    data.startRow[i] = data.defaultStartRow;
                }
            }
        }
        boolean nextsheet = false;
        // What sheet were we handling?
        if (log.isDebug()) {
            logDetailed(BaseMessages.getString(PKG, "ExcelInput.Log.GetSheet", "" + data.filenr + "." + data.sheetnr));
        }
        String sheetName = data.sheetNames[data.sheetnr];
        KSheet sheet = data.workbook.getSheet(sheetName);
        if (sheet != null) {
            // at what row do we continue reading?
            if (data.rownr < 0) {
                data.rownr = data.startRow[data.sheetnr];
                // Add an extra row if we have a header row to skip...
                if (meta.startsWithHeader()) {
                    data.rownr++;
                }
            }
            // Start at the specified column
            data.colnr = data.startColumn[data.sheetnr];
            // Build a new row and fill in the data from the sheet...
            try {
                KCell[] line = sheet.getRow(data.rownr);
                // Already increase cursor 1 row
                int lineNr = ++data.rownr;
                // Excel starts counting at 0
                if (!data.filePlayList.isProcessingNeeded(data.file, lineNr, sheetName)) {
                    // placeholder, was already null
                    retval = null;
                } else {
                    if (log.isRowLevel()) {
                        logRowlevel(BaseMessages.getString(PKG, "ExcelInput.Log.GetLine", "" + lineNr, data.filenr + "." + data.sheetnr));
                    }
                    if (log.isRowLevel()) {
                        logRowlevel(BaseMessages.getString(PKG, "ExcelInput.Log.ReadLineWith", "" + line.length));
                    }
                    ExcelInputRow excelInputRow = new ExcelInputRow(sheet.getName(), lineNr, line);
                    Object[] r = fillRow(data.colnr, excelInputRow);
                    if (log.isRowLevel()) {
                        logRowlevel(BaseMessages.getString(PKG, "ExcelInput.Log.ConvertedLinToRow", "" + lineNr, data.outputRowMeta.getString(r)));
                    }
                    boolean isEmpty = isLineEmpty(line);
                    if (!isEmpty || !meta.ignoreEmptyRows()) {
                        // Put the row
                        retval = r;
                    } else {
                        if (data.rownr > sheet.getRows()) {
                            nextsheet = true;
                        }
                    }
                    if (isEmpty && meta.stopOnEmpty()) {
                        nextsheet = true;
                    }
                }
            } catch (ArrayIndexOutOfBoundsException e) {
                if (log.isRowLevel()) {
                    logRowlevel(BaseMessages.getString(PKG, "ExcelInput.Log.OutOfIndex"));
                }
                // We tried to read below the last line in the sheet.
                // Go to the next sheet...
                nextsheet = true;
            }
        } else {
            nextsheet = true;
        }
        if (nextsheet) {
            // Go to the next sheet
            data.sheetnr++;
            // Reset the start-row:
            data.rownr = -1;
            // no previous row yet, don't take it from the previous sheet!
            // (that whould be plain wrong!)
            data.previousRow = null;
            // Perhaps it was the last sheet?
            if (data.sheetnr >= data.sheetNames.length) {
                jumpToNextFile();
            }
        }
    } catch (Exception e) {
        logError(BaseMessages.getString(PKG, "ExcelInput.Error.ProcessRowFromExcel", data.filename + "", e.toString()), e);
        setErrors(1);
        stopAll();
        return null;
    }
    return retval;
}
Also used : KSheet(org.pentaho.di.core.spreadsheet.KSheet) ResultFile(org.pentaho.di.core.ResultFile) KCell(org.pentaho.di.core.spreadsheet.KCell) Date(java.util.Date) FileInputStream(java.io.FileInputStream) KettleException(org.pentaho.di.core.exception.KettleException) KettleFileException(org.pentaho.di.core.exception.KettleFileException) IOException(java.io.IOException) FileObject(org.apache.commons.vfs2.FileObject)

Example 13 with KCell

use of org.pentaho.di.core.spreadsheet.KCell in project pentaho-kettle by pentaho.

the class StaxPoiSheet method parseRow.

private KCell[] parseRow() throws XMLStreamException {
    KCell[] cells = new StaxPoiCell[numCols];
    for (int i = 0; i < numCols; i++) {
        // go to the "c" cell tag
        while (sheetReader.hasNext()) {
            int event = sheetReader.next();
            if (event == XMLStreamConstants.START_ELEMENT && sheetReader.getLocalName().equals("c")) {
                break;
            }
            if (event == XMLStreamConstants.END_ELEMENT && sheetReader.getLocalName().equals("row")) {
                // premature end of row, returning what we have
                return cells;
            }
        }
        String cellLocation = sheetReader.getAttributeValue(null, "r");
        int columnIndex = StaxUtil.extractColumnNumber(cellLocation) - 1;
        String cellType = sheetReader.getAttributeValue(null, "t");
        String cellStyle = sheetReader.getAttributeValue(null, "s");
        boolean isFormula = false;
        String content = null;
        // get value tag
        while (sheetReader.hasNext()) {
            int event = sheetReader.next();
            if (event == XMLStreamConstants.START_ELEMENT && sheetReader.getLocalName().equals("v")) {
                // read content as string
                if (cellType != null && cellType.equals("s")) {
                    int idx = Integer.parseInt(sheetReader.getElementText());
                    content = new XSSFRichTextString(sst.getEntryAt(idx)).toString();
                } else {
                    content = sheetReader.getElementText();
                }
            }
            if (event == XMLStreamConstants.START_ELEMENT && sheetReader.getLocalName().equals("is")) {
                while (sheetReader.hasNext()) {
                    event = sheetReader.next();
                    if (event == XMLStreamConstants.CHARACTERS) {
                        content = new XSSFRichTextString(sheetReader.getText()).toString();
                        break;
                    }
                }
            }
            if (event == XMLStreamConstants.START_ELEMENT && sheetReader.getLocalName().equals("f")) {
                isFormula = true;
            }
            if (event == XMLStreamConstants.END_ELEMENT && sheetReader.getLocalName().equals("c")) {
                break;
            }
        }
        if (content != null) {
            KCellType kcType = getCellType(cellType, cellStyle, isFormula);
            cells[columnIndex] = new StaxPoiCell(parseValue(kcType, content), kcType, currentRow);
        }
    // else let cell be null
    }
    return cells;
}
Also used : XSSFRichTextString(org.apache.poi.xssf.usermodel.XSSFRichTextString) KCellType(org.pentaho.di.core.spreadsheet.KCellType) XSSFRichTextString(org.apache.poi.xssf.usermodel.XSSFRichTextString) KCell(org.pentaho.di.core.spreadsheet.KCell)

Example 14 with KCell

use of org.pentaho.di.core.spreadsheet.KCell in project pentaho-kettle by pentaho.

the class StaxPoiSheetTest method testReadRowRA.

@Test
public void testReadRowRA() throws Exception {
    KSheet sheet1 = getSampleSheet();
    KCell[] row = sheet1.getRow(4);
    assertEquals("Three", row[1].getValue());
    row = sheet1.getRow(2);
    assertEquals("One", row[1].getValue());
}
Also used : KSheet(org.pentaho.di.core.spreadsheet.KSheet) KCell(org.pentaho.di.core.spreadsheet.KCell) Test(org.junit.Test)

Example 15 with KCell

use of org.pentaho.di.core.spreadsheet.KCell in project pentaho-kettle by pentaho.

the class StaxPoiSheetTest method testReadSameRow.

@Test
public void testReadSameRow() throws Exception {
    KSheet sheet1 = getSampleSheet();
    KCell[] row = sheet1.getRow(3);
    assertEquals("Two", row[1].getValue());
    row = sheet1.getRow(3);
    assertEquals("Two", row[1].getValue());
}
Also used : KSheet(org.pentaho.di.core.spreadsheet.KSheet) KCell(org.pentaho.di.core.spreadsheet.KCell) Test(org.junit.Test)

Aggregations

KCell (org.pentaho.di.core.spreadsheet.KCell)23 KSheet (org.pentaho.di.core.spreadsheet.KSheet)16 Test (org.junit.Test)12 Date (java.util.Date)9 KWorkbook (org.pentaho.di.core.spreadsheet.KWorkbook)9 ValueMetaInterface (org.pentaho.di.core.row.ValueMetaInterface)3 IOException (java.io.IOException)2 FileObject (org.apache.commons.vfs2.FileObject)2 Cell (org.apache.poi.ss.usermodel.Cell)2 Row (org.apache.poi.ss.usermodel.Row)2 XSSFReader (org.apache.poi.xssf.eventusermodel.XSSFReader)2 XSSFRichTextString (org.apache.poi.xssf.usermodel.XSSFRichTextString)2 KettleException (org.pentaho.di.core.exception.KettleException)2 KCellType (org.pentaho.di.core.spreadsheet.KCellType)2 FileInputStream (java.io.FileInputStream)1 Method (java.lang.reflect.Method)1 XMLStreamException (javax.xml.stream.XMLStreamException)1 InvalidFormatException (org.apache.poi.openxml4j.exceptions.InvalidFormatException)1 SharedStringsTable (org.apache.poi.xssf.model.SharedStringsTable)1 StylesTable (org.apache.poi.xssf.model.StylesTable)1