Search in sources :

Example 1 with PinotSegmentRecordReader

use of com.linkedin.pinot.core.data.readers.PinotSegmentRecordReader in project pinot by linkedin.

the class ColumnarToStarTreeConverter method convertSegment.

/**
   * Helper method to perform the conversion.
   * @param columnarSegment Columnar segment directory to convert
   * @throws Exception
   */
private void convertSegment(File columnarSegment) throws Exception {
    PinotSegmentRecordReader pinotSegmentRecordReader = new PinotSegmentRecordReader(columnarSegment);
    SegmentGeneratorConfig config = new SegmentGeneratorConfig(pinotSegmentRecordReader.getSchema());
    config.setDataDir(_inputDirName);
    config.setInputFilePath(columnarSegment.getAbsolutePath());
    config.setFormat(FileFormat.PINOT);
    config.setEnableStarTreeIndex(true);
    config.setOutDir(_outputDirName);
    config.setStarTreeIndexSpecFile(_starTreeConfigFileName);
    config.setOverwrite(_overwrite);
    config.setSegmentName(columnarSegment.getName());
    SegmentIndexCreationDriver indexCreator = new SegmentIndexCreationDriverImpl();
    indexCreator.init(config);
    indexCreator.build();
}
Also used : SegmentIndexCreationDriver(com.linkedin.pinot.core.segment.creator.SegmentIndexCreationDriver) SegmentGeneratorConfig(com.linkedin.pinot.core.indexsegment.generator.SegmentGeneratorConfig) PinotSegmentRecordReader(com.linkedin.pinot.core.data.readers.PinotSegmentRecordReader) SegmentIndexCreationDriverImpl(com.linkedin.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl)

Example 2 with PinotSegmentRecordReader

use of com.linkedin.pinot.core.data.readers.PinotSegmentRecordReader in project pinot by linkedin.

the class PinotSegmentToCsvConverter method convert.

@Override
public void convert() throws Exception {
    PinotSegmentRecordReader recordReader = new PinotSegmentRecordReader(new File(_segmentDir));
    try {
        recordReader.init();
        try (BufferedWriter recordWriter = new BufferedWriter(new FileWriter(_outputFile))) {
            if (_withHeader) {
                GenericRow row = recordReader.next();
                recordWriter.write(StringUtils.join(row.getFieldNames(), _delimiter));
                recordWriter.newLine();
                recordReader.rewind();
            }
            while (recordReader.hasNext()) {
                GenericRow row = recordReader.next();
                String[] fields = row.getFieldNames();
                List<String> record = new ArrayList<>(fields.length);
                for (String field : fields) {
                    Object value = row.getValue(field);
                    if (value instanceof Object[]) {
                        record.add(StringUtils.join((Object[]) value, _listDelimiter));
                    } else {
                        record.add(value.toString());
                    }
                }
                recordWriter.write(StringUtils.join(record, _delimiter));
                recordWriter.newLine();
            }
        }
    } finally {
        recordReader.close();
    }
}
Also used : GenericRow(com.linkedin.pinot.core.data.GenericRow) FileWriter(java.io.FileWriter) ArrayList(java.util.ArrayList) File(java.io.File) PinotSegmentRecordReader(com.linkedin.pinot.core.data.readers.PinotSegmentRecordReader) BufferedWriter(java.io.BufferedWriter)

Example 3 with PinotSegmentRecordReader

use of com.linkedin.pinot.core.data.readers.PinotSegmentRecordReader in project pinot by linkedin.

the class PinotSegmentToJsonConverter method convert.

@Override
public void convert() throws Exception {
    PinotSegmentRecordReader recordReader = new PinotSegmentRecordReader(new File(_segmentDir));
    try {
        recordReader.init();
        try (BufferedWriter recordWriter = new BufferedWriter(new FileWriter(_outputFile))) {
            while (recordReader.hasNext()) {
                GenericRow row = recordReader.next();
                JSONObject record = new JSONObject();
                for (String field : row.getFieldNames()) {
                    Object value = row.getValue(field);
                    if (value instanceof Object[]) {
                        record.put(field, new JSONArray(value));
                    } else {
                        record.put(field, value);
                    }
                }
                recordWriter.write(record.toString());
                recordWriter.newLine();
            }
        }
    } finally {
        recordReader.close();
    }
}
Also used : GenericRow(com.linkedin.pinot.core.data.GenericRow) JSONObject(org.json.JSONObject) FileWriter(java.io.FileWriter) JSONArray(org.json.JSONArray) JSONObject(org.json.JSONObject) File(java.io.File) PinotSegmentRecordReader(com.linkedin.pinot.core.data.readers.PinotSegmentRecordReader) BufferedWriter(java.io.BufferedWriter)

Example 4 with PinotSegmentRecordReader

use of com.linkedin.pinot.core.data.readers.PinotSegmentRecordReader in project pinot by linkedin.

the class PinotSegmentToAvroConverter method convert.

@Override
public void convert() throws Exception {
    PinotSegmentRecordReader recordReader = new PinotSegmentRecordReader(new File(_segmentDir));
    try {
        recordReader.init();
        Schema avroSchema = buildAvroSchemaFromPinotSchema(recordReader.getSchema());
        try (DataFileWriter<Record> recordWriter = new DataFileWriter<>(new GenericDatumWriter<Record>(avroSchema))) {
            recordWriter.create(avroSchema, new File(_outputFile));
            while (recordReader.hasNext()) {
                GenericRow row = recordReader.next();
                Record record = new Record(avroSchema);
                for (String field : row.getFieldNames()) {
                    Object value = row.getValue(field);
                    if (value instanceof Object[]) {
                        record.put(field, Arrays.asList((Object[]) value));
                    } else {
                        record.put(field, value);
                    }
                }
                recordWriter.append(record);
            }
        }
    } finally {
        recordReader.close();
    }
}
Also used : GenericRow(com.linkedin.pinot.core.data.GenericRow) Schema(org.apache.avro.Schema) DataFileWriter(org.apache.avro.file.DataFileWriter) Record(org.apache.avro.generic.GenericData.Record) File(java.io.File) PinotSegmentRecordReader(com.linkedin.pinot.core.data.readers.PinotSegmentRecordReader)

Aggregations

PinotSegmentRecordReader (com.linkedin.pinot.core.data.readers.PinotSegmentRecordReader)4 GenericRow (com.linkedin.pinot.core.data.GenericRow)3 File (java.io.File)3 BufferedWriter (java.io.BufferedWriter)2 FileWriter (java.io.FileWriter)2 SegmentGeneratorConfig (com.linkedin.pinot.core.indexsegment.generator.SegmentGeneratorConfig)1 SegmentIndexCreationDriver (com.linkedin.pinot.core.segment.creator.SegmentIndexCreationDriver)1 SegmentIndexCreationDriverImpl (com.linkedin.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl)1 ArrayList (java.util.ArrayList)1 Schema (org.apache.avro.Schema)1 DataFileWriter (org.apache.avro.file.DataFileWriter)1 Record (org.apache.avro.generic.GenericData.Record)1 JSONArray (org.json.JSONArray)1 JSONObject (org.json.JSONObject)1