Search in sources :

Example 6 with CarbonTableInputFormat

use of org.apache.carbondata.hadoop.api.CarbonTableInputFormat in project carbondata by apache.

the class CarbonTableInputFormatTest method testGetFilteredSplits.

@Test
public void testGetFilteredSplits() throws Exception {
    CarbonTableInputFormat carbonInputFormat = new CarbonTableInputFormat();
    JobConf jobConf = new JobConf(new Configuration());
    Job job = Job.getInstance(jobConf);
    job.getConfiguration().set("query.id", UUID.randomUUID().toString());
    String tblPath = StoreCreator.getAbsoluteTableIdentifier().getTablePath();
    FileInputFormat.addInputPath(job, new Path(tblPath));
    CarbonTableInputFormat.setDatabaseName(job.getConfiguration(), StoreCreator.getAbsoluteTableIdentifier().getDatabaseName());
    CarbonTableInputFormat.setTableName(job.getConfiguration(), StoreCreator.getAbsoluteTableIdentifier().getTableName());
    Expression expression = new EqualToExpression(new ColumnExpression("country", DataTypes.STRING), new LiteralExpression("china", DataTypes.STRING));
    CarbonTableInputFormat.setFilterPredicates(job.getConfiguration(), expression);
    List splits = carbonInputFormat.getSplits(job);
    Assert.assertTrue(splits != null);
    Assert.assertTrue(!splits.isEmpty());
}
Also used : Path(org.apache.hadoop.fs.Path) CarbonTablePath(org.apache.carbondata.core.util.path.CarbonTablePath) EqualToExpression(org.apache.carbondata.core.scan.expression.conditional.EqualToExpression) Configuration(org.apache.hadoop.conf.Configuration) ColumnExpression(org.apache.carbondata.core.scan.expression.ColumnExpression) Expression(org.apache.carbondata.core.scan.expression.Expression) EqualToExpression(org.apache.carbondata.core.scan.expression.conditional.EqualToExpression) LiteralExpression(org.apache.carbondata.core.scan.expression.LiteralExpression) ColumnExpression(org.apache.carbondata.core.scan.expression.ColumnExpression) LiteralExpression(org.apache.carbondata.core.scan.expression.LiteralExpression) CarbonTableInputFormat(org.apache.carbondata.hadoop.api.CarbonTableInputFormat) List(java.util.List) Job(org.apache.hadoop.mapreduce.Job) JobConf(org.apache.hadoop.mapred.JobConf) Test(org.junit.Test)

Example 7 with CarbonTableInputFormat

use of org.apache.carbondata.hadoop.api.CarbonTableInputFormat in project carbondata by apache.

the class CarbonTableReader method getInputSplits2.

public List<CarbonLocalInputSplit> getInputSplits2(CarbonTableCacheModel tableCacheModel, Expression filters) {
    List<CarbonLocalInputSplit> result = new ArrayList<>();
    if (config.getUnsafeMemoryInMb() != null) {
        CarbonProperties.getInstance().addProperty(CarbonCommonConstants.UNSAFE_WORKING_MEMORY_IN_MB, config.getUnsafeMemoryInMb());
    }
    CarbonTable carbonTable = tableCacheModel.carbonTable;
    TableInfo tableInfo = tableCacheModel.carbonTable.getTableInfo();
    Configuration config = new Configuration();
    config.set(CarbonTableInputFormat.INPUT_SEGMENT_NUMBERS, "");
    String carbonTablePath = carbonTable.getAbsoluteTableIdentifier().getTablePath();
    config.set(CarbonTableInputFormat.INPUT_DIR, carbonTablePath);
    config.set(CarbonTableInputFormat.DATABASE_NAME, carbonTable.getDatabaseName());
    config.set(CarbonTableInputFormat.TABLE_NAME, carbonTable.getTableName());
    try {
        CarbonTableInputFormat.setTableInfo(config, tableInfo);
        CarbonTableInputFormat carbonTableInputFormat = createInputFormat(config, carbonTable.getAbsoluteTableIdentifier(), filters);
        JobConf jobConf = new JobConf(config);
        Job job = Job.getInstance(jobConf);
        List<InputSplit> splits = carbonTableInputFormat.getSplits(job);
        CarbonInputSplit carbonInputSplit = null;
        Gson gson = new Gson();
        if (splits != null && splits.size() > 0) {
            for (InputSplit inputSplit : splits) {
                carbonInputSplit = (CarbonInputSplit) inputSplit;
                result.add(new CarbonLocalInputSplit(carbonInputSplit.getSegmentId(), carbonInputSplit.getPath().toString(), carbonInputSplit.getStart(), carbonInputSplit.getLength(), Arrays.asList(carbonInputSplit.getLocations()), carbonInputSplit.getNumberOfBlocklets(), carbonInputSplit.getVersion().number(), carbonInputSplit.getDeleteDeltaFiles(), gson.toJson(carbonInputSplit.getDetailInfo())));
            }
        }
    } catch (IOException e) {
        throw new RuntimeException("Error creating Splits from CarbonTableInputFormat", e);
    }
    return result;
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) ArrayList(java.util.ArrayList) Gson(com.facebook.presto.hadoop.$internal.com.google.gson.Gson) CarbonInputSplit(org.apache.carbondata.hadoop.CarbonInputSplit) IOException(java.io.IOException) CarbonTable(org.apache.carbondata.core.metadata.schema.table.CarbonTable) CarbonTableInputFormat(org.apache.carbondata.hadoop.api.CarbonTableInputFormat) TableInfo(org.apache.carbondata.core.metadata.schema.table.TableInfo) Job(org.apache.hadoop.mapreduce.Job) JobConf(org.apache.hadoop.mapred.JobConf) InputSplit(org.apache.hadoop.mapreduce.InputSplit) CarbonInputSplit(org.apache.carbondata.hadoop.CarbonInputSplit)

Example 8 with CarbonTableInputFormat

use of org.apache.carbondata.hadoop.api.CarbonTableInputFormat in project carbondata by apache.

the class CarbonTableReader method createInputFormat.

private CarbonTableInputFormat<Object> createInputFormat(Configuration conf, AbsoluteTableIdentifier identifier, Expression filterExpression) throws IOException {
    CarbonTableInputFormat format = new CarbonTableInputFormat<Object>();
    CarbonTableInputFormat.setTablePath(conf, identifier.appendWithLocalPrefix(identifier.getTablePath()));
    CarbonTableInputFormat.setFilterPredicates(conf, filterExpression);
    return format;
}
Also used : CarbonTableInputFormat(org.apache.carbondata.hadoop.api.CarbonTableInputFormat)

Example 9 with CarbonTableInputFormat

use of org.apache.carbondata.hadoop.api.CarbonTableInputFormat in project carbondata by apache.

the class CarbondataRecordSetProvider method createInputFormat.

private CarbonTableInputFormat<Object> createInputFormat(Configuration conf, CarbonTable carbonTable, Expression filterExpression, CarbonProjection projection) {
    AbsoluteTableIdentifier identifier = carbonTable.getAbsoluteTableIdentifier();
    CarbonTableInputFormat format = new CarbonTableInputFormat<Object>();
    CarbonTableInputFormat.setTablePath(conf, identifier.appendWithLocalPrefix(identifier.getTablePath()));
    CarbonTableInputFormat.setDatabaseName(conf, identifier.getCarbonTableIdentifier().getDatabaseName());
    CarbonTableInputFormat.setTableName(conf, identifier.getCarbonTableIdentifier().getTableName());
    CarbonTableInputFormat.setFilterPredicates(conf, filterExpression);
    CarbonTableInputFormat.setColumnProjection(conf, projection);
    return format;
}
Also used : AbsoluteTableIdentifier(org.apache.carbondata.core.metadata.AbsoluteTableIdentifier) CarbonTableInputFormat(org.apache.carbondata.hadoop.api.CarbonTableInputFormat)

Aggregations

CarbonTableInputFormat (org.apache.carbondata.hadoop.api.CarbonTableInputFormat)9 Configuration (org.apache.hadoop.conf.Configuration)4 Path (org.apache.hadoop.fs.Path)4 JobConf (org.apache.hadoop.mapred.JobConf)4 CarbonInputSplit (org.apache.carbondata.hadoop.CarbonInputSplit)3 Job (org.apache.hadoop.mapreduce.Job)3 IOException (java.io.IOException)2 List (java.util.List)2 CarbonTable (org.apache.carbondata.core.metadata.schema.table.CarbonTable)2 CarbonTablePath (org.apache.carbondata.core.util.path.CarbonTablePath)2 Test (org.junit.Test)2 Gson (com.facebook.presto.hadoop.$internal.com.google.gson.Gson)1 ColumnHandle (com.facebook.presto.spi.ColumnHandle)1 ImmutableList (com.google.common.collect.ImmutableList)1 File (java.io.File)1 FileFilter (java.io.FileFilter)1 ArrayList (java.util.ArrayList)1 BitSet (java.util.BitSet)1 AbsoluteTableIdentifier (org.apache.carbondata.core.metadata.AbsoluteTableIdentifier)1 TableInfo (org.apache.carbondata.core.metadata.schema.table.TableInfo)1