Search in sources :

Example 11 with BOOLEAN

use of io.trino.spi.type.BooleanType.BOOLEAN in project trino by trinodb.

the class TestHistogram method testManyValuesInducingRehash.

private static void testManyValuesInducingRehash(TestingAggregationFunction aggregationFunction) {
    double distinctFraction = 0.1f;
    int numGroups = 50000;
    int itemCount = 30;
    Random random = new Random();
    GroupedAggregator groupedAggregator = aggregationFunction.createAggregatorFactory(SINGLE, ImmutableList.of(0), OptionalInt.empty()).createGroupedAggregator();
    for (int j = 0; j < numGroups; j++) {
        Map<String, Long> expectedValues = new HashMap<>();
        List<String> valueList = new ArrayList<>();
        for (int i = 0; i < itemCount; i++) {
            String str = String.valueOf(i % 10);
            String item = IntStream.range(0, itemCount).mapToObj(x -> str).collect(Collectors.joining());
            boolean distinctValue = random.nextDouble() < distinctFraction;
            if (distinctValue) {
                // produce a unique value for the histogram
                item = j + "-" + item;
                valueList.add(item);
            } else {
                valueList.add(item);
            }
            expectedValues.compute(item, (k, v) -> v == null ? 1L : ++v);
        }
        Block block = createStringsBlock(valueList);
        AggregationTestInputBuilder testInputBuilder = new AggregationTestInputBuilder(new Block[] { block }, aggregationFunction);
        AggregationTestInput test1 = testInputBuilder.build();
        test1.runPagesOnAggregatorWithAssertion(j, aggregationFunction.getFinalType(), groupedAggregator, new AggregationTestOutput(expectedValues));
    }
}
Also used : DateTimeZone(org.joda.time.DateTimeZone) TypeSignatureProvider.fromTypes(io.trino.sql.analyzer.TypeSignatureProvider.fromTypes) TestingFunctionResolution(io.trino.metadata.TestingFunctionResolution) AggregationTestInput(io.trino.operator.aggregation.groupby.AggregationTestInput) Test(org.testng.annotations.Test) Random(java.util.Random) AggregationTestInputBuilder(io.trino.operator.aggregation.groupby.AggregationTestInputBuilder) Block(io.trino.spi.block.Block) DateTimeEncoding.unpackZoneKey(io.trino.spi.type.DateTimeEncoding.unpackZoneKey) Map(java.util.Map) TIMESTAMP_WITH_TIME_ZONE(io.trino.spi.type.TimestampWithTimeZoneType.TIMESTAMP_WITH_TIME_ZONE) RowType(io.trino.spi.type.RowType) ImmutableMap(com.google.common.collect.ImmutableMap) BlockAssertions.createDoublesBlock(io.trino.block.BlockAssertions.createDoublesBlock) OperatorAssertion.toRow(io.trino.operator.OperatorAssertion.toRow) DateTimeEncoding.packDateTimeWithZone(io.trino.spi.type.DateTimeEncoding.packDateTimeWithZone) ArrayType(io.trino.spi.type.ArrayType) Collectors(java.util.stream.Collectors) SqlTimestampWithTimeZone(io.trino.spi.type.SqlTimestampWithTimeZone) List(java.util.List) BIGINT(io.trino.spi.type.BigintType.BIGINT) BlockAssertions.createLongsBlock(io.trino.block.BlockAssertions.createLongsBlock) DateTimeEncoding.unpackMillisUtc(io.trino.spi.type.DateTimeEncoding.unpackMillisUtc) AggregationTestOutput(io.trino.operator.aggregation.groupby.AggregationTestOutput) IntStream(java.util.stream.IntStream) BOOLEAN(io.trino.spi.type.BooleanType.BOOLEAN) SINGLE(io.trino.sql.planner.plan.AggregationNode.Step.SINGLE) HashMap(java.util.HashMap) OptionalInt(java.util.OptionalInt) StructuralTestUtil.mapBlockOf(io.trino.util.StructuralTestUtil.mapBlockOf) ArrayList(java.util.ArrayList) VARCHAR(io.trino.spi.type.VarcharType.VARCHAR) ImmutableList(com.google.common.collect.ImmutableList) TimeZoneKey(io.trino.spi.type.TimeZoneKey) Histogram(io.trino.operator.aggregation.histogram.Histogram) DateTimeZoneIndex.getDateTimeZone(io.trino.util.DateTimeZoneIndex.getDateTimeZone) BlockAssertions.createStringsBlock(io.trino.block.BlockAssertions.createStringsBlock) MapType(io.trino.spi.type.MapType) BlockAssertions.createBooleansBlock(io.trino.block.BlockAssertions.createBooleansBlock) DateTime(org.joda.time.DateTime) Ints(com.google.common.primitives.Ints) BlockAssertions.createStringArraysBlock(io.trino.block.BlockAssertions.createStringArraysBlock) QualifiedName(io.trino.sql.tree.QualifiedName) DOUBLE(io.trino.spi.type.DoubleType.DOUBLE) TimeZoneKey.getTimeZoneKey(io.trino.spi.type.TimeZoneKey.getTimeZoneKey) StructuralTestUtil.mapType(io.trino.util.StructuralTestUtil.mapType) AggregationTestUtils.assertAggregation(io.trino.operator.aggregation.AggregationTestUtils.assertAggregation) Assert.assertTrue(org.testng.Assert.assertTrue) BlockBuilder(io.trino.spi.block.BlockBuilder) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) Random(java.util.Random) AggregationTestOutput(io.trino.operator.aggregation.groupby.AggregationTestOutput) Block(io.trino.spi.block.Block) BlockAssertions.createDoublesBlock(io.trino.block.BlockAssertions.createDoublesBlock) BlockAssertions.createLongsBlock(io.trino.block.BlockAssertions.createLongsBlock) BlockAssertions.createStringsBlock(io.trino.block.BlockAssertions.createStringsBlock) BlockAssertions.createBooleansBlock(io.trino.block.BlockAssertions.createBooleansBlock) BlockAssertions.createStringArraysBlock(io.trino.block.BlockAssertions.createStringArraysBlock) AggregationTestInput(io.trino.operator.aggregation.groupby.AggregationTestInput) AggregationTestInputBuilder(io.trino.operator.aggregation.groupby.AggregationTestInputBuilder)

Example 12 with BOOLEAN

use of io.trino.spi.type.BooleanType.BOOLEAN in project trino by trinodb.

the class RcFileTester method preprocessWriteValueOld.

private static Object preprocessWriteValueOld(Format format, Type type, Object value) {
    if (value == null) {
        return null;
    }
    if (type.equals(BOOLEAN)) {
        return value;
    }
    if (type.equals(TINYINT)) {
        return ((Number) value).byteValue();
    }
    if (type.equals(SMALLINT)) {
        return ((Number) value).shortValue();
    }
    if (type.equals(INTEGER)) {
        return ((Number) value).intValue();
    }
    if (type.equals(BIGINT)) {
        return ((Number) value).longValue();
    }
    if (type.equals(REAL)) {
        return ((Number) value).floatValue();
    }
    if (type.equals(DOUBLE)) {
        return ((Number) value).doubleValue();
    }
    if (type instanceof VarcharType) {
        return value;
    }
    if (type.equals(VARBINARY)) {
        return ((SqlVarbinary) value).getBytes();
    }
    if (type.equals(DATE)) {
        return Date.ofEpochDay(((SqlDate) value).getDays());
    }
    if (type.equals(TIMESTAMP_MILLIS)) {
        long millis = ((SqlTimestamp) value).getMillis();
        if (format == Format.BINARY) {
            millis = HIVE_STORAGE_TIME_ZONE.convertLocalToUTC(millis, false);
        }
        return Timestamp.ofEpochMilli(millis);
    }
    if (type instanceof DecimalType) {
        return HiveDecimal.create(((SqlDecimal) value).toBigDecimal());
    }
    if (type instanceof ArrayType) {
        Type elementType = type.getTypeParameters().get(0);
        return ((List<?>) value).stream().map(element -> preprocessWriteValueOld(format, elementType, element)).collect(toList());
    }
    if (type instanceof MapType) {
        Type keyType = type.getTypeParameters().get(0);
        Type valueType = type.getTypeParameters().get(1);
        Map<Object, Object> newMap = new HashMap<>();
        for (Entry<?, ?> entry : ((Map<?, ?>) value).entrySet()) {
            newMap.put(preprocessWriteValueOld(format, keyType, entry.getKey()), preprocessWriteValueOld(format, valueType, entry.getValue()));
        }
        return newMap;
    }
    if (type instanceof RowType) {
        List<?> fieldValues = (List<?>) value;
        List<Type> fieldTypes = type.getTypeParameters();
        List<Object> newStruct = new ArrayList<>();
        for (int fieldId = 0; fieldId < fieldValues.size(); fieldId++) {
            newStruct.add(preprocessWriteValueOld(format, fieldTypes.get(fieldId), fieldValues.get(fieldId)));
        }
        return newStruct;
    }
    throw new IllegalArgumentException("unsupported type: " + type);
}
Also used : SnappyCodec(org.apache.hadoop.io.compress.SnappyCodec) PRESTO_RCFILE_WRITER_VERSION_METADATA_KEY(io.trino.rcfile.RcFileWriter.PRESTO_RCFILE_WRITER_VERSION_METADATA_KEY) DateTimeZone(org.joda.time.DateTimeZone) PrimitiveObjectInspectorFactory.javaByteObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaByteObjectInspector) Text(org.apache.hadoop.io.Text) PrimitiveObjectInspectorFactory.javaLongObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaLongObjectInspector) Writable(org.apache.hadoop.io.Writable) PrimitiveObjectInspectorFactory.javaTimestampObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaTimestampObjectInspector) Date(org.apache.hadoop.hive.common.type.Date) PrimitiveObjectInspectorFactory.javaDateObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaDateObjectInspector) Decimals.rescale(io.trino.spi.type.Decimals.rescale) FileSplit(org.apache.hadoop.mapred.FileSplit) RcFileDecoderUtils.findFirstSyncPosition(io.trino.rcfile.RcFileDecoderUtils.findFirstSyncPosition) RCFileInputFormat(org.apache.hadoop.hive.ql.io.RCFileInputFormat) Files.createTempDirectory(java.nio.file.Files.createTempDirectory) Slices(io.airlift.slice.Slices) Configuration(org.apache.hadoop.conf.Configuration) Map(java.util.Map) BigInteger(java.math.BigInteger) ObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) Assert.assertFalse(org.testng.Assert.assertFalse) LazyBinaryArray(org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryArray) IntWritable(org.apache.hadoop.io.IntWritable) SMALLINT(io.trino.spi.type.SmallintType.SMALLINT) SERIALIZATION_LIB(org.apache.hadoop.hive.serde.serdeConstants.SERIALIZATION_LIB) PrimitiveObjectInspectorFactory.javaByteArrayObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaByteArrayObjectInspector) BytesRefArrayWritable(org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable) META_TABLE_COLUMN_TYPES(org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_COLUMN_TYPES) PrimitiveObjectInspectorFactory.javaFloatObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaFloatObjectInspector) LazyMap(org.apache.hadoop.hive.serde2.lazy.LazyMap) PrimitiveObjectInspectorFactory.javaDoubleObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaDoubleObjectInspector) LazyArray(org.apache.hadoop.hive.serde2.lazy.LazyArray) Set(java.util.Set) READ_ALL_COLUMNS(org.apache.hadoop.hive.serde2.ColumnProjectionUtils.READ_ALL_COLUMNS) MICROSECONDS_PER_MILLISECOND(io.trino.type.DateTimes.MICROSECONDS_PER_MILLISECOND) UncheckedIOException(java.io.UncheckedIOException) BooleanWritable(org.apache.hadoop.io.BooleanWritable) RecordReader(org.apache.hadoop.mapred.RecordReader) TypeSignatureParameter(io.trino.spi.type.TypeSignatureParameter) DATE(io.trino.spi.type.DateType.DATE) REAL(io.trino.spi.type.RealType.REAL) PrimitiveObjectInspectorFactory.javaIntObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaIntObjectInspector) StructField(org.apache.hadoop.hive.serde2.objectinspector.StructField) Lz4Codec(org.apache.hadoop.io.compress.Lz4Codec) Iterables(com.google.common.collect.Iterables) Slice(io.airlift.slice.Slice) TIMESTAMP_MILLIS(io.trino.spi.type.TimestampType.TIMESTAMP_MILLIS) MEGABYTE(io.airlift.units.DataSize.Unit.MEGABYTE) StructObject(org.apache.hadoop.hive.serde2.StructObject) Page(io.trino.spi.Page) SqlDecimal(io.trino.spi.type.SqlDecimal) Functions.constant(com.google.common.base.Functions.constant) BOOLEAN(io.trino.spi.type.BooleanType.BOOLEAN) META_TABLE_COLUMNS(org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_COLUMNS) ArrayList(java.util.ArrayList) NONE(io.trino.rcfile.RcFileTester.Compression.NONE) VARCHAR(io.trino.spi.type.VarcharType.VARCHAR) Lists(com.google.common.collect.Lists) ALLOW_INSECURE(com.google.common.io.RecursiveDeleteOption.ALLOW_INSECURE) BZIP2(io.trino.rcfile.RcFileTester.Compression.BZIP2) PrimitiveObjectInspectorFactory.javaShortObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaShortObjectInspector) ThreadLocalRandom(java.util.concurrent.ThreadLocalRandom) VARBINARY(io.trino.spi.type.VarbinaryType.VARBINARY) HadoopNative(io.trino.hadoop.HadoopNative) LinkedHashSet(java.util.LinkedHashSet) Int128(io.trino.spi.type.Int128) Properties(java.util.Properties) MapType(io.trino.spi.type.MapType) AbstractIterator(com.google.common.collect.AbstractIterator) TESTING_TYPE_MANAGER(io.trino.type.InternalTypeManager.TESTING_TYPE_MANAGER) FileOutputStream(java.io.FileOutputStream) IOException(java.io.IOException) ObjectInspectorFactory.getStandardStructObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector) DecimalTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo) File(java.io.File) NULL(org.apache.hadoop.mapred.Reporter.NULL) SettableStructObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector) DOUBLE(io.trino.spi.type.DoubleType.DOUBLE) SqlVarbinary(io.trino.spi.type.SqlVarbinary) SIZE_OF_LONG(io.airlift.slice.SizeOf.SIZE_OF_LONG) Deserializer(org.apache.hadoop.hive.serde2.Deserializer) TINYINT(io.trino.spi.type.TinyintType.TINYINT) BlockBuilder(io.trino.spi.block.BlockBuilder) SerDeException(org.apache.hadoop.hive.serde2.SerDeException) FloatWritable(org.apache.hadoop.io.FloatWritable) RecordWriter(org.apache.hadoop.hive.ql.exec.FileSinkOperator.RecordWriter) BinaryRcFileEncoding(io.trino.rcfile.binary.BinaryRcFileEncoding) DateTimeTestingUtils.sqlTimestampOf(io.trino.testing.DateTimeTestingUtils.sqlTimestampOf) Iterables.transform(com.google.common.collect.Iterables.transform) LazyBinaryColumnarSerDe(org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe) MoreFiles.deleteRecursively(com.google.common.io.MoreFiles.deleteRecursively) GzipCodec(org.apache.hadoop.io.compress.GzipCodec) LongWritable(org.apache.hadoop.io.LongWritable) SNAPPY(io.trino.rcfile.RcFileTester.Compression.SNAPPY) TextRcFileEncoding(io.trino.rcfile.text.TextRcFileEncoding) SqlTimestamp(io.trino.spi.type.SqlTimestamp) TimestampWritableV2(org.apache.hadoop.hive.serde2.io.TimestampWritableV2) Block(io.trino.spi.block.Block) PRESTO_RCFILE_WRITER_VERSION(io.trino.rcfile.RcFileWriter.PRESTO_RCFILE_WRITER_VERSION) InputFormat(org.apache.hadoop.mapred.InputFormat) Path(org.apache.hadoop.fs.Path) KILOBYTE(io.airlift.units.DataSize.Unit.KILOBYTE) INTEGER(io.trino.spi.type.IntegerType.INTEGER) ShortWritable(org.apache.hadoop.hive.serde2.io.ShortWritable) RowType(io.trino.spi.type.RowType) SIZE_OF_INT(io.airlift.slice.SizeOf.SIZE_OF_INT) ImmutableSet(com.google.common.collect.ImmutableSet) DateWritableV2(org.apache.hadoop.hive.serde2.io.DateWritableV2) ImmutableMap(com.google.common.collect.ImmutableMap) Collections.nCopies(java.util.Collections.nCopies) RCFileOutputFormat(org.apache.hadoop.hive.ql.io.RCFileOutputFormat) SESSION(io.trino.testing.TestingConnectorSession.SESSION) ArrayType(io.trino.spi.type.ArrayType) StructObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector) SqlDate(io.trino.spi.type.SqlDate) ColumnarSerDe(org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe) Objects(java.util.Objects) DataSize(io.airlift.units.DataSize) List(java.util.List) BIGINT(io.trino.spi.type.BigintType.BIGINT) Decimals(io.trino.spi.type.Decimals) Entry(java.util.Map.Entry) LZ4(io.trino.rcfile.RcFileTester.Compression.LZ4) Optional(java.util.Optional) READ_COLUMN_IDS_CONF_STR(org.apache.hadoop.hive.serde2.ColumnProjectionUtils.READ_COLUMN_IDS_CONF_STR) DecimalType(io.trino.spi.type.DecimalType) MAP(io.trino.spi.type.StandardTypes.MAP) LazyPrimitive(org.apache.hadoop.hive.serde2.lazy.LazyPrimitive) Assert.assertNull(org.testng.Assert.assertNull) LazyBinaryMap(org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryMap) PrimitiveObjectInspectorFactory.javaBooleanObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaBooleanObjectInspector) Type(io.trino.spi.type.Type) Assert.assertEquals(org.testng.Assert.assertEquals) HashMap(java.util.HashMap) DoubleWritable(org.apache.hadoop.io.DoubleWritable) PrimitiveObjectInspectorFactory.getPrimitiveJavaObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.getPrimitiveJavaObjectInspector) VarcharType(io.trino.spi.type.VarcharType) OutputStreamSliceOutput(io.airlift.slice.OutputStreamSliceOutput) COMPRESS_CODEC(org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.COMPRESS_CODEC) ImmutableList(com.google.common.collect.ImmutableList) ByteWritable(org.apache.hadoop.io.ByteWritable) BytesWritable(org.apache.hadoop.io.BytesWritable) Math.toIntExact(java.lang.Math.toIntExact) Iterator(java.util.Iterator) Timestamp(org.apache.hadoop.hive.common.type.Timestamp) Iterators.advance(com.google.common.collect.Iterators.advance) FileInputStream(java.io.FileInputStream) JobConf(org.apache.hadoop.mapred.JobConf) BZip2Codec(org.apache.hadoop.io.compress.BZip2Codec) Collectors.toList(java.util.stream.Collectors.toList) ObjectInspectorFactory(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory) Serializer(org.apache.hadoop.hive.serde2.Serializer) HiveDecimal(org.apache.hadoop.hive.common.type.HiveDecimal) Closeable(java.io.Closeable) Assert.assertTrue(org.testng.Assert.assertTrue) PrimitiveObjectInspectorFactory.javaStringObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory.javaStringObjectInspector) HiveDecimalWritable(org.apache.hadoop.hive.serde2.io.HiveDecimalWritable) Collections(java.util.Collections) InputStream(java.io.InputStream) ZLIB(io.trino.rcfile.RcFileTester.Compression.ZLIB) VarcharType(io.trino.spi.type.VarcharType) HashMap(java.util.HashMap) SqlVarbinary(io.trino.spi.type.SqlVarbinary) ArrayList(java.util.ArrayList) RowType(io.trino.spi.type.RowType) SqlTimestamp(io.trino.spi.type.SqlTimestamp) MapType(io.trino.spi.type.MapType) ArrayType(io.trino.spi.type.ArrayType) MapType(io.trino.spi.type.MapType) RowType(io.trino.spi.type.RowType) ArrayType(io.trino.spi.type.ArrayType) DecimalType(io.trino.spi.type.DecimalType) Type(io.trino.spi.type.Type) VarcharType(io.trino.spi.type.VarcharType) DecimalType(io.trino.spi.type.DecimalType) ArrayList(java.util.ArrayList) List(java.util.List) ImmutableList(com.google.common.collect.ImmutableList) Collectors.toList(java.util.stream.Collectors.toList) StructObject(org.apache.hadoop.hive.serde2.StructObject) Map(java.util.Map) LazyMap(org.apache.hadoop.hive.serde2.lazy.LazyMap) ImmutableMap(com.google.common.collect.ImmutableMap) LazyBinaryMap(org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryMap) HashMap(java.util.HashMap)

Example 13 with BOOLEAN

use of io.trino.spi.type.BooleanType.BOOLEAN in project trino by trinodb.

the class TupleDomainOrcPredicate method getDomain.

@VisibleForTesting
public static Domain getDomain(Type type, long rowCount, ColumnStatistics columnStatistics) {
    if (rowCount == 0) {
        return Domain.none(type);
    }
    if (columnStatistics == null) {
        return Domain.all(type);
    }
    if (columnStatistics.hasNumberOfValues() && columnStatistics.getNumberOfValues() == 0) {
        return Domain.onlyNull(type);
    }
    boolean hasNullValue = columnStatistics.getNumberOfValues() != rowCount;
    if (type instanceof TimeType && columnStatistics.getIntegerStatistics() != null) {
        // This is the representation of TIME used by Iceberg
        return createDomain(type, hasNullValue, columnStatistics.getIntegerStatistics(), value -> ((long) value) * Timestamps.PICOSECONDS_PER_MICROSECOND);
    }
    if (type.getJavaType() == boolean.class && columnStatistics.getBooleanStatistics() != null) {
        BooleanStatistics booleanStatistics = columnStatistics.getBooleanStatistics();
        boolean hasTrueValues = (booleanStatistics.getTrueValueCount() != 0);
        boolean hasFalseValues = (columnStatistics.getNumberOfValues() != booleanStatistics.getTrueValueCount());
        if (hasTrueValues && hasFalseValues) {
            return Domain.all(BOOLEAN);
        }
        if (hasTrueValues) {
            return Domain.create(ValueSet.of(BOOLEAN, true), hasNullValue);
        }
        if (hasFalseValues) {
            return Domain.create(ValueSet.of(BOOLEAN, false), hasNullValue);
        }
    } else if (isShortDecimal(type) && columnStatistics.getDecimalStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getDecimalStatistics(), value -> rescale(value, (DecimalType) type).unscaledValue().longValue());
    } else if (isLongDecimal(type) && columnStatistics.getDecimalStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getDecimalStatistics(), value -> Int128.valueOf(rescale(value, (DecimalType) type).unscaledValue()));
    } else if (type instanceof CharType && columnStatistics.getStringStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getStringStatistics(), value -> truncateToLengthAndTrimSpaces(value, type));
    } else if (type instanceof VarcharType && columnStatistics.getStringStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getStringStatistics());
    } else if (type instanceof DateType && columnStatistics.getDateStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getDateStatistics(), value -> (long) value);
    } else if ((type.equals(TIMESTAMP_MILLIS) || type.equals(TIMESTAMP_MICROS)) && columnStatistics.getTimestampStatistics() != null) {
        // upper bound of the domain we create must be adjusted accordingly, to includes the rounded timestamp.
        return createDomain(type, hasNullValue, columnStatistics.getTimestampStatistics(), min -> min * MICROSECONDS_PER_MILLISECOND, max -> (max + 1) * MICROSECONDS_PER_MILLISECOND);
    } else if (type.equals(TIMESTAMP_NANOS) && columnStatistics.getTimestampStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getTimestampStatistics(), min -> new LongTimestamp(min * MICROSECONDS_PER_MILLISECOND, 0), max -> new LongTimestamp((max + 1) * MICROSECONDS_PER_MILLISECOND, 0));
    } else if (type.equals(TIMESTAMP_TZ_MILLIS) && columnStatistics.getTimestampStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getTimestampStatistics(), value -> packDateTimeWithZone(value, UTC_KEY));
    } else if (type.equals(TIMESTAMP_TZ_MICROS) && (columnStatistics.getTimestampStatistics() != null)) {
        return createDomain(type, hasNullValue, columnStatistics.getTimestampStatistics(), min -> LongTimestampWithTimeZone.fromEpochMillisAndFraction(min, 0, UTC_KEY), max -> LongTimestampWithTimeZone.fromEpochMillisAndFraction(max, 999_000_000, UTC_KEY));
    } else if (type.equals(TIMESTAMP_TZ_NANOS) && columnStatistics.getTimestampStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getTimestampStatistics(), min -> LongTimestampWithTimeZone.fromEpochMillisAndFraction(min, 0, UTC_KEY), max -> LongTimestampWithTimeZone.fromEpochMillisAndFraction(max, 999_999_000, UTC_KEY));
    } else if (type.getJavaType() == long.class && columnStatistics.getIntegerStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getIntegerStatistics());
    } else if (type.getJavaType() == double.class && columnStatistics.getDoubleStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getDoubleStatistics());
    } else if (REAL.equals(type) && columnStatistics.getDoubleStatistics() != null) {
        return createDomain(type, hasNullValue, columnStatistics.getDoubleStatistics(), value -> (long) floatToRawIntBits(value.floatValue()));
    }
    return Domain.create(ValueSet.all(type), hasNullValue);
}
Also used : MICROSECONDS_PER_MILLISECOND(io.trino.spi.type.Timestamps.MICROSECONDS_PER_MILLISECOND) DateType(io.trino.spi.type.DateType) TIMESTAMP_TZ_NANOS(io.trino.spi.type.TimestampWithTimeZoneType.TIMESTAMP_TZ_NANOS) LongTimestampWithTimeZone(io.trino.spi.type.LongTimestampWithTimeZone) Decimals.rescale(io.trino.spi.type.Decimals.rescale) RangeStatistics(io.trino.orc.metadata.statistics.RangeStatistics) INTEGER(io.trino.spi.type.IntegerType.INTEGER) SMALLINT(io.trino.spi.type.SmallintType.SMALLINT) Range(io.trino.spi.predicate.Range) UTC_KEY(io.trino.spi.type.TimeZoneKey.UTC_KEY) Domain(io.trino.spi.predicate.Domain) Collection(java.util.Collection) DateTimeEncoding.packDateTimeWithZone(io.trino.spi.type.DateTimeEncoding.packDateTimeWithZone) ValueSet(io.trino.spi.predicate.ValueSet) TIMESTAMP_NANOS(io.trino.spi.type.TimestampType.TIMESTAMP_NANOS) List(java.util.List) BIGINT(io.trino.spi.type.BigintType.BIGINT) Optional(java.util.Optional) DateTimeEncoding.unpackMillisUtc(io.trino.spi.type.DateTimeEncoding.unpackMillisUtc) DecimalType(io.trino.spi.type.DecimalType) ColumnStatistics(io.trino.orc.metadata.statistics.ColumnStatistics) DATE(io.trino.spi.type.DateType.DATE) REAL(io.trino.spi.type.RealType.REAL) MoreObjects.toStringHelper(com.google.common.base.MoreObjects.toStringHelper) Timestamps(io.trino.spi.type.Timestamps) Slice(io.airlift.slice.Slice) TimeType(io.trino.spi.type.TimeType) Decimals.isLongDecimal(io.trino.spi.type.Decimals.isLongDecimal) TIMESTAMP_MILLIS(io.trino.spi.type.TimestampType.TIMESTAMP_MILLIS) Type(io.trino.spi.type.Type) BOOLEAN(io.trino.spi.type.BooleanType.BOOLEAN) Float.intBitsToFloat(java.lang.Float.intBitsToFloat) Function(java.util.function.Function) ArrayList(java.util.ArrayList) VarcharType(io.trino.spi.type.VarcharType) Float.floatToRawIntBits(java.lang.Float.floatToRawIntBits) TIMESTAMP_TZ_MILLIS(io.trino.spi.type.TimestampWithTimeZoneType.TIMESTAMP_TZ_MILLIS) ImmutableList(com.google.common.collect.ImmutableList) Chars.truncateToLengthAndTrimSpaces(io.trino.spi.type.Chars.truncateToLengthAndTrimSpaces) Objects.requireNonNull(java.util.Objects.requireNonNull) Math.floorDiv(java.lang.Math.floorDiv) Decimals.isShortDecimal(io.trino.spi.type.Decimals.isShortDecimal) Int128(io.trino.spi.type.Int128) TIMESTAMP_TZ_MICROS(io.trino.spi.type.TimestampWithTimeZoneType.TIMESTAMP_TZ_MICROS) LongTimestamp(io.trino.spi.type.LongTimestamp) BloomFilter(io.trino.orc.metadata.statistics.BloomFilter) ColumnMetadata(io.trino.orc.metadata.ColumnMetadata) DOUBLE(io.trino.spi.type.DoubleType.DOUBLE) TIMESTAMP_MICROS(io.trino.spi.type.TimestampType.TIMESTAMP_MICROS) VarbinaryType(io.trino.spi.type.VarbinaryType) CharType(io.trino.spi.type.CharType) BooleanStatistics(io.trino.orc.metadata.statistics.BooleanStatistics) VisibleForTesting(com.google.common.annotations.VisibleForTesting) TINYINT(io.trino.spi.type.TinyintType.TINYINT) OrcColumnId(io.trino.orc.metadata.OrcColumnId) LongTimestamp(io.trino.spi.type.LongTimestamp) VarcharType(io.trino.spi.type.VarcharType) BooleanStatistics(io.trino.orc.metadata.statistics.BooleanStatistics) DecimalType(io.trino.spi.type.DecimalType) CharType(io.trino.spi.type.CharType) DateType(io.trino.spi.type.DateType) TimeType(io.trino.spi.type.TimeType) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Example 14 with BOOLEAN

use of io.trino.spi.type.BooleanType.BOOLEAN in project trino by trinodb.

the class DeltaLakeMetadata method dropSchema.

@Override
public void dropSchema(ConnectorSession session, String schemaName) {
    Optional<Path> location = metastore.getDatabase(schemaName).orElseThrow(() -> new SchemaNotFoundException(schemaName)).getLocation().map(Path::new);
    // If we see files in the schema location, don't delete it.
    // If we see no files or can't see the location at all, use fallback.
    boolean deleteData = location.map(path -> {
        // don't catch errors here
        HdfsContext context = new HdfsContext(session);
        try (FileSystem fs = hdfsEnvironment.getFileSystem(context, path)) {
            return !fs.listLocatedStatus(path).hasNext();
        } catch (IOException | RuntimeException e) {
            LOG.warn(e, "Could not check schema directory '%s'", path);
            return deleteSchemaLocationsFallback;
        }
    }).orElse(deleteSchemaLocationsFallback);
    metastore.dropDatabase(schemaName, deleteData);
}
Also used : Path(org.apache.hadoop.fs.Path) TransactionLogUtil.getTransactionLogDir(io.trino.plugin.deltalake.transactionlog.TransactionLogUtil.getTransactionLogDir) FileSystem(org.apache.hadoop.fs.FileSystem) TableSnapshot(io.trino.plugin.deltalake.transactionlog.TableSnapshot) ColumnStatisticMetadata(io.trino.spi.statistics.ColumnStatisticMetadata) FileStatus(org.apache.hadoop.fs.FileStatus) DeltaLakeSchemaSupport.validateType(io.trino.plugin.deltalake.transactionlog.DeltaLakeSchemaSupport.validateType) TypeUtils.isFloatingPointNaN(io.trino.spi.type.TypeUtils.isFloatingPointNaN) RemoveFileEntry(io.trino.plugin.deltalake.transactionlog.RemoveFileEntry) ConnectorTableExecuteHandle(io.trino.spi.connector.ConnectorTableExecuteHandle) Collections.singletonList(java.util.Collections.singletonList) NOT_SUPPORTED(io.trino.spi.StandardErrorCode.NOT_SUPPORTED) TransactionLogWriterFactory(io.trino.plugin.deltalake.transactionlog.writer.TransactionLogWriterFactory) TableNotFoundException(io.trino.spi.connector.TableNotFoundException) TimestampWithTimeZoneType(io.trino.spi.type.TimestampWithTimeZoneType) ValueSet.ofRanges(io.trino.spi.predicate.ValueSet.ofRanges) Column(io.trino.plugin.hive.metastore.Column) ConnectorOutputTableHandle(io.trino.spi.connector.ConnectorOutputTableHandle) ConnectorTableHandle(io.trino.spi.connector.ConnectorTableHandle) Map(java.util.Map) PARTITIONED_BY_PROPERTY(io.trino.plugin.deltalake.DeltaLakeTableProperties.PARTITIONED_BY_PROPERTY) ProjectionApplicationResult(io.trino.spi.connector.ProjectionApplicationResult) PRESTO_QUERY_ID_NAME(io.trino.plugin.hive.HiveMetadata.PRESTO_QUERY_ID_NAME) ENGLISH(java.util.Locale.ENGLISH) SMALLINT(io.trino.spi.type.SmallintType.SMALLINT) HdfsEnvironment(io.trino.plugin.hive.HdfsEnvironment) Table(io.trino.plugin.hive.metastore.Table) Domain(io.trino.spi.predicate.Domain) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) Set(java.util.Set) TABLE_PROVIDER_PROPERTY(io.trino.plugin.deltalake.metastore.HiveMetastoreBackedDeltaLakeMetastore.TABLE_PROVIDER_PROPERTY) HiveWriteUtils.pathExists(io.trino.plugin.hive.util.HiveWriteUtils.pathExists) MANAGED_TABLE(org.apache.hadoop.hive.metastore.TableType.MANAGED_TABLE) SchemaTableName(io.trino.spi.connector.SchemaTableName) ImmutableMap.toImmutableMap(com.google.common.collect.ImmutableMap.toImmutableMap) Stream(java.util.stream.Stream) TrinoPrincipal(io.trino.spi.security.TrinoPrincipal) CatalogSchemaTableName(io.trino.spi.connector.CatalogSchemaTableName) SchemaTablePrefix(io.trino.spi.connector.SchemaTablePrefix) HyperLogLog(io.airlift.stats.cardinality.HyperLogLog) DateTimeEncoding.unpackMillisUtc(io.trino.spi.type.DateTimeEncoding.unpackMillisUtc) FILE_MODIFIED_TIME_COLUMN_NAME(io.trino.plugin.deltalake.DeltaLakeColumnHandle.FILE_MODIFIED_TIME_COLUMN_NAME) Predicate.not(java.util.function.Predicate.not) TableColumnsMetadata(io.trino.spi.connector.TableColumnsMetadata) RemoteIterator(org.apache.hadoop.fs.RemoteIterator) ANALYZE_COLUMNS_PROPERTY(io.trino.plugin.deltalake.DeltaLakeTableProperties.ANALYZE_COLUMNS_PROPERTY) REGULAR(io.trino.plugin.deltalake.DeltaLakeColumnType.REGULAR) TransactionLogParser.getMandatoryCurrentVersion(io.trino.plugin.deltalake.transactionlog.TransactionLogParser.getMandatoryCurrentVersion) DATE(io.trino.spi.type.DateType.DATE) REAL(io.trino.spi.type.RealType.REAL) Iterables(com.google.common.collect.Iterables) ConnectorTableLayout(io.trino.spi.connector.ConnectorTableLayout) ConnectorInsertTableHandle(io.trino.spi.connector.ConnectorInsertTableHandle) DeltaLakeColumnHandle.fileSizeColumnHandle(io.trino.plugin.deltalake.DeltaLakeColumnHandle.fileSizeColumnHandle) Slice(io.airlift.slice.Slice) ColumnMetadata(io.trino.spi.connector.ColumnMetadata) DeltaLakeTableProcedureId(io.trino.plugin.deltalake.procedure.DeltaLakeTableProcedureId) INVALID_ANALYZE_PROPERTY(io.trino.spi.StandardErrorCode.INVALID_ANALYZE_PROPERTY) BOOLEAN(io.trino.spi.type.BooleanType.BOOLEAN) ConnectorTableMetadata(io.trino.spi.connector.ConnectorTableMetadata) Variable(io.trino.spi.expression.Variable) DeltaLakeTableProperties.getLocation(io.trino.plugin.deltalake.DeltaLakeTableProperties.getLocation) Range.greaterThanOrEqual(io.trino.spi.predicate.Range.greaterThanOrEqual) TransactionConflictException(io.trino.plugin.deltalake.transactionlog.writer.TransactionConflictException) HiveType(io.trino.plugin.hive.HiveType) VARCHAR(io.trino.spi.type.VarcharType.VARCHAR) DeltaLakeStatisticsAccess(io.trino.plugin.deltalake.statistics.DeltaLakeStatisticsAccess) DeltaLakeSchemaSupport.extractPartitionColumns(io.trino.plugin.deltalake.transactionlog.DeltaLakeSchemaSupport.extractPartitionColumns) ColumnHandle(io.trino.spi.connector.ColumnHandle) ImmutableSet.toImmutableSet(com.google.common.collect.ImmutableSet.toImmutableSet) INVALID_TABLE_PROPERTY(io.trino.spi.StandardErrorCode.INVALID_TABLE_PROPERTY) DeltaLakeSchemaSupport.serializeStatsAsJson(io.trino.plugin.deltalake.transactionlog.DeltaLakeSchemaSupport.serializeStatsAsJson) Nullable(javax.annotation.Nullable) ConstraintApplicationResult(io.trino.spi.connector.ConstraintApplicationResult) MapType(io.trino.spi.type.MapType) PARTITION_KEY(io.trino.plugin.deltalake.DeltaLakeColumnType.PARTITION_KEY) IOException(java.io.IOException) ConnectorSession(io.trino.spi.connector.ConnectorSession) DELTA_LAKE_INVALID_SCHEMA(io.trino.plugin.deltalake.DeltaLakeErrorCode.DELTA_LAKE_INVALID_SCHEMA) CheckpointWriterManager(io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointWriterManager) ROW_ID_COLUMN_TYPE(io.trino.plugin.deltalake.DeltaLakeColumnHandle.ROW_ID_COLUMN_TYPE) DOUBLE(io.trino.spi.type.DoubleType.DOUBLE) HiveUtil.isHiveSystemSchema(io.trino.plugin.hive.util.HiveUtil.isHiveSystemSchema) ConnectorTableProperties(io.trino.spi.connector.ConnectorTableProperties) ConnectorExpression(io.trino.spi.expression.ConnectorExpression) MAX_VALUE(io.trino.spi.statistics.ColumnStatisticType.MAX_VALUE) DeltaLakeSessionProperties.isTableStatisticsEnabled(io.trino.plugin.deltalake.DeltaLakeSessionProperties.isTableStatisticsEnabled) LOCATION_PROPERTY(io.trino.plugin.deltalake.DeltaLakeTableProperties.LOCATION_PROPERTY) TableStatisticsMetadata(io.trino.spi.statistics.TableStatisticsMetadata) TINYINT(io.trino.spi.type.TinyintType.TINYINT) NotADeltaLakeTableException(io.trino.plugin.deltalake.metastore.NotADeltaLakeTableException) DeltaLakeStatistics(io.trino.plugin.deltalake.statistics.DeltaLakeStatistics) HiveUtil.isDeltaLakeTable(io.trino.plugin.hive.util.HiveUtil.isDeltaLakeTable) NodeManager(io.trino.spi.NodeManager) EXTERNAL_TABLE(org.apache.hadoop.hive.metastore.TableType.EXTERNAL_TABLE) Database(io.trino.plugin.hive.metastore.Database) DeltaLakeSchemaSupport.extractSchema(io.trino.plugin.deltalake.transactionlog.DeltaLakeSchemaSupport.extractSchema) SYNTHESIZED(io.trino.plugin.deltalake.DeltaLakeColumnType.SYNTHESIZED) TABLE_PROVIDER_VALUE(io.trino.plugin.deltalake.metastore.HiveMetastoreBackedDeltaLakeMetastore.TABLE_PROVIDER_VALUE) SchemaNotFoundException(io.trino.spi.connector.SchemaNotFoundException) AddFileEntry(io.trino.plugin.deltalake.transactionlog.AddFileEntry) DeltaLakeMetastore(io.trino.plugin.deltalake.metastore.DeltaLakeMetastore) Format(io.trino.plugin.deltalake.transactionlog.MetadataEntry.Format) Preconditions.checkArgument(com.google.common.base.Preconditions.checkArgument) Locale(java.util.Locale) CatalogSchemaName(io.trino.spi.connector.CatalogSchemaName) Path(org.apache.hadoop.fs.Path) HyperLogLogType(io.trino.spi.type.HyperLogLogType) INTEGER(io.trino.spi.type.IntegerType.INTEGER) StorageFormat(io.trino.plugin.hive.metastore.StorageFormat) RowType(io.trino.spi.type.RowType) Range.range(io.trino.spi.predicate.Range.range) ImmutableSet(com.google.common.collect.ImmutableSet) ImmutableMap(com.google.common.collect.ImmutableMap) HiveWriteUtils.isS3FileSystem(io.trino.plugin.hive.util.HiveWriteUtils.isS3FileSystem) TransactionLogWriter(io.trino.plugin.deltalake.transactionlog.writer.TransactionLogWriter) Collection(java.util.Collection) DeltaLakeTableExecuteHandle(io.trino.plugin.deltalake.procedure.DeltaLakeTableExecuteHandle) MetadataEntry(io.trino.plugin.deltalake.transactionlog.MetadataEntry) ComputedStatistics(io.trino.spi.statistics.ComputedStatistics) TrinoException(io.trino.spi.TrinoException) ArrayType(io.trino.spi.type.ArrayType) Instant(java.time.Instant) ConnectorOutputMetadata(io.trino.spi.connector.ConnectorOutputMetadata) Sets(com.google.common.collect.Sets) FileNotFoundException(java.io.FileNotFoundException) String.format(java.lang.String.format) Preconditions.checkState(com.google.common.base.Preconditions.checkState) ROW_ID_COLUMN_NAME(io.trino.plugin.deltalake.DeltaLakeColumnHandle.ROW_ID_COLUMN_NAME) INVALID_SCHEMA_PROPERTY(io.trino.spi.StandardErrorCode.INVALID_SCHEMA_PROPERTY) DataSize(io.airlift.units.DataSize) HdfsContext(io.trino.plugin.hive.HdfsEnvironment.HdfsContext) List(java.util.List) BIGINT(io.trino.spi.type.BigintType.BIGINT) MetastoreUtil.buildInitialPrivilegeSet(io.trino.plugin.hive.metastore.MetastoreUtil.buildInitialPrivilegeSet) Assignment(io.trino.spi.connector.Assignment) BeginTableExecuteResult(io.trino.spi.connector.BeginTableExecuteResult) Function.identity(java.util.function.Function.identity) Optional(java.util.Optional) ConnectorMetadata(io.trino.spi.connector.ConnectorMetadata) DecimalType(io.trino.spi.type.DecimalType) OPTIMIZE(io.trino.plugin.deltalake.procedure.DeltaLakeTableProcedureId.OPTIMIZE) JsonCodec(io.airlift.json.JsonCodec) Comparators(com.google.common.collect.Comparators) Constraint(io.trino.spi.connector.Constraint) Range.lessThanOrEqual(io.trino.spi.predicate.Range.lessThanOrEqual) DeltaLakeFileStatistics(io.trino.plugin.deltalake.transactionlog.statistics.DeltaLakeFileStatistics) Logger(io.airlift.log.Logger) DeltaLakeSchemaSupport.serializeSchemaAsJson(io.trino.plugin.deltalake.transactionlog.DeltaLakeSchemaSupport.serializeSchemaAsJson) DeltaLakeColumnStatistics(io.trino.plugin.deltalake.statistics.DeltaLakeColumnStatistics) Type(io.trino.spi.type.Type) HashMap(java.util.HashMap) DeltaLakeColumnHandle.pathColumnHandle(io.trino.plugin.deltalake.DeltaLakeColumnHandle.pathColumnHandle) DeltaLakeColumnHandle.fileModifiedTimeColumnHandle(io.trino.plugin.deltalake.DeltaLakeColumnHandle.fileModifiedTimeColumnHandle) AtomicReference(java.util.concurrent.atomic.AtomicReference) VarcharType(io.trino.spi.type.VarcharType) ImmutableList(com.google.common.collect.ImmutableList) Verify.verify(com.google.common.base.Verify.verify) Objects.requireNonNull(java.util.Objects.requireNonNull) TableStatistics(io.trino.spi.statistics.TableStatistics) DeltaLakeSessionProperties.isExtendedStatisticsEnabled(io.trino.plugin.deltalake.DeltaLakeSessionProperties.isExtendedStatisticsEnabled) VIRTUAL_VIEW(org.apache.hadoop.hive.metastore.TableType.VIRTUAL_VIEW) CHECKPOINT_INTERVAL_PROPERTY(io.trino.plugin.deltalake.DeltaLakeTableProperties.CHECKPOINT_INTERVAL_PROPERTY) StorageFormat.create(io.trino.plugin.hive.metastore.StorageFormat.create) MetadataEntry.buildDeltaMetadataConfiguration(io.trino.plugin.deltalake.transactionlog.MetadataEntry.buildDeltaMetadataConfiguration) TupleDomain.withColumnDomains(io.trino.spi.predicate.TupleDomain.withColumnDomains) DELTA_LAKE_BAD_WRITE(io.trino.plugin.deltalake.DeltaLakeErrorCode.DELTA_LAKE_BAD_WRITE) JsonProcessingException(com.fasterxml.jackson.core.JsonProcessingException) TupleDomain(io.trino.spi.predicate.TupleDomain) DeltaLakeTableProperties.getPartitionedBy(io.trino.plugin.deltalake.DeltaLakeTableProperties.getPartitionedBy) HiveWriteUtils.createDirectory(io.trino.plugin.hive.util.HiveWriteUtils.createDirectory) GENERIC_INTERNAL_ERROR(io.trino.spi.StandardErrorCode.GENERIC_INTERNAL_ERROR) SchemaTableName.schemaTableName(io.trino.spi.connector.SchemaTableName.schemaTableName) UUID.randomUUID(java.util.UUID.randomUUID) ProtocolEntry(io.trino.plugin.deltalake.transactionlog.ProtocolEntry) DeltaTableOptimizeHandle(io.trino.plugin.deltalake.procedure.DeltaTableOptimizeHandle) Collections.unmodifiableMap(java.util.Collections.unmodifiableMap) CommitInfoEntry(io.trino.plugin.deltalake.transactionlog.CommitInfoEntry) PrincipalPrivileges(io.trino.plugin.hive.metastore.PrincipalPrivileges) TypeManager(io.trino.spi.type.TypeManager) Collections(java.util.Collections) NUMBER_OF_DISTINCT_VALUES_SUMMARY(io.trino.spi.statistics.ColumnStatisticType.NUMBER_OF_DISTINCT_VALUES_SUMMARY) FileSystem(org.apache.hadoop.fs.FileSystem) HiveWriteUtils.isS3FileSystem(io.trino.plugin.hive.util.HiveWriteUtils.isS3FileSystem) SchemaNotFoundException(io.trino.spi.connector.SchemaNotFoundException) HdfsContext(io.trino.plugin.hive.HdfsEnvironment.HdfsContext)

Example 15 with BOOLEAN

use of io.trino.spi.type.BooleanType.BOOLEAN in project trino by trinodb.

the class PhoenixClient method beginCreateTable.

@Override
public JdbcOutputTableHandle beginCreateTable(ConnectorSession session, ConnectorTableMetadata tableMetadata) {
    SchemaTableName schemaTableName = tableMetadata.getTable();
    String schema = schemaTableName.getSchemaName();
    String table = schemaTableName.getTableName();
    if (!getSchemaNames(session).contains(schema)) {
        throw new SchemaNotFoundException(schema);
    }
    try (Connection connection = connectionFactory.openConnection(session)) {
        ConnectorIdentity identity = session.getIdentity();
        schema = getIdentifierMapping().toRemoteSchemaName(identity, connection, schema);
        table = getIdentifierMapping().toRemoteTableName(identity, connection, schema, table);
        schema = toPhoenixSchemaName(schema);
        LinkedList<ColumnMetadata> tableColumns = new LinkedList<>(tableMetadata.getColumns());
        Map<String, Object> tableProperties = tableMetadata.getProperties();
        Optional<Boolean> immutableRows = PhoenixTableProperties.getImmutableRows(tableProperties);
        String immutable = immutableRows.isPresent() && immutableRows.get() ? "IMMUTABLE" : "";
        ImmutableList.Builder<String> columnNames = ImmutableList.builder();
        ImmutableList.Builder<Type> columnTypes = ImmutableList.builder();
        ImmutableList.Builder<String> columnList = ImmutableList.builder();
        Set<ColumnMetadata> rowkeyColumns = tableColumns.stream().filter(col -> isPrimaryKey(col, tableProperties)).collect(toSet());
        ImmutableList.Builder<String> pkNames = ImmutableList.builder();
        Optional<String> rowkeyColumn = Optional.empty();
        if (rowkeyColumns.isEmpty()) {
            // Add a rowkey when not specified in DDL
            columnList.add(ROWKEY + " bigint not null");
            pkNames.add(ROWKEY);
            execute(session, format("CREATE SEQUENCE %s", getEscapedTableName(schema, table + "_sequence")));
            rowkeyColumn = Optional.of(ROWKEY);
        }
        for (ColumnMetadata column : tableColumns) {
            String columnName = getIdentifierMapping().toRemoteColumnName(connection, column.getName());
            columnNames.add(columnName);
            columnTypes.add(column.getType());
            String typeStatement = toWriteMapping(session, column.getType()).getDataType();
            if (rowkeyColumns.contains(column)) {
                typeStatement += " not null";
                pkNames.add(columnName);
            }
            columnList.add(format("%s %s", getEscapedArgument(columnName), typeStatement));
        }
        ImmutableList.Builder<String> tableOptions = ImmutableList.builder();
        PhoenixTableProperties.getSaltBuckets(tableProperties).ifPresent(value -> tableOptions.add(TableProperty.SALT_BUCKETS + "=" + value));
        PhoenixTableProperties.getSplitOn(tableProperties).ifPresent(value -> tableOptions.add("SPLIT ON (" + value.replace('"', '\'') + ")"));
        PhoenixTableProperties.getDisableWal(tableProperties).ifPresent(value -> tableOptions.add(TableProperty.DISABLE_WAL + "=" + value));
        PhoenixTableProperties.getDefaultColumnFamily(tableProperties).ifPresent(value -> tableOptions.add(TableProperty.DEFAULT_COLUMN_FAMILY + "=" + value));
        PhoenixTableProperties.getBloomfilter(tableProperties).ifPresent(value -> tableOptions.add(HColumnDescriptor.BLOOMFILTER + "='" + value + "'"));
        PhoenixTableProperties.getVersions(tableProperties).ifPresent(value -> tableOptions.add(HConstants.VERSIONS + "=" + value));
        PhoenixTableProperties.getMinVersions(tableProperties).ifPresent(value -> tableOptions.add(HColumnDescriptor.MIN_VERSIONS + "=" + value));
        PhoenixTableProperties.getCompression(tableProperties).ifPresent(value -> tableOptions.add(HColumnDescriptor.COMPRESSION + "='" + value + "'"));
        PhoenixTableProperties.getTimeToLive(tableProperties).ifPresent(value -> tableOptions.add(HColumnDescriptor.TTL + "=" + value));
        PhoenixTableProperties.getDataBlockEncoding(tableProperties).ifPresent(value -> tableOptions.add(HColumnDescriptor.DATA_BLOCK_ENCODING + "='" + value + "'"));
        String sql = format("CREATE %s TABLE %s (%s , CONSTRAINT PK PRIMARY KEY (%s)) %s", immutable, getEscapedTableName(schema, table), join(", ", columnList.build()), join(", ", pkNames.build()), join(", ", tableOptions.build()));
        execute(session, sql);
        return new PhoenixOutputTableHandle(schema, table, columnNames.build(), columnTypes.build(), Optional.empty(), rowkeyColumn);
    } catch (SQLException e) {
        if (e.getErrorCode() == SQLExceptionCode.TABLE_ALREADY_EXIST.getErrorCode()) {
            throw new TrinoException(ALREADY_EXISTS, "Phoenix table already exists", e);
        }
        throw new TrinoException(PHOENIX_METADATA_ERROR, "Error creating Phoenix table", e);
    }
}
Also used : UNNECESSARY(java.math.RoundingMode.UNNECESSARY) StandardColumnMappings.varcharColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.varcharColumnMapping) TypeUtils.getArrayElementPhoenixTypeName(io.trino.plugin.phoenix5.TypeUtils.getArrayElementPhoenixTypeName) TypeUtils.jdbcObjectArrayToBlock(io.trino.plugin.phoenix5.TypeUtils.jdbcObjectArrayToBlock) HBaseFactoryProvider(org.apache.phoenix.query.HBaseFactoryProvider) StandardColumnMappings.bigintWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.bigintWriteFunction) PredicatePushdownController(io.trino.plugin.jdbc.PredicatePushdownController) NOT_SUPPORTED(io.trino.spi.StandardErrorCode.NOT_SUPPORTED) StandardColumnMappings.booleanColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.booleanColumnMapping) StandardColumnMappings.defaultVarcharColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.defaultVarcharColumnMapping) ResultSet(java.sql.ResultSet) Configuration(org.apache.hadoop.conf.Configuration) Map(java.util.Map) StandardColumnMappings.doubleWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.doubleWriteFunction) DecimalSessionSessionProperties.getDecimalDefaultScale(io.trino.plugin.jdbc.DecimalSessionSessionProperties.getDecimalDefaultScale) MapReduceParallelScanGrouper(org.apache.phoenix.iterate.MapReduceParallelScanGrouper) ENGLISH(java.util.Locale.ENGLISH) PhoenixArray(org.apache.phoenix.schema.types.PhoenixArray) SMALLINT(io.trino.spi.type.SmallintType.SMALLINT) LONGNVARCHAR(java.sql.Types.LONGNVARCHAR) ConcatResultIterator(org.apache.phoenix.iterate.ConcatResultIterator) ConnectorIdentity(io.trino.spi.security.ConnectorIdentity) TypeHandlingJdbcSessionProperties.getUnsupportedTypeHandling(io.trino.plugin.jdbc.TypeHandlingJdbcSessionProperties.getUnsupportedTypeHandling) PhoenixClientModule.getConnectionProperties(io.trino.plugin.phoenix5.PhoenixClientModule.getConnectionProperties) TIME_WITH_TIMEZONE(java.sql.Types.TIME_WITH_TIMEZONE) FOREVER(org.apache.hadoop.hbase.HConstants.FOREVER) LongWriteFunction(io.trino.plugin.jdbc.LongWriteFunction) Set(java.util.Set) LONGVARCHAR(java.sql.Types.LONGVARCHAR) PreparedStatement(java.sql.PreparedStatement) BloomType(org.apache.hadoop.hbase.regionserver.BloomType) SchemaTableName(io.trino.spi.connector.SchemaTableName) Collectors.joining(java.util.stream.Collectors.joining) LongReadFunction(io.trino.plugin.jdbc.LongReadFunction) ResultIterator(org.apache.phoenix.iterate.ResultIterator) StandardColumnMappings.smallintWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.smallintWriteFunction) StandardColumnMappings.longDecimalWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.longDecimalWriteFunction) ConnectionFactory(io.trino.plugin.jdbc.ConnectionFactory) CONVERT_TO_VARCHAR(io.trino.plugin.jdbc.UnsupportedTypeHandling.CONVERT_TO_VARCHAR) SKIP_REGION_BOUNDARY_CHECK(org.apache.phoenix.coprocessor.BaseScannerRegionObserver.SKIP_REGION_BOUNDARY_CHECK) PhoenixResultSet(org.apache.phoenix.jdbc.PhoenixResultSet) JdbcTableHandle(io.trino.plugin.jdbc.JdbcTableHandle) DATE(io.trino.spi.type.DateType.DATE) REAL(io.trino.spi.type.RealType.REAL) ARRAY(java.sql.Types.ARRAY) StandardColumnMappings.doubleColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.doubleColumnMapping) ColumnMetadata(io.trino.spi.connector.ColumnMetadata) SchemaUtil(org.apache.phoenix.util.SchemaUtil) QueryConstants(org.apache.phoenix.query.QueryConstants) StandardColumnMappings.booleanWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.booleanWriteFunction) JdbcSplit(io.trino.plugin.jdbc.JdbcSplit) SimpleDateFormat(java.text.SimpleDateFormat) ALLOW_OVERFLOW(io.trino.plugin.jdbc.DecimalConfig.DecimalMapping.ALLOW_OVERFLOW) BOOLEAN(io.trino.spi.type.BooleanType.BOOLEAN) ConnectorTableMetadata(io.trino.spi.connector.ConnectorTableMetadata) TableProperty(org.apache.phoenix.schema.TableProperty) DatabaseMetaData(java.sql.DatabaseMetaData) StandardColumnMappings.bigintColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.bigintColumnMapping) StandardColumnMappings.defaultCharColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.defaultCharColumnMapping) ArrayList(java.util.ArrayList) JDBCType(java.sql.JDBCType) SQLException(java.sql.SQLException) TIMESTAMP_TZ_MILLIS(io.trino.spi.type.TimestampWithTimeZoneType.TIMESTAMP_TZ_MILLIS) String.join(java.lang.String.join) FULL_PUSHDOWN(io.trino.plugin.jdbc.PredicatePushdownController.FULL_PUSHDOWN) StandardColumnMappings.charWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.charWriteFunction) PreparedQuery(io.trino.plugin.jdbc.PreparedQuery) PName(org.apache.phoenix.schema.PName) StandardColumnMappings.decimalColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.decimalColumnMapping) TableName(org.apache.hadoop.hbase.TableName) StandardColumnMappings.realWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.realWriteFunction) DataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding) DecimalType.createDecimalType(io.trino.spi.type.DecimalType.createDecimalType) QueryBuilder(io.trino.plugin.jdbc.QueryBuilder) PHOENIX_METADATA_ERROR(io.trino.plugin.phoenix5.PhoenixErrorCode.PHOENIX_METADATA_ERROR) SchemaUtil.getEscapedArgument(org.apache.phoenix.util.SchemaUtil.getEscapedArgument) StandardColumnMappings.smallintColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.smallintColumnMapping) IOException(java.io.IOException) DEFAULT_SCHEMA(io.trino.plugin.phoenix5.PhoenixMetadata.DEFAULT_SCHEMA) ConnectorSession(io.trino.spi.connector.ConnectorSession) Scan(org.apache.hadoop.hbase.client.Scan) DOUBLE(io.trino.spi.type.DoubleType.DOUBLE) IdentifierMapping(io.trino.plugin.jdbc.mapping.IdentifierMapping) PhoenixConnection(org.apache.phoenix.jdbc.PhoenixConnection) VarbinaryType(io.trino.spi.type.VarbinaryType) Admin(org.apache.hadoop.hbase.client.Admin) CharType(io.trino.spi.type.CharType) DecimalSessionSessionProperties.getDecimalRoundingMode(io.trino.plugin.jdbc.DecimalSessionSessionProperties.getDecimalRoundingMode) MetadataUtil.toPhoenixSchemaName(io.trino.plugin.phoenix5.MetadataUtil.toPhoenixSchemaName) TINYINT(io.trino.spi.type.TinyintType.TINYINT) StandardColumnMappings.shortDecimalWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.shortDecimalWriteFunction) StandardColumnMappings.varbinaryWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.varbinaryWriteFunction) WriteMapping(io.trino.plugin.jdbc.WriteMapping) TypeUtils.toBoxedArray(io.trino.plugin.phoenix5.TypeUtils.toBoxedArray) Connection(java.sql.Connection) HColumnDescriptor(org.apache.hadoop.hbase.HColumnDescriptor) BiFunction(java.util.function.BiFunction) SchemaNotFoundException(io.trino.spi.connector.SchemaNotFoundException) ObjectWriteFunction(io.trino.plugin.jdbc.ObjectWriteFunction) PhoenixInputSplit(org.apache.phoenix.mapreduce.PhoenixInputSplit) Preconditions.checkArgument(com.google.common.base.Preconditions.checkArgument) TypeUtils.getJdbcObjectArray(io.trino.plugin.phoenix5.TypeUtils.getJdbcObjectArray) HTableDescriptor(org.apache.hadoop.hbase.HTableDescriptor) Block(io.trino.spi.block.Block) DEFAULT_SCALE(io.trino.spi.type.DecimalType.DEFAULT_SCALE) ColumnMapping(io.trino.plugin.jdbc.ColumnMapping) ALREADY_EXISTS(io.trino.spi.StandardErrorCode.ALREADY_EXISTS) INTEGER(io.trino.spi.type.IntegerType.INTEGER) Collectors.toSet(java.util.stream.Collectors.toSet) NVARCHAR(java.sql.Types.NVARCHAR) ImmutableSet(com.google.common.collect.ImmutableSet) DecimalSessionSessionProperties.getDecimalRounding(io.trino.plugin.jdbc.DecimalSessionSessionProperties.getDecimalRounding) ImmutableMap(com.google.common.collect.ImmutableMap) Collection(java.util.Collection) DelegatePreparedStatement(org.apache.phoenix.jdbc.DelegatePreparedStatement) Compression(org.apache.hadoop.hbase.io.compress.Compression) MetadataUtil.getEscapedTableName(io.trino.plugin.phoenix5.MetadataUtil.getEscapedTableName) TrinoException(io.trino.spi.TrinoException) ArrayType(io.trino.spi.type.ArrayType) StandardColumnMappings.realColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.realColumnMapping) JdbcOutputTableHandle(io.trino.plugin.jdbc.JdbcOutputTableHandle) WriteFunction(io.trino.plugin.jdbc.WriteFunction) String.format(java.lang.String.format) PhoenixRuntime.getTable(org.apache.phoenix.util.PhoenixRuntime.getTable) JdbcSortItem(io.trino.plugin.jdbc.JdbcSortItem) PColumn(org.apache.phoenix.schema.PColumn) List(java.util.List) JdbcTypeHandle(io.trino.plugin.jdbc.JdbcTypeHandle) BIGINT(io.trino.spi.type.BigintType.BIGINT) ScanMetricsHolder(org.apache.phoenix.monitoring.ScanMetricsHolder) LocalDate(java.time.LocalDate) Decimals(io.trino.spi.type.Decimals) Optional(java.util.Optional) Math.max(java.lang.Math.max) PhoenixPreparedStatement(org.apache.phoenix.jdbc.PhoenixPreparedStatement) MoreObjects.firstNonNull(com.google.common.base.MoreObjects.firstNonNull) VARCHAR(java.sql.Types.VARCHAR) ESCAPE_CHARACTER(org.apache.phoenix.util.SchemaUtil.ESCAPE_CHARACTER) SQLExceptionCode(org.apache.phoenix.exception.SQLExceptionCode) PDataType(org.apache.phoenix.schema.types.PDataType) StandardColumnMappings.tinyintColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.tinyintColumnMapping) DecimalType(io.trino.spi.type.DecimalType) PeekingResultIterator(org.apache.phoenix.iterate.PeekingResultIterator) StandardColumnMappings.integerColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.integerColumnMapping) Types(java.sql.Types) PhoenixColumnProperties.isPrimaryKey(io.trino.plugin.phoenix5.PhoenixColumnProperties.isPrimaryKey) StandardColumnMappings.varbinaryColumnMapping(io.trino.plugin.jdbc.StandardColumnMappings.varbinaryColumnMapping) PHOENIX_QUERY_ERROR(io.trino.plugin.phoenix5.PhoenixErrorCode.PHOENIX_QUERY_ERROR) StatementContext(org.apache.phoenix.compile.StatementContext) SequenceResultIterator(org.apache.phoenix.iterate.SequenceResultIterator) StandardColumnMappings.varcharWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.varcharWriteFunction) StandardColumnMappings.tinyintWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.tinyintWriteFunction) Type(io.trino.spi.type.Type) VarcharType.createUnboundedVarcharType(io.trino.spi.type.VarcharType.createUnboundedVarcharType) TIMESTAMP(java.sql.Types.TIMESTAMP) Inject(javax.inject.Inject) VarcharType(io.trino.spi.type.VarcharType) TIME_WITH_TIME_ZONE(io.trino.spi.type.TimeWithTimeZoneType.TIME_WITH_TIME_ZONE) HConstants(org.apache.hadoop.hbase.HConstants) TIMESTAMP_WITH_TIMEZONE(java.sql.Types.TIMESTAMP_WITH_TIMEZONE) ImmutableList(com.google.common.collect.ImmutableList) QueryPlan(org.apache.phoenix.compile.QueryPlan) Verify.verify(com.google.common.base.Verify.verify) TIME(io.trino.spi.type.TimeType.TIME) BaseJdbcClient(io.trino.plugin.jdbc.BaseJdbcClient) LinkedList(java.util.LinkedList) Bytes(org.apache.hadoop.hbase.util.Bytes) PTable(org.apache.phoenix.schema.PTable) StandardColumnMappings.timeWriteFunctionUsingSqlTime(io.trino.plugin.jdbc.StandardColumnMappings.timeWriteFunctionUsingSqlTime) JdbcColumnHandle(io.trino.plugin.jdbc.JdbcColumnHandle) StandardColumnMappings.integerWriteFunction(io.trino.plugin.jdbc.StandardColumnMappings.integerWriteFunction) TableResultIterator(org.apache.phoenix.iterate.TableResultIterator) ConnectionQueryServices(org.apache.phoenix.query.ConnectionQueryServices) DEFAULT_PRECISION(io.trino.spi.type.DecimalType.DEFAULT_PRECISION) DISABLE_PUSHDOWN(io.trino.plugin.jdbc.PredicatePushdownController.DISABLE_PUSHDOWN) ObjectReadFunction(io.trino.plugin.jdbc.ObjectReadFunction) DateTimeFormatter(java.time.format.DateTimeFormatter) StringJoiner(java.util.StringJoiner) LookAheadResultIterator(org.apache.phoenix.iterate.LookAheadResultIterator) ColumnMetadata(io.trino.spi.connector.ColumnMetadata) SQLException(java.sql.SQLException) ImmutableList(com.google.common.collect.ImmutableList) PhoenixConnection(org.apache.phoenix.jdbc.PhoenixConnection) Connection(java.sql.Connection) ConnectorIdentity(io.trino.spi.security.ConnectorIdentity) SchemaTableName(io.trino.spi.connector.SchemaTableName) LinkedList(java.util.LinkedList) BloomType(org.apache.hadoop.hbase.regionserver.BloomType) JDBCType(java.sql.JDBCType) DecimalType.createDecimalType(io.trino.spi.type.DecimalType.createDecimalType) VarbinaryType(io.trino.spi.type.VarbinaryType) CharType(io.trino.spi.type.CharType) ArrayType(io.trino.spi.type.ArrayType) PDataType(org.apache.phoenix.schema.types.PDataType) DecimalType(io.trino.spi.type.DecimalType) Type(io.trino.spi.type.Type) VarcharType.createUnboundedVarcharType(io.trino.spi.type.VarcharType.createUnboundedVarcharType) VarcharType(io.trino.spi.type.VarcharType) TrinoException(io.trino.spi.TrinoException) SchemaNotFoundException(io.trino.spi.connector.SchemaNotFoundException)

Aggregations

BOOLEAN (io.trino.spi.type.BooleanType.BOOLEAN)25 BIGINT (io.trino.spi.type.BigintType.BIGINT)24 List (java.util.List)22 Optional (java.util.Optional)22 ImmutableList (com.google.common.collect.ImmutableList)20 ImmutableMap (com.google.common.collect.ImmutableMap)18 DOUBLE (io.trino.spi.type.DoubleType.DOUBLE)18 Type (io.trino.spi.type.Type)18 Map (java.util.Map)18 INTEGER (io.trino.spi.type.IntegerType.INTEGER)17 VarcharType (io.trino.spi.type.VarcharType)17 ImmutableList.toImmutableList (com.google.common.collect.ImmutableList.toImmutableList)16 DecimalType (io.trino.spi.type.DecimalType)16 REAL (io.trino.spi.type.RealType.REAL)16 SMALLINT (io.trino.spi.type.SmallintType.SMALLINT)16 TINYINT (io.trino.spi.type.TinyintType.TINYINT)16 Set (java.util.Set)16 ArrayType (io.trino.spi.type.ArrayType)15 DATE (io.trino.spi.type.DateType.DATE)15 String.format (java.lang.String.format)15