Search in sources :

Example 6 with SliceUtf8.tryGetCodePointAt

use of io.airlift.slice.SliceUtf8.tryGetCodePointAt in project presto by prestodb.

the class OrcMetadataReader method findStringStatisticTruncationPositionForOriginalOrcWriter.

@VisibleForTesting
static int findStringStatisticTruncationPositionForOriginalOrcWriter(Slice utf8) {
    int length = utf8.length();
    int position = 0;
    while (position < length) {
        int codePoint = tryGetCodePointAt(utf8, position);
        // stop at invalid sequences
        if (codePoint < 0) {
            break;
        }
        // the string stats are truncated at the first replacement character.
        if (codePoint == REPLACEMENT_CHARACTER_CODE_POINT) {
            break;
        }
        // at the first occurrence the surrogate character and 0xFF byte is appended to it.
        if (codePoint >= MIN_SUPPLEMENTARY_CODE_POINT) {
            break;
        }
        position += lengthOfCodePoint(codePoint);
    }
    return position;
}
Also used : SliceUtf8.lengthOfCodePoint(io.airlift.slice.SliceUtf8.lengthOfCodePoint) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Aggregations

SliceUtf8.lengthOfCodePoint (io.airlift.slice.SliceUtf8.lengthOfCodePoint)6 Constraint (com.facebook.presto.type.Constraint)4 SliceUtf8.offsetOfCodePoint (io.airlift.slice.SliceUtf8.offsetOfCodePoint)4 PrestoException (com.facebook.presto.spi.PrestoException)2 Description (com.facebook.presto.spi.function.Description)2 LiteralParameters (com.facebook.presto.spi.function.LiteralParameters)2 ScalarFunction (com.facebook.presto.spi.function.ScalarFunction)2 SqlType (com.facebook.presto.spi.function.SqlType)2 Slice (io.airlift.slice.Slice)2 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 InvalidUtf8Exception (io.airlift.slice.InvalidUtf8Exception)1 Slices.utf8Slice (io.airlift.slice.Slices.utf8Slice)1 OptionalInt (java.util.OptionalInt)1