Search in sources :

Example 31 with Description

use of io.prestosql.spi.function.Description in project hetu-core by openlookeng.

the class StringFunctions method replace.

@Description("greedily replaces occurrences of a pattern with a string")
@ScalarFunction
@LiteralParameters({ "x", "y", "z", "u" })
@Constraint(variable = "u", expression = "min(2147483647, x + z * (x + 1))")
@SqlType("varchar(u)")
public static Slice replace(@SqlType("varchar(x)") Slice str, @SqlType("varchar(y)") Slice search, @SqlType("varchar(z)") Slice replace) {
    // Empty search?
    if (search.length() == 0) {
        // With empty `search` we insert `replace` in front of every character and and the end
        Slice buffer = Slices.allocate((countCodePoints(str) + 1) * replace.length() + str.length());
        // Always start with replace
        buffer.setBytes(0, replace);
        int indexBuffer = replace.length();
        // After every code point insert `replace`
        int index = 0;
        while (index < str.length()) {
            int codePointLength = lengthOfCodePointSafe(str, index);
            // Append current code point
            buffer.setBytes(indexBuffer, str, index, codePointLength);
            indexBuffer += codePointLength;
            // Append `replace`
            buffer.setBytes(indexBuffer, replace);
            indexBuffer += replace.length();
            // Advance pointer to current code point
            index += codePointLength;
        }
        return buffer;
    }
    // Allocate a reasonable buffer
    Slice buffer = Slices.allocate(str.length());
    int index = 0;
    int indexBuffer = 0;
    while (index < str.length()) {
        int matchIndex = str.indexOf(search, index);
        // Found a match?
        if (matchIndex < 0) {
            // No match found so copy the rest of string
            int bytesToCopy = str.length() - index;
            buffer = Slices.ensureSize(buffer, indexBuffer + bytesToCopy);
            buffer.setBytes(indexBuffer, str, index, bytesToCopy);
            indexBuffer += bytesToCopy;
            break;
        }
        int bytesToCopy = matchIndex - index;
        buffer = Slices.ensureSize(buffer, indexBuffer + bytesToCopy + replace.length());
        // Non empty match?
        if (bytesToCopy > 0) {
            buffer.setBytes(indexBuffer, str, index, bytesToCopy);
            indexBuffer += bytesToCopy;
        }
        // Non empty replace?
        if (replace.length() > 0) {
            buffer.setBytes(indexBuffer, replace);
            indexBuffer += replace.length();
        }
        // Continue searching after match
        index = matchIndex + search.length();
    }
    return buffer.slice(0, indexBuffer);
}
Also used : Slice(io.airlift.slice.Slice) Slices.utf8Slice(io.airlift.slice.Slices.utf8Slice) Constraint(io.prestosql.type.Constraint) SliceUtf8.lengthOfCodePoint(io.airlift.slice.SliceUtf8.lengthOfCodePoint) SliceUtf8.offsetOfCodePoint(io.airlift.slice.SliceUtf8.offsetOfCodePoint) ScalarFunction(io.prestosql.spi.function.ScalarFunction) Description(io.prestosql.spi.function.Description) LiteralParameters(io.prestosql.spi.function.LiteralParameters) Constraint(io.prestosql.type.Constraint) SqlType(io.prestosql.spi.function.SqlType)

Example 32 with Description

use of io.prestosql.spi.function.Description in project hetu-core by openlookeng.

the class StringFunctions method splitPart.

@SqlNullable
@Description("splits a string by a delimiter and returns the specified field (counting from one)")
@ScalarFunction
@LiteralParameters({ "x", "y" })
@SqlType("varchar(x)")
public static Slice splitPart(@SqlType("varchar(x)") Slice string, @SqlType("varchar(y)") Slice delimiter, @SqlType(StandardTypes.BIGINT) long index) {
    checkCondition(index > 0, INVALID_FUNCTION_ARGUMENT, "Index must be greater than zero");
    // Empty delimiter? Then every character will be a split
    if (delimiter.length() == 0) {
        int startCodePoint = toIntExact(index);
        int indexStart = offsetOfCodePoint(string, startCodePoint - 1);
        if (indexStart < 0) {
            // index too big
            return null;
        }
        int length = lengthOfCodePoint(string, indexStart);
        if (indexStart + length > string.length()) {
            throw new PrestoException(INVALID_FUNCTION_ARGUMENT, "Invalid UTF-8 encoding");
        }
        return string.slice(indexStart, length);
    }
    int matchCount = 0;
    int previousIndex = 0;
    while (previousIndex < string.length()) {
        int matchIndex = string.indexOf(delimiter, previousIndex);
        // No match
        if (matchIndex < 0) {
            break;
        }
        // Reached the requested part?
        if (++matchCount == index) {
            return string.slice(previousIndex, matchIndex - previousIndex);
        }
        // Continue searching after the delimiter
        previousIndex = matchIndex + delimiter.length();
    }
    if (matchCount == index - 1) {
        // returns last section of the split
        return string.slice(previousIndex, string.length() - previousIndex);
    }
    // index is too big, null is returned
    return null;
}
Also used : PrestoException(io.prestosql.spi.PrestoException) Constraint(io.prestosql.type.Constraint) SliceUtf8.lengthOfCodePoint(io.airlift.slice.SliceUtf8.lengthOfCodePoint) SliceUtf8.offsetOfCodePoint(io.airlift.slice.SliceUtf8.offsetOfCodePoint) SqlNullable(io.prestosql.spi.function.SqlNullable) ScalarFunction(io.prestosql.spi.function.ScalarFunction) Description(io.prestosql.spi.function.Description) LiteralParameters(io.prestosql.spi.function.LiteralParameters) SqlType(io.prestosql.spi.function.SqlType)

Example 33 with Description

use of io.prestosql.spi.function.Description in project hetu-core by openlookeng.

the class StringFunctions method levenshteinDistance.

@Description("computes Levenshtein distance between two strings")
@ScalarFunction
@LiteralParameters({ "x", "y" })
@SqlType(StandardTypes.BIGINT)
public static long levenshteinDistance(@SqlType("varchar(x)") Slice left, @SqlType("varchar(y)") Slice right) {
    int[] leftCodePoints = castToCodePoints(left);
    int[] rightCodePoints = castToCodePoints(right);
    if (leftCodePoints.length < rightCodePoints.length) {
        int[] tempCodePoints = leftCodePoints;
        leftCodePoints = rightCodePoints;
        rightCodePoints = tempCodePoints;
    }
    if (rightCodePoints.length == 0) {
        return leftCodePoints.length;
    }
    checkCondition((leftCodePoints.length * (rightCodePoints.length - 1)) <= 1_000_000, INVALID_FUNCTION_ARGUMENT, "The combined inputs for Levenshtein distance are too large");
    int[] distances = new int[rightCodePoints.length];
    for (int i = 0; i < rightCodePoints.length; i++) {
        distances[i] = i + 1;
    }
    for (int i = 0; i < leftCodePoints.length; i++) {
        int leftUpDistance = distances[0];
        if (leftCodePoints[i] == rightCodePoints[0]) {
            distances[0] = i;
        } else {
            distances[0] = Math.min(i, distances[0]) + 1;
        }
        for (int j = 1; j < rightCodePoints.length; j++) {
            int leftUpDistanceNext = distances[j];
            if (leftCodePoints[i] == rightCodePoints[j]) {
                distances[j] = leftUpDistance;
            } else {
                distances[j] = Math.min(distances[j - 1], Math.min(leftUpDistance, distances[j])) + 1;
            }
            leftUpDistance = leftUpDistanceNext;
        }
    }
    return distances[rightCodePoints.length - 1];
}
Also used : Constraint(io.prestosql.type.Constraint) SliceUtf8.lengthOfCodePoint(io.airlift.slice.SliceUtf8.lengthOfCodePoint) SliceUtf8.offsetOfCodePoint(io.airlift.slice.SliceUtf8.offsetOfCodePoint) ScalarFunction(io.prestosql.spi.function.ScalarFunction) Description(io.prestosql.spi.function.Description) LiteralParameters(io.prestosql.spi.function.LiteralParameters) SqlType(io.prestosql.spi.function.SqlType)

Example 34 with Description

use of io.prestosql.spi.function.Description in project hetu-core by openlookeng.

the class VarbinaryFunctions method toIEEE754Binary32.

@Description("encode value as a big endian varbinary according to IEEE 754 single-precision floating-point format")
@ScalarFunction("to_ieee754_32")
@SqlType(StandardTypes.VARBINARY)
public static Slice toIEEE754Binary32(@SqlType(StandardTypes.REAL) long value) {
    Slice slice = Slices.allocate(Float.BYTES);
    slice.setInt(0, Integer.reverseBytes((int) value));
    return slice;
}
Also used : Slice(io.airlift.slice.Slice) ScalarFunction(io.prestosql.spi.function.ScalarFunction) Description(io.prestosql.spi.function.Description) SqlType(io.prestosql.spi.function.SqlType)

Example 35 with Description

use of io.prestosql.spi.function.Description in project hetu-core by openlookeng.

the class VarbinaryFunctions method toBigEndian32.

@Description("encode value as a 32-bit 2's complement big endian varbinary")
@ScalarFunction("to_big_endian_32")
@SqlType(StandardTypes.VARBINARY)
public static Slice toBigEndian32(@SqlType(StandardTypes.INTEGER) long value) {
    Slice slice = Slices.allocate(Integer.BYTES);
    slice.setInt(0, Integer.reverseBytes((int) value));
    return slice;
}
Also used : Slice(io.airlift.slice.Slice) ScalarFunction(io.prestosql.spi.function.ScalarFunction) Description(io.prestosql.spi.function.Description) SqlType(io.prestosql.spi.function.SqlType)

Aggregations

Description (io.prestosql.spi.function.Description)76 ScalarFunction (io.prestosql.spi.function.ScalarFunction)76 SqlType (io.prestosql.spi.function.SqlType)76 OGCGeometry (com.esri.core.geometry.ogc.OGCGeometry)40 SqlNullable (io.prestosql.spi.function.SqlNullable)34 Slice (io.airlift.slice.Slice)18 LiteralParameters (io.prestosql.spi.function.LiteralParameters)15 Point (com.esri.core.geometry.Point)14 OGCPoint (com.esri.core.geometry.ogc.OGCPoint)14 Constraint (io.prestosql.type.Constraint)13 MultiPoint (com.esri.core.geometry.MultiPoint)12 BlockBuilder (io.prestosql.spi.block.BlockBuilder)11 PrestoException (io.prestosql.spi.PrestoException)10 SliceUtf8.lengthOfCodePoint (io.airlift.slice.SliceUtf8.lengthOfCodePoint)7 SliceUtf8.offsetOfCodePoint (io.airlift.slice.SliceUtf8.offsetOfCodePoint)7 GeometryType (io.prestosql.geospatial.GeometryType)6 Envelope (com.esri.core.geometry.Envelope)5 MultiPath (com.esri.core.geometry.MultiPath)5 Matcher (io.airlift.joni.Matcher)5 Slices.utf8Slice (io.airlift.slice.Slices.utf8Slice)5