Search in sources :

Example 1 with RegexStringMatcher

use of com.ibm.cohort.cql.util.RegexStringMatcher in project quality-measure-and-cohort-service by Alvearie.

the class AnyColumnFunctions method AnyColumnRegex.

public static Object AnyColumnRegex(Object object, String regex) {
    DataRow dataRow = (DataRow) object;
    StringMatcher matcher = new RegexStringMatcher(regex);
    return dataRow.getFieldNames().stream().filter(matcher).map(dataRow::getValue).collect(Collectors.toList());
}
Also used : RegexStringMatcher(com.ibm.cohort.cql.util.RegexStringMatcher) StringMatcher(com.ibm.cohort.cql.util.StringMatcher) RegexStringMatcher(com.ibm.cohort.cql.util.RegexStringMatcher) PrefixStringMatcher(com.ibm.cohort.cql.util.PrefixStringMatcher) DataRow(com.ibm.cohort.datarow.model.DataRow)

Example 2 with RegexStringMatcher

use of com.ibm.cohort.cql.util.RegexStringMatcher in project quality-measure-and-cohort-service by Alvearie.

the class ColumnFilterFunctionTest method testColumnFiltering.

@Test
public void testColumnFiltering() {
    String path = new File("src/test/resources/alltypes/testdata/test-A.parquet").toURI().toString();
    Dataset<Row> baseline = spark.read().format("parquet").load(path);
    assertEquals(12, baseline.schema().fields().length);
    String colName = "boolean_col";
    String regexColName = "code_col[0-9]*";
    ColumnFilterFunction datasetTransformer = new ColumnFilterFunction(new HashSet<>(Arrays.asList(new EqualsStringMatcher(colName), new RegexStringMatcher(regexColName))));
    Dataset<Row> filtered = datasetTransformer.apply(baseline);
    Set<String> expectedNames = new HashSet<>(Arrays.asList(colName, "code_col", "code_col_system", "string_col", "code_col2"));
    Set<String> actualNames = new HashSet<>(Arrays.asList(filtered.schema().fieldNames()));
    assertEquals(expectedNames, actualNames);
}
Also used : RegexStringMatcher(com.ibm.cohort.cql.util.RegexStringMatcher) EqualsStringMatcher(com.ibm.cohort.cql.util.EqualsStringMatcher) Row(org.apache.spark.sql.Row) File(java.io.File) HashSet(java.util.HashSet) Test(org.junit.Test) BaseSparkTest(com.ibm.cohort.cql.spark.BaseSparkTest)

Example 3 with RegexStringMatcher

use of com.ibm.cohort.cql.util.RegexStringMatcher in project quality-measure-and-cohort-service by Alvearie.

the class DataTypeRequirementsProcessorTest method testAnyColumnRequirements.

@Test
public void testAnyColumnRequirements() throws Exception {
    String basePath = "src/test/resources";
    Map<String, Set<StringMatcher>> reqsByDataType = runPatternTest(basePath + "/any-column/cql", basePath + "/alltypes/modelinfo/alltypes-modelinfo-1.0.0.xml", null, x -> x.getFileName().toString().equals("MeasureAnyColumn-1.0.0.cql"));
    Map<String, Set<StringMatcher>> expectations = new HashMap<>();
    expectations.put("A", new HashSet<>(Arrays.asList(new PrefixStringMatcher("code_col"))));
    expectations.put("B", new HashSet<>(Arrays.asList(new RegexStringMatcher("integer_col"))));
    expectations.put("C", new HashSet<>(Arrays.asList(new RegexStringMatcher(".*_decimal"))));
    assertEquals(expectations, reqsByDataType);
}
Also used : HashSet(java.util.HashSet) Set(java.util.Set) HashMap(java.util.HashMap) RegexStringMatcher(com.ibm.cohort.cql.util.RegexStringMatcher) PrefixStringMatcher(com.ibm.cohort.cql.util.PrefixStringMatcher) Test(org.junit.Test)

Example 4 with RegexStringMatcher

use of com.ibm.cohort.cql.util.RegexStringMatcher in project quality-measure-and-cohort-service by Alvearie.

the class AnyColumnVisitor method visitFunctionRef.

@Override
public Object visitFunctionRef(FunctionRef elm, AnyColumnContext context) {
    if (AnyColumnFunctions.FUNCTION_NAMES.contains(elm.getName())) {
        if (elm.getOperand().size() == 2) {
            QName dataType = ((As) elm.getOperand().get(0)).getOperand().getResultTypeName();
            // TODO - validate that the first operand is a model object. We really should be doing that at the
            // method declaration level instead of Choice<Any>, but that will require the model
            // to have a base class that everything extends from.
            String columnMatchLogic = null;
            if (elm.getOperand().get(1) instanceof Literal) {
                columnMatchLogic = ((Literal) elm.getOperand().get(1)).getValue();
            } else {
                throw new IllegalArgumentException(String.format("Second argument to %s function at %s must be a literal", elm.getName(), elm.getLocator()));
            }
            StringMatcher matcher = null;
            if (elm.getName().equals(AnyColumnFunctions.FUNC_ANY_COLUMN)) {
                matcher = new PrefixStringMatcher(columnMatchLogic);
            } else if (elm.getName().equals(AnyColumnFunctions.FUNC_ANY_COLUMN_REGEX)) {
                matcher = new RegexStringMatcher(columnMatchLogic);
            } else {
                throw new IllegalArgumentException(String.format("Found declared, but unsupported AnyColumn function %s at %s", elm.getName(), elm.getLocator()));
            }
            context.reportAnyColumn(dataType, matcher);
        } else {
            throw new IllegalArgumentException(String.format("%s function at %s should have exactly two arguments", elm.getName(), elm.getLocator()));
        }
    }
    return super.visitFunctionRef(elm, context);
}
Also used : RegexStringMatcher(com.ibm.cohort.cql.util.RegexStringMatcher) QName(javax.xml.namespace.QName) Literal(org.hl7.elm.r1.Literal) StringMatcher(com.ibm.cohort.cql.util.StringMatcher) RegexStringMatcher(com.ibm.cohort.cql.util.RegexStringMatcher) PrefixStringMatcher(com.ibm.cohort.cql.util.PrefixStringMatcher) PrefixStringMatcher(com.ibm.cohort.cql.util.PrefixStringMatcher)

Aggregations

RegexStringMatcher (com.ibm.cohort.cql.util.RegexStringMatcher)4 PrefixStringMatcher (com.ibm.cohort.cql.util.PrefixStringMatcher)3 StringMatcher (com.ibm.cohort.cql.util.StringMatcher)2 HashSet (java.util.HashSet)2 Test (org.junit.Test)2 BaseSparkTest (com.ibm.cohort.cql.spark.BaseSparkTest)1 EqualsStringMatcher (com.ibm.cohort.cql.util.EqualsStringMatcher)1 DataRow (com.ibm.cohort.datarow.model.DataRow)1 File (java.io.File)1 HashMap (java.util.HashMap)1 Set (java.util.Set)1 QName (javax.xml.namespace.QName)1 Row (org.apache.spark.sql.Row)1 Literal (org.hl7.elm.r1.Literal)1