Search in sources :

Example 11 with HCatDataType

use of com.thinkbiganalytics.spark.validation.HCatDataType in project kylo by Teradata.

the class CleanseAndValidateRowTest method convertBooleanType.

@Test
public void convertBooleanType() {
    String booleanFieldName = "flag";
    HCatDataType fieldDataType = HCatDataType.createFromDataType(booleanFieldName, "boolean");
    assertNotNull(fieldDataType);
    assertFalse(fieldDataType.isUnchecked());
    assertEquals(fieldDataType.getConvertibleType().getName(), "java.lang.Boolean");
}
Also used : HCatDataType(com.thinkbiganalytics.spark.validation.HCatDataType) Test(org.junit.Test)

Example 12 with HCatDataType

use of com.thinkbiganalytics.spark.validation.HCatDataType in project kylo by Teradata.

the class CleanseAndValidateRowTest method standardizeAndValidate.

@Test
public void standardizeAndValidate() {
    String fieldName = "field1";
    List<BaseFieldPolicy> policies = new ArrayList<>();
    policies.add(new SimpleRegexReplacer("(?i)foo", "bar"));
    policies.add(new LookupValidator("aabaraa"));
    policies.add(new SimpleRegexReplacer("(?i)bar", "test"));
    policies.add(new LookupValidator("aatestaa"));
    FieldPolicy fieldPolicy = FieldPolicyBuilder.newBuilder().addPolicies(policies).tableName("emp").fieldName(fieldName).feedFieldName(fieldName).build();
    HCatDataType fieldDataType = HCatDataType.createFromDataType(fieldName, "string");
    StandardizationAndValidationResult result = validator.standardizeAndValidateField(fieldPolicy, "aafooaa", fieldDataType, new HashMap<Class, Class>());
    assertEquals(result.getFieldValue(), "aatestaa");
    assertEquals(StandardDataValidator.VALID_RESULT, result.getFinalValidationResult());
}
Also used : FieldPolicy(com.thinkbiganalytics.policy.FieldPolicy) BaseFieldPolicy(com.thinkbiganalytics.policy.BaseFieldPolicy) HCatDataType(com.thinkbiganalytics.spark.validation.HCatDataType) ArrayList(java.util.ArrayList) LookupValidator(com.thinkbiganalytics.policy.validation.LookupValidator) BaseFieldPolicy(com.thinkbiganalytics.policy.BaseFieldPolicy) SimpleRegexReplacer(com.thinkbiganalytics.policy.standardization.SimpleRegexReplacer) StandardizationAndValidationResult(com.thinkbiganalytics.spark.datavalidator.StandardizationAndValidationResult) Test(org.junit.Test)

Aggregations

HCatDataType (com.thinkbiganalytics.spark.validation.HCatDataType)12 FieldPolicy (com.thinkbiganalytics.policy.FieldPolicy)9 ArrayList (java.util.ArrayList)9 Test (org.junit.Test)9 BaseFieldPolicy (com.thinkbiganalytics.policy.BaseFieldPolicy)8 StandardizationAndValidationResult (com.thinkbiganalytics.spark.datavalidator.StandardizationAndValidationResult)8 SimpleRegexReplacer (com.thinkbiganalytics.policy.standardization.SimpleRegexReplacer)4 LookupValidator (com.thinkbiganalytics.policy.validation.LookupValidator)3 HashMap (java.util.HashMap)3 Nonnull (javax.annotation.Nonnull)2 StructField (org.apache.spark.sql.types.StructField)2 StandardizationPolicy (com.thinkbiganalytics.policy.standardization.StandardizationPolicy)1 CharacterValidator (com.thinkbiganalytics.policy.validation.CharacterValidator)1 ValidationResult (com.thinkbiganalytics.policy.validation.ValidationResult)1 CleansedRowResult (com.thinkbiganalytics.spark.datavalidator.CleansedRowResult)1 StructType (org.apache.spark.sql.types.StructType)1