Search in sources :

Example 1 with FieldPoliciesJsonTransformer

use of com.thinkbiganalytics.policy.FieldPoliciesJsonTransformer in project kylo by Teradata.

the class FieldPolicyLoader method loadFieldPolicy.

/**
 * read the JSON file path and return the JSON string
 *
 * @param path path to field policy JSON file
 */
public Map<String, FieldPolicy> loadFieldPolicy(String path) {
    log.info("Loading Field Policy JSON file at {} ", path);
    String policyJson = "[]";
    /**
     * If spark is running in yarn-cluster mode, the policyJson file will be passed via --files param to be
     * added into driver classpath in Application Master. The "path" won't be valid in that case,
     * as it would be pointing to local file system. To enable this, we should be checking the fieldPolicyFile
     * in the current location ie classpath for "yarn-cluster" mode as well as the path for "yarn-client" mode
     *
     * You can also use sparkcontext object to get the value of sparkContext.getConf().get("spark.submit.deployMode")
     * and use this to decide which readFieldPolicyJsonPath to choose.
     */
    File policyFile = new File(path);
    if (policyFile.exists() && policyFile.isFile()) {
        log.info("Loading field policies at {} ", path);
    } else {
        log.info("Couldn't find field policy file at {} will check classpath.", path);
        String fileName = policyFile.getName();
        path = "./" + fileName;
    }
    try (BufferedReader br = new BufferedReader(new FileReader(path))) {
        StringBuilder sb = new StringBuilder();
        String line = br.readLine();
        if (line == null) {
            log.error("Field policies file at {} is empty ", path);
        }
        while (line != null) {
            sb.append(line);
            line = br.readLine();
        }
        policyJson = sb.toString();
    } catch (Exception e) {
        log.error("Error parsing field policy file. Please verify valid JSON at path {}", e.getMessage(), e);
    }
    FieldPoliciesJsonTransformer fieldPoliciesJsonTransformer = new FieldPoliciesJsonTransformer(policyJson);
    fieldPoliciesJsonTransformer.augmentPartitionColumnValidation();
    Map<String, FieldPolicy> map = fieldPoliciesJsonTransformer.buildPolicies();
    log.info("Finished building field policies for file: {} with entity that has {} fields ", path, map.size());
    return map;
}
Also used : FieldPolicy(com.thinkbiganalytics.policy.FieldPolicy) BufferedReader(java.io.BufferedReader) FileReader(java.io.FileReader) File(java.io.File) FieldPoliciesJsonTransformer(com.thinkbiganalytics.policy.FieldPoliciesJsonTransformer)

Example 2 with FieldPoliciesJsonTransformer

use of com.thinkbiganalytics.policy.FieldPoliciesJsonTransformer in project kylo by Teradata.

the class ValidationStage method getPolicyMap.

/**
 * Gets the policy map for the specified schema.
 */
private Map<String, com.thinkbiganalytics.policy.FieldPolicy> getPolicyMap(@Nonnull final StructType schema) {
    final Map<String, com.thinkbiganalytics.policy.FieldPolicy> policyMap = new FieldPoliciesJsonTransformer(Arrays.asList(policies)).buildPolicies();
    for (final StructField field : schema.fields()) {
        final String name = field.name().toLowerCase().trim();
        if (!policyMap.containsKey(name)) {
            final com.thinkbiganalytics.policy.FieldPolicy policy = FieldPolicyBuilder.newBuilder().fieldName(name).feedFieldName(name).build();
            policyMap.put(name, policy);
        }
    }
    return policyMap;
}
Also used : StructField(org.apache.spark.sql.types.StructField) FieldPolicy(com.thinkbiganalytics.policy.rest.model.FieldPolicy) FieldPoliciesJsonTransformer(com.thinkbiganalytics.policy.FieldPoliciesJsonTransformer)

Aggregations

FieldPoliciesJsonTransformer (com.thinkbiganalytics.policy.FieldPoliciesJsonTransformer)2 FieldPolicy (com.thinkbiganalytics.policy.FieldPolicy)1 FieldPolicy (com.thinkbiganalytics.policy.rest.model.FieldPolicy)1 BufferedReader (java.io.BufferedReader)1 File (java.io.File)1 FileReader (java.io.FileReader)1 StructField (org.apache.spark.sql.types.StructField)1