use of com.ibm.cohort.cql.spark.data.SparkSchemaCreator in project quality-measure-and-cohort-service by Alvearie.
the class SparkCqlEvaluator method calculateSparkSchema.
/**
* Auto-detect an output schema for 1 or more contexts using program metadata files
* and the CQL definitions that will be used by the engine.
*
* @param contextNames List of context names to calculate schemas for.
* @param contextDefinitions Context definitions used during schema calculation. Used to
* detect the key column for each context.
* @param encoder Encoder used to calculate the output column names to use for
* each output schema.
* @param cqlTranslator Pre-configured CQL Translator instance
* @return Map of context name to the output Spark schema for that context. The map will only
* contain entries for each context name that included in the contextNames list
* used as input to this function.
* @throws Exception if deserialization errors occur when reading in any of the input files
* or if inferring an output schema fails for any reason.
*/
protected Map<String, StructType> calculateSparkSchema(List<String> contextNames, ContextDefinitions contextDefinitions, SparkOutputColumnEncoder encoder, CqlToElmTranslator cqlTranslator) throws Exception {
CqlLibraryProvider libProvider = SparkCqlEvaluator.libraryProvider.get();
if (libProvider == null) {
libProvider = createLibraryProvider();
SparkCqlEvaluator.libraryProvider.set(libProvider);
}
CqlEvaluationRequests cqlEvaluationRequests = getFilteredJobSpecificationWithIds();
SparkSchemaCreator sparkSchemaCreator = new SparkSchemaCreator(libProvider, cqlEvaluationRequests, contextDefinitions, encoder, cqlTranslator);
return sparkSchemaCreator.calculateSchemasForContexts(contextNames);
}
Aggregations