Search in sources :

Example 1 with SparkSchemaCreator

use of com.ibm.cohort.cql.spark.data.SparkSchemaCreator in project quality-measure-and-cohort-service by Alvearie.

the class SparkCqlEvaluator method calculateSparkSchema.

/**
 * Auto-detect an output schema for 1 or more contexts using program metadata files
 * and the CQL definitions that will be used by the engine.
 *
 * @param contextNames          List of context names to calculate schemas for.
 * @param contextDefinitions    Context definitions used during schema calculation. Used to
 *                              detect the key column for each context.
 * @param encoder               Encoder used to calculate the output column names to use for
 *                              each output schema.
 * @param cqlTranslator         Pre-configured CQL Translator instance
 * @return Map of context name to the output Spark schema for that context. The map will only
 *         contain entries for each context name that included in the contextNames list
 *         used as input to this function.
 * @throws Exception if deserialization errors occur when reading in any of the input files
 *         or if inferring an output schema fails for any reason.
 */
protected Map<String, StructType> calculateSparkSchema(List<String> contextNames, ContextDefinitions contextDefinitions, SparkOutputColumnEncoder encoder, CqlToElmTranslator cqlTranslator) throws Exception {
    CqlLibraryProvider libProvider = SparkCqlEvaluator.libraryProvider.get();
    if (libProvider == null) {
        libProvider = createLibraryProvider();
        SparkCqlEvaluator.libraryProvider.set(libProvider);
    }
    CqlEvaluationRequests cqlEvaluationRequests = getFilteredJobSpecificationWithIds();
    SparkSchemaCreator sparkSchemaCreator = new SparkSchemaCreator(libProvider, cqlEvaluationRequests, contextDefinitions, encoder, cqlTranslator);
    return sparkSchemaCreator.calculateSchemasForContexts(contextNames);
}
Also used : SparkSchemaCreator(com.ibm.cohort.cql.spark.data.SparkSchemaCreator) CqlLibraryProvider(com.ibm.cohort.cql.library.CqlLibraryProvider) PriorityCqlLibraryProvider(com.ibm.cohort.cql.library.PriorityCqlLibraryProvider) HadoopBasedCqlLibraryProvider(com.ibm.cohort.cql.library.HadoopBasedCqlLibraryProvider) TranslatingCqlLibraryProvider(com.ibm.cohort.cql.translation.TranslatingCqlLibraryProvider) ClasspathCqlLibraryProvider(com.ibm.cohort.cql.library.ClasspathCqlLibraryProvider) CqlEvaluationRequests(com.ibm.cohort.cql.evaluation.CqlEvaluationRequests)

Aggregations

CqlEvaluationRequests (com.ibm.cohort.cql.evaluation.CqlEvaluationRequests)1 ClasspathCqlLibraryProvider (com.ibm.cohort.cql.library.ClasspathCqlLibraryProvider)1 CqlLibraryProvider (com.ibm.cohort.cql.library.CqlLibraryProvider)1 HadoopBasedCqlLibraryProvider (com.ibm.cohort.cql.library.HadoopBasedCqlLibraryProvider)1 PriorityCqlLibraryProvider (com.ibm.cohort.cql.library.PriorityCqlLibraryProvider)1 SparkSchemaCreator (com.ibm.cohort.cql.spark.data.SparkSchemaCreator)1 TranslatingCqlLibraryProvider (com.ibm.cohort.cql.translation.TranslatingCqlLibraryProvider)1