Search in sources :

Example 1 with TableSetup

use of com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup in project kylo by Teradata.

the class HiveColumnsUpgradeAction method upgradeTo.

@Override
public void upgradeTo(final KyloVersion startingVersion) {
    log.info("Upgrading hive columns from version: {}", startingVersion);
    feedService.getFeeds().stream().filter(feed -> Optional.ofNullable(feed.getTable()).map(TableSetup::getTableSchema).map(TableSchema::getFields).isPresent()).forEach(feed -> {
        final TableSchema schema = feed.getTable().getTableSchema();
        final DerivedDatasource datasource = datasourceProvider.findDerivedDatasource("HiveDatasource", feed.getSystemCategoryName() + "." + feed.getSystemFeedName());
        if (datasource != null) {
            log.info("Upgrading schema: {}/{}", schema.getDatabaseName(), schema.getSchemaName());
            datasource.setGenericProperties(Collections.singletonMap("columns", (Serializable) schema.getFields()));
        }
    });
}
Also used : DerivedDatasource(com.thinkbiganalytics.metadata.api.datasource.DerivedDatasource) Logger(org.slf4j.Logger) FeedManagerFeedService(com.thinkbiganalytics.feedmgr.service.feed.FeedManagerFeedService) LoggerFactory(org.slf4j.LoggerFactory) Profile(org.springframework.context.annotation.Profile) Serializable(java.io.Serializable) Inject(javax.inject.Inject) Component(org.springframework.stereotype.Component) TableSchema(com.thinkbiganalytics.discovery.schema.TableSchema) Optional(java.util.Optional) TableSetup(com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup) KyloVersion(com.thinkbiganalytics.KyloVersion) DatasourceProvider(com.thinkbiganalytics.metadata.api.datasource.DatasourceProvider) Collections(java.util.Collections) KyloUpgrader(com.thinkbiganalytics.server.upgrade.KyloUpgrader) UpgradeState(com.thinkbiganalytics.server.upgrade.UpgradeState) Serializable(java.io.Serializable) TableSchema(com.thinkbiganalytics.discovery.schema.TableSchema) TableSetup(com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup) DerivedDatasource(com.thinkbiganalytics.metadata.api.datasource.DerivedDatasource)

Example 2 with TableSetup

use of com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup in project kylo by Teradata.

the class FeedHiveTableService method updateColumnDescriptions.

/**
 * Updates the column descriptions in the Hive metastore for the specified feed.
 *
 * @param feed the feed to update
 * @throws DataAccessException if there is any problem
 */
public void updateColumnDescriptions(@Nonnull final FeedMetadata feed) {
    final List<Field> feedFields = Optional.ofNullable(feed.getTable()).map(TableSetup::getTableSchema).map(TableSchema::getFields).orElse(null);
    if (feedFields != null && !feedFields.isEmpty()) {
        final TableSchema hiveSchema = hiveService.getTableSchema(feed.getSystemCategoryName(), feed.getSystemFeedName());
        if (hiveSchema != null) {
            final Map<String, Field> hiveFieldMap = hiveSchema.getFields().stream().collect(Collectors.toMap(field -> field.getName().toLowerCase(), Function.identity()));
            feedFields.stream().filter(feedField -> {
                final Field hiveField = hiveFieldMap.get(feedField.getName().toLowerCase());
                return hiveField != null && (StringUtils.isNotEmpty(feedField.getDescription()) || StringUtils.isNotEmpty(hiveField.getDescription())) && !Objects.equals(feedField.getDescription(), hiveField.getDescription());
            }).forEach(feedField -> changeColumn(feed, feedField.getName(), feedField));
        }
    }
}
Also used : DataAccessException(org.springframework.dao.DataAccessException) FeedMetadata(com.thinkbiganalytics.feedmgr.rest.model.FeedMetadata) StringUtils(org.apache.commons.lang3.StringUtils) Function(java.util.function.Function) Collectors(java.util.stream.Collectors) HiveUtils(com.thinkbiganalytics.hive.util.HiveUtils) Objects(java.util.Objects) List(java.util.List) Field(com.thinkbiganalytics.discovery.schema.Field) HiveService(com.thinkbiganalytics.hive.service.HiveService) Map(java.util.Map) TableSchema(com.thinkbiganalytics.discovery.schema.TableSchema) Optional(java.util.Optional) TableSetup(com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup) Nonnull(javax.annotation.Nonnull) Field(com.thinkbiganalytics.discovery.schema.Field) TableSchema(com.thinkbiganalytics.discovery.schema.TableSchema) TableSetup(com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup)

Example 3 with TableSetup

use of com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup in project kylo by Teradata.

the class PropertyExpressionResolverTest method testFeedMetadataProperties.

@Test
public void testFeedMetadataProperties() {
    FeedMetadata metadata = new FeedMetadata();
    metadata.setSystemFeedName("feedSystemName");
    metadata.setCategory(new FeedCategory());
    metadata.setTable(new TableSetup());
    metadata.getTable().setSourceTableSchema(new DefaultTableSchema());
    metadata.getTable().setTableSchema(new DefaultTableSchema());
    metadata.getTable().getSourceTableSchema().setName("sourceTableName");
    metadata.getTable().getTableSchema().setName("tableSchemaName");
    final NifiProperty prop1 = createProperty("${metadata.table.sourceTableSchema.name}");
    Assert.assertTrue(resolver.resolveExpression(metadata, prop1));
    Assert.assertEquals("sourceTableName", prop1.getValue());
}
Also used : FeedCategory(com.thinkbiganalytics.feedmgr.rest.model.FeedCategory) FeedMetadata(com.thinkbiganalytics.feedmgr.rest.model.FeedMetadata) TableSetup(com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup) NifiProperty(com.thinkbiganalytics.nifi.rest.model.NifiProperty) DefaultTableSchema(com.thinkbiganalytics.discovery.model.DefaultTableSchema) Test(org.junit.Test)

Example 4 with TableSetup

use of com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup in project kylo by Teradata.

the class DerivedDatasourceFactory method ensureDatasource.

public Datasource.ID ensureDatasource(TemplateProcessorDatasourceDefinition definition, FeedMetadata feedMetadata, List<NifiProperty> allProperties) {
    return metadataAccess.commit(() -> {
        List<NifiProperty> propertiesToEvalulate = new ArrayList<NifiProperty>();
        // fetch the def
        DatasourceDefinition datasourceDefinition = datasourceDefinitionProvider.findByProcessorType(definition.getProcessorType());
        if (datasourceDefinition != null) {
            // find out if there are any saved properties on the Feed that match the datasourceDef
            List<NifiProperty> feedProperties = feedMetadata.getProperties().stream().filter(property -> matchesDefinition(definition, property) && datasourceDefinition.getDatasourcePropertyKeys().contains(property.getKey())).collect(Collectors.toList());
            // resolve any ${metadata.} properties
            List<NifiProperty> resolvedFeedProperties = propertyExpressionResolver.resolvePropertyExpressions(feedProperties, feedMetadata);
            List<NifiProperty> resolvedAllProperties = propertyExpressionResolver.resolvePropertyExpressions(allProperties, feedMetadata);
            // propetyHash
            propertiesToEvalulate.addAll(feedProperties);
            propertiesToEvalulate.addAll(allProperties);
            propertyExpressionResolver.resolveStaticProperties(propertiesToEvalulate);
            String identityString = datasourceDefinition.getIdentityString();
            String desc = datasourceDefinition.getDescription();
            String title = datasourceDefinition.getTitle();
            PropertyExpressionResolver.ResolvedVariables identityStringPropertyResolution = propertyExpressionResolver.resolveVariables(identityString, propertiesToEvalulate);
            identityString = identityStringPropertyResolution.getResolvedString();
            PropertyExpressionResolver.ResolvedVariables titlePropertyResolution = propertyExpressionResolver.resolveVariables(title, propertiesToEvalulate);
            title = titlePropertyResolution.getResolvedString();
            if (desc != null) {
                PropertyExpressionResolver.ResolvedVariables descriptionPropertyResolution = propertyExpressionResolver.resolveVariables(desc, propertiesToEvalulate);
                desc = descriptionPropertyResolution.getResolvedString();
            }
            // if the identityString still contains unresolved variables then make the title readable and replace the idstring with the feed.id
            if (propertyExpressionResolver.containsVariablesPatterns(identityString)) {
                title = propertyExpressionResolver.replaceAll(title, " {runtime variable} ");
                identityString = propertyExpressionResolver.replaceAll(identityString, feedMetadata.getId());
            }
            // if it is the Source ensure the feed matches this ds
            if (isCreateDatasource(datasourceDefinition, feedMetadata)) {
                Map<String, String> controllerServiceProperties = parseControllerServiceProperties(datasourceDefinition, feedProperties);
                Map<String, Object> properties = new HashMap<String, Object>(identityStringPropertyResolution.getResolvedVariables());
                properties.putAll(controllerServiceProperties);
                DerivedDatasource derivedDatasource = datasourceProvider.ensureDerivedDatasource(datasourceDefinition.getDatasourceType(), identityString, title, desc, properties);
                if (derivedDatasource != null) {
                    if ("HiveDatasource".equals(derivedDatasource.getDatasourceType()) && Optional.ofNullable(feedMetadata.getTable()).map(TableSetup::getTableSchema).map(TableSchema::getFields).isPresent()) {
                        derivedDatasource.setGenericProperties(Collections.singletonMap("columns", (Serializable) feedMetadata.getTable().getTableSchema().getFields()));
                    }
                    return derivedDatasource.getId();
                }
            }
            return null;
        } else {
            return null;
        }
    }, MetadataAccess.SERVICE);
}
Also used : FeedDataTransformation(com.thinkbiganalytics.feedmgr.rest.model.FeedDataTransformation) DerivedDatasource(com.thinkbiganalytics.metadata.api.datasource.DerivedDatasource) LoggerFactory(org.slf4j.LoggerFactory) HashMap(java.util.HashMap) FeedMetadata(com.thinkbiganalytics.feedmgr.rest.model.FeedMetadata) StringUtils(org.apache.commons.lang3.StringUtils) DatasourceDefinitionProvider(com.thinkbiganalytics.metadata.api.datasource.DatasourceDefinitionProvider) ArrayList(java.util.ArrayList) HashSet(java.util.HashSet) Inject(javax.inject.Inject) DatasourceDefinition(com.thinkbiganalytics.metadata.api.datasource.DatasourceDefinition) TemplateProcessorDatasourceDefinition(com.thinkbiganalytics.feedmgr.rest.model.TemplateProcessorDatasourceDefinition) Map(java.util.Map) PropertyExpressionResolver(com.thinkbiganalytics.feedmgr.nifi.PropertyExpressionResolver) TableSetup(com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup) MetadataAccess(com.thinkbiganalytics.metadata.api.MetadataAccess) RegisteredTemplate(com.thinkbiganalytics.feedmgr.rest.model.RegisteredTemplate) Nonnull(javax.annotation.Nonnull) FeedManagerTemplateService(com.thinkbiganalytics.feedmgr.service.template.FeedManagerTemplateService) Datasource(com.thinkbiganalytics.metadata.api.datasource.Datasource) Logger(org.slf4j.Logger) ControllerServiceDTO(org.apache.nifi.web.api.dto.ControllerServiceDTO) NifiProperty(com.thinkbiganalytics.nifi.rest.model.NifiProperty) NifiControllerServiceProperties(com.thinkbiganalytics.feedmgr.nifi.NifiControllerServiceProperties) Collection(java.util.Collection) Set(java.util.Set) Collectors(java.util.stream.Collectors) Serializable(java.io.Serializable) RegisteredTemplateCache(com.thinkbiganalytics.feedmgr.service.template.RegisteredTemplateCache) List(java.util.List) Stream(java.util.stream.Stream) TableSchema(com.thinkbiganalytics.discovery.schema.TableSchema) Optional(java.util.Optional) DatasourceProvider(com.thinkbiganalytics.metadata.api.datasource.DatasourceProvider) Collections(java.util.Collections) Serializable(java.io.Serializable) TableSchema(com.thinkbiganalytics.discovery.schema.TableSchema) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) PropertyExpressionResolver(com.thinkbiganalytics.feedmgr.nifi.PropertyExpressionResolver) DerivedDatasource(com.thinkbiganalytics.metadata.api.datasource.DerivedDatasource) DatasourceDefinition(com.thinkbiganalytics.metadata.api.datasource.DatasourceDefinition) TemplateProcessorDatasourceDefinition(com.thinkbiganalytics.feedmgr.rest.model.TemplateProcessorDatasourceDefinition) NifiProperty(com.thinkbiganalytics.nifi.rest.model.NifiProperty)

Example 5 with TableSetup

use of com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup in project kylo by Teradata.

the class FeedIT method getCreateFeedRequest.

protected FeedMetadata getCreateFeedRequest(FeedCategory category, ImportTemplate template, String name) throws Exception {
    FeedMetadata feed = new FeedMetadata();
    feed.setFeedName(name);
    feed.setSystemFeedName(name.toLowerCase());
    feed.setCategory(category);
    feed.setTemplateId(template.getTemplateId());
    feed.setTemplateName(template.getTemplateName());
    feed.setDescription("Created by functional test");
    feed.setInputProcessorType("org.apache.nifi.processors.standard.GetFile");
    List<NifiProperty> properties = new ArrayList<>();
    NifiProperty fileFilter = new NifiProperty("305363d8-015a-1000-0000-000000000000", "1f67e296-2ff8-4b5d-0000-000000000000", "File Filter", USERDATA1_CSV);
    fileFilter.setProcessGroupName("NiFi Flow");
    fileFilter.setProcessorName("Filesystem");
    fileFilter.setProcessorType("org.apache.nifi.processors.standard.GetFile");
    fileFilter.setTemplateValue("mydata\\d{1,3}.csv");
    fileFilter.setInputProperty(true);
    fileFilter.setUserEditable(true);
    properties.add(fileFilter);
    NifiProperty inputDir = new NifiProperty("305363d8-015a-1000-0000-000000000000", "1f67e296-2ff8-4b5d-0000-000000000000", "Input Directory", VAR_DROPZONE);
    inputDir.setProcessGroupName("NiFi Flow");
    inputDir.setProcessorName("Filesystem");
    inputDir.setProcessorType("org.apache.nifi.processors.standard.GetFile");
    inputDir.setInputProperty(true);
    inputDir.setUserEditable(true);
    properties.add(inputDir);
    NifiProperty loadStrategy = new NifiProperty("305363d8-015a-1000-0000-000000000000", "6aeabec7-ec36-4ed5-0000-000000000000", "Load Strategy", "FULL_LOAD");
    loadStrategy.setProcessGroupName("NiFi Flow");
    loadStrategy.setProcessorName("GetTableData");
    loadStrategy.setProcessorType("com.thinkbiganalytics.nifi.v2.ingest.GetTableData");
    properties.add(loadStrategy);
    feed.setProperties(properties);
    FeedSchedule schedule = new FeedSchedule();
    schedule.setConcurrentTasks(1);
    schedule.setSchedulingPeriod("15 sec");
    schedule.setSchedulingStrategy("TIMER_DRIVEN");
    feed.setSchedule(schedule);
    TableSetup table = new TableSetup();
    DefaultTableSchema schema = new DefaultTableSchema();
    schema.setName("test1");
    List<Field> fields = new ArrayList<>();
    fields.add(newTimestampField("registration_dttm"));
    fields.add(newBigIntField("id"));
    fields.add(newStringField("first_name"));
    fields.add(newStringField("second_name"));
    fields.add(newStringField("email"));
    fields.add(newStringField("gender"));
    fields.add(newStringField("ip_address"));
    fields.add(newBinaryField("cc"));
    fields.add(newStringField("country"));
    fields.add(newStringField("birthdate"));
    fields.add(newStringField("salary"));
    schema.setFields(fields);
    table.setTableSchema(schema);
    table.setSourceTableSchema(schema);
    table.setFeedTableSchema(schema);
    table.setTargetMergeStrategy("DEDUPE_AND_MERGE");
    table.setFeedFormat("ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'\n WITH SERDEPROPERTIES ( 'separatorChar' = ',' ,'escapeChar' = '\\\\' ,'quoteChar' = '\\'') STORED AS TEXTFILE");
    table.setTargetFormat("STORED AS ORC");
    List<FieldPolicy> policies = new ArrayList<>();
    policies.add(newPolicyBuilder("registration_dttm").toPolicy());
    policies.add(newPolicyBuilder("id").toPolicy());
    policies.add(newPolicyBuilder("first_name").withStandardisation(toUpperCase).withProfile().withIndex().toPolicy());
    policies.add(newPolicyBuilder("second_name").withProfile().withIndex().toPolicy());
    policies.add(newPolicyBuilder("email").withValidation(email).toPolicy());
    policies.add(newPolicyBuilder("gender").withValidation(lookup, notNull).toPolicy());
    policies.add(newPolicyBuilder("ip_address").withValidation(ipAddress).toPolicy());
    policies.add(newPolicyBuilder("cc").withStandardisation(base64EncodeBinary).withProfile().toPolicy());
    policies.add(newPolicyBuilder("country").withStandardisation(base64EncodeBinary, base64DecodeBinary, base64EncodeString, base64DecodeString).withValidation(notNull, length).withProfile().toPolicy());
    policies.add(newPolicyBuilder("birthdate").toPolicy());
    policies.add(newPolicyBuilder("salary").toPolicy());
    table.setFieldPolicies(policies);
    List<PartitionField> partitions = new ArrayList<>();
    partitions.add(byYear("registration_dttm"));
    table.setPartitions(partitions);
    TableOptions options = new TableOptions();
    options.setCompressionFormat("SNAPPY");
    options.setAuditLogging(true);
    table.setOptions(options);
    table.setTableType("SNAPSHOT");
    feed.setTable(table);
    feed.setOptions(new FeedProcessingOptions());
    feed.getOptions().setSkipHeader(true);
    feed.setDataOwner("Marketing");
    List<Tag> tags = new ArrayList<>();
    tags.add(new DefaultTag("users"));
    tags.add(new DefaultTag("registrations"));
    feed.setTags(tags);
    User owner = new User();
    owner.setSystemName("dladmin");
    owner.setDisplayName("Data Lake Admin");
    Set<String> groups = new HashSet<>();
    groups.add("admin");
    groups.add("user");
    owner.setGroups(groups);
    feed.setOwner(owner);
    return feed;
}
Also used : FeedProcessingOptions(com.thinkbiganalytics.feedmgr.rest.model.schema.FeedProcessingOptions) User(com.thinkbiganalytics.security.rest.model.User) FieldPolicy(com.thinkbiganalytics.policy.rest.model.FieldPolicy) FeedMetadata(com.thinkbiganalytics.feedmgr.rest.model.FeedMetadata) ArrayList(java.util.ArrayList) PartitionField(com.thinkbiganalytics.feedmgr.rest.model.schema.PartitionField) Field(com.thinkbiganalytics.discovery.schema.Field) PartitionField(com.thinkbiganalytics.feedmgr.rest.model.schema.PartitionField) TableOptions(com.thinkbiganalytics.feedmgr.rest.model.schema.TableOptions) FeedSchedule(com.thinkbiganalytics.feedmgr.rest.model.FeedSchedule) NifiProperty(com.thinkbiganalytics.nifi.rest.model.NifiProperty) TableSetup(com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup) DefaultTableSchema(com.thinkbiganalytics.discovery.model.DefaultTableSchema) Tag(com.thinkbiganalytics.discovery.schema.Tag) DefaultTag(com.thinkbiganalytics.discovery.model.DefaultTag) DefaultTag(com.thinkbiganalytics.discovery.model.DefaultTag) HashSet(java.util.HashSet)

Aggregations

TableSetup (com.thinkbiganalytics.feedmgr.rest.model.schema.TableSetup)5 FeedMetadata (com.thinkbiganalytics.feedmgr.rest.model.FeedMetadata)4 TableSchema (com.thinkbiganalytics.discovery.schema.TableSchema)3 NifiProperty (com.thinkbiganalytics.nifi.rest.model.NifiProperty)3 Optional (java.util.Optional)3 DefaultTableSchema (com.thinkbiganalytics.discovery.model.DefaultTableSchema)2 Field (com.thinkbiganalytics.discovery.schema.Field)2 DatasourceProvider (com.thinkbiganalytics.metadata.api.datasource.DatasourceProvider)2 DerivedDatasource (com.thinkbiganalytics.metadata.api.datasource.DerivedDatasource)2 Serializable (java.io.Serializable)2 ArrayList (java.util.ArrayList)2 Collections (java.util.Collections)2 HashSet (java.util.HashSet)2 List (java.util.List)2 Map (java.util.Map)2 Collectors (java.util.stream.Collectors)2 Nonnull (javax.annotation.Nonnull)2 KyloVersion (com.thinkbiganalytics.KyloVersion)1 DefaultTag (com.thinkbiganalytics.discovery.model.DefaultTag)1 Tag (com.thinkbiganalytics.discovery.schema.Tag)1