Search in sources :

Example 1 with Type

use of org.apache.hadoop.hive.ql.hooks.Entity.Type in project incubator-atlas by apache.

the class HiveHook method createOrUpdateEntities.

private LinkedHashMap<Type, Referenceable> createOrUpdateEntities(HiveMetaStoreBridge dgiBridge, HiveEventContext event, Entity entity, boolean skipTempTables, Table existTable) throws AtlasHookException {
    try {
        Database db = null;
        Table table = null;
        Partition partition = null;
        LinkedHashMap<Type, Referenceable> result = new LinkedHashMap<>();
        List<Referenceable> entities = new ArrayList<>();
        switch(entity.getType()) {
            case DATABASE:
                db = entity.getDatabase();
                break;
            case TABLE:
                table = entity.getTable();
                db = dgiBridge.hiveClient.getDatabase(table.getDbName());
                break;
            case PARTITION:
                partition = entity.getPartition();
                table = partition.getTable();
                db = dgiBridge.hiveClient.getDatabase(table.getDbName());
                break;
            default:
                LOG.info("{}: entity-type not handled by Atlas hook. Ignored", entity.getType());
        }
        if (db != null) {
            db = dgiBridge.hiveClient.getDatabase(db.getName());
        }
        if (db != null) {
            Referenceable dbEntity = dgiBridge.createDBInstance(db);
            entities.add(dbEntity);
            result.put(Type.DATABASE, dbEntity);
            Referenceable tableEntity = null;
            if (table != null) {
                if (existTable != null) {
                    table = existTable;
                } else {
                    table = dgiBridge.hiveClient.getTable(table.getDbName(), table.getTableName());
                }
                // we create the table since we need the HDFS path to temp table lineage.
                if (skipTempTables && table.isTemporary() && !TableType.EXTERNAL_TABLE.equals(table.getTableType())) {
                    LOG.debug("Skipping temporary table registration {} since it is not an external table {} ", table.getTableName(), table.getTableType().name());
                } else {
                    tableEntity = dgiBridge.createTableInstance(dbEntity, table);
                    entities.add(tableEntity);
                    result.put(Type.TABLE, tableEntity);
                }
            }
            event.addMessage(new HookNotification.EntityUpdateRequest(event.getUser(), entities));
        }
        return result;
    } catch (Exception e) {
        throw new AtlasHookException("HiveHook.createOrUpdateEntities() failed.", e);
    }
}
Also used : Partition(org.apache.hadoop.hive.ql.metadata.Partition) Table(org.apache.hadoop.hive.ql.metadata.Table) ArrayList(java.util.ArrayList) AtlasHookException(org.apache.atlas.hook.AtlasHookException) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) MalformedURLException(java.net.MalformedURLException) AtlasHookException(org.apache.atlas.hook.AtlasHookException) LinkedHashMap(java.util.LinkedHashMap) Type(org.apache.hadoop.hive.ql.hooks.Entity.Type) TableType(org.apache.hadoop.hive.metastore.TableType) HookNotification(org.apache.atlas.notification.hook.HookNotification) Referenceable(org.apache.atlas.typesystem.Referenceable) Database(org.apache.hadoop.hive.metastore.api.Database)

Example 2 with Type

use of org.apache.hadoop.hive.ql.hooks.Entity.Type in project incubator-atlas by apache.

the class HiveHook method renameTable.

private void renameTable(HiveMetaStoreBridge dgiBridge, HiveEventContext event) throws AtlasHookException {
    try {
        //crappy, no easy of getting new name
        assert event.getInputs() != null && event.getInputs().size() == 1;
        assert event.getOutputs() != null && event.getOutputs().size() > 0;
        //Update entity if not exists
        ReadEntity oldEntity = event.getInputs().iterator().next();
        Table oldTable = oldEntity.getTable();
        for (WriteEntity writeEntity : event.getOutputs()) {
            if (writeEntity.getType() == Entity.Type.TABLE) {
                Table newTable = writeEntity.getTable();
                //Hive sends with both old and new table names in the outputs which is weird. So skipping that with the below check
                if (!newTable.getDbName().equals(oldTable.getDbName()) || !newTable.getTableName().equals(oldTable.getTableName())) {
                    final String oldQualifiedName = HiveMetaStoreBridge.getTableQualifiedName(dgiBridge.getClusterName(), oldTable);
                    final String newQualifiedName = HiveMetaStoreBridge.getTableQualifiedName(dgiBridge.getClusterName(), newTable);
                    //Create/update old table entity - create entity with oldQFNme and old tableName if it doesnt exist. If exists, will update
                    //We always use the new entity while creating the table since some flags, attributes of the table are not set in inputEntity and Hive.getTable(oldTableName) also fails since the table doesnt exist in hive anymore
                    final LinkedHashMap<Type, Referenceable> tables = createOrUpdateEntities(dgiBridge, event, writeEntity, true);
                    Referenceable tableEntity = tables.get(Type.TABLE);
                    //Reset regular column QF Name to old Name and create a new partial notification request to replace old column QFName to newName to retain any existing traits
                    replaceColumnQFName(event, (List<Referenceable>) tableEntity.get(HiveMetaStoreBridge.COLUMNS), oldQualifiedName, newQualifiedName);
                    //Reset partition key column QF Name to old Name and create a new partial notification request to replace old column QFName to newName to retain any existing traits
                    replaceColumnQFName(event, (List<Referenceable>) tableEntity.get(HiveMetaStoreBridge.PART_COLS), oldQualifiedName, newQualifiedName);
                    //Reset SD QF Name to old Name and create a new partial notification request to replace old SD QFName to newName to retain any existing traits
                    replaceSDQFName(event, tableEntity, oldQualifiedName, newQualifiedName);
                    //Reset Table QF Name to old Name and create a new partial notification request to replace old Table QFName to newName
                    replaceTableQFName(event, oldTable, newTable, tableEntity, oldQualifiedName, newQualifiedName);
                }
            }
        }
    } catch (Exception e) {
        throw new AtlasHookException("HiveHook.renameTable() failed.", e);
    }
}
Also used : ReadEntity(org.apache.hadoop.hive.ql.hooks.ReadEntity) Type(org.apache.hadoop.hive.ql.hooks.Entity.Type) TableType(org.apache.hadoop.hive.metastore.TableType) Table(org.apache.hadoop.hive.ql.metadata.Table) Referenceable(org.apache.atlas.typesystem.Referenceable) WriteEntity(org.apache.hadoop.hive.ql.hooks.WriteEntity) AtlasHookException(org.apache.atlas.hook.AtlasHookException) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) MalformedURLException(java.net.MalformedURLException) AtlasHookException(org.apache.atlas.hook.AtlasHookException)

Example 3 with Type

use of org.apache.hadoop.hive.ql.hooks.Entity.Type in project incubator-atlas by apache.

the class HiveHook method processHiveEntity.

private <T extends Entity> void processHiveEntity(HiveMetaStoreBridge dgiBridge, HiveEventContext event, T entity, Set<String> dataSetsProcessed, SortedMap<T, Referenceable> dataSets, Set<Referenceable> entities) throws AtlasHookException {
    try {
        if (entity.getType() == Type.TABLE || entity.getType() == Type.PARTITION) {
            final String tblQFName = HiveMetaStoreBridge.getTableQualifiedName(dgiBridge.getClusterName(), entity.getTable());
            if (!dataSetsProcessed.contains(tblQFName)) {
                LinkedHashMap<Type, Referenceable> result = createOrUpdateEntities(dgiBridge, event, entity, false);
                dataSets.put(entity, result.get(Type.TABLE));
                dataSetsProcessed.add(tblQFName);
                entities.addAll(result.values());
            }
        } else if (entity.getType() == Type.DFS_DIR) {
            URI location = entity.getLocation();
            if (location != null) {
                final String pathUri = lower(new Path(location).toString());
                LOG.debug("Registering DFS Path {} ", pathUri);
                if (!dataSetsProcessed.contains(pathUri)) {
                    Referenceable hdfsPath = dgiBridge.fillHDFSDataSet(pathUri);
                    dataSets.put(entity, hdfsPath);
                    dataSetsProcessed.add(pathUri);
                    entities.add(hdfsPath);
                }
            }
        }
    } catch (Exception e) {
        throw new AtlasHookException("HiveHook.processHiveEntity() failed.", e);
    }
}
Also used : Path(org.apache.hadoop.fs.Path) Type(org.apache.hadoop.hive.ql.hooks.Entity.Type) TableType(org.apache.hadoop.hive.metastore.TableType) Referenceable(org.apache.atlas.typesystem.Referenceable) URI(java.net.URI) AtlasHookException(org.apache.atlas.hook.AtlasHookException) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) MalformedURLException(java.net.MalformedURLException) AtlasHookException(org.apache.atlas.hook.AtlasHookException)

Example 4 with Type

use of org.apache.hadoop.hive.ql.hooks.Entity.Type in project incubator-atlas by apache.

the class HiveHook method collect.

private void collect(HiveEventContext event) throws Exception {
    assert event.getHookType() == HookContext.HookType.POST_EXEC_HOOK : "Non-POST_EXEC_HOOK not supported!";
    LOG.info("Entered Atlas hook for hook type {}, operation {} , user {} as {}", event.getHookType(), event.getOperation(), event.getUgi().getRealUser(), event.getUgi().getShortUserName());
    HiveMetaStoreBridge dgiBridge = new HiveMetaStoreBridge(atlasProperties, hiveConf);
    switch(event.getOperation()) {
        case CREATEDATABASE:
            handleEventOutputs(dgiBridge, event, Type.DATABASE);
            break;
        case CREATETABLE:
            LinkedHashMap<Type, Referenceable> tablesCreated = handleEventOutputs(dgiBridge, event, Type.TABLE);
            if (tablesCreated != null && tablesCreated.size() > 0) {
                handleExternalTables(dgiBridge, event, tablesCreated);
            }
            break;
        case CREATETABLE_AS_SELECT:
        case CREATEVIEW:
        case ALTERVIEW_AS:
        case LOAD:
        case EXPORT:
        case IMPORT:
        case QUERY:
        case TRUNCATETABLE:
            registerProcess(dgiBridge, event);
            break;
        case ALTERTABLE_RENAME:
        case ALTERVIEW_RENAME:
            renameTable(dgiBridge, event);
            break;
        case ALTERTABLE_FILEFORMAT:
        case ALTERTABLE_CLUSTER_SORT:
        case ALTERTABLE_BUCKETNUM:
        case ALTERTABLE_PROPERTIES:
        case ALTERVIEW_PROPERTIES:
        case ALTERTABLE_SERDEPROPERTIES:
        case ALTERTABLE_SERIALIZER:
        case ALTERTABLE_ADDCOLS:
        case ALTERTABLE_REPLACECOLS:
        case ALTERTABLE_PARTCOLTYPE:
            handleEventOutputs(dgiBridge, event, Type.TABLE);
            break;
        case ALTERTABLE_RENAMECOL:
            renameColumn(dgiBridge, event);
            break;
        case ALTERTABLE_LOCATION:
            LinkedHashMap<Type, Referenceable> tablesUpdated = handleEventOutputs(dgiBridge, event, Type.TABLE);
            if (tablesUpdated != null && tablesUpdated.size() > 0) {
                //Track altered lineage in case of external tables
                handleExternalTables(dgiBridge, event, tablesUpdated);
            }
            break;
        case ALTERDATABASE:
        case ALTERDATABASE_OWNER:
            handleEventOutputs(dgiBridge, event, Type.DATABASE);
            break;
        case DROPTABLE:
        case DROPVIEW:
            deleteTable(dgiBridge, event);
            break;
        case DROPDATABASE:
            deleteDatabase(dgiBridge, event);
            break;
        default:
    }
}
Also used : HiveMetaStoreBridge(org.apache.atlas.hive.bridge.HiveMetaStoreBridge) Type(org.apache.hadoop.hive.ql.hooks.Entity.Type) TableType(org.apache.hadoop.hive.metastore.TableType) Referenceable(org.apache.atlas.typesystem.Referenceable)

Aggregations

Referenceable (org.apache.atlas.typesystem.Referenceable)4 TableType (org.apache.hadoop.hive.metastore.TableType)4 Type (org.apache.hadoop.hive.ql.hooks.Entity.Type)4 MalformedURLException (java.net.MalformedURLException)3 AtlasHookException (org.apache.atlas.hook.AtlasHookException)3 HiveException (org.apache.hadoop.hive.ql.metadata.HiveException)3 Table (org.apache.hadoop.hive.ql.metadata.Table)2 URI (java.net.URI)1 ArrayList (java.util.ArrayList)1 LinkedHashMap (java.util.LinkedHashMap)1 HiveMetaStoreBridge (org.apache.atlas.hive.bridge.HiveMetaStoreBridge)1 HookNotification (org.apache.atlas.notification.hook.HookNotification)1 Path (org.apache.hadoop.fs.Path)1 Database (org.apache.hadoop.hive.metastore.api.Database)1 ReadEntity (org.apache.hadoop.hive.ql.hooks.ReadEntity)1 WriteEntity (org.apache.hadoop.hive.ql.hooks.WriteEntity)1 Partition (org.apache.hadoop.hive.ql.metadata.Partition)1