Search in sources :

Example 1 with FileStatus

use of org.apache.hadoop.fs.FileStatus in project hive by apache.

the class ReplChangeManager method recycle.

/***
   * Move a path into cmroot. If the path is a directory (of a partition, or table if nonpartitioned),
   *   recursively move files inside directory to cmroot. Note the table must be managed table
   * @param path a single file or directory
   * @param ifPurge if the file should skip Trash when delete
   * @return
   * @throws MetaException
   */
public int recycle(Path path, boolean ifPurge) throws MetaException {
    if (!enabled) {
        return 0;
    }
    try {
        int count = 0;
        if (fs.isDirectory(path)) {
            FileStatus[] files = fs.listStatus(path, hiddenFileFilter);
            for (FileStatus file : files) {
                count += recycle(file.getPath(), ifPurge);
            }
        } else {
            Path cmPath = getCMPath(path, hiveConf, getChksumString(path, fs));
            if (LOG.isDebugEnabled()) {
                LOG.debug("Moving " + path.toString() + " to " + cmPath.toString());
            }
            // set timestamp before moving to cmroot, so we can
            // avoid race condition CM remove the file before setting
            // timestamp
            long now = System.currentTimeMillis();
            fs.setTimes(path, now, now);
            boolean succ = fs.rename(path, cmPath);
            // We might want to setXAttr for the new location in the future
            if (!succ) {
                if (LOG.isDebugEnabled()) {
                    LOG.debug("A file with the same content of " + path.toString() + " already exists, ignore");
                }
                // Need to extend the tenancy if we saw a newer file with the same content
                fs.setTimes(cmPath, now, now);
            } else {
                // set the file owner to hive (or the id metastore run as)
                fs.setOwner(cmPath, msUser, msGroup);
                // locations if orig-loc becomes important
                try {
                    fs.setXAttr(cmPath, ORIG_LOC_TAG, path.toString().getBytes());
                } catch (UnsupportedOperationException e) {
                    LOG.warn("Error setting xattr for " + path.toString());
                }
                count++;
            }
            // any file claim remain in trash would be granted
            if (!ifPurge) {
                try {
                    fs.setXAttr(cmPath, REMAIN_IN_TRASH_TAG, new byte[] { 0 });
                } catch (UnsupportedOperationException e) {
                    LOG.warn("Error setting xattr for " + cmPath.toString());
                }
            }
        }
        return count;
    } catch (IOException e) {
        throw new MetaException(StringUtils.stringifyException(e));
    }
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) IOException(java.io.IOException) MetaException(org.apache.hadoop.hive.metastore.api.MetaException)

Example 2 with FileStatus

use of org.apache.hadoop.fs.FileStatus in project hadoop by apache.

the class InMemorySCMStore method getInitialCachedResources.

@VisibleForTesting
Map<String, String> getInitialCachedResources(FileSystem fs, Configuration conf) throws IOException {
    // get the root directory for the shared cache
    String location = conf.get(YarnConfiguration.SHARED_CACHE_ROOT, YarnConfiguration.DEFAULT_SHARED_CACHE_ROOT);
    Path root = new Path(location);
    try {
        fs.getFileStatus(root);
    } catch (FileNotFoundException e) {
        String message = "The shared cache root directory " + location + " was not found";
        LOG.error(message);
        throw (IOException) new FileNotFoundException(message).initCause(e);
    }
    int nestedLevel = SharedCacheUtil.getCacheDepth(conf);
    // now traverse individual directories and process them
    // the directory structure is specified by the nested level parameter
    // (e.g. 9/c/d/<checksum>/file)
    String pattern = SharedCacheUtil.getCacheEntryGlobPattern(nestedLevel + 1);
    LOG.info("Querying for all individual cached resource files");
    FileStatus[] entries = fs.globStatus(new Path(root, pattern));
    int numEntries = entries == null ? 0 : entries.length;
    LOG.info("Found " + numEntries + " files: processing for one resource per " + "key");
    Map<String, String> initialCachedEntries = new HashMap<String, String>();
    if (entries != null) {
        for (FileStatus entry : entries) {
            Path file = entry.getPath();
            String fileName = file.getName();
            if (entry.isFile()) {
                // get the parent to get the checksum
                Path parent = file.getParent();
                if (parent != null) {
                    // the name of the immediate parent directory is the checksum
                    String key = parent.getName();
                    // first
                    if (initialCachedEntries.containsKey(key)) {
                        LOG.warn("Key " + key + " is already mapped to file " + initialCachedEntries.get(key) + "; file " + fileName + " will not be added");
                    } else {
                        initialCachedEntries.put(key, fileName);
                    }
                }
            }
        }
    }
    LOG.info("A total of " + initialCachedEntries.size() + " files are now mapped");
    return initialCachedEntries;
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) HashMap(java.util.HashMap) ConcurrentHashMap(java.util.concurrent.ConcurrentHashMap) FileNotFoundException(java.io.FileNotFoundException) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Example 3 with FileStatus

use of org.apache.hadoop.fs.FileStatus in project hadoop by apache.

the class TestCleanerTask method testProcessEvictableResource.

@Test
public void testProcessEvictableResource() throws Exception {
    FileSystem fs = mock(FileSystem.class);
    CleanerMetrics metrics = mock(CleanerMetrics.class);
    SCMStore store = mock(SCMStore.class);
    CleanerTask task = createSpiedTask(fs, store, metrics, new ReentrantLock());
    // mock an evictable resource
    when(store.isResourceEvictable(isA(String.class), isA(FileStatus.class))).thenReturn(true);
    FileStatus status = mock(FileStatus.class);
    when(status.getPath()).thenReturn(new Path(ROOT + "/a/b/c/abc"));
    when(store.removeResource(isA(String.class))).thenReturn(true);
    // rename succeeds
    when(fs.rename(isA(Path.class), isA(Path.class))).thenReturn(true);
    // delete returns true
    when(fs.delete(isA(Path.class), anyBoolean())).thenReturn(true);
    // process the resource
    task.processSingleResource(status);
    // the directory should be renamed
    verify(fs).rename(eq(status.getPath()), isA(Path.class));
    // metrics should record a deleted file
    verify(metrics).reportAFileDelete();
    verify(metrics, never()).reportAFileProcess();
}
Also used : ReentrantLock(java.util.concurrent.locks.ReentrantLock) Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) SCMStore(org.apache.hadoop.yarn.server.sharedcachemanager.store.SCMStore) FileSystem(org.apache.hadoop.fs.FileSystem) CleanerMetrics(org.apache.hadoop.yarn.server.sharedcachemanager.metrics.CleanerMetrics) Test(org.junit.Test)

Example 4 with FileStatus

use of org.apache.hadoop.fs.FileStatus in project hadoop by apache.

the class TestCleanerTask method testProcessFreshResource.

@Test
public void testProcessFreshResource() throws Exception {
    FileSystem fs = mock(FileSystem.class);
    CleanerMetrics metrics = mock(CleanerMetrics.class);
    SCMStore store = mock(SCMStore.class);
    CleanerTask task = createSpiedTask(fs, store, metrics, new ReentrantLock());
    // mock a resource that is not evictable
    when(store.isResourceEvictable(isA(String.class), isA(FileStatus.class))).thenReturn(false);
    FileStatus status = mock(FileStatus.class);
    when(status.getPath()).thenReturn(new Path(ROOT + "/a/b/c/abc"));
    // process the resource
    task.processSingleResource(status);
    // the directory should not be renamed
    verify(fs, never()).rename(eq(status.getPath()), isA(Path.class));
    // metrics should record a processed file (but not delete)
    verify(metrics).reportAFileProcess();
    verify(metrics, never()).reportAFileDelete();
}
Also used : ReentrantLock(java.util.concurrent.locks.ReentrantLock) Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) SCMStore(org.apache.hadoop.yarn.server.sharedcachemanager.store.SCMStore) FileSystem(org.apache.hadoop.fs.FileSystem) CleanerMetrics(org.apache.hadoop.yarn.server.sharedcachemanager.metrics.CleanerMetrics) Test(org.junit.Test)

Example 5 with FileStatus

use of org.apache.hadoop.fs.FileStatus in project hbase by apache.

the class DumpReplicationQueues method getTotalWALSize.

/**
   *  return total size in bytes from a list of WALs
   */
private long getTotalWALSize(FileSystem fs, List<String> wals, String server) throws IOException {
    int size = 0;
    FileStatus fileStatus;
    for (String wal : wals) {
        try {
            fileStatus = (new WALLink(getConf(), server, wal)).getFileStatus(fs);
        } catch (IOException e) {
            if (e instanceof FileNotFoundException) {
                numWalsNotFound++;
                LOG.warn("WAL " + wal + " couldn't be found, skipping", e);
            } else {
                LOG.warn("Can't get file status of WAL " + wal + ", skipping", e);
            }
            continue;
        }
        size += fileStatus.getLen();
    }
    totalSizeOfWALs += size;
    return size;
}
Also used : WALLink(org.apache.hadoop.hbase.io.WALLink) FileStatus(org.apache.hadoop.fs.FileStatus) FileNotFoundException(java.io.FileNotFoundException) IOException(java.io.IOException)

Aggregations

FileStatus (org.apache.hadoop.fs.FileStatus)2175 Path (org.apache.hadoop.fs.Path)1734 FileSystem (org.apache.hadoop.fs.FileSystem)822 Test (org.junit.Test)634 IOException (java.io.IOException)564 ArrayList (java.util.ArrayList)358 Configuration (org.apache.hadoop.conf.Configuration)306 FileNotFoundException (java.io.FileNotFoundException)214 LocatedFileStatus (org.apache.hadoop.fs.LocatedFileStatus)199 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)118 FsPermission (org.apache.hadoop.fs.permission.FsPermission)118 FSDataOutputStream (org.apache.hadoop.fs.FSDataOutputStream)110 HashMap (java.util.HashMap)108 PathFilter (org.apache.hadoop.fs.PathFilter)101 List (java.util.List)88 URI (java.net.URI)82 HashSet (java.util.HashSet)77 File (java.io.File)73 Map (java.util.Map)61 DistributedFileSystem (org.apache.hadoop.hdfs.DistributedFileSystem)59