Search in sources :

Example 1 with FileFilter

use of com.baidu.hugegraph.loader.source.file.FileFilter in project hugegraph-computer by hugegraph.

the class LoaderFileInputSplitFetcher method scanHdfsPaths.

private List<String> scanHdfsPaths(HDFSSource hdfsSource) {
    List<String> paths = new ArrayList<>();
    try {
        Configuration configuration = this.loadConfiguration(hdfsSource);
        this.enableKerberos(hdfsSource, configuration);
        try (FileSystem hdfs = FileSystem.get(configuration)) {
            Path path = new Path(hdfsSource.path());
            FileFilter filter = hdfsSource.filter();
            if (hdfs.getFileStatus(path).isFile()) {
                if (!filter.reserved(path.getName())) {
                    throw new ComputerException("Please check path name and extensions, ensure " + "that at least one path is available for reading");
                }
                paths.add(path.toString());
            } else {
                assert hdfs.getFileStatus(path).isDirectory();
                FileStatus[] statuses = hdfs.listStatus(path);
                Path[] subPaths = FileUtil.stat2Paths(statuses);
                for (Path subPath : subPaths) {
                    if (filter.reserved(subPath.getName())) {
                        paths.add(subPath.toString());
                    }
                }
            }
        }
    } catch (Throwable throwable) {
        throw new ComputerException("Failed to scanPaths", throwable);
    }
    return paths;
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) Configuration(org.apache.hadoop.conf.Configuration) FileSystem(org.apache.hadoop.fs.FileSystem) ArrayList(java.util.ArrayList) FileFilter(com.baidu.hugegraph.loader.source.file.FileFilter) ComputerException(com.baidu.hugegraph.computer.core.common.exception.ComputerException)

Example 2 with FileFilter

use of com.baidu.hugegraph.loader.source.file.FileFilter in project hugegraph-computer by hugegraph.

the class LoaderFileInputSplitFetcher method scanLocalPaths.

private List<String> scanLocalPaths(FileSource source) {
    List<String> paths = new ArrayList<>();
    File file = FileUtils.getFile(source.path());
    FileFilter filter = source.filter();
    if (file.isFile()) {
        if (!filter.reserved(file.getName())) {
            throw new LoadException("Please check file name and extensions, ensure " + "that at least one file is available for reading");
        }
        paths.add(file.getAbsolutePath());
    } else {
        assert file.isDirectory();
        File[] subFiles = file.listFiles();
        if (subFiles == null) {
            throw new LoadException("Error while listing the files of " + "path '%s'", file);
        }
        for (File subFile : subFiles) {
            if (filter.reserved(subFile.getName())) {
                paths.add(subFile.getAbsolutePath());
            }
        }
    }
    return paths;
}
Also used : ArrayList(java.util.ArrayList) FileFilter(com.baidu.hugegraph.loader.source.file.FileFilter) File(java.io.File) LoadException(com.baidu.hugegraph.loader.exception.LoadException)

Example 3 with FileFilter

use of com.baidu.hugegraph.loader.source.file.FileFilter in project incubator-hugegraph-toolchain by apache.

the class HDFSFileReader method scanReadables.

@Override
protected List<Readable> scanReadables() throws IOException {
    Path path = new Path(this.source().path());
    FileFilter filter = this.source().filter();
    List<Readable> paths = new ArrayList<>();
    if (this.hdfs.isFile(path)) {
        if (!filter.reserved(path.getName())) {
            throw new LoadException("Please check path name and extensions, ensure " + "that at least one path is available for reading");
        }
        paths.add(new HDFSFile(this.hdfs, path));
    } else {
        assert this.hdfs.isDirectory(path);
        FileStatus[] statuses = this.hdfs.listStatus(path);
        Path[] subPaths = FileUtil.stat2Paths(statuses);
        for (Path subPath : subPaths) {
            if (filter.reserved(subPath.getName())) {
                paths.add(new HDFSFile(this.hdfs, subPath));
            }
        }
    }
    return paths;
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) ArrayList(java.util.ArrayList) Readable(com.baidu.hugegraph.loader.reader.Readable) FileFilter(com.baidu.hugegraph.loader.source.file.FileFilter) LoadException(com.baidu.hugegraph.loader.exception.LoadException)

Example 4 with FileFilter

use of com.baidu.hugegraph.loader.source.file.FileFilter in project incubator-hugegraph-toolchain by apache.

the class LocalFileReader method scanReadables.

@Override
protected List<Readable> scanReadables() {
    File file = FileUtils.getFile(this.source().path());
    checkExistAndReadable(file);
    FileFilter filter = this.source().filter();
    List<Readable> files = new ArrayList<>();
    if (file.isFile()) {
        if (!filter.reserved(file.getName())) {
            throw new LoadException("Please check file name and extensions, ensure " + "that at least one file is available for reading");
        }
        files.add(new LocalFile(file));
    } else {
        assert file.isDirectory();
        File[] subFiles = file.listFiles();
        if (subFiles == null) {
            throw new LoadException("Error while listing the files of " + "path '%s'", file);
        }
        for (File subFile : subFiles) {
            if (filter.reserved(subFile.getName())) {
                files.add(new LocalFile(subFile));
            }
        }
    }
    return files;
}
Also used : ArrayList(java.util.ArrayList) Readable(com.baidu.hugegraph.loader.reader.Readable) FileFilter(com.baidu.hugegraph.loader.source.file.FileFilter) File(java.io.File) LoadException(com.baidu.hugegraph.loader.exception.LoadException)

Aggregations

FileFilter (com.baidu.hugegraph.loader.source.file.FileFilter)4 ArrayList (java.util.ArrayList)4 LoadException (com.baidu.hugegraph.loader.exception.LoadException)3 Readable (com.baidu.hugegraph.loader.reader.Readable)2 File (java.io.File)2 FileStatus (org.apache.hadoop.fs.FileStatus)2 Path (org.apache.hadoop.fs.Path)2 ComputerException (com.baidu.hugegraph.computer.core.common.exception.ComputerException)1 Configuration (org.apache.hadoop.conf.Configuration)1 FileSystem (org.apache.hadoop.fs.FileSystem)1