Search in sources :

Example 1 with Readable

use of com.baidu.hugegraph.loader.reader.Readable in project incubator-hugegraph-toolchain by apache.

the class FileReader method init.

@Override
public void init(LoadContext context, InputStruct struct) throws InitException {
    this.progress(context, struct);
    List<Readable> readableList;
    try {
        readableList = this.scanReadables();
        // Sort readable files by name
        readableList.sort(Comparator.comparing(Readable::name));
    } catch (IOException e) {
        throw new InitException("Failed to scan readable files for '%s'", e, this.source);
    }
    this.readables = readableList.iterator();
    this.fetcher = this.createLineFetcher();
    this.fetcher.readHeaderIfNeeded(readableList);
}
Also used : InitException(com.baidu.hugegraph.loader.exception.InitException) Readable(com.baidu.hugegraph.loader.reader.Readable) IOException(java.io.IOException)

Example 2 with Readable

use of com.baidu.hugegraph.loader.reader.Readable in project incubator-hugegraph-toolchain by apache.

the class OrcFileLineFetcher method readHeader.

@Override
public String[] readHeader(List<Readable> readables) {
    Readable readable = readables.get(0);
    this.openReader(readable);
    StructObjectInspector inspector;
    try {
        inspector = (StructObjectInspector) this.reader.getObjectInspector();
        return this.parseHeader(inspector);
    } finally {
        try {
            this.closeReader();
        } catch (IOException e) {
            LOG.warn("Failed to close reader of '{}'", readable);
        }
    }
}
Also used : Readable(com.baidu.hugegraph.loader.reader.Readable) IOException(java.io.IOException) StructObjectInspector(org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector)

Example 3 with Readable

use of com.baidu.hugegraph.loader.reader.Readable in project incubator-hugegraph-toolchain by apache.

the class HDFSFileReader method scanReadables.

@Override
protected List<Readable> scanReadables() throws IOException {
    Path path = new Path(this.source().path());
    FileFilter filter = this.source().filter();
    List<Readable> paths = new ArrayList<>();
    if (this.hdfs.isFile(path)) {
        if (!filter.reserved(path.getName())) {
            throw new LoadException("Please check path name and extensions, ensure " + "that at least one path is available for reading");
        }
        paths.add(new HDFSFile(this.hdfs, path));
    } else {
        assert this.hdfs.isDirectory(path);
        FileStatus[] statuses = this.hdfs.listStatus(path);
        Path[] subPaths = FileUtil.stat2Paths(statuses);
        for (Path subPath : subPaths) {
            if (filter.reserved(subPath.getName())) {
                paths.add(new HDFSFile(this.hdfs, subPath));
            }
        }
    }
    return paths;
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) ArrayList(java.util.ArrayList) Readable(com.baidu.hugegraph.loader.reader.Readable) FileFilter(com.baidu.hugegraph.loader.source.file.FileFilter) LoadException(com.baidu.hugegraph.loader.exception.LoadException)

Example 4 with Readable

use of com.baidu.hugegraph.loader.reader.Readable in project incubator-hugegraph-toolchain by apache.

the class LocalFileReader method scanReadables.

@Override
protected List<Readable> scanReadables() {
    File file = FileUtils.getFile(this.source().path());
    checkExistAndReadable(file);
    FileFilter filter = this.source().filter();
    List<Readable> files = new ArrayList<>();
    if (file.isFile()) {
        if (!filter.reserved(file.getName())) {
            throw new LoadException("Please check file name and extensions, ensure " + "that at least one file is available for reading");
        }
        files.add(new LocalFile(file));
    } else {
        assert file.isDirectory();
        File[] subFiles = file.listFiles();
        if (subFiles == null) {
            throw new LoadException("Error while listing the files of " + "path '%s'", file);
        }
        for (File subFile : subFiles) {
            if (filter.reserved(subFile.getName())) {
                files.add(new LocalFile(subFile));
            }
        }
    }
    return files;
}
Also used : ArrayList(java.util.ArrayList) Readable(com.baidu.hugegraph.loader.reader.Readable) FileFilter(com.baidu.hugegraph.loader.source.file.FileFilter) File(java.io.File) LoadException(com.baidu.hugegraph.loader.exception.LoadException)

Example 5 with Readable

use of com.baidu.hugegraph.loader.reader.Readable in project incubator-hugegraph-toolchain by apache.

the class ParquetFileLineFetcher method readHeader.

@Override
public String[] readHeader(List<Readable> readables) {
    Readable readable = readables.get(0);
    this.openReader(readables.get(0));
    try {
        return this.parseHeader(this.schema);
    } finally {
        try {
            this.closeReader();
        } catch (IOException e) {
            LOG.warn("Failed to close reader of '{}'", readable);
        }
    }
}
Also used : Readable(com.baidu.hugegraph.loader.reader.Readable) IOException(java.io.IOException)

Aggregations

Readable (com.baidu.hugegraph.loader.reader.Readable)5 IOException (java.io.IOException)3 LoadException (com.baidu.hugegraph.loader.exception.LoadException)2 FileFilter (com.baidu.hugegraph.loader.source.file.FileFilter)2 ArrayList (java.util.ArrayList)2 InitException (com.baidu.hugegraph.loader.exception.InitException)1 File (java.io.File)1 FileStatus (org.apache.hadoop.fs.FileStatus)1 Path (org.apache.hadoop.fs.Path)1 StructObjectInspector (org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector)1