Examples with PathFilter - org.apache.hadoop.fs.PathFilter

Example 86 with PathFilter

use of org.apache.hadoop.fs.PathFilter in project drill by axbaretto.

the class FileSystemUtilTest method testListAllRecursiveWithFilter.

@Test
public void testListAllRecursiveWithFilter() throws IOException {
    List<FileStatus> statuses = FileSystemUtil.listAll(fs, new Path(base, "a"), true, new PathFilter() {

        @Override
        public boolean accept(Path path) {
            return path.getName().endsWith("a") || path.getName().endsWith(".txt");
        }
    });
    assertEquals("Directory and file count should match", 7, statuses.size());
}

Also used : Path(org.apache.hadoop.fs.Path) PathFilter(org.apache.hadoop.fs.PathFilter) FileStatus(org.apache.hadoop.fs.FileStatus) Test(org.junit.Test)

Example 87 with PathFilter

use of org.apache.hadoop.fs.PathFilter in project drill by axbaretto.

the class FileSystemUtilTest method mergeFiltersFalse.

@Test
public void mergeFiltersFalse() {
    Path file = new Path("abc.txt");
    PathFilter firstFilter = new PathFilter() {

        @Override
        public boolean accept(Path path) {
            return path.getName().startsWith("a");
        }
    };
    PathFilter secondFilter = new PathFilter() {

        @Override
        public boolean accept(Path path) {
            return path.getName().endsWith(".csv");
        }
    };
    assertFalse("Path should have been excluded from the path list", FileSystemUtil.mergeFilters(firstFilter, secondFilter).accept(file));
    assertFalse("Path should have been excluded from the path list", FileSystemUtil.mergeFilters(firstFilter, new PathFilter[] { secondFilter }).accept(file));
}

Also used : Path(org.apache.hadoop.fs.Path) PathFilter(org.apache.hadoop.fs.PathFilter) Test(org.junit.Test)

Example 88 with PathFilter

use of org.apache.hadoop.fs.PathFilter in project tez by apache.

the class TestTezJobs method testSortMergeJoinExampleDisableSplitGrouping.

@Test(timeout = 60000)
public void testSortMergeJoinExampleDisableSplitGrouping() throws Exception {
    SortMergeJoinExample sortMergeJoinExample = new SortMergeJoinExample();
    sortMergeJoinExample.setConf(conf);
    Path stagingDirPath = new Path(TEST_ROOT_DIR + "/tmp/tez-staging-dir");
    Path inPath1 = new Path(TEST_ROOT_DIR + "/tmp/sortMerge/inPath1");
    Path inPath2 = new Path(TEST_ROOT_DIR + "/tmp/sortMerge/inPath2");
    Path outPath = new Path(TEST_ROOT_DIR + "/tmp/sortMerge/outPath");
    localFs.delete(outPath, true);
    localFs.mkdirs(inPath1);
    localFs.mkdirs(inPath2);
    localFs.mkdirs(stagingDirPath);
    Set<String> expectedResult = new HashSet<String>();
    FSDataOutputStream out1 = localFs.create(new Path(inPath1, "file"));
    FSDataOutputStream out2 = localFs.create(new Path(inPath2, "file"));
    BufferedWriter writer1 = new BufferedWriter(new OutputStreamWriter(out1));
    BufferedWriter writer2 = new BufferedWriter(new OutputStreamWriter(out2));
    for (int i = 0; i < 20; i++) {
        String term = "term" + i;
        writer1.write(term);
        writer1.newLine();
        if (i % 2 == 0) {
            writer2.write(term);
            writer2.newLine();
            expectedResult.add(term);
        }
    }
    writer1.close();
    writer2.close();
    out1.close();
    out2.close();
    String[] args = new String[] { "-D" + TezConfiguration.TEZ_AM_STAGING_DIR + "=" + stagingDirPath.toString(), "-counter", "-local", "-disableSplitGrouping", inPath1.toString(), inPath2.toString(), "1", outPath.toString() };
    assertEquals(0, sortMergeJoinExample.run(args));
    FileStatus[] statuses = localFs.listStatus(outPath, new PathFilter() {

        public boolean accept(Path p) {
            String name = p.getName();
            return !name.startsWith("_") && !name.startsWith(".");
        }
    });
    assertEquals(1, statuses.length);
    FSDataInputStream inStream = localFs.open(statuses[0].getPath());
    BufferedReader reader = new BufferedReader(new InputStreamReader(inStream));
    String line;
    while ((line = reader.readLine()) != null) {
        assertTrue(expectedResult.remove(line));
    }
    reader.close();
    inStream.close();
    assertEquals(0, expectedResult.size());
}

Also used : Path(org.apache.hadoop.fs.Path) PathFilter(org.apache.hadoop.fs.PathFilter) FileStatus(org.apache.hadoop.fs.FileStatus) InputStreamReader(java.io.InputStreamReader) SortMergeJoinExample(org.apache.tez.examples.SortMergeJoinExample) BufferedWriter(java.io.BufferedWriter) BufferedReader(java.io.BufferedReader) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) OutputStreamWriter(java.io.OutputStreamWriter) FSDataOutputStream(org.apache.hadoop.fs.FSDataOutputStream) HashSet(java.util.HashSet) Test(org.junit.Test)

Example 89 with PathFilter

use of org.apache.hadoop.fs.PathFilter in project tez by apache.

the class TestTezJobs method verifySortMergeJoinInput.

private void verifySortMergeJoinInput(Path outPath, Set<String> expectedResult) throws IOException {
    FileStatus[] statuses = remoteFs.listStatus(outPath, new PathFilter() {

        public boolean accept(Path p) {
            String name = p.getName();
            return !name.startsWith("_") && !name.startsWith(".");
        }
    });
    assertEquals(1, statuses.length);
    FSDataInputStream inStream = remoteFs.open(statuses[0].getPath());
    BufferedReader reader = new BufferedReader(new InputStreamReader(inStream));
    String line;
    while ((line = reader.readLine()) != null) {
        assertTrue(expectedResult.remove(line));
    }
    reader.close();
    inStream.close();
    assertEquals(0, expectedResult.size());
}

Example 90 with PathFilter

use of org.apache.hadoop.fs.PathFilter in project tez by apache.

the class TestTezJobs method testHashJoinExampleDisableSplitGrouping.

@Test(timeout = 60000)
public void testHashJoinExampleDisableSplitGrouping() throws Exception {
    HashJoinExample hashJoinExample = new HashJoinExample();
    hashJoinExample.setConf(conf);
    Path stagingDirPath = new Path(TEST_ROOT_DIR + "/tmp/tez-staging-dir");
    Path inPath1 = new Path(TEST_ROOT_DIR + "/tmp/hashJoin/inPath1");
    Path inPath2 = new Path(TEST_ROOT_DIR + "/tmp/hashJoin/inPath2");
    Path outPath = new Path(TEST_ROOT_DIR + "/tmp/hashJoin/outPath");
    localFs.delete(outPath, true);
    localFs.mkdirs(inPath1);
    localFs.mkdirs(inPath2);
    localFs.mkdirs(stagingDirPath);
    Set<String> expectedResult = new HashSet<String>();
    FSDataOutputStream out1 = localFs.create(new Path(inPath1, "file"));
    FSDataOutputStream out2 = localFs.create(new Path(inPath2, "file"));
    BufferedWriter writer1 = new BufferedWriter(new OutputStreamWriter(out1));
    BufferedWriter writer2 = new BufferedWriter(new OutputStreamWriter(out2));
    for (int i = 0; i < 20; i++) {
        String term = "term" + i;
        writer1.write(term);
        writer1.newLine();
        if (i % 2 == 0) {
            writer2.write(term);
            writer2.newLine();
            expectedResult.add(term);
        }
    }
    writer1.close();
    writer2.close();
    out1.close();
    out2.close();
    String[] args = new String[] { "-D" + TezConfiguration.TEZ_AM_STAGING_DIR + "=" + stagingDirPath.toString(), "-counter", "-local", "-disableSplitGrouping", inPath1.toString(), inPath2.toString(), "1", outPath.toString() };
    assertEquals(0, hashJoinExample.run(args));
    FileStatus[] statuses = localFs.listStatus(outPath, new PathFilter() {

        public boolean accept(Path p) {
            String name = p.getName();
            return !name.startsWith("_") && !name.startsWith(".");
        }
    });
    assertEquals(1, statuses.length);
    FSDataInputStream inStream = localFs.open(statuses[0].getPath());
    BufferedReader reader = new BufferedReader(new InputStreamReader(inStream));
    String line;
    while ((line = reader.readLine()) != null) {
        assertTrue(expectedResult.remove(line));
    }
    reader.close();
    inStream.close();
    assertEquals(0, expectedResult.size());
}

Also used : Path(org.apache.hadoop.fs.Path) PathFilter(org.apache.hadoop.fs.PathFilter) FileStatus(org.apache.hadoop.fs.FileStatus) InputStreamReader(java.io.InputStreamReader) BufferedWriter(java.io.BufferedWriter) HashJoinExample(org.apache.tez.examples.HashJoinExample) BufferedReader(java.io.BufferedReader) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) OutputStreamWriter(java.io.OutputStreamWriter) FSDataOutputStream(org.apache.hadoop.fs.FSDataOutputStream) HashSet(java.util.HashSet) Test(org.junit.Test)

Aggregations

PathFilter (org.apache.hadoop.fs.PathFilter)123 Path (org.apache.hadoop.fs.Path)114 FileStatus (org.apache.hadoop.fs.FileStatus)96 Test (org.junit.Test)47 IOException (java.io.IOException)42 FileSystem (org.apache.hadoop.fs.FileSystem)39 ArrayList (java.util.ArrayList)22 List (java.util.List)19 Configuration (org.apache.hadoop.conf.Configuration)18 Collections (java.util.Collections)11 BufferedReader (java.io.BufferedReader)9 InputStreamReader (java.io.InputStreamReader)9 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)9 Assert.assertEquals (org.junit.Assert.assertEquals)9 Assert.assertTrue (org.junit.Assert.assertTrue)9 URI (java.net.URI)8 Test (org.testng.annotations.Test)8 FSDataOutputStream (org.apache.hadoop.fs.FSDataOutputStream)7 IGNORED (com.facebook.presto.hive.NestedDirectoryPolicy.IGNORED)6 RECURSE (com.facebook.presto.hive.NestedDirectoryPolicy.RECURSE)6