Search in sources :

Example 1 with GazetteerMatcher

use of org.opensextant.extractors.geo.GazetteerMatcher in project Xponents by OpenSextant.

the class TestGazMatcher method main.

/**
     * Do a basic test. Requirements include setting opensextant.solr to solr
     * core home. (Xponents/solr, by default) USAGE:
     * 
     * TestGazMatcher file
     * 
     * Prints: all matched, filtered place mentions distinct places distinct
     * countries
     */
public static void main(String[] args) throws Exception {
    GazetteerMatcher sm = new GazetteerMatcher(true);
    URL filterFile = TestGazMatcher.class.getResource("/test-filter.txt");
    if (filterFile == null) {
        System.err.println("This test requires a 'test-filter.txt' file with non-place names in it." + "\nThese filters should match up with your test documents");
    }
    MatchFilter filt = new MatchFilter(filterFile);
    sm.setMatchFilter(filt);
    try {
        String docContent = "We drove to Sin City. The we drove to -$IN ĆITŸ .";
        System.out.println(docContent);
        List<PlaceCandidate> matches = sm.tagText(docContent, "main-test");
        for (PlaceCandidate pc : matches) {
            printGeoTags(pc);
        }
        docContent = "Is there some city in 刘家埝 written in Chinese?";
        matches = sm.tagCJKText(docContent, "main-test");
        for (PlaceCandidate pc : matches) {
            printGeoTags(pc);
        }
        docContent = "Where is seoul?";
        matches = sm.tagText(docContent, "main-test");
        for (PlaceCandidate pc : matches) {
            printGeoTags(pc);
        }
        String buf = FileUtility.readFile(args[0]);
        matches = sm.tagText(buf, "main-test", true);
        summarizeFindings(copyFrom(matches));
    } catch (Exception err) {
        err.printStackTrace();
    } finally {
        sm.shutdown();
    }
}
Also used : GazetteerMatcher(org.opensextant.extractors.geo.GazetteerMatcher) MatchFilter(org.opensextant.extraction.MatchFilter) URL(java.net.URL) PlaceCandidate(org.opensextant.extractors.geo.PlaceCandidate)

Aggregations

URL (java.net.URL)1 MatchFilter (org.opensextant.extraction.MatchFilter)1 GazetteerMatcher (org.opensextant.extractors.geo.GazetteerMatcher)1 PlaceCandidate (org.opensextant.extractors.geo.PlaceCandidate)1