Search in sources :

Example 1 with ExtractionResult

use of org.opensextant.extraction.ExtractionResult in project Xponents by OpenSextant.

the class TweetGeocoder method geocodeTweetUser.

/**
     * If user loc.xy: 
     *    write out( xy ) 
     * else if user loc 
     *    geocode (user loc) 
     *    write out ()
     *
     * geocode(status) write out ()
     */
public void geocodeTweetUser(Tweet tw) {
    if (tw.author_xy_val == null || tw.author_location == null) {
        return;
    }
    ExtractionResult res = new ExtractionResult(tw.id);
    res.addAttribute("timestamp", tw.pub_date);
    res.addAttribute("author", tw.author);
    res.addAttribute("tweet", tw.getText());
    /*
         * If User profile location or geo coord is a Coordinate... parse and add to matched locations
         */
    if (tw.author_xy_val != null) {
        res.matches = userlocX.extract(new TextInput(tw.id, tw.author_xy_val));
    } else if (tw.author_location != null) {
        res.matches = userlocX.extract(new TextInput(tw.id, tw.author_location));
    }
    /*
         * If User profile is a place name, attempt to match it and disambiguate. 
         */
    if (res.matches.isEmpty()) {
        try {
            res.matches = geocoder.extract(new TextInput(tw.id, tw.author_location));
        } catch (Exception userErr) {
            log.error("Geocoding error with Users?", userErr);
        }
    }
    if (res.matches.isEmpty()) {
        return;
    }
    userOutput.writeGeocodingResult(res);
}
Also used : ExtractionResult(org.opensextant.extraction.ExtractionResult) TextInput(org.opensextant.data.TextInput) ProcessingException(org.opensextant.processing.ProcessingException) ParseException(java.text.ParseException) ConfigException(org.opensextant.ConfigException) IOException(java.io.IOException)

Example 2 with ExtractionResult

use of org.opensextant.extraction.ExtractionResult in project Xponents by OpenSextant.

the class TweetGeocoder method geocodeTweet.

/**
     * If a tweet has a non-zero status text, let's find all places in the
     * content.
     */
public void geocodeTweet(Tweet tw) {
    ++recordCount;
    if (tw.getText() != null && !tw.getText().isEmpty()) {
        try {
            ExtractionResult res = new ExtractionResult(tw.id);
            // Place name tagger may not work if content has mostly lower case proper names.!!!! TODO: allow mixed case;
            res.matches = geocoder.extract(new TextInput(tw.id, tw.getText()));
            res.addAttribute("timestamp", tw.pub_date);
            res.addAttribute("tweet", tw.getText());
            res.addAttribute("author", tw.author);
            enrichResults(res.matches);
            tweetOutput.writeGeocodingResult(res);
        } catch (Exception err) {
            log.error("Geocoding error?", err);
        }
    }
    if (recordCount % batch == 0 && recordCount > 0) {
        log.info("ROW #" + recordCount);
        geocoder.reportMemory();
    }
}
Also used : ExtractionResult(org.opensextant.extraction.ExtractionResult) TextInput(org.opensextant.data.TextInput) ProcessingException(org.opensextant.processing.ProcessingException) ParseException(java.text.ParseException) ConfigException(org.opensextant.ConfigException) IOException(java.io.IOException)

Example 3 with ExtractionResult

use of org.opensextant.extraction.ExtractionResult in project Xponents by OpenSextant.

the class XtractorGroup method processAndFormat.

/**
     * Processes input content against all extractors and all formatters This
     * does not throw exceptions, as some processing may fail, while others
     * succeed. TODO: Processing/Formatting details would have to be retrieved
     * by calling some other method that is statefully tracking such things.
     *
     * @param input
     * @return status -1 failure, 0 nothing found, 1 found matches and
     *         formatted; 2 found content but nothing formatted. them.
     */
public int processAndFormat(TextInput input) {
    reset();
    ExtractionResult compilation = new ExtractionResult(input.id);
    if (input instanceof DocInput) {
        compilation.recordFile = ((DocInput) input).getFilepath();
        compilation.recordTextFile = ((DocInput) input).getTextpath();
    }
    compilation.matches = process(input);
    compilation.input = input;
    if (compilation.matches.isEmpty()) {
        // nothing found
        return 0;
    }
    int status = format(compilation);
    return status;
}
Also used : DocInput(org.opensextant.data.DocInput) ExtractionResult(org.opensextant.extraction.ExtractionResult)

Aggregations

ExtractionResult (org.opensextant.extraction.ExtractionResult)3 IOException (java.io.IOException)2 ParseException (java.text.ParseException)2 ConfigException (org.opensextant.ConfigException)2 TextInput (org.opensextant.data.TextInput)2 ProcessingException (org.opensextant.processing.ProcessingException)2 DocInput (org.opensextant.data.DocInput)1