Examples with ArticleIngestion - org.ambraproject.rhino.model.ArticleIngestion

Example 11 with ArticleIngestion

use of org.ambraproject.rhino.model.ArticleIngestion in project rhino by PLOS.

the class ArticleCrudServiceImpl method buildOverview.

@Override
public ArticleOverview buildOverview(Article article) {
    return hibernateTemplate.execute(session -> {
        Query ingestionQuery = session.createQuery("FROM ArticleIngestion WHERE article = :article");
        ingestionQuery.setParameter("article", article);
        List<ArticleIngestion> ingestions = ingestionQuery.list();
        Query revisionQuery = session.createQuery("" + "FROM ArticleRevision WHERE ingestion IN " + "  (FROM ArticleIngestion WHERE article = :article)");
        revisionQuery.setParameter("article", article);
        List<ArticleRevision> revisions = revisionQuery.list();
        ArticleIdentifier id = ArticleIdentifier.create(article.getDoi());
        return ArticleOverview.build(id, ingestions, revisions);
    });
}

Also used : ArticleIngestion(org.ambraproject.rhino.model.ArticleIngestion) ArticleRevision(org.ambraproject.rhino.model.ArticleRevision) ArticleIdentifier(org.ambraproject.rhino.identity.ArticleIdentifier) Query(org.hibernate.Query)

Example 12 with ArticleIngestion

use of org.ambraproject.rhino.model.ArticleIngestion in project rhino by PLOS.

the class IngestionServiceTest method createStubArticleItem.

private static ArticleItem createStubArticleItem() {
    ArticleItem articleItem = new ArticleItem();
    ArticleIngestion articleIngestion = new ArticleIngestion();
    Article article = new Article();
    article.setDoi("test");
    articleItem.setIngestion(articleIngestion);
    articleIngestion.setArticle(article);
    return articleItem;
}

Also used : ArticleItem(org.ambraproject.rhino.model.ArticleItem) ArticleIngestion(org.ambraproject.rhino.model.ArticleIngestion) Article(org.ambraproject.rhino.model.Article)

Example 13 with ArticleIngestion

use of org.ambraproject.rhino.model.ArticleIngestion in project rhino by PLOS.

the class TaxonomyClassificationServiceImpl method getRawTerms.

/**
   * @inheritDoc
   */
@Override
public List<String> getRawTerms(Document articleXml, Article article, boolean isTextRequired) {
    RuntimeConfiguration.TaxonomyConfiguration configuration = getTaxonomyConfiguration();
    String toCategorize = getCategorizationContent(articleXml);
    ArticleIngestion latest = articleCrudService.readLatestRevision(article).getIngestion();
    String header = String.format(MESSAGE_HEADER, new SimpleDateFormat("yyyy-MM-dd").format(latest.getPublicationDate()), latest.getJournal().getTitle(), latest.getArticleType(), article.getDoi());
    String aiMessage = String.format(MESSAGE_BEGIN, configuration.getThesaurus()) + StringEscapeUtils.escapeXml10(String.format(MESSAGE_DOC_ELEMENT, header, toCategorize)) + MESSAGE_END;
    HttpPost post = new HttpPost(configuration.getServer().toString());
    post.setEntity(new StringEntity(aiMessage, APPLICATION_XML_UTF_8));
    DocumentBuilder documentBuilder = newDocumentBuilder();
    Document response;
    try (CloseableHttpResponse httpResponse = httpClient.execute(post);
        InputStream stream = httpResponse.getEntity().getContent()) {
        response = documentBuilder.parse(stream);
    } catch (IOException e) {
        throw new TaxonomyRemoteServiceNotAvailableException(e);
    } catch (SAXException e) {
        throw new TaxonomyRemoteServiceInvalidBehaviorException("Invalid XML returned from " + configuration.getServer(), e);
    }
    //parse result
    NodeList vectorElements = response.getElementsByTagName("VectorElement");
    List<String> results = new ArrayList<>(vectorElements.getLength());
    // Add the text that is sent to taxonomy server if isTextRequired is true
    if (isTextRequired) {
        toCategorize = StringEscapeUtils.unescapeXml(toCategorize);
        results.add(toCategorize);
    }
    //The first and last elements of the vector response are just MAITERMS
    for (int i = 1; i < vectorElements.getLength() - 1; i++) {
        results.add(vectorElements.item(i).getTextContent());
    }
    if ((isTextRequired && results.size() == 1) || results.isEmpty()) {
        log.error("Taxonomy server returned 0 terms. " + article.getDoi());
    }
    return results;
}

Also used : ArticleIngestion(org.ambraproject.rhino.model.ArticleIngestion) HttpPost(org.apache.http.client.methods.HttpPost) InputStream(java.io.InputStream) NodeList(org.w3c.dom.NodeList) ArrayList(java.util.ArrayList) IOException(java.io.IOException) Document(org.w3c.dom.Document) RuntimeConfiguration(org.ambraproject.rhino.config.RuntimeConfiguration) SAXException(org.xml.sax.SAXException) StringEntity(org.apache.http.entity.StringEntity) TaxonomyRemoteServiceInvalidBehaviorException(org.ambraproject.rhino.service.taxonomy.TaxonomyRemoteServiceInvalidBehaviorException) DocumentBuilder(javax.xml.parsers.DocumentBuilder) AmbraService.newDocumentBuilder(org.ambraproject.rhino.service.impl.AmbraService.newDocumentBuilder) CloseableHttpResponse(org.apache.http.client.methods.CloseableHttpResponse) SimpleDateFormat(java.text.SimpleDateFormat) TaxonomyRemoteServiceNotAvailableException(org.ambraproject.rhino.service.taxonomy.TaxonomyRemoteServiceNotAvailableException)

Example 14 with ArticleIngestion

use of org.ambraproject.rhino.model.ArticleIngestion in project rhino by PLOS.

the class IssueOutputView method getIssueImageFigureDoi.

private static String getIssueImageFigureDoi(ArticleCrudService articleCrudService, Article imageArticle) {
    ArticleRevision latestArticleRevision = articleCrudService.getLatestRevision(imageArticle).orElseThrow(() -> new RuntimeException("Image article has no published revisions. " + imageArticle.getDoi()));
    ArticleIngestion ingestion = latestArticleRevision.getIngestion();
    Collection<ArticleItem> allArticleItems = articleCrudService.getAllArticleItems(ingestion);
    List<ArticleItem> figureImageItems = allArticleItems.stream().filter(item -> FIGURE_IMAGE_TYPES.contains(item.getItemType())).collect(Collectors.toList());
    if (figureImageItems.size() != 1) {
        throw new RuntimeException("Image article does not contain exactly one image file. " + imageArticle.getDoi());
    }
    return figureImageItems.get(0).getDoi();
}

Also used : ArticleRevision(org.ambraproject.rhino.model.ArticleRevision) ArticleIngestion(org.ambraproject.rhino.model.ArticleIngestion) ArticleItem(org.ambraproject.rhino.model.ArticleItem) JsonObject(com.google.gson.JsonObject) ArticleItem(org.ambraproject.rhino.model.ArticleItem) ImmutableSet(com.google.common.collect.ImmutableSet) Article(org.ambraproject.rhino.model.Article) Journal(org.ambraproject.rhino.model.Journal) Collection(java.util.Collection) Autowired(org.springframework.beans.factory.annotation.Autowired) Collectors(java.util.stream.Collectors) JsonOutputView(org.ambraproject.rhino.view.JsonOutputView) JsonElement(com.google.gson.JsonElement) Objects(java.util.Objects) ArticleIngestion(org.ambraproject.rhino.model.ArticleIngestion) ArticleCrudService(org.ambraproject.rhino.service.ArticleCrudService) List(java.util.List) ImmutableList(com.google.common.collect.ImmutableList) ArticleRevision(org.ambraproject.rhino.model.ArticleRevision) Issue(org.ambraproject.rhino.model.Issue) Volume(org.ambraproject.rhino.model.Volume) JsonSerializationContext(com.google.gson.JsonSerializationContext) Optional(java.util.Optional) IssueCrudService(org.ambraproject.rhino.service.IssueCrudService) ArticleRevisionView(org.ambraproject.rhino.view.article.ArticleRevisionView)

Example 15 with ArticleIngestion

use of org.ambraproject.rhino.model.ArticleIngestion in project rhino by PLOS.

the class TaxonomyClassificationServiceImpl method populateCategories.

/**
   * {@inheritDoc}
   */
@Override
public void populateCategories(ArticleRevision revision) {
    ArticleIngestion ingestion = revision.getIngestion();
    Article article = ingestion.getArticle();
    Document xml = articleCrudService.getManuscriptXml(ingestion);
    List<WeightedTerm> terms;
    String doi = article.getDoi();
    //todo: fix or remove this when we find a home for article types
    boolean isAmendment = false;
    if (!isAmendment) {
        terms = classifyArticle(article, xml);
        if (terms != null && terms.size() > 0) {
            List<WeightedTerm> leafNodes = getDistinctLeafNodes(CATEGORY_COUNT, terms);
            persistCategories(leafNodes, article);
        } else {
            log.error("Taxonomy server returned 0 terms. Cannot populate Categories. " + doi);
        }
    }
}

Also used : ArticleIngestion(org.ambraproject.rhino.model.ArticleIngestion) WeightedTerm(org.ambraproject.rhino.service.taxonomy.WeightedTerm) Article(org.ambraproject.rhino.model.Article) Document(org.w3c.dom.Document)

Aggregations

ArticleIngestion (org.ambraproject.rhino.model.ArticleIngestion)16 Article (org.ambraproject.rhino.model.Article)7 ArticleRevision (org.ambraproject.rhino.model.ArticleRevision)7 Query (org.hibernate.Query)6 ArticleItem (org.ambraproject.rhino.model.ArticleItem)4 ArrayList (java.util.ArrayList)3 Collection (java.util.Collection)3 List (java.util.List)3 Collectors (java.util.stream.Collectors)3 ArticleIdentifier (org.ambraproject.rhino.identity.ArticleIdentifier)3 Journal (org.ambraproject.rhino.model.Journal)3 Document (org.w3c.dom.Document)3 InputStream (java.io.InputStream)2 Optional (java.util.Optional)2 ManifestXml (org.ambraproject.rhino.content.xml.ManifestXml)2 ArticleFile (org.ambraproject.rhino.model.ArticleFile)2 RestClientException (org.ambraproject.rhino.rest.RestClientException)2 ArticleOverview (org.ambraproject.rhino.view.article.ArticleOverview)2 Autowired (org.springframework.beans.factory.annotation.Autowired)2 ImmutableList (com.google.common.collect.ImmutableList)1