Search in sources :

Example 6 with Tag

use of com.graphaware.nlp.domain.Tag in project neo4j-nlp by graphaware.

the class ConceptNet5Importer method processConcept.

private List<Tag> processConcept(Tag source, ConceptNet5Enricher.RelDirection relDirection, boolean filterLang, List<String> outLang, int depth, List<String> admittedRelations, List<String> admittedPOS, int limit, double minWeight, ConceptNet5Concept concept) {
    List<Tag> res = new ArrayList<>();
    String conceptValue;
    String conceptLanguage;
    if (relDirection == ConceptNet5Enricher.RelDirection.OUT) {
        conceptValue = concept.getEnd();
        conceptLanguage = concept.getEndLanguage();
    } else {
        conceptValue = concept.getStart();
        conceptLanguage = concept.getStartLanguage();
    }
    if (checkAdmittedRelations(concept, admittedRelations) && concept.getWeight() > minWeight && checkLanguages(filterLang, source.getLanguage(), conceptLanguage, outLang)) {
        if (!concept.getStart().equalsIgnoreCase(concept.getEnd())) {
            conceptValue = removeApices(conceptValue);
            conceptValue = removeParenthesis(conceptValue);
            Tag annotateTag = tryToAnnotate(conceptValue, conceptLanguage);
            List<String> posList = annotateTag.getPos();
            if (admittedPOS == null || admittedPOS.isEmpty() || posList == null || posList.isEmpty() || posList.stream().filter((pos) -> (admittedPOS.contains(pos))).count() > 0) {
                if (depth > 1) {
                    importHierarchy(annotateTag, relDirection, filterLang, outLang, depth - 1, admittedRelations, admittedPOS, limit, minWeight);
                }
                source.addParent(concept.getRel(), annotateTag, concept.getWeight(), ConceptNet5Enricher.ENRICHER_NAME);
                res.add(annotateTag);
            }
        }
    }
    return res;
}
Also used : TimeUnit(java.util.concurrent.TimeUnit) List(java.util.List) Log(org.neo4j.logging.Log) TextUtils.removeApices(com.graphaware.nlp.util.TextUtils.removeApices) TextUtils.removeParenthesis(com.graphaware.nlp.util.TextUtils.removeParenthesis) LoggerFactory(com.graphaware.common.log.LoggerFactory) AbstractImporter(com.graphaware.nlp.enrich.AbstractImporter) CacheBuilder(com.google.common.cache.CacheBuilder) Cache(com.google.common.cache.Cache) Tag(com.graphaware.nlp.domain.Tag) ArrayList(java.util.ArrayList) CopyOnWriteArrayList(java.util.concurrent.CopyOnWriteArrayList) ArrayList(java.util.ArrayList) CopyOnWriteArrayList(java.util.concurrent.CopyOnWriteArrayList) Tag(com.graphaware.nlp.domain.Tag)

Example 7 with Tag

use of com.graphaware.nlp.domain.Tag in project neo4j-nlp by graphaware.

the class MicrosoftConteptImporter method importHierarchy.

public List<Tag> importHierarchy(Tag tag, int limit, String ENRICHER_NAME) {
    final List<Tag> concepts = new ArrayList<>();
    String param = tag.getLemma();
    try {
        param = URLEncoder.encode(tag.getLemma(), "UTF-8");
    } catch (Exception e) {
        // 
        return concepts;
    }
    String url = "https://concept.research.microsoft.com/api/Concept/ScoreByProb?instance=" + param + "&topK=" + limit;
    WebResource resource = Client.create(cfg).resource(url);
    ClientResponse response = resource.accept(MediaType.APPLICATION_JSON).type(MediaType.APPLICATION_JSON).get(ClientResponse.class);
    Map<String, Double> map = response.getEntity(Map.class);
    map.keySet().stream().forEach(k -> {
        Tag annotatedTag = tryToAnnotate(cleanImportedConcept(k), "en");
        tag.addParent("IS_RELATED_TO", annotatedTag, map.get(k).floatValue(), ENRICHER_NAME);
        concepts.add(annotatedTag);
    });
    return concepts;
}
Also used : ClientResponse(com.sun.jersey.api.client.ClientResponse) ArrayList(java.util.ArrayList) WebResource(com.sun.jersey.api.client.WebResource) Tag(com.graphaware.nlp.domain.Tag)

Example 8 with Tag

use of com.graphaware.nlp.domain.Tag in project neo4j-nlp by graphaware.

the class MicrosoftConceptEnricher method importConcept.

@Override
public Node importConcept(ConceptRequest request) {
    List<Tag> conceptTags = new ArrayList<>();
    List<Tag> tags = new ArrayList<>();
    Pair<Iterator<Node>, Node> pair = getTagsIteratorFromRequest(request);
    Iterator<Node> tagsIterator = pair.first();
    Node tagToBeAnnotated = pair.second();
    while (tagsIterator.hasNext()) {
        Tag tag = (Tag) getPersister(Tag.class).fromNode(tagsIterator.next());
        tags.add(tag);
    }
    tags.forEach(tag -> {
        getImporter().importHierarchy(tag, 20, ENRICHER_NAME).forEach(conceptTag -> {
            conceptTag.getParents().forEach(parent -> {
                conceptTag.addParent(parent);
            });
            conceptTags.add(conceptTag);
        });
        conceptTags.add(tag);
    });
    conceptTags.forEach((newTag) -> {
        if (newTag != null) {
            getPersister(Tag.class).getOrCreate(newTag, newTag.getId(), String.valueOf(System.currentTimeMillis()));
        }
    });
    return tagToBeAnnotated;
}
Also used : Node(org.neo4j.graphdb.Node) ArrayList(java.util.ArrayList) Iterator(java.util.Iterator) Tag(com.graphaware.nlp.domain.Tag)

Example 9 with Tag

use of com.graphaware.nlp.domain.Tag in project neo4j-nlp by graphaware.

the class TestAnnotatedTesterUnitTest method testSentencesCountIsCorrect.

@Test
public void testSentencesCountIsCorrect() {
    AnnotatedText annotatedText = new AnnotatedText();
    TestAnnotatedText test = new TestAnnotatedText(annotatedText);
    Sentence sentence = new Sentence("hello");
    sentence.addTag(new Tag("hello", "en"));
    annotatedText.addSentence(sentence);
    test.assertSentencesCount(1);
}
Also used : TestAnnotatedText(com.graphaware.nlp.util.TestAnnotatedText) AnnotatedText(com.graphaware.nlp.domain.AnnotatedText) TestAnnotatedText(com.graphaware.nlp.util.TestAnnotatedText) TagUtils.newTag(com.graphaware.nlp.util.TagUtils.newTag) Tag(com.graphaware.nlp.domain.Tag) Sentence(com.graphaware.nlp.domain.Sentence) Test(org.junit.Test)

Example 10 with Tag

use of com.graphaware.nlp.domain.Tag in project neo4j-nlp by graphaware.

the class TestAnnotatedTesterUnitTest method testTagIsNotFound.

@Test(expected = AssertionError.class)
public void testTagIsNotFound() {
    AnnotatedText annotatedText = new AnnotatedText();
    TestAnnotatedText test = new TestAnnotatedText(annotatedText);
    Sentence sentence = new Sentence("hello it is me");
    sentence.addTag(new Tag("hello", "en"));
    annotatedText.addSentence(sentence);
    test.assertTagWithLemma("hella");
}
Also used : TestAnnotatedText(com.graphaware.nlp.util.TestAnnotatedText) AnnotatedText(com.graphaware.nlp.domain.AnnotatedText) TestAnnotatedText(com.graphaware.nlp.util.TestAnnotatedText) TagUtils.newTag(com.graphaware.nlp.util.TagUtils.newTag) Tag(com.graphaware.nlp.domain.Tag) Sentence(com.graphaware.nlp.domain.Sentence) Test(org.junit.Test)

Aggregations

Tag (com.graphaware.nlp.domain.Tag)20 Sentence (com.graphaware.nlp.domain.Sentence)9 AnnotatedText (com.graphaware.nlp.domain.AnnotatedText)8 TagUtils.newTag (com.graphaware.nlp.util.TagUtils.newTag)5 TestAnnotatedText (com.graphaware.nlp.util.TestAnnotatedText)5 Test (org.junit.Test)5 ArrayList (java.util.ArrayList)4 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)3 Node (org.neo4j.graphdb.Node)3 Cache (com.google.common.cache.Cache)2 CacheBuilder (com.google.common.cache.CacheBuilder)2 LoggerFactory (com.graphaware.common.log.LoggerFactory)2 TextProcessor (com.graphaware.nlp.processor.TextProcessor)2 TextUtils.removeApices (com.graphaware.nlp.util.TextUtils.removeApices)2 TextUtils.removeParenthesis (com.graphaware.nlp.util.TextUtils.removeParenthesis)2 ClientResponse (com.sun.jersey.api.client.ClientResponse)2 WebResource (com.sun.jersey.api.client.WebResource)2 List (java.util.List)2 CopyOnWriteArrayList (java.util.concurrent.CopyOnWriteArrayList)2 TimeUnit (java.util.concurrent.TimeUnit)2