Search in sources :

Example 11 with Dataset

use of org.eol.globi.service.Dataset in project eol-globi-data by jhpoelen.

the class StudyImporterForMetaTableIT method importAll.

@Test
public void importAll() throws IOException, StudyImporterException {
    final List<Map<String, String>> links = new ArrayList<Map<String, String>>();
    final InteractionListener interactionListener = properties -> links.add(properties);
    final StudyImporterForMetaTable.TableParserFactory tableFactory = (config, dataset) -> {
        String firstFewLines = "intertype,obstype,effunit,effort,obsunit,obsquant,germnotes,\"REPLACE(Interaction.notes, ',', ';')\",AnimalNumber,AnimalClass,AnimalOrder,AnimalFamily,AnimalGenus,AnimalSpecies,AnimalSubSpecies,AnimalType,AnimalCommonName,PlantNumber,PlantFamily,PlantGenus,PlantSpecies,PlantSubSpecies,country,region,ProvinceDistrictCity,ProtectedArea,HabitatWhite,HabitatAuthor,author,title,year,journal,volume,number,pages,USER,DEF_timestamp,,,\n" + "seed disperser,direct observation,months,4,dung density,,,Article focused on elephant density per habitat type based on seed/plant types identified in dung at the various research locations. All identified plant types are being assumed to be dispersed by the elephants,1441,Mammalia,Proboscidea,Elephantidae,Loxodonta,africana,,NULL,African Bush Elephant,4035,Poaceae,Cynodon,dactylon,NULL,Mozambique,NULL,NULL,yes,forest transitions and mosaics,mangroves dune grass plains forest woodland riverine,\"De Boer, W.F. and Ntumi, C.P. and Correia, A.U. and Mafuca, J.M.\",Diet and distribution of elephant in the Maputo Elephant Reserve; Mozambique,2000,African Journal of Ecology,38,3,188-201,Mary,0000-00-00 00:00:00,,,\n" + "seed disperser,direct observation,months,4,dung density,,,Article focused on elephant density per habitat type based on seed/plant types identified in dung at the various research locations. All identified plant types are being assumed to be dispersed by the elephants,1441,Mammalia,Proboscidea,Elephantidae,Loxodonta,africana,,NULL,African Bush Elephant,3639,Poaceae,Aristida,canescens,NULL,Mozambique,NULL,NULL,yes,forest transitions and mosaics,mangroves dune grass plains forest woodland riverine,\"De Boer, W.F. and Ntumi, C.P. and Correia, A.U. and Mafuca, J.M.\",Diet and distribution of elephant in the Maputo Elephant Reserve; Mozambique,2000,African Journal of Ecology,38,3,188-201,Mary,0000-00-00 00:00:00,,,\n" + "seed disperser,direct observation,months,4,dung density,,,Article focused on elephant density per habitat type based on seed/plant types identified in dung at the various research locations. All identified plant types are being assumed to be dispersed by the elephants,1441,Mammalia,Proboscidea,Elephantidae,Loxodonta,africana,,NULL,African Bush Elephant,3574,Poaceae,Andropogon,eucomus,NULL,Mozambique,NULL,NULL,yes,forest transitions and mosaics,mangroves dune grass plains forest woodland riverine,\"De Boer, W.F. and Ntumi, C.P. and Correia, A.U. and Mafuca, J.M.\",Diet and distribution of elephant in the Maputo Elephant Reserve; Mozambique,2000,African Journal of Ecology,38,3,188-201,Mary,0000-00-00 00:00:00,,,\n" + "seed disperser,direct observation,months,4,dung density,,,Article focused on elephant density per habitat type based on seed/plant types identified in dung at the various research locations. All identified plant types are being assumed to be dispersed by the elephants,1441,Mammalia,Proboscidea,Elephantidae,Loxodonta,africana,,NULL,African Bush Elephant,5125,Phyllanthaceae,Phyllanthus,reticulatus,NULL,Mozambique,NULL,NULL,yes,forest transitions and mosaics,mangroves dune grass plains forest woodland riverine,\"De Boer, W.F. and Ntumi, C.P. and Correia, A.U. and Mafuca, J.M.\",Diet and distribution of elephant in the Maputo Elephant Reserve; Mozambique,2000,African Journal of Ecology,38,3,188-201,Mary,0000-00-00 00:00:00,,,\n" + "seed disperser,direct observation,months,4,dung density,,,Article focused on elephant density per habitat type based on seed/plant types identified in dung at the various research locations. All identified plant types are being assumed to be dispersed by the elephants,1441,Mammalia,Proboscidea,Elephantidae,Loxodonta,africana,,NULL,African Bush Elephant,399,Myrtaceae,Syzygium,cordatum,,Mozambique,NULL,NULL,yes,forest transitions and mosaics,mangroves dune grass plains forest woodland riverine,\"De Boer, W.F. and Ntumi, C.P. and Correia, A.U. and Mafuca, J.M.\",Diet and distribution of elephant in the Maputo Elephant Reserve; Mozambique,2000,African Journal of Ecology,38,3,188-201,Mary,0000-00-00 00:00:00,,,\n" + "seed disperser,direct observation,months,4,dung density,,,Article focused on elephant density per habitat type based on seed/plant types identified in dung at the various research locations. All identified plant types are being assumed to be dispersed by the elephants,1441,Mammalia,Proboscidea,Elephantidae,Loxodonta,africana,,NULL,African Bush Elephant,374,Moraceae,Ficus,sycomorus,,Mozambique,NULL,NULL,yes,forest transitions and mosaics,mangroves dune grass plains forest woodland riverine,\"De Boer, W.F. and Ntumi, C.P. and Correia, A.U. and Mafuca, J.M.\",Diet and distribution of elephant in the Maputo Elephant Reserve; Mozambique,2000,African Journal of Ecology,38,3,188-201,Mary,0000-00-00 00:00:00,,,\n" + "seed disperser,direct observation,months,4,dung density,,,Article focused on elephant density per habitat type based on seed/plant types identified in dung at the various research locations. All identified plant types are being assumed to be dispersed by the elephants,1441,Mammalia,Proboscidea,Elephantidae,Loxodonta,africana,,NULL,African Bush Elephant,4398,Moraceae,Ficus,sp,NULL,Mozambique,NULL,NULL,yes,forest transitions and mosaics,mangroves dune grass plains forest woodland riverine,\"De Boer, W.F. and Ntumi, C.P. and Correia, A.U. and Mafuca, J.M.\",Diet and distribution of elephant in the Maputo Elephant Reserve; Mozambique,2000,African Journal of Ecology,38,3,188-201,Mary,0000-00-00 00:00:00,,,\n" + "seed disperser,direct observation,years,4,NULL,NULL,NULL,NULL,3051,Animal,Animal,Animal,Animal,animal,NULL,general animal,NULL,4176,Caesalpinioideae,Distemonanthus,benthamianus,NULL,Cameroon,NULL,NULL,yes,NULL,semideciduous tropical rain forest,\"Hardesty, B.D. and Parker, V.T.\",Community seed rain patterns and a comparison to adult community structure in a West African tropical forest,2003,Plant Ecology,164,1,49-64,Mary,8/15/12 9:35,,,\n" + "ingestion,direct observation,years,2,NULL,NULL,NULL,during both summer and winter season,1462,Mammalia,Artiodactyla,Bovidae,Madoqua,kirkii,,NULL,Kirk's Dikdik,6897,Moraceae,Ficus,petersii,NULL,Namibia,South West Africa,NULL,yes,NULL,riverine thicket,\"Tinley, K.\",Dikdik; Madoqua kirkii; in south-west Africa: notes on distribution; ecology; and behaviour,1969,Madoqua,1,NULL,Jul-33,Anna,2/24/14 18:40,,,\n";
        return new LabeledCSVParser(new CSVParser(IOUtils.toInputStream(firstFewLines)));
    };
    final String baseUrl = "https://raw.githubusercontent.com/globalbioticinteractions/AfricaTreeDatabase/master";
    final String resource = baseUrl + "/globi.json";
    importAll(interactionListener, tableFactory, baseUrl, resource);
    assertThat(links.size(), is(9));
}
Also used : URL(java.net.URL) Assert.assertNotNull(org.junit.Assert.assertNotNull) DatasetImpl(org.eol.globi.service.DatasetImpl) Test(org.junit.Test) IOException(java.io.IOException) JsonNode(org.codehaus.jackson.JsonNode) StringContains.containsString(org.junit.internal.matchers.StringContains.containsString) CSVParser(com.Ostermiller.util.CSVParser) ArrayList(java.util.ArrayList) Assert.assertThat(org.junit.Assert.assertThat) IOUtils(org.apache.commons.io.IOUtils) List(java.util.List) ResourceUtil(org.eol.globi.util.ResourceUtil) Assert(junit.framework.Assert) Map(java.util.Map) LabeledCSVParser(com.Ostermiller.util.LabeledCSVParser) Dataset(org.eol.globi.service.Dataset) Is.is(org.hamcrest.core.Is.is) URI(java.net.URI) StringStartsWith.startsWith(org.hamcrest.core.StringStartsWith.startsWith) ObjectMapper(org.codehaus.jackson.map.ObjectMapper) CoreMatchers.nullValue(org.hamcrest.CoreMatchers.nullValue) InputStream(java.io.InputStream) CSVParser(com.Ostermiller.util.CSVParser) LabeledCSVParser(com.Ostermiller.util.LabeledCSVParser) ArrayList(java.util.ArrayList) StringContains.containsString(org.junit.internal.matchers.StringContains.containsString) LabeledCSVParser(com.Ostermiller.util.LabeledCSVParser) Map(java.util.Map) Test(org.junit.Test)

Example 12 with Dataset

use of org.eol.globi.service.Dataset in project eol-globi-data by jhpoelen.

the class StudyImporterForMetaTableIT method importREEMWithStaticCSV.

@Test
public void importREEMWithStaticCSV() throws IOException, StudyImporterException {
    final List<Map<String, String>> links = new ArrayList<Map<String, String>>();
    final InteractionListener interactionListener = properties -> links.add(properties);
    final StudyImporterForMetaTable.TableParserFactory tableFactory = (config, dataset) -> {
        String firstFewLines = "Hauljoin,\" Pred_nodc\",\" Pred_specn\",\" Prey_nodc\",\" Pred_len\",\" Year\",\" Month\",\" day\",\" region\",\" Pred_name\",\" Prey_Name\",\" Vessel\",\" Cruise\",\" Haul\",\" Rlat\",\" Rlong\",\" Gear_depth\",\" Bottom_depth\",\" Start_hour\",\" Surface_temp\",\" Gear_temp\",\" INPFC_Area\",\" Stationid\",\" Start_date\",\" Prey_sz1\",\" Prey_sex\"\n" + "11012118.0,8791030401.0,5.0,9999999998.0,53.0,1994.0,7.0,11.0,AI,\"Pacific cod Gadus macrocephalus\",\"Rocks \",95.0,199401.0,148.0,51.43,178.81999999999999,222.0,228.0,11.0,0.63,0.41999999999999998,542.0,118-11,\"1994-07-11 00:00:00\",3.0,\n" + "11012118.0,8791030401.0,8.0,9999999998.0,53.0,1994.0,7.0,11.0,AI,\"Pacific cod Gadus macrocephalus\",\"Rocks \",95.0,199401.0,148.0,51.43,178.81999999999999,222.0,228.0,11.0,0.63,0.41999999999999998,542.0,118-11,\"1994-07-11 00:00:00\",3.0,\n" + "11012118.0,8791030401.0,9.0,9999999998.0,58.0,1994.0,7.0,11.0,AI,\"Pacific cod Gadus macrocephalus\",\"Rocks \",95.0,199401.0,148.0,51.43,178.81999999999999,222.0,228.0,11.0,0.63,0.41999999999999998,542.0,118-11,\"1994-07-11 00:00:00\",13.0,\n" + "11012118.0,8791030401.0,9.0,9999999998.0,58.0,1994.0,7.0,11.0,AI,\"Pacific cod Gadus macrocephalus\",\"Rocks \",95.0,199401.0,148.0,51.43,178.81999999999999,222.0,228.0,11.0,0.63,0.41999999999999998,542.0,118-11,\"1994-07-11 00:00:00\",3.0,\n";
        return new LabeledCSVParser(new CSVParser(IOUtils.toInputStream(firstFewLines)));
    };
    final String baseUrl = "https://raw.githubusercontent.com/globalbioticinteractions/noaa-reem/master";
    final String resource = baseUrl + "/globi.json";
    importAll(interactionListener, tableFactory, baseUrl, resource);
    assertThat(links.size(), is(12));
    final Map<String, String> firstLine = links.get(0);
    assertThat(firstLine.get(StudyImporterForTSV.INTERACTION_TYPE_ID), is("http://purl.obolibrary.org/obo/RO_0002470"));
    assertThat(firstLine.get(StudyImporterForTSV.INTERACTION_TYPE_NAME), is("eats"));
    assertThat(firstLine.get(StudyImporterForTSV.TARGET_TAXON_ID), is(nullValue()));
    assertThat(firstLine.get(StudyImporterForTSV.TARGET_TAXON_NAME), is("Rocks"));
    assertThat(firstLine.get(StudyImporterForTSV.SOURCE_TAXON_ID), is("NODC:8791030401"));
    assertThat(firstLine.get(StudyImporterForTSV.SOURCE_TAXON_NAME), is("Pacific cod Gadus macrocephalus"));
    assertThat(firstLine.get(StudyImporterForMetaTable.EVENT_DATE), startsWith("1994-07-11"));
    assertThat(firstLine.get(StudyImporterForMetaTable.LATITUDE), is("51.43"));
    assertThat(firstLine.get(StudyImporterForMetaTable.LONGITUDE), is("178.81999999999999"));
}
Also used : URL(java.net.URL) Assert.assertNotNull(org.junit.Assert.assertNotNull) DatasetImpl(org.eol.globi.service.DatasetImpl) Test(org.junit.Test) IOException(java.io.IOException) JsonNode(org.codehaus.jackson.JsonNode) StringContains.containsString(org.junit.internal.matchers.StringContains.containsString) CSVParser(com.Ostermiller.util.CSVParser) ArrayList(java.util.ArrayList) Assert.assertThat(org.junit.Assert.assertThat) IOUtils(org.apache.commons.io.IOUtils) List(java.util.List) ResourceUtil(org.eol.globi.util.ResourceUtil) Assert(junit.framework.Assert) Map(java.util.Map) LabeledCSVParser(com.Ostermiller.util.LabeledCSVParser) Dataset(org.eol.globi.service.Dataset) Is.is(org.hamcrest.core.Is.is) URI(java.net.URI) StringStartsWith.startsWith(org.hamcrest.core.StringStartsWith.startsWith) ObjectMapper(org.codehaus.jackson.map.ObjectMapper) CoreMatchers.nullValue(org.hamcrest.CoreMatchers.nullValue) InputStream(java.io.InputStream) CSVParser(com.Ostermiller.util.CSVParser) LabeledCSVParser(com.Ostermiller.util.LabeledCSVParser) ArrayList(java.util.ArrayList) StringContains.containsString(org.junit.internal.matchers.StringContains.containsString) LabeledCSVParser(com.Ostermiller.util.LabeledCSVParser) Map(java.util.Map) Test(org.junit.Test)

Example 13 with Dataset

use of org.eol.globi.service.Dataset in project eol-globi-data by jhpoelen.

the class NodeFactoryNeo4jTest method addDatasetToStudy.

@Test
public void addDatasetToStudy() throws NodeFactoryException, IOException {
    StudyImpl study1 = new StudyImpl("my title", "some source", "some doi", "some citation");
    DatasetImpl dataset = new DatasetImpl("some/namespace", URI.create("some:uri"));
    ObjectNode objectNode = new ObjectMapper().createObjectNode();
    objectNode.put(DatasetConstant.SHOULD_RESOLVE_REFERENCES, false);
    dataset.setConfig(objectNode);
    study1.setOriginatingDataset(dataset);
    StudyNode study = getNodeFactory().getOrCreateStudy(study1);
    Dataset origDataset = study.getOriginatingDataset();
    assertThat(origDataset, is(notNullValue()));
    assertThat(origDataset.getArchiveURI().toString(), is("some:uri"));
    assertThat(origDataset.getOrDefault(DatasetConstant.SHOULD_RESOLVE_REFERENCES, "true"), is("false"));
    String expectedConfig = new ObjectMapper().writeValueAsString(objectNode);
    assertThat(new ObjectMapper().writeValueAsString(origDataset.getConfig()), is(expectedConfig));
    Node datasetNode = NodeUtil.getDataSetForStudy(study);
    assertThat(datasetNode.getProperty(DatasetConstant.NAMESPACE), is("some/namespace"));
    assertThat(datasetNode.getProperty("archiveURI"), is("some:uri"));
    assertThat(datasetNode.getProperty(DatasetConstant.SHOULD_RESOLVE_REFERENCES), is("false"));
    StudyImpl otherStudy = new StudyImpl("my other title", "some source", "some doi", "some citation");
    otherStudy.setOriginatingDataset(dataset);
    StudyNode studySameDataset = getNodeFactory().getOrCreateStudy(otherStudy);
    Node datasetNodeOther = NodeUtil.getDataSetForStudy(studySameDataset);
    assertThat(datasetNode.getId(), is(datasetNodeOther.getId()));
}
Also used : ObjectNode(org.codehaus.jackson.node.ObjectNode) Dataset(org.eol.globi.service.Dataset) Node(org.neo4j.graphdb.Node) ObjectNode(org.codehaus.jackson.node.ObjectNode) DatasetImpl(org.eol.globi.service.DatasetImpl) ObjectMapper(org.codehaus.jackson.map.ObjectMapper) Test(org.junit.Test)

Example 14 with Dataset

use of org.eol.globi.service.Dataset in project eol-globi-data by jhpoelen.

the class NodeFactoryNeo4jTest method assertDataset.

private void assertDataset(String citationKey) {
    DatasetImpl dataset = new DatasetImpl("some/namespace", URI.create("some:uri"));
    ObjectNode objectNode = new ObjectMapper().createObjectNode();
    objectNode.put(DatasetConstant.SHOULD_RESOLVE_REFERENCES, false);
    objectNode.put(citationKey, "some citation");
    dataset.setConfig(objectNode);
    Dataset origDataset = getNodeFactory().getOrCreateDataset(dataset);
    assertThat(origDataset, is(notNullValue()));
    assertThat(origDataset.getArchiveURI().toString(), is("some:uri"));
    assertThat(origDataset.getOrDefault(DatasetConstant.SHOULD_RESOLVE_REFERENCES, "true"), is("false"));
    assertThat(origDataset.getOrDefault(DatasetConstant.CITATION, "no citation"), is("some citation"));
    assertThat(origDataset.getOrDefault(DatasetConstant.LAST_SEEN_AT, "1"), is(not("1")));
    Dataset datasetAnother = getNodeFactory().getOrCreateDataset(dataset);
    assertThat(((DatasetNode) datasetAnother).getNodeID(), is(((DatasetNode) origDataset).getNodeID()));
}
Also used : ObjectNode(org.codehaus.jackson.node.ObjectNode) Dataset(org.eol.globi.service.Dataset) DatasetImpl(org.eol.globi.service.DatasetImpl) ObjectMapper(org.codehaus.jackson.map.ObjectMapper)

Example 15 with Dataset

use of org.eol.globi.service.Dataset in project eol-globi-data by jhpoelen.

the class StudyImporterForSeltmannTest method importSome.

@Test
public void importSome() throws StudyImporterException, IOException {
    StudyImporterForSeltmann importer = new StudyImporterForSeltmann(null, nodeFactory);
    Dataset dataset = new DatasetLocal();
    JsonNode config = new ObjectMapper().readTree("{\"citation\": \"some citation\", \"resources\": {\"archive\": \"seltmann/testArchive.zip\"}}");
    dataset.setConfig(config);
    importer.setDataset(dataset);
    importStudy(importer);
    List<Study> allStudies = NodeUtil.findAllStudies(getGraphDb());
    for (Study allStudy : allStudies) {
        assertThat(allStudy.getSource(), startsWith("Digital Bee Collections Network, 2014 (and updates). Version: 2015-03-18. National Science Foundation grant DBI 0956388"));
        assertThat(allStudy.getCitation(), is("Digital Bee Collections Network, 2014 (and updates). Version: 2015-03-18. National Science Foundation grant DBI 0956388"));
        Iterable<Relationship> specimens = NodeUtil.getSpecimens(allStudy);
        for (Relationship specimen : specimens) {
            SpecimenNode spec = new SpecimenNode(specimen.getEndNode());
            final String recordId = (String) spec.getUnderlyingNode().getProperty("idigbio:recordID");
            assertThat(recordId, is(notNullValue()));
            assertThat(spec.getExternalId(), is(recordId));
            Term basisOfRecord = spec.getBasisOfRecord();
            assertThat(basisOfRecord.getId(), either(is("TEST:PreservedSpecimen")).or(is("TEST:LabelObservation")));
            assertThat(basisOfRecord.getName(), either(is("PreservedSpecimen")).or(is("LabelObservation")));
        }
    }
    assertThat(taxonIndex.findTaxonByName("Megandrena mentzeliae"), is(notNullValue()));
    assertThat(taxonIndex.findTaxonByName("Mentzelia tricuspis"), is(notNullValue()));
}
Also used : Study(org.eol.globi.domain.Study) Dataset(org.eol.globi.service.Dataset) Relationship(org.neo4j.graphdb.Relationship) JsonNode(org.codehaus.jackson.JsonNode) Term(org.eol.globi.domain.Term) SpecimenNode(org.eol.globi.domain.SpecimenNode) DatasetLocal(org.eol.globi.service.DatasetLocal) ObjectMapper(org.codehaus.jackson.map.ObjectMapper) Test(org.junit.Test)

Aggregations

Dataset (org.eol.globi.service.Dataset)31 Test (org.junit.Test)16 JsonNode (org.codehaus.jackson.JsonNode)11 ObjectMapper (org.codehaus.jackson.map.ObjectMapper)9 DatasetImpl (org.eol.globi.service.DatasetImpl)9 IOException (java.io.IOException)6 InputStream (java.io.InputStream)5 URI (java.net.URI)5 URL (java.net.URL)5 ArrayList (java.util.ArrayList)4 List (java.util.List)4 DatasetFinder (org.eol.globi.service.DatasetFinder)4 Is.is (org.hamcrest.core.Is.is)4 Assert.assertNotNull (org.junit.Assert.assertNotNull)4 Assert.assertThat (org.junit.Assert.assertThat)4 CSVParser (com.Ostermiller.util.CSVParser)3 LabeledCSVParser (com.Ostermiller.util.LabeledCSVParser)3 Map (java.util.Map)3 Assert (junit.framework.Assert)3 IOUtils (org.apache.commons.io.IOUtils)3