Search in sources :

Example 1 with Bioconductor

use of org.edamontology.edammap.core.input.csv.Bioconductor in project edammap by edamontology.

the class QueryLoader method get.

public static List<Query> get(String queryPath, QueryType type, Map<EdamUri, Concept> concepts, int timeout, String userAgent) throws IOException, ParseException {
    if (type == QueryType.server) {
        throw new IllegalArgumentException("Query of type \"" + QueryType.server.name() + "\" is not loadable from path, but has to be provided");
    }
    List<? extends InputType> inputs;
    if (type == QueryType.biotools) {
        inputs = Json.load(queryPath, type, timeout, userAgent);
    } else if (type == QueryType.biotools14) {
        inputs = Xml.load(queryPath, type, timeout, userAgent);
    } else {
        inputs = Csv.load(queryPath, type, timeout, userAgent);
    }
    Set<Query> queries = new LinkedHashSet<>();
    String filename = new File(queryPath).getName();
    for (InputType input : inputs) {
        switch(type) {
            case generic:
                queries.add(getGeneric((Generic) input, concepts, filename));
                break;
            case SEQwiki:
                queries.add(getSEQwiki((SEQwiki) input, concepts, filename));
                break;
            case msutils:
                queries.add(getMsutils((Msutils) input, concepts, filename));
                break;
            case Bioconductor:
                queries.add(getBioconductor((Bioconductor) input, concepts));
                break;
            case biotools14:
                queries.add(getBiotools14((Biotools14) input, concepts, filename));
                break;
            case biotools:
                queries.add(getBiotools((Tool) input, concepts, 0, 0, filename));
                break;
            case server:
                break;
        }
    }
    return new ArrayList<>(queries);
}
Also used : LinkedHashSet(java.util.LinkedHashSet) Bioconductor(org.edamontology.edammap.core.input.csv.Bioconductor) Generic(org.edamontology.edammap.core.input.csv.Generic) ArrayList(java.util.ArrayList) Msutils(org.edamontology.edammap.core.input.csv.Msutils) InputType(org.edamontology.edammap.core.input.InputType) Biotools14(org.edamontology.edammap.core.input.xml.Biotools14) File(java.io.File) SEQwiki(org.edamontology.edammap.core.input.csv.SEQwiki) Tool(org.edamontology.edammap.core.input.json.Tool)

Example 2 with Bioconductor

use of org.edamontology.edammap.core.input.csv.Bioconductor in project edammap by edamontology.

the class Csv method load.

public static List<InputType> load(String queryPath, QueryType type, int timeout, String userAgent) throws IOException, ParseException {
    List<InputType> inputs = new ArrayList<>();
    BeanListProcessor<? extends InputType> rowProcessor;
    switch(type) {
        case SEQwiki:
            rowProcessor = new BeanListProcessor<SEQwiki>(SEQwiki.class);
            break;
        case msutils:
            rowProcessor = new BeanListProcessor<Msutils>(Msutils.class);
            break;
        case Bioconductor:
            rowProcessor = new BeanListProcessor<Bioconductor>(Bioconductor.class);
            break;
        default:
            rowProcessor = new BeanListProcessor<Generic>(Generic.class);
            break;
    }
    rowProcessor.setStrictHeaderValidationEnabled(false);
    CsvParserSettings settings = new CsvParserSettings();
    settings.setProcessor(rowProcessor);
    settings.setHeaderExtractionEnabled(true);
    settings.setAutoConfigurationEnabled(true);
    // disabling is (slightly) more efficient if your input is small
    settings.setReadInputOnSeparateThread(false);
    settings.setSkipEmptyLines(true);
    settings.trimValues(true);
    settings.setMaxCharsPerColumn(100000);
    settings.getFormat().setDelimiter(',');
    settings.getFormat().setQuote('"');
    settings.getFormat().setQuoteEscape('"');
    settings.getFormat().setCharToEscapeQuoteEscaping('"');
    settings.getFormat().setLineSeparator("\n");
    settings.getFormat().setComment('#');
    try (InputStreamReader reader = new InputStreamReader(Input.newInputStream(queryPath, true, timeout, userAgent), StandardCharsets.UTF_8)) {
        CsvParser parser = new CsvParser(settings);
        parser.parse(reader);
    }
    int i = 0;
    for (InputType inputType : rowProcessor.getBeans()) {
        inputType.check(++i);
        inputs.add(inputType);
    }
    logger.debug("Loaded {} CSV entries from {} of type {}", inputs.size(), queryPath, type);
    return inputs;
}
Also used : Bioconductor(org.edamontology.edammap.core.input.csv.Bioconductor) InputStreamReader(java.io.InputStreamReader) Generic(org.edamontology.edammap.core.input.csv.Generic) ArrayList(java.util.ArrayList) Msutils(org.edamontology.edammap.core.input.csv.Msutils) CsvParserSettings(com.univocity.parsers.csv.CsvParserSettings) CsvParser(com.univocity.parsers.csv.CsvParser) SEQwiki(org.edamontology.edammap.core.input.csv.SEQwiki)

Aggregations

ArrayList (java.util.ArrayList)2 Bioconductor (org.edamontology.edammap.core.input.csv.Bioconductor)2 Generic (org.edamontology.edammap.core.input.csv.Generic)2 Msutils (org.edamontology.edammap.core.input.csv.Msutils)2 SEQwiki (org.edamontology.edammap.core.input.csv.SEQwiki)2 CsvParser (com.univocity.parsers.csv.CsvParser)1 CsvParserSettings (com.univocity.parsers.csv.CsvParserSettings)1 File (java.io.File)1 InputStreamReader (java.io.InputStreamReader)1 LinkedHashSet (java.util.LinkedHashSet)1 InputType (org.edamontology.edammap.core.input.InputType)1 Tool (org.edamontology.edammap.core.input.json.Tool)1 Biotools14 (org.edamontology.edammap.core.input.xml.Biotools14)1