Search in sources :

Example 1 with DocMaker

use of org.apache.lucene.benchmark.byTask.feeds.DocMaker in project lucene-solr by apache.

the class AddDocTask method setup.

@Override
public void setup() throws Exception {
    super.setup();
    DocMaker docMaker = getRunData().getDocMaker();
    if (docSize > 0) {
        doc = docMaker.makeDocument(docSize);
    } else {
        doc = docMaker.makeDocument();
    }
}
Also used : DocMaker(org.apache.lucene.benchmark.byTask.feeds.DocMaker)

Example 2 with DocMaker

use of org.apache.lucene.benchmark.byTask.feeds.DocMaker in project lucene-solr by apache.

the class UpdateDocTask method setup.

@Override
public void setup() throws Exception {
    super.setup();
    DocMaker docMaker = getRunData().getDocMaker();
    if (docSize > 0) {
        doc = docMaker.makeDocument(docSize);
    } else {
        doc = docMaker.makeDocument();
    }
}
Also used : DocMaker(org.apache.lucene.benchmark.byTask.feeds.DocMaker)

Example 3 with DocMaker

use of org.apache.lucene.benchmark.byTask.feeds.DocMaker in project lucene-solr by apache.

the class ExtractWikipedia method main.

public static void main(String[] args) throws Exception {
    Path wikipedia = null;
    Path outputDir = Paths.get("enwiki");
    boolean keepImageOnlyDocs = true;
    for (int i = 0; i < args.length; i++) {
        String arg = args[i];
        if (arg.equals("--input") || arg.equals("-i")) {
            wikipedia = Paths.get(args[i + 1]);
            i++;
        } else if (arg.equals("--output") || arg.equals("-o")) {
            outputDir = Paths.get(args[i + 1]);
            i++;
        } else if (arg.equals("--discardImageOnlyDocs") || arg.equals("-d")) {
            keepImageOnlyDocs = false;
        }
    }
    Properties properties = new Properties();
    properties.setProperty("docs.file", wikipedia.toAbsolutePath().toString());
    properties.setProperty("content.source.forever", "false");
    properties.setProperty("keep.image.only.docs", String.valueOf(keepImageOnlyDocs));
    Config config = new Config(properties);
    ContentSource source = new EnwikiContentSource();
    source.setConfig(config);
    DocMaker docMaker = new DocMaker();
    docMaker.setConfig(config, source);
    docMaker.resetInputs();
    if (Files.exists(wikipedia)) {
        System.out.println("Extracting Wikipedia to: " + outputDir + " using EnwikiContentSource");
        Files.createDirectories(outputDir);
        ExtractWikipedia extractor = new ExtractWikipedia(docMaker, outputDir);
        extractor.extract();
    } else {
        printUsage();
    }
}
Also used : Path(java.nio.file.Path) ContentSource(org.apache.lucene.benchmark.byTask.feeds.ContentSource) EnwikiContentSource(org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource) EnwikiContentSource(org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource) DocMaker(org.apache.lucene.benchmark.byTask.feeds.DocMaker) Config(org.apache.lucene.benchmark.byTask.utils.Config) Properties(java.util.Properties)

Example 4 with DocMaker

use of org.apache.lucene.benchmark.byTask.feeds.DocMaker in project lucene-solr by apache.

the class ReadTokensTask method setup.

@Override
public void setup() throws Exception {
    super.setup();
    DocMaker docMaker = getRunData().getDocMaker();
    doc = docMaker.makeDocument();
}
Also used : DocMaker(org.apache.lucene.benchmark.byTask.feeds.DocMaker)

Aggregations

DocMaker (org.apache.lucene.benchmark.byTask.feeds.DocMaker)4 Path (java.nio.file.Path)1 Properties (java.util.Properties)1 ContentSource (org.apache.lucene.benchmark.byTask.feeds.ContentSource)1 EnwikiContentSource (org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource)1 Config (org.apache.lucene.benchmark.byTask.utils.Config)1