Search in sources :

Example 11 with SimpleFSDirectory

use of org.apache.lucene.store.SimpleFSDirectory in project jspwiki by apache.

the class LuceneSearchProvider method pageRemoved.

/**
 *  {@inheritDoc}
 */
public void pageRemoved(WikiPage page) {
    IndexWriter writer = null;
    try {
        Directory luceneDir = new SimpleFSDirectory(new File(m_luceneDirectory), null);
        writer = getIndexWriter(luceneDir);
        Query query = new TermQuery(new Term(LUCENE_ID, page.getName()));
        writer.deleteDocuments(query);
    } catch (Exception e) {
        log.error("Unable to remove page '" + page.getName() + "' from Lucene index", e);
    } finally {
        close(writer);
    }
}
Also used : TermQuery(org.apache.lucene.search.TermQuery) Query(org.apache.lucene.search.Query) TermQuery(org.apache.lucene.search.TermQuery) IndexWriter(org.apache.lucene.index.IndexWriter) Term(org.apache.lucene.index.Term) SimpleFSDirectory(org.apache.lucene.store.SimpleFSDirectory) File(java.io.File) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) NoRequiredPropertyException(org.apache.wiki.api.exceptions.NoRequiredPropertyException) InternalWikiException(org.apache.wiki.InternalWikiException) ParseException(org.apache.lucene.queryparser.classic.ParseException) LockObtainFailedException(org.apache.lucene.store.LockObtainFailedException) InvalidTokenOffsetsException(org.apache.lucene.search.highlight.InvalidTokenOffsetsException) IOException(java.io.IOException) ProviderException(org.apache.wiki.api.exceptions.ProviderException) Directory(org.apache.lucene.store.Directory) SimpleFSDirectory(org.apache.lucene.store.SimpleFSDirectory)

Example 12 with SimpleFSDirectory

use of org.apache.lucene.store.SimpleFSDirectory in project jspwiki by apache.

the class LuceneSearchProvider method doFullLuceneReindex.

/**
 *  Performs a full Lucene reindex, if necessary.
 *
 *  @throws IOException If there's a problem during indexing
 */
protected void doFullLuceneReindex() throws IOException {
    File dir = new File(m_luceneDirectory);
    String[] filelist = dir.list();
    if (filelist == null) {
        throw new IOException("Invalid Lucene directory: cannot produce listing: " + dir.getAbsolutePath());
    }
    try {
        if (filelist.length == 0) {
            // 
            // No files? Reindex!
            // 
            Date start = new Date();
            IndexWriter writer = null;
            log.info("Starting Lucene reindexing, this can take a couple of minutes...");
            Directory luceneDir = new SimpleFSDirectory(dir, null);
            try {
                writer = getIndexWriter(luceneDir);
                Collection allPages = m_engine.getPageManager().getAllPages();
                for (Iterator iterator = allPages.iterator(); iterator.hasNext(); ) {
                    WikiPage page = (WikiPage) iterator.next();
                    try {
                        String text = m_engine.getPageManager().getPageText(page.getName(), WikiProvider.LATEST_VERSION);
                        luceneIndexPage(page, text, writer);
                    } catch (IOException e) {
                        log.warn("Unable to index page " + page.getName() + ", continuing to next ", e);
                    }
                }
                Collection allAttachments = m_engine.getAttachmentManager().getAllAttachments();
                for (Iterator iterator = allAttachments.iterator(); iterator.hasNext(); ) {
                    Attachment att = (Attachment) iterator.next();
                    try {
                        String text = getAttachmentContent(att.getName(), WikiProvider.LATEST_VERSION);
                        luceneIndexPage(att, text, writer);
                    } catch (IOException e) {
                        log.warn("Unable to index attachment " + att.getName() + ", continuing to next", e);
                    }
                }
            } finally {
                close(writer);
            }
            Date end = new Date();
            log.info("Full Lucene index finished in " + (end.getTime() - start.getTime()) + " milliseconds.");
        } else {
            log.info("Files found in Lucene directory, not reindexing.");
        }
    } catch (NoClassDefFoundError e) {
        log.info("Lucene libraries do not exist - not using Lucene.");
    } catch (IOException e) {
        log.error("Problem while creating Lucene index - not using Lucene.", e);
    } catch (ProviderException e) {
        log.error("Problem reading pages while creating Lucene index (JSPWiki won't start.)", e);
        throw new IllegalArgumentException("unable to create Lucene index");
    } catch (Exception e) {
        log.error("Unable to start lucene", e);
    }
}
Also used : ProviderException(org.apache.wiki.api.exceptions.ProviderException) WikiPage(org.apache.wiki.WikiPage) Attachment(org.apache.wiki.attachment.Attachment) IOException(java.io.IOException) SimpleFSDirectory(org.apache.lucene.store.SimpleFSDirectory) Date(java.util.Date) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) NoRequiredPropertyException(org.apache.wiki.api.exceptions.NoRequiredPropertyException) InternalWikiException(org.apache.wiki.InternalWikiException) ParseException(org.apache.lucene.queryparser.classic.ParseException) LockObtainFailedException(org.apache.lucene.store.LockObtainFailedException) InvalidTokenOffsetsException(org.apache.lucene.search.highlight.InvalidTokenOffsetsException) IOException(java.io.IOException) ProviderException(org.apache.wiki.api.exceptions.ProviderException) IndexWriter(org.apache.lucene.index.IndexWriter) Iterator(java.util.Iterator) Collection(java.util.Collection) File(java.io.File) Directory(org.apache.lucene.store.Directory) SimpleFSDirectory(org.apache.lucene.store.SimpleFSDirectory)

Example 13 with SimpleFSDirectory

use of org.apache.lucene.store.SimpleFSDirectory in project spoon by INRIA.

the class HunspellService method loadDictionary.

/**
 * Loads the hunspell dictionary for the given local.
 *
 * @param locale       The locale of the hunspell dictionary to be loaded.
 * @param nodeSettings The node level settings
 * @param env          The node environment (from which the conf path will be resolved)
 * @return The loaded Hunspell dictionary
 * @throws Exception when loading fails (due to IO errors or malformed dictionary files)
 */
private Dictionary loadDictionary(String locale, Settings nodeSettings, Environment env) throws Exception {
    if (logger.isDebugEnabled()) {
        logger.debug("Loading hunspell dictionary [{}]...", locale);
    }
    Path dicDir = hunspellDir.resolve(locale);
    if (FileSystemUtils.isAccessibleDirectory(dicDir, logger) == false) {
        throw new ElasticsearchException(String.format(Locale.ROOT, "Could not find hunspell dictionary [%s]", locale));
    }
    // merging node settings with hunspell dictionary specific settings
    Settings dictSettings = HUNSPELL_DICTIONARY_OPTIONS.get(nodeSettings);
    nodeSettings = loadDictionarySettings(dicDir, dictSettings.getByPrefix(locale + "."));
    boolean ignoreCase = nodeSettings.getAsBoolean("ignore_case", defaultIgnoreCase);
    Path[] affixFiles = FileSystemUtils.files(dicDir, "*.aff");
    if (affixFiles.length == 0) {
        throw new ElasticsearchException(String.format(Locale.ROOT, "Missing affix file for hunspell dictionary [%s]", locale));
    }
    if (affixFiles.length != 1) {
        throw new ElasticsearchException(String.format(Locale.ROOT, "Too many affix files exist for hunspell dictionary [%s]", locale));
    }
    InputStream affixStream = null;
    Path[] dicFiles = FileSystemUtils.files(dicDir, "*.dic");
    List<InputStream> dicStreams = new ArrayList<>(dicFiles.length);
    try {
        for (int i = 0; i < dicFiles.length; i++) {
            dicStreams.add(Files.newInputStream(dicFiles[i]));
        }
        affixStream = Files.newInputStream(affixFiles[0]);
        try (Directory tmp = new SimpleFSDirectory(env.tmpFile())) {
            return new Dictionary(tmp, "hunspell", affixStream, dicStreams, ignoreCase);
        }
    } catch (Exception e) {
        logger.error("Could not load hunspell dictionary [{}]", e, locale);
        throw e;
    } finally {
        IOUtils.close(affixStream);
        IOUtils.close(dicStreams);
    }
}
Also used : Path(java.nio.file.Path) Dictionary(org.apache.lucene.analysis.hunspell.Dictionary) InputStream(java.io.InputStream) ArrayList(java.util.ArrayList) ElasticsearchException(org.elasticsearch.ElasticsearchException) SimpleFSDirectory(org.apache.lucene.store.SimpleFSDirectory) ElasticsearchException(org.elasticsearch.ElasticsearchException) IOException(java.io.IOException) Settings(org.elasticsearch.common.settings.Settings) SimpleFSDirectory(org.apache.lucene.store.SimpleFSDirectory) Directory(org.apache.lucene.store.Directory)

Example 14 with SimpleFSDirectory

use of org.apache.lucene.store.SimpleFSDirectory in project eol-globi-data by jhpoelen.

the class TaxonCacheService method initTaxonIdMap.

private void initTaxonIdMap() throws PropertyEnricherException {
    try {
        LOG.info("taxon lookup service instantiating...");
        File luceneDir = new File(getCacheDir().getAbsolutePath(), "lucene");
        boolean preexisting = luceneDir.exists();
        createCacheDir(luceneDir, isTemporary());
        TaxonLookupServiceImpl taxonLookupService = new TaxonLookupServiceImpl(new SimpleFSDirectory(luceneDir));
        taxonLookupService.setMaxHits(getMaxTaxonLinks());
        taxonLookupService.start();
        if (!isTemporary() && preexisting) {
            LOG.info("pre-existing taxon lookup index found, no need to re-index...");
        } else {
            LOG.info("no pre-existing taxon lookup index found, re-indexing...");
            int count = 0;
            LOG.info("taxon map loading [" + taxonMapResource + "] ...");
            StopWatch watch = new StopWatch();
            watch.start();
            BufferedReader reader = createBufferedReader(taxonMapResource);
            final LabeledCSVParser labeledCSVParser = CSVTSVUtil.createLabeledTSVParser(reader);
            while (labeledCSVParser.getLine() != null) {
                Taxon provided = TaxonMapParser.parseProvidedTaxon(labeledCSVParser);
                Taxon resolved = TaxonMapParser.parseResolvedTaxon(labeledCSVParser);
                addIfNeeded(taxonLookupService, provided.getExternalId(), resolved.getExternalId());
                addIfNeeded(taxonLookupService, provided.getName(), resolved.getExternalId());
                addIfNeeded(taxonLookupService, resolved.getName(), resolved.getExternalId());
                count++;
            }
            watch.stop();
            logCacheLoadStats(watch.getTime(), count);
            LOG.info("taxon map loading [" + taxonMapResource + "] done.");
        }
        taxonLookupService.finish();
        this.taxonLookupService = taxonLookupService;
        LOG.info("taxon lookup service instantiating done.");
    } catch (IOException e) {
        throw new PropertyEnricherException("problem initiating taxon cache index", e);
    }
}
Also used : PropertyEnricherException(org.eol.globi.service.PropertyEnricherException) Taxon(org.eol.globi.domain.Taxon) BufferedReader(java.io.BufferedReader) LabeledCSVParser(com.Ostermiller.util.LabeledCSVParser) IOException(java.io.IOException) File(java.io.File) SimpleFSDirectory(org.apache.lucene.store.SimpleFSDirectory) StopWatch(org.apache.commons.lang3.time.StopWatch)

Example 15 with SimpleFSDirectory

use of org.apache.lucene.store.SimpleFSDirectory in project eol-globi-data by jhpoelen.

the class TaxonLookupServiceImpl method start.

@Override
public void start() {
    try {
        if (indexDir == null) {
            indexPath = new File(System.getProperty("java.io.tmpdir") + "/taxon" + System.currentTimeMillis());
            LOG.info("index directory at [" + indexPath + "] created.");
            // FileUtils.forceDeleteOnExit(indexPath);
            indexDir = new SimpleFSDirectory(indexPath);
        }
        IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_35, null);
        indexWriter = new IndexWriter(indexDir, config);
    } catch (IOException e) {
        throw new RuntimeException("failed to create indexWriter, cannot continue", e);
    }
}
Also used : IndexWriter(org.apache.lucene.index.IndexWriter) IOException(java.io.IOException) File(java.io.File) SimpleFSDirectory(org.apache.lucene.store.SimpleFSDirectory) IndexWriterConfig(org.apache.lucene.index.IndexWriterConfig)

Aggregations

SimpleFSDirectory (org.apache.lucene.store.SimpleFSDirectory)37 Directory (org.apache.lucene.store.Directory)23 Path (java.nio.file.Path)15 IOException (java.io.IOException)13 File (java.io.File)9 IndexWriter (org.apache.lucene.index.IndexWriter)9 FSDirectory (org.apache.lucene.store.FSDirectory)7 Settings (org.elasticsearch.common.settings.Settings)7 LockObtainFailedException (org.apache.lucene.store.LockObtainFailedException)6 CorruptIndexException (org.apache.lucene.index.CorruptIndexException)5 IndexSearcher (org.apache.lucene.search.IndexSearcher)5 FilterDirectory (org.apache.lucene.store.FilterDirectory)5 IndexInput (org.apache.lucene.store.IndexInput)5 InputStream (java.io.InputStream)4 ParameterizedMessage (org.apache.logging.log4j.message.ParameterizedMessage)4 Dictionary (org.apache.lucene.analysis.hunspell.Dictionary)4 IndexReader (org.apache.lucene.index.IndexReader)4 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)4 MMapDirectory (org.apache.lucene.store.MMapDirectory)4 NIOFSDirectory (org.apache.lucene.store.NIOFSDirectory)4