Examples with Indexer - org.wso2.carbon.registry.indexing.indexer.Indexer

Example 6 with Indexer

use of org.wso2.carbon.registry.indexing.indexer.Indexer in project carbon-apimgt by wso2.

the class RegistryPersistenceUtil method notifyAPIStateChangeToAssociatedDocuments.

/**
 * Notify document artifacts if an api state change occured. This change is required to re-trigger the document
 * indexer so that the documnet indexes will be updated with the new associated api status.
 *
 * @param apiArtifact
 * @param registry
 * @throws RegistryException
 * @throws APIManagementException
 */
public static void notifyAPIStateChangeToAssociatedDocuments(GenericArtifact apiArtifact, Registry registry) throws RegistryException, APIManagementException {
    Association[] docAssociations = registry.getAssociations(apiArtifact.getPath(), APIConstants.DOCUMENTATION_ASSOCIATION);
    for (Association association : docAssociations) {
        String documentResourcePath = association.getDestinationPath();
        Resource docResource = registry.get(documentResourcePath);
        String oldStateChangeIndicatorStatus = docResource.getProperty(APIConstants.API_STATE_CHANGE_INDICATOR);
        String newStateChangeIndicatorStatus = "false";
        if (oldStateChangeIndicatorStatus != null) {
            newStateChangeIndicatorStatus = String.valueOf(!Boolean.parseBoolean(oldStateChangeIndicatorStatus));
        }
        docResource.setProperty(APIConstants.API_STATE_CHANGE_INDICATOR, "false");
        registry.put(documentResourcePath, docResource);
    }
}

Also used : Association(org.wso2.carbon.registry.core.Association) Resource(org.wso2.carbon.registry.core.Resource)

Example 7 with Indexer

use of org.wso2.carbon.registry.indexing.indexer.Indexer in project carbon-apimgt by wso2.

the class DocumentIndexer method getIndexedDocument.

public IndexDocument getIndexedDocument(AsyncIndexer.File2Index fileData) throws SolrException, RegistryException {
    IndexDocument indexDocument = super.getIndexedDocument(fileData);
    IndexDocument newIndexDocument = indexDocument;
    Registry registry = GovernanceUtils.getGovernanceSystemRegistry(IndexingManager.getInstance().getRegistry(fileData.tenantId));
    String documentResourcePath = fileData.path.substring(RegistryConstants.GOVERNANCE_REGISTRY_BASE_PATH.length());
    if (documentResourcePath.contains("/apimgt/applicationdata/apis/")) {
        return null;
    }
    if (log.isDebugEnabled()) {
        log.debug("Executing document indexer for resource at " + documentResourcePath);
    }
    Resource documentResource = null;
    Map<String, List<String>> fields = indexDocument.getFields();
    if (registry.resourceExists(documentResourcePath)) {
        documentResource = registry.get(documentResourcePath);
    }
    if (documentResource != null) {
        try {
            fetchRequiredDetailsFromAssociatedAPI(registry, documentResource, fields);
            StringBuilder stringBuilder = new StringBuilder();
            stringBuilder.append(fetchDocumentContent(registry, documentResource));
            if (fields.get(APIConstants.DOC_NAME) != null) {
                stringBuilder.append(APIConstants.DOC_NAME + "=" + StringUtils.join(fields.get(APIConstants.DOC_NAME), ","));
            }
            if (fields.get(APIConstants.DOC_SUMMARY) != null) {
                stringBuilder.append(APIConstants.DOC_SUMMARY + "=" + StringUtils.join(fields.get(APIConstants.DOC_SUMMARY), ","));
            }
            newIndexDocument = new IndexDocument(fileData.path, "", stringBuilder.toString(), indexDocument.getTenantId());
            fields.put(APIConstants.DOCUMENT_INDEXER_INDICATOR, Arrays.asList("true"));
            newIndexDocument.setFields(fields);
        } catch (APIManagementException e) {
            // error occured while fetching details from API, but continuing document indexing
            log.error("Error while updating indexed document.", e);
        } catch (IOException e) {
            // error occured while fetching document content, but continuing document indexing
            log.error("Error while getting document content.", e);
        }
    }
    return newIndexDocument;
}

Also used : IndexDocument(org.wso2.carbon.registry.indexing.solr.IndexDocument) APIManagementException(org.wso2.carbon.apimgt.api.APIManagementException) Resource(org.wso2.carbon.registry.core.Resource) List(java.util.List) Registry(org.wso2.carbon.registry.core.Registry) IOException(java.io.IOException)

Example 8 with Indexer

use of org.wso2.carbon.registry.indexing.indexer.Indexer in project carbon-apimgt by wso2.

the class MSWordIndexer method getIndexedDocument.

public IndexDocument getIndexedDocument(File2Index fileData) throws SolrException {
    try {
        String wordText = null;
        try {
            // Extract MSWord 2003 document files
            POIFSFileSystem fs = new POIFSFileSystem(new ByteArrayInputStream(fileData.data));
            WordExtractor msWord2003Extractor = new WordExtractor(fs);
            wordText = msWord2003Extractor.getText();
        } catch (OfficeXmlFileException e) {
            // if 2003 extraction failed, try with MSWord 2007 document files extractor
            XWPFDocument doc = new XWPFDocument(new ByteArrayInputStream(fileData.data));
            XWPFWordExtractor msWord2007Extractor = new XWPFWordExtractor(doc);
            wordText = msWord2007Extractor.getText();
        } catch (Exception e) {
            // The reason for not throwing an exception is that since this is an indexer that runs in the background
            // throwing an exception might lead to adverse behaviors in the client side and might lead to
            // other files not being indexed
            String msg = "Failed to extract the document while indexing";
            log.error(msg, e);
        }
        IndexDocument indexDoc = new IndexDocument(fileData.path, wordText, null);
        Map<String, List<String>> fields = new HashMap<String, List<String>>();
        fields.put("path", Collections.singletonList(fileData.path));
        if (fileData.mediaType != null) {
            fields.put(IndexingConstants.FIELD_MEDIA_TYPE, Collections.singletonList(fileData.mediaType));
        } else {
            fields.put(IndexingConstants.FIELD_MEDIA_TYPE, Collections.singletonList("application/pdf"));
        }
        indexDoc.setFields(fields);
        return indexDoc;
    } catch (IOException e) {
        String msg = "Failed to write to the index";
        log.error(msg, e);
        throw new SolrException(ErrorCode.SERVER_ERROR, msg, e);
    }
}

Also used : IndexDocument(org.wso2.carbon.registry.indexing.solr.IndexDocument) HashMap(java.util.HashMap) XWPFWordExtractor(org.apache.poi.xwpf.extractor.XWPFWordExtractor) IOException(java.io.IOException) OfficeXmlFileException(org.apache.poi.poifs.filesystem.OfficeXmlFileException) IOException(java.io.IOException) SolrException(org.apache.solr.common.SolrException) WordExtractor(org.apache.poi.hwpf.extractor.WordExtractor) XWPFWordExtractor(org.apache.poi.xwpf.extractor.XWPFWordExtractor) OfficeXmlFileException(org.apache.poi.poifs.filesystem.OfficeXmlFileException) ByteArrayInputStream(java.io.ByteArrayInputStream) POIFSFileSystem(org.apache.poi.poifs.filesystem.POIFSFileSystem) XWPFDocument(org.apache.poi.xwpf.usermodel.XWPFDocument) List(java.util.List) SolrException(org.apache.solr.common.SolrException)

Example 9 with Indexer

use of org.wso2.carbon.registry.indexing.indexer.Indexer in project carbon-apimgt by wso2.

the class MSPowerpointIndexerTest method testShouldReturnIndexedDocumentWhenParameterCorrect.

@Test
public void testShouldReturnIndexedDocumentWhenParameterCorrect() throws Exception {
    POIFSFileSystem ppExtractor = Mockito.mock(POIFSFileSystem.class);
    PowerPointExtractor powerPointExtractor = Mockito.mock(PowerPointExtractor.class);
    XSLFPowerPointExtractor xslfExtractor = Mockito.mock(XSLFPowerPointExtractor.class);
    XMLSlideShow xmlSlideShow = Mockito.mock(XMLSlideShow.class);
    PowerMockito.whenNew(POIFSFileSystem.class).withParameterTypes(InputStream.class).withArguments(Mockito.any(InputStream.class)).thenThrow(OfficeXmlFileException.class).thenReturn(ppExtractor).thenThrow(APIManagementException.class);
    PowerMockito.whenNew(PowerPointExtractor.class).withParameterTypes(POIFSFileSystem.class).withArguments(ppExtractor).thenReturn(powerPointExtractor);
    PowerMockito.whenNew(XMLSlideShow.class).withParameterTypes(InputStream.class).withArguments(Mockito.any()).thenReturn(xmlSlideShow);
    PowerMockito.whenNew(XSLFPowerPointExtractor.class).withArguments(xmlSlideShow).thenReturn(xslfExtractor);
    Mockito.when(powerPointExtractor.getText()).thenReturn("");
    Mockito.when(xslfExtractor.getText()).thenReturn("");
    MSPowerpointIndexer indexer = new MSPowerpointIndexer();
    IndexDocument ppDoc = indexer.getIndexedDocument(file2Index);
    // should return the default media type when media type is not defined in file2Index
    if (!"application/vnd.ms-powerpoint".equals(ppDoc.getFields().get(IndexingConstants.FIELD_MEDIA_TYPE).get(0))) {
        Assert.fail();
    }
    // should return the media type we have set in the file2Index
    file2Index.mediaType = "text/html";
    ppDoc = indexer.getIndexedDocument(file2Index);
    if (!"text/html".equals(ppDoc.getFields().get(IndexingConstants.FIELD_MEDIA_TYPE).get(0))) {
        Assert.fail();
    }
    // should return the media type we have set in the file2Index even if exception occurred while reading the file
    ppDoc = indexer.getIndexedDocument(file2Index);
    if (!"text/html".equals(ppDoc.getFields().get(IndexingConstants.FIELD_MEDIA_TYPE).get(0))) {
        Assert.fail();
    }
}

Also used : IndexDocument(org.wso2.carbon.registry.indexing.solr.IndexDocument) XSLFPowerPointExtractor(org.apache.poi.xslf.extractor.XSLFPowerPointExtractor) POIFSFileSystem(org.apache.poi.poifs.filesystem.POIFSFileSystem) PowerPointExtractor(org.apache.poi.hslf.extractor.PowerPointExtractor) XSLFPowerPointExtractor(org.apache.poi.xslf.extractor.XSLFPowerPointExtractor) XMLSlideShow(org.apache.poi.xslf.usermodel.XMLSlideShow) Test(org.junit.Test) PrepareForTest(org.powermock.core.classloader.annotations.PrepareForTest)

Example 10 with Indexer

use of org.wso2.carbon.registry.indexing.indexer.Indexer in project carbon-apimgt by wso2.

the class WSDLIndexerTest method testShouldReturnIndexedDocumentWhenParameterCorrect.

@Test
public void testShouldReturnIndexedDocumentWhenParameterCorrect() throws RegistryException {
    String mediaType = "application/wsdl";
    final String MEDIA_TYPE = "mediaType";
    AsyncIndexer.File2Index file2Index = new AsyncIndexer.File2Index("".getBytes(), null, "", -1234, "");
    WSDLIndexer indexer = new WSDLIndexer();
    // should return the default media type when media type is not defined in file2Index
    IndexDocument xml = indexer.getIndexedDocument(file2Index);
    if (xml.getFields().get(MEDIA_TYPE) != null) {
        Assert.fail();
    }
    // should return the media type we have set in the file2Index
    file2Index.mediaType = mediaType;
    xml = indexer.getIndexedDocument(file2Index);
    if (!mediaType.equals(xml.getFields().get(MEDIA_TYPE).get(0))) {
        Assert.fail();
    }
}

Also used : IndexDocument(org.wso2.carbon.registry.indexing.solr.IndexDocument) AsyncIndexer(org.wso2.carbon.registry.indexing.AsyncIndexer) Test(org.junit.Test)

Aggregations

IndexDocument (org.wso2.carbon.registry.indexing.solr.IndexDocument)7 Test (org.junit.Test)5 Resource (org.wso2.carbon.registry.core.Resource)4 AsyncIndexer (org.wso2.carbon.registry.indexing.AsyncIndexer)4 HashMap (java.util.HashMap)3 POIFSFileSystem (org.apache.poi.poifs.filesystem.POIFSFileSystem)3 IOException (java.io.IOException)2 List (java.util.List)2 WordExtractor (org.apache.poi.hwpf.extractor.WordExtractor)2 XWPFWordExtractor (org.apache.poi.xwpf.extractor.XWPFWordExtractor)2 XWPFDocument (org.apache.poi.xwpf.usermodel.XWPFDocument)2 PrepareForTest (org.powermock.core.classloader.annotations.PrepareForTest)2 APIProductResource (org.wso2.carbon.apimgt.api.model.APIProductResource)2 Association (org.wso2.carbon.registry.core.Association)2 UserRegistry (org.wso2.carbon.registry.core.session.UserRegistry)2 RegistryConfigLoader (org.wso2.carbon.registry.indexing.RegistryConfigLoader)2 Indexer (org.wso2.carbon.registry.indexing.indexer.Indexer)2 ByteArrayInputStream (java.io.ByteArrayInputStream)1 ArrayList (java.util.ArrayList)1 LinkedHashMap (java.util.LinkedHashMap)1