Search in sources :

Example 21 with PerFieldAnalyzerWrapper

use of org.apache.lucene.analysis.miscellaneous.PerFieldAnalyzerWrapper in project ansj_seg by NLPchina.

the class IndexTest method indexTest.

@Test
public void indexTest() throws CorruptIndexException, LockObtainFailedException, IOException, ParseException {
    PerFieldAnalyzerWrapper analyzer = new PerFieldAnalyzerWrapper(new AnsjAnalyzer(TYPE.index_ansj));
    Directory directory = null;
    IndexWriter iwriter = null;
    IndexWriterConfig ic = new IndexWriterConfig(analyzer);
    // 建立内存索引对象
    directory = new RAMDirectory();
    iwriter = new IndexWriter(directory, ic);
    addContent(iwriter, "助推企业转型升级提供强有力的技术支持和服保障。中心的建成将使青岛的服务器承载能力突破10万台,达到世界一流水平。");
    addContent(iwriter, "涉及民生的部分商品和服务成本监审政策");
    addContent(iwriter, "我穿着和服");
    iwriter.commit();
    iwriter.close();
    System.out.println("索引建立完毕");
    Analyzer queryAnalyzer = new AnsjAnalyzer(AnsjAnalyzer.TYPE.dic_ansj);
    System.out.println("index ok to search!");
    search(queryAnalyzer, directory, "\"和服\"");
}
Also used : AnsjAnalyzer(org.ansj.lucene5.AnsjAnalyzer) AnsjAnalyzer(org.ansj.lucene5.AnsjAnalyzer) Analyzer(org.apache.lucene.analysis.Analyzer) RAMDirectory(org.apache.lucene.store.RAMDirectory) PerFieldAnalyzerWrapper(org.apache.lucene.analysis.miscellaneous.PerFieldAnalyzerWrapper) RAMDirectory(org.apache.lucene.store.RAMDirectory) Directory(org.apache.lucene.store.Directory) Test(org.junit.Test)

Example 22 with PerFieldAnalyzerWrapper

use of org.apache.lucene.analysis.miscellaneous.PerFieldAnalyzerWrapper in project ansj_seg by NLPchina.

the class IndexAndTest method test.

@Test
public void test() throws Exception {
    DicLibrary.put(DicLibrary.DEFAULT, "../../library/default.dic");
    PerFieldAnalyzerWrapper analyzer = new PerFieldAnalyzerWrapper(new AnsjAnalyzer(TYPE.index_ansj));
    Directory directory = null;
    IndexWriter iwriter = null;
    IndexWriterConfig ic = new IndexWriterConfig(analyzer);
    String text = "[工程名称]赣州市南康区第四中学学生公寓、食堂及附属工程\n" + "[关键信息]赣州市南康区第四中学.; 房屋建筑工 程施工总承包叁级以上(含叁级)资质;无;本工程授权委托人(注册建造师)须提供劳动合同和投标公司为其缴交的社保证明(社保证明时间为2015 年 11 月至 2016 年 1 月)原件,须提供加盖当地社保局业务章的社保手册或花名册(含姓名、社保查询号或身份证号、缴费基数和缴费凭证)或基本养老保险个人帐户对账单;如果建造师是法人代表的,则提供:身份证、法人代表资格证、建造师注册证书及其相应的 B 类安全生产考核合格证;               须提交公司或投标项目所 在地的检察机关出具的投标公司和投标公 司拟派项目负责人的《关于行贿犯罪档案 查询通知书》 。;小胡开银诚、中灿两家,赣州市南康区第四中学学生公寓、食堂及附属工程,本项目投资 857.9 万元,开标时间:2016 年 03 月 01 日 10:00,投标保证金的金额:15 万元,保证金到账截止时间为 2016 年 2月 26 日 17:00 时。介绍信2000元/家,报名费600元/家,保证金老板自己打。开标老板:黄思婷 134-0707-4912    委托人:胡童科。;2000.0;\n" + "[其他信息]赣州分公司;赣州分公司;投标申请单-20160226-1;3607821602050117-1.JXZF;;";
    System.out.println(IndexAnalysis.parse(text));
    // 建立内存索引对象
    directory = new RAMDirectory();
    iwriter = new IndexWriter(directory, ic);
    addContent(iwriter, text);
    iwriter.commit();
    iwriter.close();
    System.out.println("索引建立完毕");
    Analyzer queryAnalyzer = new AnsjAnalyzer(AnsjAnalyzer.TYPE.index_ansj);
    System.out.println("index ok to search!");
    for (Term t : IndexAnalysis.parse(text)) {
        System.out.println(t.getName());
        search(queryAnalyzer, directory, "\"" + t.getName() + "\"");
    }
}
Also used : AnsjAnalyzer(org.ansj.lucene5.AnsjAnalyzer) Term(org.ansj.domain.Term) AnsjAnalyzer(org.ansj.lucene5.AnsjAnalyzer) Analyzer(org.apache.lucene.analysis.Analyzer) RAMDirectory(org.apache.lucene.store.RAMDirectory) PerFieldAnalyzerWrapper(org.apache.lucene.analysis.miscellaneous.PerFieldAnalyzerWrapper) RAMDirectory(org.apache.lucene.store.RAMDirectory) Directory(org.apache.lucene.store.Directory) Test(org.junit.Test)

Aggregations

Analyzer (org.apache.lucene.analysis.Analyzer)22 PerFieldAnalyzerWrapper (org.apache.lucene.analysis.miscellaneous.PerFieldAnalyzerWrapper)22 HashMap (java.util.HashMap)12 RAMDirectory (org.apache.lucene.store.RAMDirectory)11 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)10 StandardAnalyzer (org.apache.lucene.analysis.standard.StandardAnalyzer)8 Document (org.apache.lucene.document.Document)8 TextField (org.apache.lucene.document.TextField)8 IndexWriter (org.apache.lucene.index.IndexWriter)8 Field (org.apache.lucene.document.Field)7 Directory (org.apache.lucene.store.Directory)6 Test (org.junit.Test)6 LowerCaseFilter (org.apache.lucene.analysis.LowerCaseFilter)4 Tokenizer (org.apache.lucene.analysis.Tokenizer)4 WhitespaceAnalyzer (org.apache.lucene.analysis.core.WhitespaceAnalyzer)4 StandardTokenizer (org.apache.lucene.analysis.standard.StandardTokenizer)4 DirectoryReader (org.apache.lucene.index.DirectoryReader)4 IOException (java.io.IOException)3 Map (java.util.Map)3 SKOSAnalyzer (at.ac.univie.mminf.luceneSKOS.analysis.SKOSAnalyzer)2