欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Lucene学习之创建索引

程序员文章站 2022-07-09 13:45:51
...

一、实现步骤

第一步:创建一个maven工程。

第二步:创建一个indexwriter对象。

  • 指定索引库的存放位置Directory对象
  • 指定一个分析器,对文档内容进行分析。

第二步:创建document对象。

第三步:创建field对象,将field添加到document对象中。

第四步:使用indexwriter对象将document对象写入索引库,此过程进行索引创建。并将索引和document对象写入索引库。

第五步:关闭IndexWriter对象。

二、pom文件

我这里用的是当前最新版本,具体maven文件如下:

<dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.11</version>
      <scope>test</scope>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-core -->
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-core</artifactId>
      <version>7.5.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-analyzers-common -->
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-analyzers-common</artifactId>
      <version>7.5.0</version>
    </dependency>

    <!-- https://mvnrepository.om/artifact/org.apache.lucene/lucene-queryparser -->
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-queryparser</artifactId>
      <version>7.5.0</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/commons-io/commons-io -->
    <dependency>
      <groupId>commons-io</groupId>
      <artifactId>commons-io</artifactId>
      <version>2.6</version>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.12</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.12</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.12</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.12</version>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.12</version>
      <scope>compile</scope>
    </dependency>

  </dependencies>

三、源代码

package com.wuzheng.lucene;

import org.apache.commons.io.FileUtils;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

import java.io.File;
import java.io.IOException;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;

public class Indexer {

    private IndexWriter indexWriter;

    private final String index_dir = "D:/index";

    private final String file_dir = "D:/file";

    /**
     * 创建索引
     * 1.创建一个IndexWriter对象
     */
    public void createIndex() throws Exception{
        //指定索引库存放位置
        Directory directory = FSDirectory.open(Paths.get(index_dir));
        //创建分析器   默认标准分析器
        StandardAnalyzer analyzer = new StandardAnalyzer();
        //创建IndexWriterConfig 对象
        IndexWriterConfig indexWriterConfig=new IndexWriterConfig(analyzer);
        //创建IndexWriter对象
        indexWriter=new IndexWriter(directory,indexWriterConfig);

        //获得原始文档 可以是文件、数据库表记录、或者网页信息等
        List<File> sourceFiles = getSourceFiles(file_dir);
        if(sourceFiles!=null&& sourceFiles.size()>0){
            for (int i = 0; i < sourceFiles.size(); i++) {
                File file =  sourceFiles.get(i);
                indexWriter.addDocument(fileToDocument(file));
            }
        }
        indexWriter.close();
    }

    private Document fileToDocument (File file) throws IOException {
        Document document = new Document();
        StringField stringField = new StringField("fileName", file.getName(), Field.Store.YES);
        TextField textField = new TextField("fileContent", FileUtils.readFileToString(file,"UTF-8"), Field.Store.NO);
        document.add(stringField);
        document.add(textField);
        return document;
    }

    private List<File> getSourceFiles(String file_dir) {
        File file = new File(file_dir);
        return Arrays.asList(file.listFiles());
    }

    public static void main(String[] args) {
        Indexer indexer = new Indexer();
        try {
            indexer.createIndex();
            System.out.println("create index success");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }


}

四、测试结果

Lucene学习之创建索引

转载于:https://my.oschina.net/codeTec/blog/2877921