lucene-wiki翻译：如何提高索引速度-3

程序员文章站 2022-04-14 12:26:09

...

原文：http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
导航：Lucene-java Wiki-》1 Overview-》1.1 Informational-》 1.1.1BasicsOfPerformance-》1.1.1.4 ImproveIndexingSpeed
注意：“ 红色 ”，表示不知道、不确定怎么翻译。 “ 蓝色”自己的描述。
状态：完成

8.以同样的顺序在Document中添加fields

原文写道

Always add fields in the same order to your Document, when using stored fields or term vectors

Lucene's merging has an optimization whereby stored fields and term vectors can be bulk-byte-copied, but the optimization only applies if the field name -> number mapping is the same across segments. Future Lucene versions may attempt to assign the same mapping automatically (see LUCENE-1737), but until then the only way to get the same mapping is to always add the same fields in the same order to each document you index.

以同样的顺序在Document中添加fields，大家平时就是这么做的 lucene-wiki翻译：如何提高索引速度-3

博客分类： Lucene翻译 lucene索引速度提高java 。Lucene在合并索引的时候有一个优化功能，即可以根据field和term vectors实现批量字节拷贝，但该优化只有在 name->number映射在所有segments都相同的情况下方可实现。未来的lucene版本可能将会实现自动映射（参看），但目前为止，只有“以同样的顺序在Document中添加fields”这一种方式来获得一样的映射。

9.在分析器Analyzers 中复用（单例模式）Token 实例

原文写道

Re-use a single Token instance in your analyzer Analyzers often create a new Token for each termin sequence that needs to be indexed from a Field. You can save substantial GC cost by re-using a single Token instance instead.

在分析器Analyzers 中复用（单例模式）Token 实例。对于需要建立索引的Field，分析器Analyzers会为其中的没个term创建一个Token 对象。你可以通过复用Token来降低垃圾回收的消耗。

人家的翻译：

在你的分析器Analyzer中使用一个单一的Token实例。在分析器*享一个单一的token实例也将缓解GC的压力。

悲哀啊，我到现在没用过token，啥情况？？？

10.用Tokenz中的char[] API代替String API来表示数据

原文写道

Use the char[] API in Token instead of the String API to represent token Text

As of Lucene 2.3, a Token can represent its text as a slice into a char array, which saves the GC cost of new'ing and then reclaiming（回收） String instances. By re-using a single Token instance and using the char[] API you can avoid new'ing any objects for each term. See Token for details.

represent ...as : 把…描绘成。晕，这都忘记了。 lucene-wiki翻译：如何提高索引速度-3

博客分类： Lucene翻译 lucene索引速度提高java

人家的翻译

在Lucene 2.3中，Token可以使用char数组来表示他的数据。这样可以避免构建字符串以及GC回收字符串的消耗。通过配合使用单一Token实例和使用char[]接口你可以避免创建新的对象。更多细节参考：Token

相关标签： lucene 索引速度提高 java

上一篇： __EMIT伪指令

下一篇： Oracle 将 Berkeley DB 许可证更改为 AGPL

lucene-wiki翻译：如何提高索引速度-3

如何提高PHP速度第1/3页

如何提高MYSQL数据库的查询统计速度 select 索引应用

如何使用索引提高查询速度_MySQL

如何提高PHP速度第1/3页

如何提高MYSQL数据库的查询统计速度 select 索引应用_PHP教程

如何提高PHP速度第1/3页_php技巧

如何提高PHP速度第1/3页

如何提高MYSQL数据库的查询统计速度 select 索引应用_php技巧

如何使用索引提高查询速度_MySQL

如何提高MYSQL数据库的查询统计速度 select 索引应用