Solr数据库6.3.0版本配置问题:whose UTF8 encoding is longer than the max length 32766
问题描述,入库时,在solr的web界面上按条件搜索不到相应的字段内容,查看solr数据库后台报错如下
2018-05-09 08:51:27.736 ERROR (http-nio-8032-exec-32) [c:comos s:shard3 r:core_node12 x:core3] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Exception writing document id fytxtex5788 to the index; possible analysis error: Document contains at least one immense term in field="html" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[60, 109, 101, 116, 97, 32, 104, 116, 116, 112, 45, 101, 113, 117, 105, 118, 61, 34, 67, 111]...', original message: bytes can be at most 32766 in length; got 40928. Perhaps the document has an indexed string field (solr.StrField) which is too large
2018-05-09 08:51:28.629 ERROR (http-nio-8032-exec-32) [c:comos s:shard3 r:core_node12 x:core3] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Exception writing document id fytxtex5788 to the index; possible analysis error: Document contains at least one immense term in field="html" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[60, 109, 101, 116, 97, 32, 104, 116, 116, 112, 45, 101, 113, 117, 105, 118, 61, 34, 67, 111]...', original message: bytes can be at most 32766 in length; got 40928. Perhaps the document has an indexed string field (solr.StrField) which is too large
主要报错:ERROR o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Exception writing document id to the index; possible analysis error: Document contains at least one immense term in field="html" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[……]...', original message: bytes can be at most 32766 in length; got 40928. Perhaps the document has an indexed string field (solr.StrField) which is too large。
问题原因:
跟网上找到的一样,看报错是因为field名为"html"的在入库时因为超出默认值失败。
解决方法:
在solr6版本以后已经用managed-schema代替了原来的schema.xml。且可以在web管理界面看到。
从图中可以看到这个html的类型是Dynamic Field,名称是*,然后Type是string。所以我们要做的就是找到managed-schema文件,修改<dynamicField name="*" type="string" indexed="true" stored="false" multiValued="true" />其中的type=text_general。
重启Zookeeper,重新执行zookCli.sh ,重新启动solr数据库。查看type 变成text_general后,入库无报错。