Clear Special character
程序员文章站
2022-07-15 09:19:45
...
public static String[] analyzer(String string) { List<String> list = new ArrayList<String>(); try { StringReader reader = new StringReader(string); IKSegmenter ik = new IKSegmenter(reader, true); Lexeme lexeme = null; while ((lexeme = ik.next()) != null) { list.add(lexeme.getLexemeText()); } } catch (IOException e) { e.printStackTrace(); } return list.toArray(new String[list.size()]); } public static String[] generate(String string) { List<String> list = new ArrayList<String>(); string = clear_special_character(string); String[] tags = string.split("[,\\s]"); for (String tag : tags) { tag = tag.trim(); if (tag.length() > 0) { list.add(tag); } } return list.toArray(new String[list.size()]); } public static String clear_special_character(String string) { string = string.replaceAll("\\pP|\\pS", " "); string = string.replaceAll("\\s+", " "); return string; }
推荐阅读
-
CSS 浮动清理,不使用 clear:both标签
-
liststyletype小图标怎么设置(clear清除浮动代码)
-
Postman请求后台报错:Invalid character found in method name. HTTP method names must be tokens
-
python中time.strftime不支持中文,报错UnicodeEncodeError: 'locale' codec can't encode character '\u5e74' in position 2: encoding error
-
python使用clear方法清除字典内全部数据实例
-
jquery.bgiframe.js在IE9下提示INVALID_CHARACTER_ERR错误
-
java基础系列(一):Number,Character和String类及操作
-
Unknown initial character set index '255' received from server. Initial client character set can be
-
css别忘记清除浮动clear:both
-
PowerShell使用Clear-Content命令删除、清空文件内容的例子