首页 > 代码库 > Lucene分词报错:”TokenStream contract violation: close() call missing”
Lucene分词报错:”TokenStream contract violation: close() call missing”
Lucene使用IKAnalyzer分词时报错:”TokenStream contract violation: close() call missing” 解决办法是每次完成后必须调用关闭方法。
如果报错:java.lang.illegalstateexception: tokenstream contract violation: reset()/close() call missing,则要在tokenStream.incrementToken(),原因是lucene从4.6.0开始tokenstream使用方法更改的问题,在使用incrementtoken方法前必须调用reset方法,详见api http://lucene.apache.org/core/4_6_0/core/index.html 。
以下正确示例代码(第10行和22行调用reset()和close()方法):
public Set<String> slicing(String text){ Set<String> result = new HashSet<>(); StringReader reader = null; TokenStream tokenStream = null; try { reader = new StringReader(text); tokenStream = analyzer.tokenStream("", reader); CharTermAttribute charTermAttribute = tokenStream.getAttribute(CharTermAttribute.class); OffsetAttribute offsetAttribute = tokenStream.addAttribute(OffsetAttribute.class); tokenStream.reset(); while (tokenStream.incrementToken()) { int startOffset = offsetAttribute.startOffset(); int endOffset = offsetAttribute.endOffset(); if((endOffset - startOffset) > 1){ String term = charTermAttribute.toString(); result.add(term); } } } catch (IOException e) { e.printStackTrace(); } finally{ IOs.close(tokenStream, reader); } return result;}
http://www.lizi.pw/archives/56
org.wltea.analyzer.lucene.IKAnalyzer
Exception in thread "main" java.lang.IllegalStateException: 词典尚未初始化,请先调用initial方法at org.wltea.analyzer.dic.Dictionary.getSingleton(Dictionary.java:137)at org.wltea.analyzer.core.CJKSegmenter.analyze(CJKSegmenter.java:80)at org.wltea.analyzer.core.IKSegmenter.next(IKSegmenter.java:116)at org.wltea.analyzer.lucene.IKTokenizer.incrementToken(IKTokenizer.java:88)
Lucene分词报错:”TokenStream contract violation: close() call missing”
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。