We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
case: 词典: 二、二氧化碳、高、高温、高温催化 句子: 二氧化碳高温催化剂报告
期望抽取结果: 二氧化碳、高温催化 目前的抽取结果: [[0:1]=二, [0:4]=二氧化碳, [4:6]=高温, [4:8]=高温催化]
@Test public void test() { // Collect test data set TreeMap<String, String> map = new TreeMap<String, String>(); String[] keyArray = new String[]{ "二", "二氧化碳", "高温", "高温", "高温催化" }; for (String key : keyArray) { map.put(key, key); } // Build an AhoCorasickDoubleArrayTrie AhoCorasickDoubleArrayTrie<String> acdat = new AhoCorasickDoubleArrayTrie<>(); acdat.build(map); // Test it@SpringBootTest final String text = "二氧化碳高温催化剂报告"; List<AhoCorasickDoubleArrayTrie.Hit<String>> wordList = acdat.parseText(text); System.out.println(wordList); }
输出:
[[0:1]=二, [0:4]=二氧化碳, [4:6]=高温, [4:8]=高温催化]
The text was updated successfully, but these errors were encountered:
使用HanLP提供的AhoCorasickDoubleArrayTrieSegment
Sorry, something went wrong.
No branches or pull requests
case:
词典: 二、二氧化碳、高、高温、高温催化
句子: 二氧化碳高温催化剂报告
期望抽取结果: 二氧化碳、高温催化
目前的抽取结果: [[0:1]=二, [0:4]=二氧化碳, [4:6]=高温, [4:8]=高温催化]
输出:
The text was updated successfully, but these errors were encountered: