首页 > 代码库 > 词频统计更新
词频统计更新
代码有两个分支,1、选择输入文本路径或,2、选择直接输入文章。
public static void main(String[] args) { HashMap<String,Integer> map=new HashMap<String,Integer>();//用于统计各个单词的个数,排序 //过滤字符串中的所有标点符号 String regex=" ?.!:,\"\"‘‘;\n"; BufferedReader br; try { //FileReader类创建了一个可以读取文件内容的Reader类、调用构造方法FileReader() Scanner scan = new Scanner(System.in); System.out.println("请输入您的输入格式"); System.out.println("1、文件完整路径"); System.out.println("2、文章内容"); int flag = scan.nextInt();
根据不同的选择,进入不同的分支
功能1:小文件输入键盘在控制台下输入命令。
在控制台输入文本路径即可进行词频统计。
1 System.out.println("请输入文件完整路径"); 2 String fileUrl = scan.next(); 3 br = new BufferedReader(new FileReader(fileUrl));//文件完整路径 4 String sentence; 5 int wordCount = 0; 6 try { 7 while((sentence = br.readLine()) !=null){ //用readLine读取文件,判断读取文件是否为空 8 sentence = sentence.replaceAll(regex, ""); 9 StringTokenizer token=new StringTokenizer(sentence);10 while(token.hasMoreTokens()){ //循环遍历11 wordCount++; 12 String word = token.nextToken();13 if(map.containsKey(word)){ //HashMap不允许重复的key,所以利用这个特性,去统计单词的个数14 int count=map.get(word);15 map.put(word, count+1); //如果HashMap已有这个单词,则设置它的数量加116 }17 else{18 map.put(word, 1); //如果没有这个单词,则新填入,数量为119 }20 }21 }22 System.out.println("总共单词数:"+wordCount);23 sort(map); 24 } catch (IOException e) {25 e.printStackTrace();26 }27 break;
运行结果:
请输入您的输入格式1、文件完整路径2、文章内容1请输入文件完整路径c://english.txt总共单词数:181as:7the:7not:6it:6to:5are:4a:4your:4in:4they:3live:3and:3of:2do:2may:2by:2be:2clothes:2that:2often:2have:2from:2above:2is:2you:2door:1its:1suppose.It:1palace.The:1contentedly:1snow:1friends,Turn:1yourself:1means.which:1or:1windows:1life,poor:1bad:1quiet:1like:1without:1thoughts.:1simply:1abode;the:1change.Sell:1will:1some:1fault-finder:1herb,like:1before:1most:1I:1old,return:1trouble:1life:1change;we:1supported:1is.You:1spring.:1me:1mind:1town;but:1there,and:1paradise.Love:1hardnames.It:1is,meet:1should:1seem:1independent:1new:1alms-house:1poor-house.The:1pleasant,thrilling,glorious:1;do:1garden:1happens:1keep:1but:1However:1reflected:1being:1brightly:1enough:1Cultivate:1any.May:1looks:1more:1sage.Do:1town‘s:1when:1faults:1richest.The:1disreputable.:1think:1get:1so:1much:1lives:1perhaps:1early:1things,whether:1call:1dishonest:1sun:1shun:1melts:1setting:1them.Things:1poverty:1poorest:1mean:1receive:1find:1hourss,even:1thoughts,as:1rich:1poor:1man‘s:1cheering:1great:1see:1supporting:1themselves:1misgiving.Most:1
功能2. 支持命令行输入英文作品的文件名
>wf english.txt
total 181 words
功能3. 支持命令行输入存储有英文作品文件的目录名,批量统计。
>dir folder
gone_with_the_wand
runbinson
janelove
>wf folder
gone_with_the_wand
total 1234567 words
功能4. 从控制台读入英文单篇作品
System.out.println("请输入文章内容"); String sentence2 = scan.next(); //将要输入的句子或段落。 System.out.println(sentence2); int wordCount2=0; //每个单词出现的次数。 HashMap<String,Integer> map2=new HashMap<String,Integer>();//用于统计各个单词的个数,排序 StringTokenizer token=new StringTokenizer(sentence2);//这个类会将字符串分解成一个个的标记 sentence = sentence2.replaceAll(regex, ""); while(token.hasMoreTokens()){ //循环遍历 wordCount2++; String word=token.nextToken(", ?.!:\"\"‘‘\n"); //括号里的字符的含义是说按照,空格 ? . : "" ‘‘ \n去分割 if(map2.containsKey(word)){ //HashMap不允许重复的key,所以利用这个特性,去统计单词的个数 int count=map2.get(word); map2.put(word, count+1); //如果HashMap已有这个单词,则设置它的数量加1 } else map2.put(word, 1); //如果没有这个单词,则新填入,数量为1 } System.out.println("总共单词数:"+wordCount2); sort(map2); //调用排序的方法,排序并输出! break; }
运行结果:
1 请输入您的输入格式 2 1、文件完整路径 3 2、文章内容 4 2 5 请输入文章内容 6 However mean your life is,meet it and live it ;do not shun it and call it hardnames.It is not so bad as you suppose.It looks poorest when you are richest.The fault-finder will find faults in paradise.Love your life,poor as it is.You may perhaps have some pleasant,thrilling,glorious hourss,even in a poor-house.The setting sun is reflected from the windows of the alms-house as brightly as from the rich man‘s abode;the snow melts before its door as early in the spring. I do not see but a quiet mind may live as contentedly there,and have as cheering thoughts,as in a palace.The town‘s poor seem to me often to live the most independent lives of any.May be they are simply great enough to receive without misgiving.Most think that they are above being supported by the town;but it often happens that they are not above supporting themselves by dishonest means.which should be more disreputable.Cultivate poverty like a garden herb,like sage.Do not trouble yourself much to get new things,whether clothes or friends,Turn the old,return to them.Things do not change;we change.Sell your clothes and keep your thoughts. 9 总共单词数:181 10 as:7 11 the:7 12 not:6 13 it:6 14 to:5 15 are:4 16 a:4 17 your:4 18 in:4 19 they:3 20 live:3 21 and:3 22 of:2 23 do:2 24 may:2 25 by:2 26 be:2 27 clothes:2 28 that:2 29 often:2 30 have:2 31 from:2 32 above:2 33 is:2 34 you:2 35 door:1 36 its:1 37 suppose.It:1 38 palace.The:1 39 contentedly:1 40 snow:1 41 friends,Turn:1 42 yourself:1 43 means.which:1 44 or:1 45 windows:1 46 life,poor:1 47 bad:1 48 quiet:1 49 like:1 50 without:1 51 thoughts.:1 52 simply:1 53 abode;the:1 54 change.Sell:1 55 will:1 56 some:1 57 fault-finder:1 58 herb,like:1 59 before:1 60 most:1 61 I:1 62 old,return:1 63 trouble:1 64 life:1 65 change;we:1 66 supported:1 67 is.You:1 68 spring.:1 69 me:1 70 mind:1 71 town;but:1 72 there,and:1 73 paradise.Love:1 74 hardnames.It:1 75 is,meet:1 76 should:1 77 seem:1 78 independent:1 79 new:1 80 alms-house:1 81 poor-house.The:1 82 pleasant,thrilling,glorious:1 83 ;do:1 84 garden:1 85 happens:1 86 keep:1 87 but:1 88 However:1 89 reflected:1 90 being:1 91 brightly:1 92 enough:1 93 Cultivate:1 94 any.May:1 95 looks:1 96 more:1 97 sage.Do:1 98 town‘s:1 99 when:1100 faults:1101 richest.The:1102 disreputable.:1103 think:1104 get:1105 so:1106 much:1107 lives:1108 perhaps:1109 early:1110 things,whether:1111 call:1112 dishonest:1113 sun:1114 shun:1115 melts:1116 setting:1117 them.Things:1118 poverty:1119 poorest:1120 mean:1121 receive:1122 find:1123 hourss,even:1124 thoughts,as:1125 rich:1126 poor:1127 man‘s:1128 cheering:1129 great:1130 see:1131 supporting:1132 themselves:1133 misgiving.Most:1
词频统计更新
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。