首页 > 代码库 > Python 的mapreduce 单词统计(转载)
Python 的mapreduce 单词统计(转载)
#!/usr/bin/env python import random # ‘abc..z‘ alphaStr = "".join(map(chr, range(97,123))) fp = open("word.txt", "w") maxIter = 100000 for i in range(maxIter): word = "" len =random.randint(1,5) for j in range(len): word + = alphaStr[random.randint(0,25)] fp.write(word + ‘\n‘) fp.close() cat word.txt | ./wordcount_mapper.py | ./wordcount_reducer.py . word count reduce, python #filename: wordcount_reducer.py from operator import itemgetter import sys wordcount = {} for line in sys.stdin: word, count = line.strip().split(‘\t‘,1) try: count = int(count) wordcount[word] = wordcount.get(word,0) + count except ValueError pass sorted_wordcount = sorted(wordcount.iterms(), key = itemgettter(0)) for word,count in sorted_wordcount: print("%s\t%s") %(word, count)
Python 的mapreduce 单词统计(转载)
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。