首页 > 代码库 > 随机产生单词然后判别其是否是真正的(可拼写的)单词:)

随机产生单词然后判别其是否是真正的(可拼写的)单词:)

    linux下带的好玩小巧的东东就是多啊!本猫又找到一个spell程序,如果单词是可拼写的则神马也不输出,否则输出疑似拼错的单词.可以把若干单词放在文件中,也可以用管道输入spell.为了简便本猫采用了后一种方法,可能会慢一点啊!稍后会实现前一种方法,看看效率提高了多少.

    首先是随机生成单词的方法:

def rand_words(n=10000,min_len=2,max_len=12)
	chars = (("a".."z").to_a * max_len).freeze
	words = []
	srand
	n.times do |x|
		len = min_len + (rand*1000).to_i % max_len
		idxes = []
		len.times {idxes<<(rand*100)%26}
		chars.shuffle
		words << chars.values_at(*idxes).join
		idxes.clear
	end 
	words
end

考虑到同一个字母在单词中可以出现多次,符合实际点的做法是将字母表重复max_len次,极端情况下max_len个字母都可以是相同的.然后再写下判断单词是否可拼写的方法:

#ret word that can spell or ret nil
def spell_word(word)
	cmd = `echo #{word}|spell`.chomp
	if cmd == word
		return nil
	else
		return word
	end
end

最后用参数的方式调用试一下:

rand_words(ARGV[0].to_i,ARGV[1].to_i,ARGV[2].to_i).each do |w|
		printf w+" " if spell_word(w)
	end

上测试代码:

wisy@wisy-ThinkPad-X61:~/src/ruby_src$ time ./a.rb 2000 3 12
usage : a.rb [-all|-all2] words_count min_len max_len
eta war raw chum lab err woe coil pee swum yap sap mud who sin 
real	0m22.770s
user	0m3.350s
sys	0m7.216s

速度貌似不甚理想啊,2000个单词用了22秒多!下面用传递文件给spell的方法看看速度如何,遂写一个spell_words方法:

#spell all words by tmpfile
def spell_words(words)
	puts "using spell_words..."
	f = Tempfile.new("#{$$}_spell_blablabla")
	#f = File.open("spell_test","w+")
	#f.write Marshal.dump(words)
	f.write words.join(" ")
	f.close

	cmd = `spell #{f.path}`
	no_spell_words = cmd.split("\n")
	words - no_spell_words
end

修改参数调用方式:

if ARGV[0] =~ /^-all$/
	puts spell_words(rand_words(ARGV[1].to_i,ARGV[2].to_i,ARGV[3].to_i)).join(" ")
else
	rand_words(ARGV[0].to_i,ARGV[1].to_i,ARGV[2].to_i).each do |w|
		printf w+" " if spell_word(w)
	end
end

看一下速度有没有提升啊:

wisy@wisy-ThinkPad-X61:~/src/ruby_src$ time ./a.rb -all 2000 3 12
usage : a.rb [-all|-all2] words_count min_len max_len
using spell_words...
work bah air hop pus etc bob

real	0m3.945s
user	0m0.163s
sys	0m0.050s

效率有变快,只用了不到4秒啊!最后如果spell直接将结果写入文件,然后再读出的方式会不会更快呢?再写一个spell_words2方法吧微笑

#spell all words by tmpfile and spell ret is also use tmpfile
def spell_words2(words)
	puts "using spell_words2..."
	f_words = Tempfile.new("#{$$}_spell_words")
	f_ret = Tempfile.new("#{$$}_spell_ret")
	f_ret.close

	f_words.write words.join(" ")
	f_words.close

	cmd = `spell #{f_words.path} > #{f_ret.path}`
	f=File.open(f_ret.path)
	no_spell_words = f.read.split("\n")
	f.close
	words - no_spell_words
end

最终修改后的源代码如下:

#!/usr/bin/ruby
#code by hopy 2014.12.08
#random create some words and check if a valid word!

require 'tempfile'

def rand_words(n=10000,min_len=2,max_len=12)
	chars = (("a".."z").to_a * max_len).freeze
	words = []
	srand
	n.times do |x|
		len = min_len + (rand*1000).to_i % max_len
		idxes = []
		len.times {idxes<<(rand*100)%26}
		chars.shuffle
		words << chars.values_at(*idxes).join
		idxes.clear
	end 
	words
end

#ret word that can spell or ret nil
def spell_word(word)
	cmd = `echo #{word}|spell`.chomp
	if cmd == word
		return nil
	else
		return word
	end
end

#spell all words by tmpfile
def spell_words(words)
	puts "using spell_words..."
	f = Tempfile.new("#{$$}_spell_blablabla")
	#f = File.open("spell_test","w+")
	#f.write Marshal.dump(words)
	f.write words.join(" ")
	f.close

	cmd = `spell #{f.path}`
	no_spell_words = cmd.split("\n")
	words - no_spell_words
end

#spell all words by tmpfile and spell ret is also use tmpfile
def spell_words2(words)
	puts "using spell_words2..."
	f_words = Tempfile.new("#{$$}_spell_words")
	f_ret = Tempfile.new("#{$$}_spell_ret")
	f_ret.close

	f_words.write words.join(" ")
	f_words.close

	cmd = `spell #{f_words.path} > #{f_ret.path}`
	f=File.open(f_ret.path)
	no_spell_words = f.read.split("\n")
	f.close
	words - no_spell_words
end

puts "usage : #{$0[2..-1]} [-all|-all2] words_count min_len max_len"

if ARGV[0] =~ /^-all$/
	puts spell_words(rand_words(ARGV[1].to_i,ARGV[2].to_i,ARGV[3].to_i)).join(" ")
elsif ARGV[0] =~ /^-all2$/
	puts spell_words2(rand_words(ARGV[1].to_i,ARGV[2].to_i,ARGV[3].to_i)).join(" ")
else
	rand_words(ARGV[0].to_i,ARGV[1].to_i,ARGV[2].to_i).each do |w|
		printf w+" " if spell_word(w)
	end
end

测试如下:

wisy@wisy-ThinkPad-X61:~/src/ruby_src$ time ./a.rb -all2 2000 3 12
usage : a.rb [-all|-all2] words_count min_len max_len
using spell_words2...
pus tty sis aft cut

real	0m4.443s
user	0m0.163s
sys	0m0.050s

貌似spell_words2的方法比spell_words还要略慢啊!?2种方法基本效率差不多啊!一种功能却写了3种实现方法,本猫是不是闲的蛋疼呢?也不尽然,只能说多一种尝试,多一种可能吧!大笑

随机产生单词然后判别其是否是真正的(可拼写的)单词:)