首页 > 代码库 > arff文件和txt文件之间的转换_python

arff文件和txt文件之间的转换_python

在github上,已经有前辈对这两种格式的文件间的转换提供了相应的python库,比如liac-arff: https://github.com/renatopp/liac-arff。但是当程序比较复杂时,再调用这么多外部文件,未免显得冗杂;而且这些arff库,在attribute和值数目不一致时,会报错。所以,在师兄的支持下,我参考overflow写了两个简单的转换函数。(用时5个多小时。。。以后要效率啊)

arff2txt():

将arff文件转换成txt格式:

import reimport sysdef arff2txt(filename):    txtfile = open(./generatedtxt.txt,w)    arr = []    lines = []    arff_file = open(filename)    for line in arff_file:        if not (line.startswith("@")):            if not (line.startswith("%")):                line = line.strip("\n")                line = line.split(,)                arr.append(line)    del arr[0]    for child in arr:        del child[10]        if child[9] == "True":            child[9] = 1        else:            child[9] = 0        lines.append(\t.join(map(str,child)))    result = \n.join(lines)    print result    txtfile.writelines(result)    txtfile.close()

 

txt2arff():

将txt文件转换成arff()格式:

def txt2arff(filename, value):    with open(./generatedarff.arff, w) as fp:        fp.write(‘‘‘@relation ExceptionRelation@attribute ID string@attribute Thrown numeric@attribute SetLogicFlag numeric@attribute Return numeric@attribute LOC numeric@attribute NumMethod numeric@attribute EmptyBlock numeric@attribute RecoverFlag numeric@attribute OtherOperation numeric@attribute class-att {True,False}@data‘‘‘)        with open(filename) as f:            contents = f.readlines()        for content in contents:            lines = content.split(\t)            lines = [line.strip() for line in lines]            if lines[9] == 1:                lines[9] = "True"                lines.append({ + str(value) + })            else:                lines[9] = "False"                lines.append({1})            array = ,.join(lines)            fp.write("%s\n" % array)