Python实际应用-数据处理(二) 数据特定格式变化

首页 > 代码库 > Python实际应用-数据处理(二) 数据特定格式变化

Python实际应用-数据处理(二) 数据特定格式变化

2024-07-23 21:09:04 221人阅读

目前的状况是：

1. 在我一个文件夹下面有许多文件名是这样的数据文件

part-m-0000

part-m-0001

part-m-0002

part-m-0003

...

2. 其中每个文件夹里的数据是这样格式：

"460030730101160","3","0","0","0","2013/8/31 0:21:42"
"460036745672363","3","0","0","0","2013/8/31 0:21:31"
"460030250931114","3","1307","1","0","2013/8/31 0:21:40"
"460030250942643","3","0","0","0","2013/8/31 0:21:40"
"460036650411006","3","1021","1","0","2013/8/31 0:21:39"
"000000000009674","8","0","0","0","2013/8/31 0:12:28"
"000000000005661","8","0","0","0","2013/8/31 0:12:29"
"460030731390121","3","0","0","0","2013/8/31 21:54:00"
"460030256111396","3","0","0","0","2013/8/31 21:54:00"
"460030207447762","3","0","0","0","2013/8/31 21:53:58"
"460030250939916","3","0","0","0","2013/8/31 21:53:58"
"460030957972011","3","1613","0","0","2013/8/31 21:53:51"
"460030237206739","3","0","0","0","2013/8/31 21:53:59"
...

现在需要将数字上的引号去掉，同时将最后一列的时间的小时提取出来，下面是我用python处理的过程：

1. 先遍历当前文件夹下所有的以‘part‘开头的文件；

2. 对每一个文件，读取每一行，根据“，”进行分割；

3. 之后读每一部分取引号中间的部分，对最后一项时间取小时数部分，这里需要判断小时的位数是1还是2；

4. 每读一行就写一行

下面是具体的待买

#coding: utf-8
import os
for root,dir,files in os.walk("./"):
        for file in files:
                if file.startswith("part"):
                        filepath = "./"+file #This is the current file path
                        print filepath
                        newfilepath = "./data_handled/"+file[7:] # This is file used to write into
                        file = open(filepath)
                        newfile = open(newfilepath,'w')
                        for line in file:
                                string = ""
                                line_ = line.split(',')
                                for i in range(len(line_)-1):
                                        j = line_[i][1:len(line_[i])-1] #Delte the " "
                                        string += j
                                        string += ','
                                len1 = len(line_)
                                if len(line_[len1-1]) > 12:
                                        if line_[len1-1][12]==':':
                                                k = line_[len1-1][11:12]
                                        else:
                                                k = line_[len1-1][11:13]
                                else :
                                        k = "-1"
                                string += k
                                newfile.write(string+"\n")
                        newfile.close()

Python实际应用-数据处理(二) 数据特定格式变化

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > Python实际应用-数据处理(二) 数据特定格式变化

Python实际应用-数据处理(二) 数据特定格式变化

看完仍有疑问？有类似问题直接问程序猿