首页 > 代码库 > 读取非文本格式文件进行修改的 Python 问题
读取非文本格式文件进行修改的 Python 问题
A question not published(I solve it when I write) in 2017.3.20
I just wanted to publish it to stackoverflow, but I solved it.
1.Platform: windows 8
I‘m just trying to modify a file in python 2.7, which is not in plain text
(like .txt, .csv) but docx format,in order to add one or more bytes to
it.
However, when I tried to read it through commands:
with open(‘GGG.docx‘, ‘rb+‘) as f:
contents = f.read() # type(contents): str
with open(‘GGG2.docx‘, ‘rb+‘) as f:
f.writelines(contents)
, the result only appears to be write str to GGG2.docx, instead of bytes
Format of GGG.docx:
504b 0304 1400 0600 0800 0000 2100 c1f0
070c 7801 0000 5a06 0000 1300 0802 5b43
6f6e 7465 6e74 5f54 7970 6573 5d2e 786d
6c20 a204 0228 a000 0200 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
...
Format of GGG2.docx:
504b 0304 1400 0600 0800 0000 2100 c1f0
070c 7801 0000 5a06 0000 1300 0802 5b43
6f6e 7465 6e74 5f54 7970 6573 5d2e 786d
6c20 a204 0228 a000 0200 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
They look the same in editor(sublime), but GGG2.docx is in fact
the plain text ‘504B 0304 ...‘
2. Way 2:
A similar question is ‘http://stackoverflow.com/questions/21677646/python-writing-a-two-byte-string-as-a-single-byte-hex-character-to-a-binary-fi
‘, but it doesn‘t work.
3. Way 3:
In http://stackoverflow.com/questions/12092527/python-write-bytes-to-file,
a method was proposed like this:
import io
with open(‘GGG.docx‘, ‘rb+‘) as f:
contents = f.read() # type(contents): str
with io.open(‘GGG2.docx‘, ‘rb+‘) as f:
f.writelines(contents)
But in the link the user suggests it(open instead of io.open) is useful in windows.
Now it works.....
Thank you all the guys.
Yep, I just want to give complaints to the designer of python that made this.
读取非文本格式文件进行修改的 Python 问题