python decode unicode encode

首页 > 代码库 > python decode unicode encode

2024-11-10 08:00:39 204人阅读

字符串在Python内部的表示是unicode编码，因此，在做编码转换时，通常需要以unicode作为中间编码，即先将其他编码的字符串解码（decode）成unicode，再从unicode编码（encode）成另一种编码。

代码中字符串的默认编码与代码文件本身的编码一致，以下是不一致的两种:

1. s = u‘你好‘

该字符串的编码就被指定为unicode了，即python的内部编码，而与代码文件本身的编码(查看默认编码：import sys print(‘hello‘,sys.getdefaultencoding()) ascii 。设置默认编码：import sys reload(sys) sys.setdefaultencoding(‘utf-8‘)))无关。因此，对于这种情况做编码转换，只需要直接使用encode方法将其转换成指定编码即可.

2. # -*- coding: utf-8 -*-

s = ‘你好’

此时为utf-8编码，ascii编码不能显示汉字

isinstance(s, unicode) #用来判断是否为unicode ,是返回True，不是返回False

unicode(str,‘gb2312‘)与str.decode(‘gb2312‘)是一样的，都是将gb2312编码的str转为unicode编码

使用str.__class__可以查看str的编码形式

原理说了半天，最后来个包治百病的吧：）

#!/usr/bin/env python
#coding=utf-8
s="中文"

if isinstance(s, unicode):
#s=u"中文"
print s.encode(‘gb2312‘)
else:
#s="中文"
print s.decode(‘utf-8‘).encode(‘gb2312‘)

语音模块代码：

# -*- coding: utf-8 -*-importimport sysprint(‘hello‘,sys.getdefaultencoding())def xfs_frame_info(words):    #decode utf-8 to python internal unicode coding    isinstance(words,unicode)    wordu = words.decode(‘utf-8‘)    #encode python unicode to gbk    data = http://www.mamicode.com/wordu.encode(‘gbk‘)        length = len(data) + 2    frame_info = bytearray(5)    frame_info[0] = 0xfd    frame_info[1] = (length >> 8)    frame_info[2] = (length & 0x00ff)    frame_info[3] = 0x01    frame_info[4] = 0x01           buf = frame_info + data    print("buf:",buf)    return bufif __name__ == "__main__":    print("hello world")    words1= u‘你好‘    #encodetype = isinstance(words1,unicode)    #print("encodetype",encodetype)    print("origin unicode", words1)        words= words1.encode(‘utf-8‘)    print("utf-8 encoded", words)    a = xfs_frame_info(words)    print(‘a‘,a)if __name__ == "__main__":    print("hello world")    words1= ‘你好‘    print("oringe utf-8 encode:",words1)    encodetype = isinstance(words1,unicode)    wordu = words1.decode(‘utf-8‘)    print("unicode from utf-8 decode:",wordu)    #encodetype = isinstance(words1,utf-8)    #encodetype = isinstance(words1,‘ascii‘)    #print("encodetype",encodetype)    #print("origin unicode", words1)        word_utf8 = wordu.encode(‘utf-8‘)    #encodetype2 = isinstance(words,utf8)    #print("encodetype2",encodetype2)    print("utf-8 encoded",word_utf8)    a = xfs_frame_info(word_utf8)    print(‘a‘,a)

你好前不加u‘‘时，要多一步decode为unicode

python decode unicode encode

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > python decode unicode encode

python decode unicode encode

看完仍有疑问？有类似问题直接问程序猿