首页 > 代码库 > [译] Python 2.7.6 标准库——字符串

[译] Python 2.7.6 标准库——字符串

译自:https://docs.python.org/2/library/index.html业余时间翻译,有时间有心情有思路有冲动就翻译,部分翻译为意译或替换为更容易理解的意思,水平特有限,仅供自己参考

格式有时间再调

 

7. 字符串服务

 

  • 7.1. string — 通用字符串操作
  • 7.2. re — 正则表达式操作
  • 7.3. struct — 二进制字符串操作
  • 7.4. difflib — 计算序列?
  • 7.5. StringIO — 以文件形式读取字符串
  • 7.6. cStringIOStringIO更快一些的版本
  • 7.7. textwrap — 文本包装和过滤
  • 7.8. codecs — 编解码器注册和基本类
  • 7.9. unicodedata — Unicode库相关
  • 7.10. stringprep — 网络字符串准备?
  • 7.11. fpformat — 浮点数转换

7.1. string— 通用字符串操作

 

源代码: Lib/string.py

 

string模块包含一系列有用的常量和类,以及一些已弃用的函数方法。另外,Python的内置string类支持序列类型的通用方法(序列类型包含str、unicode、列表、元组、字节数组、buffer、xrange块),和字符串特定方法。可以使用模板字符串(或%操作符)来输出格式化字符串。同时,在re模块里还有一些基于正则表达式的字符串函数。

7.1.1. String常量

string.ascii_letters

大小写ascii字母常量,非本地依赖

string.ascii_lowercase

小写字母‘abcdefghijklmnopqrstuvwxyz‘,非本地依赖且不会改变。

string.ascii_uppercase

大写字母‘ABCDEFGHIJKLMNOPQRSTUVWXYZ‘,非本地依赖且不会改变

string.digits

字符串‘0123456789‘.

string.hexdigits

字符串‘0123456789abcdefABCDEF‘.

string.letters

大小写字母,本地依赖,执行locale.setlocale()方法时会更新。

string.lowercase

包含所有小写字母的字符串。在大多数系统中,该字符串‘abcdefghijklmnopqrstuvwxyz‘。本地依赖,执行locale.setlocale()方法时会更新

string.octdigits

字符串‘01234567‘.

string.punctuation

ASCII字符中,在C locale被认为是标点符号的字符组成的字符串

string.printable

可打印字符,由digits(数字)、letters(字母)、punctuation(标点符号)和whitespace(空格符)。

string.uppercase

包含所有大写字母的字符串。在大多数系统中,该字符串‘ABCDEFGHIJKLMNOPQRSTUVWXYZ‘。本地依赖,执行locale.setlocale()方法时会更新
string.whitespace

空格符字符串。在多数系统中,该字符串包含空格、制表符、换行符、回车符、换页符、垂直制表符。

7.1.2. String格式化

注:2.6版本新特性

内置的str和unicode类通过str.format()方法(PEP 3101)提供了复杂变量替换和值格式化的能力。string模块中的Formatter类允许使用类似于内置的format()方法的实现来创建和定制字符串格式化行为

class string.Formatter

public方法:

format(format_string, *args, **kwargs)

主要方法。使用格式化字符串和任意变量作为参数。format()只是对vformat()的调用包装。

vformat(format_string, args, kwargs)

对格式化的实际操作。作为单独的函数实现,方便传入预定义参数字典,而不是作为单一参数使用*args和**kwargs语义传入未包装和再包装的字典?。vformat() 分拆格式化字符串为字符数据,并替换相应的域。它将会调用下面介绍的各种方法。

另外,Formatter类定义了一系列用来让子类替换(重新实现)的方法:

parse(format_string)

Loop over the format_string and return an iterable of tuples (literal_text, field_name, format_spec, conversion). This is used by vformat() to break the string into either literal text, or replacement fields.

The values in the tuple conceptually represent a span of literal text followed by a single replacement field. If there is no literal text (which can happen if two replacement fields occur consecutively), then literal_text will be a zero-length string. If there is no replacement field, then the values of field_name, format_spec and conversion will be None.

get_field(field_name, args, kwargs)

Given field_name as returned by parse() (see above), convert it to an object to be formatted. Returns a tuple (obj, used_key). The default version takes strings of the form defined in PEP 3101, such as “0[name]” or “label.title”. args and kwargs are as passed in to vformat(). The return value used_key has the same meaning as the key parameter to get_value().

get_value(key, args, kwargs)

Retrieve a given field value. The key argument will be either an integer or a string. If it is an integer, it represents the index of the positional argument in args; if it is a string, then it represents a named argument in kwargs.

The args parameter is set to the list of positional arguments to vformat(), and the kwargs parameter is set to the dictionary of keyword arguments.

For compound field names, these functions are only called for the first component of the field name; Subsequent components are handled through normal attribute and indexing operations.

So for example, the field expression ‘0.name’ would cause get_value() to be called with a key argument of 0. The name attribute will be looked up after get_value() returns by calling the built-in getattr() function.

If the index or keyword refers to an item that does not exist, then an IndexError or KeyError should be raised.

check_unused_args(used_args, args, kwargs)

Implement checking for unused arguments if desired. The arguments to this function is the set of all argument keys that were actually referred to in the format string (integers for positional arguments, and strings for named arguments), and a reference to the args and kwargs that was passed to vformat. The set of unused args can be calculated from these parameters. check_unused_args() is assumed to raise an exception if the check fails.

format_field(value, format_spec)

format_field() simply calls the global format() built-in. The method is provided so that subclasses can override it.

convert_field(value, conversion)

Converts the value (returned by get_field()) given a conversion type (as in the tuple returned by the parse() method). The default version understands ‘s’ (str), ‘r’ (repr) and ‘a’ (ascii) conversion types.