关于happybase中 row_prefix 参数

首页 > 代码库 > 关于happybase中 row_prefix 参数

关于happybase中 row_prefix 参数

2024-08-06 08:21:02 217人阅读

起因: 使用happybase 访问hbase 时

    def scan(self, row_start=None, row_stop=None, row_prefix=None,
             columns=None, filter=None, timestamp=None,
             include_timestamp=False, batch_size=1000, scan_batching=None,
             limit=None, sorted_columns=False):

scan 函数中有一个row_prefix 参数，而这个参数在java client 对应函数并没有出现，它到底有什么作用呢

查看源码，我们能看到

        if row_prefix is not None:
            if row_start is not None or row_stop is not None:
                raise TypeError(
                    "'row_prefix' cannot be combined with 'row_start' "
                    "or 'row_stop'")

            row_start = row_prefix
            row_stop = str_increment(row_prefix)

str_increment 的具体代码

def str_increment(s):
    """Increment and truncate a byte string (for sorting purposes)

    This functions returns the shortest string that sorts after the given
    string when compared using regular string comparison semantics.

    This function increments the last byte that is smaller than ``0xFF``, and
    drops everything after it. If the string only contains ``0xFF`` bytes,
    `None` is returned.
    """
    for i in xrange(len(s) - 1, -1, -1):
        if s[i] != '\xff':
            return s[:i] + chr(ord(s[i]) + 1)

    return None

看完代码大家应该很明白了,row_prefix 被转换成了row_start 和row_stop。

当有如下场景

微博表

用户ID_微博ID

假定我们想获取此用户的所有微博，在scan时就没有必要设定scan范围 ‘用户ID_0‘ ~ ‘用户ID_a‘

而可以直接使用row_prefix = ‘用户ID‘

PS：回头我会提供str_increment 的java 实现

关于happybase中 row_prefix 参数

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > 关于happybase中 row_prefix 参数

关于happybase中 row_prefix 参数

看完仍有疑问？有类似问题直接问程序猿