首页 > 代码库 > 【elasticsearch】python下的使用

【elasticsearch】python下的使用

有用链接:

最有用的:http://es.xiaoleilu.com/054_Query_DSL/70_Important_clauses.html

不错的博客:http://www.cnblogs.com/letong/p/4749234.html

其他1:http://www.jianshu.com/p/14aa8b09c789

其他2:http://xiaorui.cc/

 

1.查询索引中的所有内容

#coding=utf8
from elasticsearch import Elasticsearch

es = Elasticsearch([{host:x.x.x.x,port:9200}])
index = "test"
query = {"query":{"match_all":{}}}
resp = es.search(index, body=query)
resp_docs = resp["hits"]["hits"]
total = resp[hits][total]

print total  #总共查找到的数量
print resp_docs[0][_source][@timestamp] #输出一个字段

 

2.用scroll分次查询所有内容+复杂条件

过滤条件:字段A不为空且字段B不为空,且时间在过去10天~2天之间

#coding=utf8
from elasticsearch import Elasticsearch
import json
import datetime

es = Elasticsearch([{host:x.x.x.x,port:9200}])
index = "test"
query = {         "query":{             "filtered":{                 "query":{                     "bool":{                         "must_not":{"term":{"A":""}},                         "must_not":{"term":{"B":""}},                         }                     },                 "filter":{
                    "range":{@timestamp:{gte:now-10d,lt:now-2d}}
                    }
                }            }         }
resp = es.search(index, body=query, scroll="1m",size=100)
scroll_id = resp[_scroll_id]
resp_docs = resp["hits"]["hits"]
total = resp[hits][total]
count = len(resp_docs)
datas = resp_docs
while len(resp_docs) > 0:
    scroll_id = resp[_scroll_id]
    resp = es.scroll(scroll_id=scroll_id, scroll="1m")
    resp_docs = resp["hits"]["hits"]
    datas.extend(resp_docs)
    count += len(resp_docs)
    if count >= total:
        break

print len(datas)

 

3.聚合

查看一共有多少种@timestamp字段

#coding=utf8
from elasticsearch import Elasticsearch

es = Elasticsearch([{host:x.x.x.x,port:9200}])
index = "test"
query = {"aggs":{"all_times":{"terms":{"field":"@timestamp"}}}}
resp = es.search(index, body=query)
total = resp[hits][total]
print total
print resp["aggregations"]

 

【elasticsearch】python下的使用