首页 > 代码库 > 分布式搜索引擎Elasticsearch的简单使用

分布式搜索引擎Elasticsearch的简单使用

官方网址:https://www.elastic.co/products/elasticsearch/

一、特性

1、支持中文分词

2、支持多种数据源的全文检索引擎

3、分布式

4、基于lucene的开源搜索引擎

5、Restful api

二、资源

  • smartcn, 默认的中文分词 :https://github.com/elasticsearch/elasticsearch-analysis-smartcn

  • mmseg :https://github.com/medcl/elasticsearch-analysis-mmseg

  • ik:https://github.com/medcl/elasticsearch-analysis-ik

  • pinyin, 拼音分词可用于输入拼音提示中文 :https://github.com/medcl/elasticsearch-analysis-pinyin

  • stconvert, 中文简繁体互换 :https://github.com/medcl/elasticsearch-analysis-stconvert

  • elasticsearch-servicewrapper:https://github.com/elasticsearch/elasticsearch-servicewrapper

  • Elastic HQ,elasticsearch的监控工具:http://www.elastichq.org

  • elasticsearch-rtf :https://github.com/medcl/elasticsearch-rtf

三、安装

  • 服务器:Linux(centos 6.x)

  • java环境:JDK 1.8.0

  • elasticsearch:2.3.1

  • elasticsearch-jdbc(数据源插件):2.3.1

  • IK Analysis(中文分词插件):1.9.1

1、安装Java

yum install java-1.8.0

2、安装Elasticsearch

#创建.repo文件(elasticsearch.repo)cat >> /etc/yum.repos.d/elasticsearch.repo << EOF[elasticsearch-2.x]name=Elasticsearch repository for 2.x packagesbaseurl=https://packages.elastic.co/elasticsearch/2.x/centosgpgcheck=1gpgkey=https://packages.elastic.co/GPG-KEY-elasticsearchenabled=1EOF#导入key:rpm --import https://packages.elastic.co/GPG-KEY-elasticsearchyum install elasticsearch

3、创建目录

mkdir -p  /data/elasticsearch/datamkdir -p  /data/elasticsearch/logschown -R elasticsearch /data/elasticsearch/datachown -R elasticsearch /data/elasticsearch/logs

4、生成配置文件(/etc/elasticsearch/elasticsearch.yml)

#集群名(同一个集群,名称必须相同)cluster.name: my-application#服务节点名(每个服务节点不一样)node.name: node-1#数据存储路径path.data: /data/elasticsearch/data#服务日志路径path.logs: /data/elasticsearch/logs#服务ip地址network.host: 0.0.0.0#服务端口http.port: 9200

四、IK的安装

1.安装maven工具

wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repoyum install apache-maven

2.下载ik源码包

git clone https://github.com/medcl/elasticsearch-analysis-ik

3.生成jar插件包

mvn cleanmvn compilemvn packageunzip target/releases/elasticsearch-analysis-ik-*.zipcp -r target/releases/ /usr/share/elasticsearch/plugins/ik

4.配置词库(ik自带搜狗词库)

配置:/usr/share/elasticsearch/plugins/ik/config/ik/IKAnalyzer.cfg.xml

<entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic;custom/sougou.dic</entry>

将jar包复制到Elasticsearch的plugins/analysis-ik 目录下,再把解压出的ik目录(配置和词典等),复制到Elasticsearch的config 目录下。然后编辑配置文件elasticsearch.yml ,在后面加一行:

index.analysis.analyzer.ik.type : "ik"

重启service elasticsearch restart

 

然后录入数据,创建索引

 

五、elasticsearch-jdbc

1、使用feeder方式

wget http://xbib.org/repository/org/xbib/elasticsearch/importer/elasticsearch-jdbc/2.3.1.0/elasticsearch-jdbc-2.3.1.0-dist.zipunzip elasticsearch-jdbc-2.3.1.0-dist.zip

编辑数据导入脚本import.sh

export JDBC_IMPORTER_HOME=/elasticsearch-jdbc-2.3.2.0bin=$JDBC_IMPORTER_HOME/binlib=$JDBC_IMPORTER_HOME/libecho ‘{"type" : "jdbc","jdbc": {"url":"jdbc:mysql://127.0.0.1:3306/dbtest","user":"root","password":"123456","sql":"select * from test_tb","index" : "customer","type" : "external"}}‘ | java     -cp "${lib}/*"     -Dlog4j.configurationFile=${bin}/log4j2.xml     org.xbib.tools.Runner     org.xbib.tools.JDBCImporter

测试

curl ‘localhost:9200/customer/external/_search?pretty&q=*‘

2、使用river方式

#安装elasticsearchcurl -OL https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.4.2.zipcd $ES_HOMEunzip path/to/elasticsearch-1.4.2.zip#安装JDBC插件./bin/plugin --install jdbc --url http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-river-jdbc/1.4.0.6/elasticsearch-river-jdbc-1.4.0.6-plugin.zip#下载mysql drivercurl -o mysql-connector-java-5.1.33.zip -L ‘http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.33.zip/from/http://cdn.mysql.com/‘cp mysql-connector-java-5.1.33-bin.jar $ES_HOME/plugins/jdbc/ chmod 644 $ES_HOME/plugins/jdbc/*#启动elasticsearch./bin/elasticsearch#停止rivercurl -XDELETE ‘localhost:9200/_river/my_jdbc_river/‘

JDBC插件参数

curl -XPUT ‘localhost:9200/_river/my_jdbc_river/_meta‘ -d ‘{    "type" : "jdbc",    "jdbc" : {        "url" : "jdbc:mysql://localhost:3306/test",        "user" : "",        "password" : "",        "sql" : "select * from orders",        "index" : "myindex",        "type" : "mytype",        ...    }}‘如果一个数组传递给jdbc字段,多个river源也是可以的curl -XPUT ‘localhost:9200/_river/my_jdbc_river/_meta‘ -d ‘{     <river parameters>    "type" : "jdbc",    "jdbc" : [ {         <river definition 1>    }, {         <river definition 2>    } ]}‘curl -XPUT ‘localhost:9200/_river/my_jdbc_river/_meta‘ -d ‘{     "type" : "jdbc",     "jdbc" : {         "driver" : "com.mysql.jdbc.Driver",         "url" : "jdbc:mysql://localhost:3306/test",         "user" : "root",         "password" : "123456",         "sql" : "select * from test.student;",         "interval" : "30",         "index" : "test",         "type" : "student"     } }’

查看ES是否已经同步了这些数据  

curl -XGET ‘localhost:9200/test/student/_search?pretty&q=*‘

官网地址:https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html

 

 

参考

https://www.elastic.co/guide/en/elasticsearch/guide/current/empty-search.html

https://github.com/medcl/elasticsearch-analysis-ik

http://blog.csdn.net/clementad/article/details/46898013

https://endymecy.gitbooks.io/elasticsearch-guide-chinese/content/elasticsearch-river-jdbc.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html

https://github.com/jprante/elasticsearch-jdbc

http://www.voidcn.com/blog/wojiushiwo987/article/p-6058574.html

http://leotse90.com/2015/11/11/ElasticSearch%E4%B8%8EMySQL%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A5%E4%BB%A5%E5%8F%8A%E4%BF%AE%E6%94%B9%E8%A1%A8%E7%BB%93%E6%9E%84/

http://www.jianshu.com/p/638ff7b848cc

http://www.cnblogs.com/buzzlight/p/logstash_elasticsearch_kibana_log.html

分布式搜索引擎Elasticsearch的简单使用