首页 > 代码库 > 大批量数据导入(Bulk Data Loading)

大批量数据导入(Bulk Data Loading)

PSQL

.csv文件中,没有引号;
直接写值,不管是啥数据类型;

存在表,直接导入数据;

  1. bin/psql.py -t EXAMPLE localhost data.csv

建表,导数据

  1. ./psql.py localhost:2222 XXX.sql XXX.csv


MapReduce


etc/profile

  1. export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/CI-zcl/hbase-0.98.6.1-hadoop2/lib/hbase-protocol-0.98.6.1-hadoop2.jar


报错

  1. 17/02/14 16:50:08 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
  2. java.net.ConnectException: 拒绝连接
  3. at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  4. at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
  5. at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
  6. at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
  7. 17/02/14 16:50:08 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=master:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
  8. 17/02/14 16:50:08 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
  9. 17/02/14 16:50:09 INFO zookeeper.ClientCnxn: Opening socket connection to server master/10.2.32.22:2181. Will not attempt to authenticate using SASL (unknown error)
  10. 17/02/14 16:50:09 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect

原因及解决

看日志,很重要
技术分享
 
端口改回2181就好使了


在PhoenixHome里面执行:

  1. hadoop jar /CI-zcl/phoenix-4.1.0-bin/phoenix-4.1.0-client-hadoop2.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -t TEST -i /data/tb1.csv -z master:2181







大批量数据导入(Bulk Data Loading)