首页 > 代码库 > Java API 读取HDFS的单文件
Java API 读取HDFS的单文件
HDFS上的单文件:
-bash-3.2$ hadoop fs -ls /user/pms/ouyangyewei/data/input/combineorder/repeat_rec_category Found 1 items -rw-r--r-- 2 deploy supergroup 520 2014-08-14 17:03 /user/pms/ouyangyewei/data/input/combineorder/repeat_rec_category/repeatRecCategory.txt文件内容:
-bash-3.2$ hadoop fs -cat /user/pms/ouyangyewei/data/input/combineorder/repeat_rec_category/repeatRecCategory.txt | more 8104 960985 5472 971917 5320 971895 971902 971922 958261 972047 972050
Java API使用FileSystem方式 读取HDFS单文件的方法
/** * 获取可重复推荐的类目,以英文逗号分隔 * @param filePath * @param conf * @return */ public String getRepeatRecCategoryStr(String filePath) { final String DELIMITER = "\t"; final String INNER_DELIMITER = ","; String categoryFilterStrs = new String(); BufferedReader br = null; try { FileSystem fs = FileSystem.get(new Configuration()); FSDataInputStream inputStream = fs.open(new Path(filePath)); br = new BufferedReader(new InputStreamReader(inputStream)); String line = null; while (null != (line = br.readLine())) { String[] strs = line.split(DELIMITER); categoryFilterStrs += (strs[0] + INNER_DELIMITER); } } catch (IOException e) { e.printStackTrace(); } finally { if (null != br) { try { br.close(); } catch (IOException e) { e.printStackTrace(); } } } return categoryFilterStrs; }
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。