一起研究haoop（二）：Java代码操作HDFS

芝加哥09

浏览: 59307 次

最近访客更多访客>>

独善其身008

我啊来了

tiamofr

zzhouzz

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

hadoop
Java

hadoop big data 海里数据

在此就不详细介绍HDFS是啥东西了，你只要问问google大神或度娘就一清二楚了。

在此我主要用java代码实现对HDFS的增、删、查操作。

由于本工程是用Maven管理的，则pom文件如下：

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.cloud.hdfs</groupId>
  <artifactId>java-hdfs</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>java-hdfs</name>
  <url>http://maven.apache.org</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-core</artifactId>
        <version>1.2.1</version>
    </dependency>
  </dependencies>
</project>

HDFSClient.java

package com.cloud.hdfs;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;

public class HDFSClient {
    private FileSystem fileSystem;

    /**
     * 在创建对象时，把fileSystem实例化。
     * @param conf
     * @throws IOException
     */
    public HDFSClient(Configuration conf) throws IOException {
        fileSystem = FileSystem.get(conf);
    }

    public void close() throws IOException {
        fileSystem.close();
    }

    /**
     * 实现的命令：
     * hadoop fs -ls /chris
     * @param folder
     * @throws IOException
     */
    public void ls(String folder) throws IOException {
        Path path = new Path(folder);
        FileStatus[] fileStatus = fileSystem.listStatus(path);
        System.out.println("====================================================");
        for (FileStatus fs : fileStatus) {
            System.out.println("name: " + fs.getPath() +" folder: " + fs.isDir() + " size: " + fs.getLen() + " permission: " + fs.getPermission());
        }
        System.out.println("====================================================");
    }

    /**
     * 实现的命令：
     * hadoop fs -mkdir /chris/client
     * @param folder
     * @throws IOException
     */
    public void mkdir(String folder) throws IOException {
        Path path = new Path(folder);
        if (!fileSystem.exists(path)) {
            fileSystem.mkdirs(path);
            System.out.println("Created " + folder);
        }
    }

    /**
     * 实现的命令：
     * haoop fs -rmr /chris/client
     * @param folder
     * @throws IOException
     */
    public void rmr(String folder) throws IOException {
        Path path = new Path(folder);
        fileSystem.deleteOnExit(path);
        System.out.println("Delete the " + folder);
    }

    /**
     * 实现的命令：
     * hadoop fs -copyFromLocal /home/chris/test /chris/
     * 注意：此处由于ubuntu操作系统是安装在win7的虚拟机上的，而这段程序是在win7下run的
     *      所以此处的本地路径就是win7的。
     * @param local
     * @param remote
     * @throws IOException
     */
    public void copyFile(String local, String remote) throws IOException {
        fileSystem.copyFromLocalFile(new Path(local), new Path(remote));
        System.out.println("Copy from " + local +" to " + remote);
    }

    /**
     * 实现命令：
     * hadoop fs -cat /chris/test
     * @param file
     * @throws IOException
     */
    public void cat(String file) throws IOException {
        Path path = new Path(file);
        FSDataInputStream in = fileSystem.open(path);
        IOUtils.copyBytes(in, System.out, 4096, false);
        IOUtils.closeStream(in);
    }

    /**
     * 实现的命令：
     * hadoop fs -copyToLocal /tmp/core-site.xml /home/chris
     * 注意：此处的本地也是win7的路径，理由同上。
     * @param remote
     * @param local
     * @throws IOException
     */
    public void download(String remote, String local) throws IOException {
        fileSystem.copyToLocalFile(new Path(remote), new Path(local));
        System.out.println("Download from " + remote + " to " + local);
    }

    /**
     * 这个是没法通过一条命令来实现的。
     * 但是创建文件是有命令的：hadoop fs -touchz /chris/hehe
     * 只不过里面的内容是空的。
     * @param file
     * @param content
     * @throws IOException
     */
    public void createFile(String file, String content) throws IOException {
        byte[] buff = content.getBytes();
        FSDataOutputStream out = fileSystem.create(new Path(file));
        out.write(buff, 0, buff.length);
        out.close();
    }

    public static void main(String[] args) throws Exception {
        //在实现这个config时，它会自动去加载resources下的这几个配置文件
        Configuration config = new Configuration();
        HDFSClient client = new HDFSClient(config);
//        client.mkdir("/chris/client");
//        client.ls("/chris");
//        client.rmr("/chris/client");
//        client.ls("/chris");
//        client.cat("/chris/test");
//        client.copyFile("src/main/resources/core-site.xml", "/tmp/");
//        client.download("/chris/test", "src/main/resources/");
//        client.createFile("/chris/client.txt", "ddddddddd");
        client.close();
    }
}

0
顶

0
踩

分享到：

利用JS监控内存 | 吃透Java中的动态代理

2014-03-26 23:01
浏览 2273
评论(0)
分类:企业架构
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

一起研究haoop（二）：Java代码操作HDFS

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

一起研究haoop（二）：Java代码操作HDFS

评论

发表评论

相关推荐

利用带返回值多线程实现Hadoop中的WordCount实例

基于lucene5.5.0的创建索引与查询

Hadoop深入浅出实战经典视频教程（共22讲）

统计邮件的打开率

动手写批量邮件发送器

Hadoop深入浅出实战经典–第02讲

网络爬虫：利用Selenium实现登录

动手写最优的单例模式

自己动手写单向链表

吃透Java中的动态代理

自己动手写SSO（单点登录）

自己动手写Tomcat

maven搭建SSH工程

自己动手写MVC框架

一起研究hadoop（一）：hadoop的伪分布式安装配置

Spring + SpringMVC + Mybatis + Maven 搭建Web工程

Velocity详解——(maven管理)

观察者模式与spring的结合

利用注解模拟权限管理

用反射机制调用类的main方法

最近访客更多访客>>