org.apache.hadoop.fs.kfs.package.html Maven / Gradle / Ivy

Go to download
Show more of this group Show more artifacts with this name
Show all versions of hadoop-cdh4 Show documentation
Shaded version of CDH4 Hadoop for Presto
The newest version!






A client for the Kosmos filesystem (KFS)

Introduction

This pages describes how to use Kosmos Filesystem 
( KFS ) as a backing
store with Hadoop.   This page assumes that you have downloaded the
KFS software and installed necessary binaries as outlined in the KFS
documentation.

Steps

        
          In the Hadoop conf directory edit core-site.xml,
          add the following:
            <property>
  <name>fs.kfs.impl</name>
  <value>org.apache.hadoop.fs.kfs.KosmosFileSystem</value>
  <description>The FileSystem for kfs: uris.</description>
</property>
            

          
In the Hadoop conf directory edit core-site.xml,
          adding the following (with appropriate values for
          <server> and <port>):
            <property>
  <name>fs.default.name</name>
  <value>kfs://<server:port></value> 
</property>

<property>
  <name>fs.kfs.metaServerHost</name>
  <value><server></value>
  <description>The location of the KFS meta server.</description>
</property>

<property>
  <name>fs.kfs.metaServerPort</name>
  <value><port></value>
  <description>The location of the meta server's port.</description>
</property>


          

          Copy KFS's  kfs-0.1.jar  to Hadoop's lib directory.  This step
          enables Hadoop's to load the KFS specific modules.  Note
          that, kfs-0.1.jar was built when you compiled KFS source
          code.  This jar file contains code that calls KFS's client
          library code via JNI; the native code is in KFS's 
          libkfsClient.so  library.
          

           When the Hadoop map/reduce trackers start up, those
processes (on local as well as remote nodes) will now need to load
KFS's  libkfsClient.so  library.  To simplify this process, it is advisable to
store libkfsClient.so in an NFS accessible directory (similar to where
Hadoop binaries/scripts are stored); then, modify Hadoop's
conf/hadoop-env.sh adding the following line and providing suitable
value for <path>:
export LD_LIBRARY_PATH=<path>



          
Start only the map/reduce trackers
          

          example: execute Hadoop's bin/start-mapred.sh
        



If the map/reduce job trackers start up, all file-I/O is done to KFS.