In this post we’ll see a Java program to read a file in HDFS. You can read a file in HDFS in two ways-
- Create an object of FSDataInputStream and use that object to read data from file. See example.
- You can use IOUtils class provided by Hadoop framework. See example.
Reading HDFS file Using FSDataInputStream
import java.io.IOException;
import java.io.OutputStream;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class HDFSFileRead {
 public static void main(String[] args) {
  Configuration conf = new Configuration();
  FSDataInputStream in = null;
  OutputStream out = null;
  try {
   FileSystem fs = FileSystem.get(conf);
   // Input file path
   Path inFile = new Path(args[0]);
     
   // Check if file exists at the given location
   if (!fs.exists(inFile)) {
    System.out.println("Input file not found");
    throw new IOException("Input file not found");
   }
   // open and read from file
   in = fs.open(inFile);
   //displaying file content on terminal 
   out = System.out;
   byte buffer[] = new byte[256];
  
   int bytesRead = 0;
   while ((bytesRead = in.read(buffer)) > 0) {
    out.write(buffer, 0, bytesRead);
   }      
  } catch (IOException e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
  }finally {
   // Closing streams
   try {
    if(in != null) {     
     in.close();    
    }
    if(out != null) {
     out.close();
    }
   } catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
   }
  }
 }
}
In order to execute this program you need to add the class path to the Hadoop’s classpath.
export HADOOP_CLASSPATH=<PATH TO .class FILE>
To run program- hadoop org.netjs.HDFSFileRead /user/process/display.txt
Reading HDFS file Using IOUtils class
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
public class HDFSFileRead {
 public static void main(String[] args) {
  Configuration conf = new Configuration();
  FSDataInputStream in = null;
  //OutputStream out = null;
  try {
   FileSystem fs = FileSystem.get(conf);
   // Input file path
   Path inFile = new Path(args[0]);
     
   // Check if file exists at the given location
   if (!fs.exists(inFile)) {
    System.out.println("Input file not found");
    throw new IOException("Input file not found");
   }
   in = fs.open(inFile);
   
   IOUtils.copyBytes(in, System.out, 512, false);
  } catch (IOException e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
  }finally {
   IOUtils.closeStream(in);
  }
 }
}
That's all for this topic Java Program to Read File in HDFS. If you have any doubt or any suggestions to make please drop a comment. Thanks!
>>>Return to Hadoop Framework Tutorial Page
Related Topics
You may also like-