In this post we’ll see a Java program to read a file in HDFS. You can read a file in HDFS in two ways-
- Create an object of FSDataInputStream and use that object to read data from file. See example.
- You can use IOUtils class provided by Hadoop framework. See example.
Reading HDFS file Using FSDataInputStream
import java.io.IOException; import java.io.OutputStream; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class HDFSFileRead { public static void main(String[] args) { Configuration conf = new Configuration(); FSDataInputStream in = null; OutputStream out = null; try { FileSystem fs = FileSystem.get(conf); // Input file path Path inFile = new Path(args[0]); // Check if file exists at the given location if (!fs.exists(inFile)) { System.out.println("Input file not found"); throw new IOException("Input file not found"); } // open and read from file in = fs.open(inFile); //displaying file content on terminal out = System.out; byte buffer[] = new byte[256]; int bytesRead = 0; while ((bytesRead = in.read(buffer)) > 0) { out.write(buffer, 0, bytesRead); } } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); }finally { // Closing streams try { if(in != null) { in.close(); } if(out != null) { out.close(); } } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } }
In order to execute this program you need to add the class path to the Hadoop’s classpath.
export HADOOP_CLASSPATH=<PATH TO .class FILE>
To run program- hadoop org.netjs.HDFSFileRead /user/process/display.txt
Reading HDFS file Using IOUtils class
import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IOUtils; public class HDFSFileRead { public static void main(String[] args) { Configuration conf = new Configuration(); FSDataInputStream in = null; //OutputStream out = null; try { FileSystem fs = FileSystem.get(conf); // Input file path Path inFile = new Path(args[0]); // Check if file exists at the given location if (!fs.exists(inFile)) { System.out.println("Input file not found"); throw new IOException("Input file not found"); } in = fs.open(inFile); IOUtils.copyBytes(in, System.out, 512, false); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); }finally { IOUtils.closeStream(in); } } }
That's all for this topic Java Program to Read File in HDFS. If you have any doubt or any suggestions to make please drop a comment. Thanks!
>>>Return to Hadoop Framework Tutorial Page
Related Topics
You may also like-