Setting up LZO compression in Hadoop/YARN

Install liblzo

  • On Redhat based systems sudo yum install liblzo-devel
  • On Debian based systems sudo apt-get install liblzo2-dev

Clone the hadoop-lzo from github

$ git clone https://github.com/twitter/hadoop-lzo.git
$ cd hadoop-lzo
$ mvn clean package

Place the hadoop-lzo-*.jar somewhere on your cluste classpath

$ cp hadoop-lzo/target/hadoop-lzo-0.4.20-SNAPSHOT.jar /data/lib/

Place the native hadoop-lzo binaries in hadoop native directory

$ cp hadoop-lzo/target/native/Linux-amd64-64/lib/* $HADOOP_HOME/lib/native/hadoop-lzo/

Add the following to your core-site.xml:

<property>
<name>io.compression.codecs</name>
<value>
    org.apache.hadoop.io.compress.GzipCodec,
    org.apache.hadoop.io.compress.DefaultCodec,
    org.apache.hadoop.io.compress.BZip2Codec,
    com.hadoop.compression.lzo.LzoCodec,
    com.hadoop.compression.lzo.LzopCodec
</value>
</property>
<property>
    <name>io.compression.codec.lzo.class</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

Add the following to your mapred-site.xml:

<property>
  <name>mapred.child.env</name>
  <value>JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native/hadoop-lzo/</value>
</property>
<property>
  <name>mapred.map.output.compression.codec</name>
  <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

References:

8 comments

Fortunately, Apache Hadoop is a tailor-made solution that delivers on both counts, by turning big data insights into actionable business enhancements for long-term success. To know more, visit Hadoop Training Bangalore

hadoop online training always recommends this site for valuable information related not only about the hadoop but also about various other technologies which are cloud related. Thanks for maintaining this wonderful blog.

I think this is the best blog I have been through all this day.


Hadoop Training in Chennai

Excellent and very cool idea and the subject at the top of magnificence and I am happy to this post..Interesting post! Thanks for writing it.What's wrong with this kind of post exactly? It follows your previous guideline for post length as well as clarity..
Android Training in Chennai