Setting up LZO compression in Hadoop/YARN

Install liblzo

  • On Redhat based systems sudo yum install liblzo-devel
  • On Debian based systems sudo apt-get install liblzo2-dev

Clone the hadoop-lzo from github

$ git clone https://github.com/twitter/hadoop-lzo.git
$ cd hadoop-lzo
$ mvn clean package

Place the hadoop-lzo-*.jar somewhere on your cluste classpath

$ cp hadoop-lzo/target/hadoop-lzo-0.4.20-SNAPSHOT.jar /data/lib/

Place the native hadoop-lzo binaries in hadoop native directory

$ cp hadoop-lzo/target/native/Linux-amd64-64/lib/* $HADOOP_HOME/lib/native/hadoop-lzo/

Add the following to your core-site.xml:

<property>
<name>io.compression.codecs</name>
<value>
    org.apache.hadoop.io.compress.GzipCodec,
    org.apache.hadoop.io.compress.DefaultCodec,
    org.apache.hadoop.io.compress.BZip2Codec,
    com.hadoop.compression.lzo.LzoCodec,
    com.hadoop.compression.lzo.LzopCodec
</value>
</property>
<property>
    <name>io.compression.codec.lzo.class</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

Add the following to your mapred-site.xml:

<property>
  <name>mapred.child.env</name>
  <value>JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native/hadoop-lzo/</value>
</property>
<property>
  <name>mapred.map.output.compression.codec</name>
  <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

References:

10 comments

Fortunately, Apache Hadoop is a tailor-made solution that delivers on both counts, by turning big data insights into actionable business enhancements for long-term success. To know more, visit Hadoop Training Bangalore

hadoop online training always recommends this site for valuable information related not only about the hadoop but also about various other technologies which are cloud related. Thanks for maintaining this wonderful blog.

I think this is the best blog I have been through all this day.


Hadoop Training in Chennai

Excellent and very cool idea and the subject at the top of magnificence and I am happy to this post..Interesting post! Thanks for writing it.What's wrong with this kind of post exactly? It follows your previous guideline for post length as well as clarity..
Android Training in Chennai

Hi, Great.. Tutorial is just awesome..It is really helpful for a newbie like me.. I am a regular follower of your blog. Really very informative post you shared here. Kindly keep blogging. If anyone wants to become a Java developer learn from Java Training in Chennai. or learn thru Java Online Training from India . Nowadays Java has tons of job opportunities on various vertical industry.

Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating Hadoop Admin Online Training Bangalore