Setting up LZO compression in Hadoop/YARN

Install liblzo

  • On Redhat based systems sudo yum install liblzo-devel
  • On Debian based systems sudo apt-get install liblzo2-dev

Clone the hadoop-lzo from github

$ git clone https://github.com/twitter/hadoop-lzo.git
$ cd hadoop-lzo
$ mvn clean package

Place the hadoop-lzo-*.jar somewhere on your cluste classpath

$ cp hadoop-lzo/target/hadoop-lzo-0.4.20-SNAPSHOT.jar /data/lib/

Place the native hadoop-lzo binaries in hadoop native directory

$ cp hadoop-lzo/target/native/Linux-amd64-64/lib/* $HADOOP_HOME/lib/native/hadoop-lzo/

Add the following to your core-site.xml:

<property>
<name>io.compression.codecs</name>
<value>
    org.apache.hadoop.io.compress.GzipCodec,
    org.apache.hadoop.io.compress.DefaultCodec,
    org.apache.hadoop.io.compress.BZip2Codec,
    com.hadoop.compression.lzo.LzoCodec,
    com.hadoop.compression.lzo.LzopCodec
</value>
</property>
<property>
    <name>io.compression.codec.lzo.class</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

Add the following to your mapred-site.xml:

<property>
  <name>mapred.child.env</name>
  <value>JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native/hadoop-lzo/</value>
</property>
<property>
  <name>mapred.map.output.compression.codec</name>
  <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

References:

35 comments

Fortunately, Apache Hadoop is a tailor-made solution that delivers on both counts, by turning big data insights into actionable business enhancements for long-term success. To know more, visit Hadoop Training Bangalore

I think this is the best blog I have been through all this day.


Hadoop Training in Chennai

Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating Hadoop Admin Online Training Bangalore

I am really admired for the great info is visible in this blog that to lot of benefits for visiting the nice info in this website.
Thanks a lot for using the nice info is visible in this blog.
Digital Marketing Training in Chennai

I am really admired for the great info is visible in this blog that to lot of benefits for visiting the nice info in this website.
Thanks a lot for using the nice info is visible in this blog.latestITjobs
Hadoop Training in Chennai

Nice post. It is very useful and informative post.

CEH Training In Hyderbad

Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating Big data training

You should be a piece of a challenge for probably the best website on the web. I will suggest this site
best interiors

Thanks for the information about Blogspot very informative for everyone.
https://digitalbadi.com/digital-marketing-course-in-hyderabad/
https://digitalbadi.com/digital-marketing-course-in-telugu/


https://digitalbadi.com/wordpress-training-in-hyderabad/
https://digitalbadi.com/video-editing-course-in-hyderabad/


https://digitalbadi.com/seo-training-in-hyderabad/