1、需要在集群的每一个节点都安装lzo库,查看当前系统位数:
uname -a
2、下载lzo库,下载地址:http://www.oberhumer.com/opensource/lzo/download/
也可以通过命令下载:
wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.10.tar.gz
3、 解压lzo库
tar -zxvf lzo-2.10.tar.gz
4、进入解压后的lzo目录
mkdir /usr/local/lzo2.10 cd /usr/local/lzo-2.10 ./configure --enable-shared -prefix=/usr/local/lzo2.10 make make install
5、库文件被安装到了/usr/local/lzo2.10,将/usr/local/lzo2.10拷贝到/usr/local/lib和/usr/lib64下,或者在/usr/local/lib和/usr/lib64下建立软连接
cp -r /usr/local/lzo2.10/lib/* /usr/local/lib cp -r /usr/local/lzo2.10/lib/* /usr/lib64/
6、将/usr/local/lzo2.10/lib下的库文件拷贝到$HADOOP_HOME/lib/native/Linux-amd64-64/下:
cp -r /usr/local/lzo2.10/lib/* $HADOOP_HOME/lib/native/Linux-amd64-64/2、安装hadoop-lzo
1、下载hadoop-lzo
wget https://github.com/twitter/hadoop-lzo/archive/master.zip
2、编译hadoop-lzo源码,在编译之前如果没有安装maven需要配置maven环境,解压缩master.zip,为:hadoop-lzo-master,进入hadoop-lzo-master中,修改pom.xml中hadoop版本配置,进行maven编译
unzip master.zip cd hadoop-lzo-master vim pom.xmlUTF-8 3.1.3 1.0.4
3、在hadoop-lzo-master目录中执行一下命令编译hadoop-lzo
export CFLAGS=-m64 export CXXFLAGS=-m64 vim /etc/profile export C_INCLUDE_PATH=/usr/local/lzo2.10/include export LIBRARY_PATH=/usr/local/lzo2.10/lib source /etc/profile
4、打包完成后,进入/usr/local/hadoop-lzo-master/target/native/Linux-amd64-64,将lib/libgplcompression*复制到hadoop的native中,将hadoop-lzo-0.4.21-SNAPSHOT.jar 复制到每台hadoop的common包里
cd /usr/local/hadoop-lzo-master/target/native/Linux-amd64-64 cp /usr/local/hadoop-lzo-master/target/native/Linux-amd64-64/* $HADOOP_HOME/lib/native/ cp /usr/local/hadoop-lzo-master/target/hadoop-lzo-0.4.21-SNAPSHOT.jar $HADOOP_HOME/share/hadoop/common/3、hadoop配置lzo
1、在$HADOOP_HOME/etc/hadoop/hadoop-env.sh文件中配置:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/hadoop/hadoop-3.1.3/lib/native:/usr/local/lib
2、在$HADOOP_HOME/etc/hadoop/core-site.xml加上如下配置:
io.compression.codecs org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.SnappyCodec, com.hadoop.compression.lzo.LzoCodec, com.hadoop.compression.lzo.LzopCodec io.compression.codec.lzo.class com.hadoop.compression.lzo.LzoCodec
3、在$HADOOP_HOME/etc/hadoop/mapred-site.xml加上如下配置
mapred.compress.map.output true mapred.map.output.compression.codec com.hadoop.compression.lzo.LzoCodec mapred.child.env LD_LIBRARY_PATH=/usr/local/hadoop/hadoop-3.1.3/lib/native
mapreduce.reduce.env LD_LIBRARY_PATH=/usr/local/hadoop/hadoop-3.1.3/lib/native
mapred.child.env LD_LIBRARY_PATH=/usr/local/hadoop/hadoop-3.1.3/lib/native
将上述修改的配置文件全部同步到集群的所有机器上,并重启Hadoop集群,这样就可以在Hadoop中使用lzo。
4、测试
hadoop jar /usr/local/hadoop/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output4、在Hbase中配置lzo
1、将hadoop-lzo-xxx.jar复制到/hbase/lib中
cp /usr/local/hadoop-lzo-master/target/hadoop-lzo-0.4.21-SNAPSHOT.jar $Hbase_HOME/lib
2、在hbase/lib下创建native文件夹,在/hbase/lib/native下创建Linux-amd64-64 -> /usr/local/hadoop/hadoop-3.1.3/lib/native的软连接
ln -s /usr/local/hadoop/hadoop-3.1.3/lib/native /usr/local/hbase/hbase-2.1.7/lib/native/Linux-amd64-64
3、 在$Hbase_HOME/conf/hbase-env.sh中添加如下配置
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/hadoop/hadoop-3.1.3/lib/native/:/usr/local/lib export Hbase_LIBRARY_PATH=$Hbase_LIBRARY_PATH:/usr/local/hbase/hbase-2.1.7/lib/native/Linux-amd64-64/:/usr/local/lib/ export CLASSPATH=$CLASSPATH:$Hbase_LIBRARY_PATH
4、在$Hbase_HOME/conf/hbase-site.xml中添加如下配置:
hbase.regionserver.codecs lzo
5、启动hbase,并测试
hbase org.apache.hadoop.hbase.util.CompressionTest file:///home/hadoop/ouput lzo
hbase shell 建表
create 'test', { NAME => 'cf', COMPRESSION => 'lzo' }



