栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

大数据学习之HDFS基础

大数据学习之HDFS基础

一、HDFS介绍

    基本介绍

    HDFS的全称是Hadoop Distributed File System ,Hadoop的 分布式 文件 系统是一种允许文件通过网络在多台主机上分享的文件系统,可以让多台机器上的多个用户分享文件和存储空间HDFS是一种适合大文件存储的分布式文件系统,不适合小文件存储

    设计思想

二、HDFS基础操作

    HDFS的shell

    命令格式:bin/hdfs dfs -xxx scheme://authority/path

    使用hadoop bin目录的hdfs命令,后面指定dfs,表示是操作分布式文件系统的,这些属于固定格式【若在path中配置了Hadoop的bin目录,则直接使用hdfs即可】xxx是一个占位符,具体我们想对hdfs做什么操作,就可以在这里指定对应的命令了HDFS的schema是hdfs,authority是集群中namenode所在节点的ip和对应的端口号,把ip换成主机名也是一样的,path是我们要操作的文件路径信息其实后面这一长串内容就是core-site.xml配置文件中fs.defaultFS属性的值,这个代表的是HDFS的地址。 基础命令

    hdfs dfs:查看帮助文档

    [root@bigdata01 ~]# hdfs dfs
    Usage: hadoop fs [generic options]
            [-appendToFile  ... ]
            [-cat [-ignoreCrc]  ...]
            [-checksum  ...]
            [-chgrp [-R] GROUP PATH...]
            [-chmod [-R]  PATH...]
            [-chown [-R] [OWNER][:[GROUP]] PATH...]
            [-copyFromLocal [-f] [-p] [-l] [-d] [-t ]  ... ]
            [-copyToLocal [-f] [-p] [-ignoreCrc] [-crc]  ... ]
            [-count [-q] [-h] [-v] [-t []] [-u] [-x] [-e]  ...]
            [-cp [-f] [-p | -p[topax]] [-d]  ... ]
            [-createSnapshot  []]
            [-deleteSnapshot  ]
            [-df [-h] [ ...]]
            [-du [-s] [-h] [-v] [-x]  ...]
            [-expunge]
            [-find  ...  ...]
            [-get [-f] [-p] [-ignoreCrc] [-crc]  ... ]
            [-getfacl [-R] ]
            [-getfattr [-R] {-n name | -d} [-e en] ]
            [-getmerge [-nl] [-skip-empty-file]  ]
            [-head ]
            [-help [cmd ...]]
            [-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [ ...]]
            [-mkdir [-p]  ...]
            [-moveFromLocal  ... ]
            [-moveToLocal  ]
            [-mv  ... ]
            [-put [-f] [-p] [-l] [-d]  ... ]
            [-renameSnapshot   ]
            [-rm [-f] [-r|-R] [-skipTrash] [-safely]  ...]
            [-rmdir [--ignore-fail-on-non-empty]  ...]
            [-setfacl [-R] [{-b|-k} {-m|-x } ]|[--set  ]]
            [-setfattr {-n name [-v value] | -x name} ]
            [-setrep [-R] [-w]   ...]
            [-stat [format]  ...]
            [-tail [-f] ]
            [-test -[defsz] ]
            [-text [-ignoreCrc]  ...]
            [-touch [-a] [-m] [-t TIMESTAMP ] [-c]  ...]
            [-touchz  ...]
            [-truncate [-w]   ...]
            [-usage [cmd ...]]
    
    Generic options supported are:
    -conf         specify an application configuration file
    -D                define a value for a given property
    -fs  specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
    -jt   specify a ResourceManager
    -files                 specify a comma-separated list of files to be copied to the map reduce cluster
    -libjars                specify a comma-separated list of jar files to be included in the classpath
    -archives           specify a comma-separated list of archives to be unarchived on the compute machines
    
    The general command line syntax is:
    command [genericOptions] [commandOptions]
    
    

    hdfs dfs -ls:查看指定路径信息

    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
    Found 1 items
    -rw-r--r--   2 root supergroup       1361 2022-02-25 18:24 /README.txt
    

    hdfs dfs -ls -R:递归显示所有目录信息

    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls -R /
    -rw-r--r--   2 root supergroup       1361 2022-02-25 18:24 /README.txt
    drwxr-xr-x   - root supergroup          0 2022-02-25 18:29 /abc
    drwxr-xr-x   - root supergroup          0 2022-02-25 18:29 /abc/xyz
    drwxr-xr-x   - root supergroup          0 2022-02-25 18:28 /test
    
    

    hdfs dfs -put:上传指定文件

    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -put README.txt /  
    

    hdfs dfs -get:下载指定文件

    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -get /README.txt .
    get: `README.txt': File exists
    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -get /README.txt README.txt.bak
    [root@bigdata01 hadoop-3.2.0]# ll
    total 188
    drwxr-xr-x. 2 1001 1002    203 Jan  8  2019 bin
    drwxr-xr-x. 3 1001 1002     20 Jan  8  2019 etc
    drwxr-xr-x. 2 1001 1002    106 Jan  8  2019 include
    drwxr-xr-x. 3 1001 1002     20 Jan  8  2019 lib
    drwxr-xr-x. 4 1001 1002   4096 Jan  8  2019 libexec
    -rw-rw-r--. 1 1001 1002 150569 Oct 19  2018 LICENSE.txt
    -rw-rw-r--. 1 1001 1002  22125 Oct 19  2018 NOTICE.txt
    -rw-rw-r--. 1 1001 1002   1361 Oct 19  2018 README.txt
    -rw-r--r--. 1 root root   1361 Feb 25 18:25 README.txt.bak
    drwxr-xr-x. 3 1001 1002   4096 Feb 25 15:53 sbin
    drwxr-xr-x. 4 1001 1002     31 Jan  8  2019 share
    

    hdfs dfs -cat:查看指定文件

    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -cat /README.txt
    For the latest information about Hadoop, please visit our website at:
    
       http://hadoop.apache.org/
    
    and our wiki, at:
    
       http://wiki.apache.org/hadoop/
    
    This distribution includes cryptographic software.  The country in 
    which you currently reside may have restrictions on the import, 
    possession, use, and/or re-export to another country, of 
    encryption software.  BEFORE using any encryption software, please 
    check your country's laws, regulations and policies concerning the
    import, possession, or use, and re-export of encryption software, to 
    see if this is permitted.  See  for more
    information.
    
    The U.S. Government Department of Commerce, Bureau of Industry and
    Security (BIS), has classified this software as Export Commodity 
    Control Number (ECCN) 5D002.C.1, which includes information security
    software using or performing cryptographic functions with asymmetric
    algorithms.  The form and manner of this Apache Software Foundation
    distribution makes it eligible for export under the License Exception
    ENC Technology Software Unrestricted (TSU) exception (see the BIS 
    Export Administration Regulations, Section 740.13) for both object 
    code and source code.
    
    The following provides more details on the included cryptographic
    software:
      Hadoop Core uses the SSL libraries from the Jetty project written 
    by mortbay.org.
    

    hdfs dfs -mkdir:创建文件夹

    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -mkdir /test
    
    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
    Found 2 items
    -rw-r--r--   2 root supergroup       1361 2022-02-25 18:24 /README.txt
    drwxr-xr-x   - root supergroup          0 2022-02-25 18:28 /test
    

    hdfs dfs -mkdir -p:递归创建多级目录

    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -mkdir -p /abc/xyz
    You have mail in /var/spool/mail/root
    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
    Found 3 items
    -rw-r--r--   2 root supergroup       1361 2022-02-25 18:24 /README.txt
    drwxr-xr-x   - root supergroup          0 2022-02-25 18:29 /abc
    drwxr-xr-x   - root supergroup          0 2022-02-25 18:28 /test
    

    hdfs dfs -rm:删除文件

    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -rm /README.txt
    Deleted /README.txt
    
    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
    Found 2 items
    drwxr-xr-x   - root supergroup          0 2022-02-25 18:29 /abc
    drwxr-xr-x   - root supergroup          0 2022-02-25 18:28 /test
    

    hdfs dfs -rm -r:删除目录

    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -rm -r /test
    Deleted /test
    
    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -rm -r /abc
    Deleted /abc
    You have mail in /var/spool/mail/root
    [root@bigdata01 hadoop-3.2.0]# hdfs dfs -ls /
    [root@bigdata01 hadoop-3.2.0]# 
    
转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/744969.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号