The Lustre file system utilizes a client software package to mount and interact with the file system. The Lustre client contains the lfs helper utility used to manage aspects of the file system. The lfs helper utility should always be used to interact with the file system instead of native Linux utilities. Using the lfs helper utility you can query the file system object storage targets (OSTs) and metadata targets (MDTs). All file data in Lustre is store on the OST storage volumes. All file metadata including file names, timestamps, permissions, and more is stored on the MDT.
This section will examine Amazon FSx for Lustre data repository integration with Amazon S3. The workshop environment created an FSx for Lustre file system with a data repository integrated with the "nasanex" bucket in the US West (Oregon) region. NASA NEX is a part of the Registry of Open Data on AWS project and is a collection of Earth science datasets maintained by NASA, including climate change projections and satellite images of the Earth’s surface.
Important
|
Read through all steps below before continuing. |
-
Open the Amazon EC2 console.
TipContext-click (right-click) the link above and open the link in a new tab or window to make it easy to navigate between this github workshop and Amazon EC2 console. NoteMake sure you are in the AWS Region of your workshop environment. If you need to change the AWS Region of the Amazon EC2 console, in the top right corner of the browser window click the region name next to Support and click the appropriate AWS Region from the drop-down menu. -
Click Instances (running).
-
Click the check box next to the instance with the name Linux Instance 0.
-
Click the Connect button.
-
Connect using AWS Systems Manager - select the Session Manager tab and click the Connect button to open a session.
Copy, paste, then execute the shell commands below in the Session Manager terminal session of Linux Instance 0 to answer the following questions:
-
Is the FSx for Lustre file system mounted?
bash mount -t lustre
-
How long does it take to list the entire file system?
NoteThe lfs client helper utility is used to work with the Lustre file system. time lfs find /fsx > /dev/null
-
What file types did you see? How many files?
lfs find /fsx --type f | wc -l
-
How many directories?
lfs find /fsx --type d | wc -l
-
How many small files (< 512 KiB)?
lfs find /fsx --type f --size -512k | wc -l
-
How many large files (> 100 MiB)?
lfs find /fsx --type f --size +100M | wc -l
-
How many .nc, .hdf, .tif, .gz files?
lfs find /fsx --type f | rev | cut -d '.' -f 1 | rev | sort -n | uniq -c | egrep '(nc|hdf|tif|gz)'
-
How much metadata (MDT) has been loaded into the file system?
lfs df -h
NoteThe metadata target (MDT) holds the Lustre file systems directory structure, permissions time stamps, system namespace, and other file system details. How much data (all the OSTs) has been loaded into the file system?
NoteThe object storage targets (OSTs) are the storage volumes where data is stored within the Lustre file system. How much data storage capacity is available?
The results of your queries should match the following:
Query | Results |
---|---|
Is the FSx for Lustre file system mounted? |
10.0.1.193@tcp:/fsx on /fsx type lustre (rw,lazystatfs) (you will have a different IP address) |
How long does it take to list the entire file system? |
~real 1m10.618s |
What file types did you see? |
.hdf .nc .gz .tif .json .md5 .txt .pdf |
How many files? |
373572 |
How many directories? |
42242 |
How many small files (< 512 KiB)? |
23692 |
How many large files (> 100 MiB)? |
169617 |
How many .nc, .hdf, .tif, .gz files? |
.nc = 87002; .hdf = 207552; .tif = 11095; .gz = 42009 |
How much storage is used by the metadata target (MDT)? |
1.9G |
How much storage is used by all the object storage targets (OSTs)? |
27.M |
How much data storage capacity is available? |
6.6T |