JuiceFS is a high-performance POSIX file system released under Apache License 2.0, particularly designed for the cloud-native environment. The data, stored via JuiceFS, will be persisted in object storage (e.g. Amazon S3), and the corresponding metadata can be persisted in various database engines such as Redis, MySQL, and SQLite based on the scenarios and requirements.
With JuiceFS, massive cloud storage can be directly connected to big data, machine learning, artificial intelligence, and various application platforms in production environments. Without modifying code, the massive cloud storage can be used as efficiently as local storage.
đź“ş Video: What is JuiceFS?
đź“– Document: Quick start guide
- Fully POSIX-compatible: Use as a local file system, seamlessly docking with existing applications without breaking business workflow.
- Fully Hadoop-compatible: JuiceFS' Hadoop Java SDK is compatible with Hadoop 2.x and Hadoop 3.x as well as a variety of components in the Hadoop ecosystems.
- S3-compatible: JuiceFS' S3 Gateway provides an S3-compatible interface.
- Cloud Native: A Kubernetes CSI driver is provided for easily using JuiceFS in Kubernetes.
- Shareable: JuiceFS is a shared file storage that can be read and written by thousands of clients.
- Strong Consistency: The confirmed modification will be immediately visible on all the servers mounted with the same file system.
- Outstanding Performance: The latency can be as low as a few milliseconds, and the throughput can be expanded nearly unlimitedly (depending on the size of the object storage). Test results
- Data Encryption: Supports data encryption in transit and at rest (please refer to the guide for more information).
- Global File Locks: JuiceFS supports both BSD locks (flock) and POSIX record locks (fcntl).
- Data Compression: JuiceFS supports LZ4 or Zstandard to compress all your data.
Architecture | Getting Started | Advanced Topics | POSIX Compatibility | Performance Benchmark | Supported Object Storage | Who is using | Roadmap | Reporting Issues | Contributing | Community | Usage Tracking | License | Credits | FAQ
JuiceFS consists of three parts:
- JuiceFS Client: Coordinate the implementation of object storage and metadata storage engines as well as file system interfaces such as POSIX, Hadoop, Kubernetes, and S3 gateway.
- Data Storage: Support data storage in local disk and object storage.
- Metadata Engine: Support storage of metadata that corresponds to the stored data on multiple engines such as Redis, MySQL, and SQLite.
The metadata of file system will be stored via JuiceFS through Redis, which is a fast, open-source, in-memory key-value data storage, particularly suitable for storing metadata; meanwhile, all the data will be stored in object storage through JuiceFS client. Learn more
Each file stored in JuiceFS will be split into "Chunk" s at a fixed size with the default upper limit of 64 MiB. Each Chunk is composed of one or more "Slice"(s). The length of the slice varies depending on how the file is written. Each slice will be further split into size-fixed "Block" (s), which is 4 MiB by default. In the end, these blocks will be stored in object storage; at the same time, JuiceFS will store the file and its Chunks, Slices, Blocks and other metadata information in metadata engines. Learn more
When using JuiceFS, files will eventually be split into Chunks, Slices and Blocks and stored in object storage. Therefore, you may find that the source files stored in JuiceFS cannot be found in the file browser of the object storage platform. Instead, there are only a chunks directory and a bunch of digitally numbered directories and files in the bucket. But don't panic! This is just the secret of the high-performance operation of JuiceFS!
To create a JuiceFS, you need the following 3 preparations:
- Redis database for metadata storage
- Object storage is used to store data blocks
- JuiceFS Client
Please refer to Quick Start Guide in the community doc (or doc in this repo) to start using JuiceFS immediately!
There is a command reference to see all options of the subcommand.
Using JuiceFS on Kubernetes is so easy, have a try.
If you wanna use JuiceFS in Hadoop, check Hadoop Java SDK.
- Redis Best Practices
- How to Setup Object Storage
- Cache Management
- Fault Diagnosis and Analysis
- FUSE Mount Options
- Using JuiceFS on Windows
- S3 Gateway
Please refer to JuiceFS User Manual for more information.
JuiceFS passed all of the 8813 tests in latest pjdfstest.
All tests successful.
Test Summary Report
-------------------
/root/soft/pjdfstest/tests/chown/00.t (Wstat: 0 Tests: 1323 Failed: 0)
TODO passed: 693, 697, 708-709, 714-715, 729, 733
Files=235, Tests=8813, 233 wallclock secs ( 2.77 usr 0.38 sys + 2.57 cusr 3.93 csys = 9.65 CPU)
Result: PASS
Besides the things covered by pjdfstest, JuiceFS provides:
- Close-to-open consistency. Once a file is closed, the following open and read are guaranteed see the data written before close. Within same mount point, read can see all data written before it immediately.
- Rename and all other metadata operations are atomic guaranteed by Redis transaction.
- Open files remain accessible after unlink from same mount point.
- Mmap is supported (tested with FSx).
- Fallocate with punch hole support.
- Extended attributes (xattr).
- BSD locks (flock).
- POSIX record locks (fcntl).
JuiceFS provides a subcommand to run a few basic benchmarks to understand how it works in your environment:
Performed a sequential read/write benchmark on JuiceFS, EFS and S3FS by fio, here is the result:
It shows JuiceFS can provide 10X more throughput than the other two, read more details.
Performed a simple mdtest benchmark on JuiceFS, EFS and S3FS by mdtest, here is the result:
It shows JuiceFS can provide significantly more metadata IOPS than the other two, read more details.
There is a virtual file called .accesslog
in the root of JuiceFS to show all the operations and the time they takes, for example:
$ cat /jfs/.accesslog
2021.01.15 08:26:11.003330 [uid:0,gid:0,pid:4403] write (17669,8666,4993160): OK <0.000010>
2021.01.15 08:26:11.003473 [uid:0,gid:0,pid:4403] write (17675,198,997439): OK <0.000014>
2021.01.15 08:26:11.003616 [uid:0,gid:0,pid:4403] write (17666,390,951582): OK <0.000006>
The last number on each line is the time (in seconds) current operation takes. You can use this directly to debug and analyze performance issues, or try ./juicefs profile /jfs
to monitor real time statistics. Please run ./juicefs profile -h
or refer to here to learn more about this subcommand.
- Amazon S3
- Google Cloud Storage
- Azure Blob Storage
- Alibaba Cloud Object Storage Service (OSS)
- Tencent Cloud Object Storage (COS)
- QingStor Object Storage
- Ceph RGW
- MinIO
- Local disk
- Redis
JuiceFS supports almost all object storage services. Learn more.
It's considered as beta quality, the storage format is not stabilized yet. If you want to use it in a production environment, please do a careful and serious evaluation first. If you are interested in it, please test it as soon as possible and give us feedback.
You are welcome to tell us after using JuiceFS and share your experience with everyone. We have also collected a summary list in ADOPTERS.md, which also includes other open source projects used with JuiceFS.
- Stabilize storage format
- Support FoundationDB as meta engine
- User and group quotas
- Directory quotas
- Snapshot
- Write once read many (WORM)
We use GitHub Issues to track community reported issues. You can also contact the community for getting answers.
Thank you for your contribution! Please refer to the CONTRIBUTING.md for more information.
Welcome to join the Discussions and the Slack channel to connect with JuiceFS team members and other users.
JuiceFS by default collects anonymous usage data. It only collects core metrics (e.g. version number), no user or any sensitive data will be collected. You could review related code here.
These data help us understand how the community is using this project. You could disable reporting easily by command line option --no-usage-report
:
$ ./juicefs mount --no-usage-report
JuiceFS is open-sourced under Apache License 2.0, see LICENSE.
The design of JuiceFS was inspired by Google File System, HDFS and MooseFS, thanks to their great work.
JuiceFS already supported many object storage, please check the list first. If this object storage is compatible with S3, you could treat it as S3. Otherwise, try reporting issue.
The simple answer is no. JuiceFS uses transaction to guarantee the atomicity of metadata operations, which is not well supported in cluster mode. Sentinal or other HA solution for Redis are needed.
See "Redis Best Practices" for more information.
See "Comparison with Others" for more information.
For more FAQs, please see the full list.