SeaweedFS
[image: 68747470733a2f2f696d672e736869656c64732e696f2f62616467652f736c61636b2d707572706c65] [image: 68747470733a2f2f696d672e736869656c64732e696f2f747769747465722f666f6c6c6f772f7365617765656466732e7376673f7374796c653d736f6369616c266c6162656c3d466f6c6c6f77] [image: 68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f616374696f6e732f776f726b666c6f772f7374617475732f7365617765656466732f7365617765656466732f676f2e796d6c] [image: 68747470733a2f2f676f646f632e6f72672f6769746875622e636f6d2f7365617765656466732f7365617765656466732f776565643f7374617475732e737667] [image: 68747470733a2f2f696d672e736869656c64732e696f2f62616467652f646f63732d77696b692d626c75652e737667] [image: 68747470733a2f2f696d672e736869656c64732e696f2f646f636b65722f70756c6c732f63687269736c7573662f7365617765656466733f6d61784167653d34383030] [image: 68747470733a2f2f696d672e736869656c64732e696f2f6d6176656e2d63656e7472616c2f762f636f6d2e6769746875622e63687269736c7573662f7365617765656466732d636c69656e74] [image: 68747470733a2f2f696d672e736869656c64732e696f2f656e64706f696e743f75726c3d68747470733a2f2f61727469666163746875622e696f2f62616467652f7265706f7369746f72792f736561776565646673]
[image: seaweedfs.png]
Introduction
SeaweedFS is a simple and highly scalable distributed file system. There are two objectives:
to store billions of files!
to serve the files fast!
SeaweedFS started as an Object Store to handle small files efficiently. Instead of managing all file metadata in a central master, the central master only manages volumes on volume servers, and these volume servers manage files and their metadata. This relieves concurrency pressure from the central master and spreads file metadata into volume servers, allowing faster file access (O(1), usually just one disk read operation).
There is only 40 bytes of disk storage overhead for each file's metadata. It is so simple with O(1) disk reads that you are welcome to challenge the performance with your actual use cases.
SeaweedFS started by implementing Facebook's Haystack design paper. Also, SeaweedFS implements erasure coding with ideas from f4: Facebook’s Warm BLOB Storage System, and has a lot of similarities with Facebook’s Tectonic Filesystem
On top of the object store, optional Filer can support directories and POSIX attributes. Filer is a separate linearly-scalable stateless server with customizable metadata stores, e.g., MySql, Postgres, Redis, Cassandra, HBase, Mongodb, Elastic Search, LevelDB, RocksDB, Sqlite, MemSql, TiDB, Etcd, CockroachDB, YDB, etc.
For any distributed key value stores, the large values can be offloaded to SeaweedFS. With the fast access speed and linearly scalable capacity, SeaweedFS can work as a distributed Key-Large-Value store.
SeaweedFS can transparently integrate with the cloud. With hot data on local cluster, and warm data on the cloud with O(1) access time, SeaweedFS can achieve both fast local access time and elastic cloud storage capacity. What's more, the cloud storage access API cost is minimized. Faster and cheaper than direct cloud storage!
Back to TOC
Features
Additional Features
Can choose no replication or different replication levels, rack and data center aware.
Automatic master servers failover - no single point of failure (SPOF).
Automatic Gzip compression depending on file MIME type.
Automatic compaction to reclaim disk space after deletion or update.
Automatic entry TTL expiration.
Any server with some disk space can add to the total storage space.
Adding/Removing servers does not cause any data re-balancing unless triggered by admin commands.
Optional picture resizing.
Support ETag, Accept-Range, Last-Modified, etc.
Support in-memory/leveldb/readonly mode tuning for memory/performance balance.
Support rebalancing the writable and readonly volumes.
Customizable Multiple Storage Tiers: Customizable storage disk types to balance performance and cost.
Transparent cloud integration: unlimited capacity via tiered cloud storage for warm data.
Erasure Coding for warm storage Rack-Aware 10.4 erasure coding reduces storage cost and increases availability.
Back to TOC
Filer Features
Filer server provides "normal" directories and files via HTTP.
File TTL automatically expires file metadata and actual file data.
Mount filer reads and writes files directly as a local directory via FUSE.
Filer Store Replication enables HA for filer meta data stores.
Active-Active Replication enables asynchronous one-way or two-way cross cluster continuous replication.
Amazon S3 compatible API accesses files with S3 tooling.
Hadoop Compatible File System accesses files from Hadoop/Spark/Flink/etc or even runs HBase.
Async Replication To Cloud has extremely fast local access and backups to Amazon S3, Google Cloud Storage, Azure, BackBlaze.
WebDAV accesses as a mapped drive on Mac and Windows, or from mobile devices.
AES256-GCM Encrypted Storage safely stores the encrypted data.
Super Large Files stores large or super large files in tens of TB.
Cloud Drive mounts cloud storage to local cluster, cached for fast read and write with asynchronous write back.
Gateway to Remote Object Store mirrors bucket operations to remote object storage, in addition to Cloud Drive