Warning: This is work in progress, so this post will be updated
TODO: Make a table of the file systems mentioned e.g. in CRUSH paper (related work section).
Design space
A subset of properties that can considered when designing a new distributed file system:
- Is the cluster static or dynamic?
- Is the cluster heterogenous or homogenous?
- Object based or block based?
- Is data ever migrated once it is written? If so under which circumstances (storage added and/or storage deleted)?
- Is the allocation based on metadata or a mapping function (consistent hashing etc)?
- Is there a central allocator?
- Is there replication? Across failure domains?
- What are the assumptions about the workload (skew on new/old items etc)?