summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/cephfs/posix.rst
diff options
context:
space:
mode:
authorQiaowei Ren <qiaowei.ren@intel.com>2018-01-04 13:43:33 +0800
committerQiaowei Ren <qiaowei.ren@intel.com>2018-01-05 11:59:39 +0800
commit812ff6ca9fcd3e629e49d4328905f33eee8ca3f5 (patch)
tree04ece7b4da00d9d2f98093774594f4057ae561d4 /src/ceph/doc/cephfs/posix.rst
parent15280273faafb77777eab341909a3f495cf248d9 (diff)
initial code repo
This patch creates initial code repo. For ceph, luminous stable release will be used for base code, and next changes and optimization for ceph will be added to it. For opensds, currently any changes can be upstreamed into original opensds repo (https://github.com/opensds/opensds), and so stor4nfv will directly clone opensds code to deploy stor4nfv environment. And the scripts for deployment based on ceph and opensds will be put into 'ci' directory. Change-Id: I46a32218884c75dda2936337604ff03c554648e4 Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Diffstat (limited to 'src/ceph/doc/cephfs/posix.rst')
-rw-r--r--src/ceph/doc/cephfs/posix.rst49
1 files changed, 49 insertions, 0 deletions
diff --git a/src/ceph/doc/cephfs/posix.rst b/src/ceph/doc/cephfs/posix.rst
new file mode 100644
index 0000000..6a62cb2
--- /dev/null
+++ b/src/ceph/doc/cephfs/posix.rst
@@ -0,0 +1,49 @@
+========================
+ Differences from POSIX
+========================
+
+CephFS aims to adhere to POSIX semantics wherever possible. For
+example, in contrast to many other common network file systems like
+NFS, CephFS maintains strong cache coherency across clients. The goal
+is for processes communicating via the file system to behave the same
+when they are on different hosts as when they are on the same host.
+
+However, there are a few places where CephFS diverges from strict
+POSIX semantics for various reasons:
+
+- If a client is writing to a file and fails, its writes are not
+ necessarily atomic. That is, the client may call write(2) on a file
+ opened with O_SYNC with an 8 MB buffer and then crash and the write
+ may be only partially applied. (Almost all file systems, even local
+ file systems, have this behavior.)
+- In shared simultaneous writer situations, a write that crosses
+ object boundaries is not necessarily atomic. This means that you
+ could have writer A write "aa|aa" and writer B write "bb|bb"
+ simultaneously (where | is the object boundary), and end up with
+ "aa|bb" rather than the proper "aa|aa" or "bb|bb".
+- POSIX includes the telldir(2) and seekdir(2) system calls that allow
+ you to obtain the current directory offset and seek back to it.
+ Because CephFS may refragment directories at any time, it is
+ difficult to return a stable integer offset for a directory. As
+ such, a seekdir to a non-zero offset may often work but is not
+ guaranteed to do so. A seekdir to offset 0 will always work (and is
+ equivalent to rewinddir(2)).
+- Sparse files propagate incorrectly to the stat(2) st_blocks field.
+ Because CephFS does not explicitly track which parts of a file are
+ allocated/written, the st_blocks field is always populated by the
+ file size divided by the block size. This will cause tools like
+ du(1) to overestimate consumed space. (The recursive size field,
+ maintained by CephFS, also includes file "holes" in its count.)
+- When a file is mapped into memory via mmap(2) on multiple hosts,
+ writes are not coherently propagated to other clients' caches. That
+ is, if a page is cached on host A, and then updated on host B, host
+ A's page is not coherently invalidated. (Shared writable mmap
+ appears to be quite rare--we have yet to here any complaints about this
+ behavior, and implementing cache coherency properly is complex.)
+- CephFS clients present a hidden ``.snap`` directory that is used to
+ access, create, delete, and rename snapshots. Although the virtual
+ directory is excluded from readdir(2), any process that tries to
+ create a file or directory with the same name will get an error
+ code. The name of this hidden directory can be changed at mount
+ time with ``-o snapdirname=.somethingelse`` (Linux) or the config
+ option ``client_snapdir`` (libcephfs, ceph-fuse).