summaryrefslogtreecommitdiffstats
path: root/src/ceph/doc/radosgw/troubleshooting.rst
diff options
context:
space:
mode:
authorQiaowei Ren <qiaowei.ren@intel.com>2018-01-04 13:43:33 +0800
committerQiaowei Ren <qiaowei.ren@intel.com>2018-01-05 11:59:39 +0800
commit812ff6ca9fcd3e629e49d4328905f33eee8ca3f5 (patch)
tree04ece7b4da00d9d2f98093774594f4057ae561d4 /src/ceph/doc/radosgw/troubleshooting.rst
parent15280273faafb77777eab341909a3f495cf248d9 (diff)
initial code repo
This patch creates initial code repo. For ceph, luminous stable release will be used for base code, and next changes and optimization for ceph will be added to it. For opensds, currently any changes can be upstreamed into original opensds repo (https://github.com/opensds/opensds), and so stor4nfv will directly clone opensds code to deploy stor4nfv environment. And the scripts for deployment based on ceph and opensds will be put into 'ci' directory. Change-Id: I46a32218884c75dda2936337604ff03c554648e4 Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Diffstat (limited to 'src/ceph/doc/radosgw/troubleshooting.rst')
-rw-r--r--src/ceph/doc/radosgw/troubleshooting.rst179
1 files changed, 179 insertions, 0 deletions
diff --git a/src/ceph/doc/radosgw/troubleshooting.rst b/src/ceph/doc/radosgw/troubleshooting.rst
new file mode 100644
index 0000000..3e4a057
--- /dev/null
+++ b/src/ceph/doc/radosgw/troubleshooting.rst
@@ -0,0 +1,179 @@
+=================
+ Troubleshooting
+=================
+
+
+The Gateway Won't Start
+=======================
+
+If you cannot start the gateway (i.e., there is no existing ``pid``),
+check to see if there is an existing ``.asok`` file from another
+user. If an ``.asok`` file from another user exists and there is no
+running ``pid``, remove the ``.asok`` file and try to start the
+process again.
+
+This may occur when you start the process as a ``root`` user and
+the startup script is trying to start the process as a
+``www-data`` or ``apache`` user and an existing ``.asok`` is
+preventing the script from starting the daemon.
+
+The radosgw init script (/etc/init.d/radosgw) also has a verbose argument that
+can provide some insight as to what could be the issue:
+
+ /etc/init.d/radosgw start -v
+
+or
+
+ /etc/init.d radosgw start --verbose
+
+HTTP Request Errors
+===================
+
+Examining the access and error logs for the web server itself is
+probably the first step in identifying what is going on. If there is
+a 500 error, that usually indicates a problem communicating with the
+``radosgw`` daemon. Ensure the daemon is running, its socket path is
+configured, and that the web server is looking for it in the proper
+location.
+
+
+Crashed ``radosgw`` process
+===========================
+
+If the ``radosgw`` process dies, you will normally see a 500 error
+from the web server (apache, nginx, etc.). In that situation, simply
+restarting radosgw will restore service.
+
+To diagnose the cause of the crash, check the log in ``/var/log/ceph``
+and/or the core file (if one was generated).
+
+
+Blocked ``radosgw`` Requests
+============================
+
+If some (or all) radosgw requests appear to be blocked, you can get
+some insight into the internal state of the ``radosgw`` daemon via
+its admin socket. By default, there will be a socket configured to
+reside in ``/var/run/ceph``, and the daemon can be queried with::
+
+ ceph daemon /var/run/ceph/client.rgw help
+
+ help list available commands
+ objecter_requests show in-progress osd requests
+ perfcounters_dump dump perfcounters value
+ perfcounters_schema dump perfcounters schema
+ version get protocol version
+
+Of particular interest::
+
+ ceph daemon /var/run/ceph/client.rgw objecter_requests
+ ...
+
+will dump information about current in-progress requests with the
+RADOS cluster. This allows one to identify if any requests are blocked
+by a non-responsive OSD. For example, one might see::
+
+ { "ops": [
+ { "tid": 1858,
+ "pg": "2.d2041a48",
+ "osd": 1,
+ "last_sent": "2012-03-08 14:56:37.949872",
+ "attempts": 1,
+ "object_id": "fatty_25647_object1857",
+ "object_locator": "@2",
+ "snapid": "head",
+ "snap_context": "0=[]",
+ "mtime": "2012-03-08 14:56:37.949813",
+ "osd_ops": [
+ "write 0~4096"]},
+ { "tid": 1873,
+ "pg": "2.695e9f8e",
+ "osd": 1,
+ "last_sent": "2012-03-08 14:56:37.970615",
+ "attempts": 1,
+ "object_id": "fatty_25647_object1872",
+ "object_locator": "@2",
+ "snapid": "head",
+ "snap_context": "0=[]",
+ "mtime": "2012-03-08 14:56:37.970555",
+ "osd_ops": [
+ "write 0~4096"]}],
+ "linger_ops": [],
+ "pool_ops": [],
+ "pool_stat_ops": [],
+ "statfs_ops": []}
+
+In this dump, two requests are in progress. The ``last_sent`` field is
+the time the RADOS request was sent. If this is a while ago, it suggests
+that the OSD is not responding. For example, for request 1858, you could
+check the OSD status with::
+
+ ceph pg map 2.d2041a48
+
+ osdmap e9 pg 2.d2041a48 (2.0) -> up [1,0] acting [1,0]
+
+This tells us to look at ``osd.1``, the primary copy for this PG::
+
+ ceph daemon osd.1 ops
+ { "num_ops": 651,
+ "ops": [
+ { "description": "osd_op(client.4124.0:1858 fatty_25647_object1857 [write 0~4096] 2.d2041a48)",
+ "received_at": "1331247573.344650",
+ "age": "25.606449",
+ "flag_point": "waiting for sub ops",
+ "client_info": { "client": "client.4124",
+ "tid": 1858}},
+ ...
+
+The ``flag_point`` field indicates that the OSD is currently waiting
+for replicas to respond, in this case ``osd.0``.
+
+
+Java S3 API Troubleshooting
+===========================
+
+
+Peer Not Authenticated
+----------------------
+
+You may receive an error that looks like this::
+
+ [java] INFO: Unable to execute HTTP request: peer not authenticated
+
+The Java SDK for S3 requires a valid certificate from a recognized certificate
+authority, because it uses HTTPS by default. If you are just testing the Ceph
+Object Storage services, you can resolve this problem in a few ways:
+
+#. Prepend the IP address or hostname with ``http://``. For example, change this::
+
+ conn.setEndpoint("myserver");
+
+ To::
+
+ conn.setEndpoint("http://myserver")
+
+#. After setting your credentials, add a client configuration and set the
+ protocol to ``Protocol.HTTP``. ::
+
+ AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
+
+ ClientConfiguration clientConfig = new ClientConfiguration();
+ clientConfig.setProtocol(Protocol.HTTP);
+
+ AmazonS3 conn = new AmazonS3Client(credentials, clientConfig);
+
+
+
+405 MethodNotAllowed
+--------------------
+
+If you receive an 405 error, check to see if you have the S3 subdomain set up correctly.
+You will need to have a wild card setting in your DNS record for subdomain functionality
+to work properly.
+
+Also, check to ensure that the default site is disabled. ::
+
+ [java] Exception in thread "main" Status Code: 405, AWS Service: Amazon S3, AWS Request ID: null, AWS Error Code: MethodNotAllowed, AWS Error Message: null, S3 Extended Request ID: null
+
+
+