diff options
Diffstat (limited to 'rubbos/app/tomcat-connectors-1.2.32-src/xdocs/generic_howto/timeouts.xml')
-rw-r--r-- | rubbos/app/tomcat-connectors-1.2.32-src/xdocs/generic_howto/timeouts.xml | 407 |
1 files changed, 407 insertions, 0 deletions
diff --git a/rubbos/app/tomcat-connectors-1.2.32-src/xdocs/generic_howto/timeouts.xml b/rubbos/app/tomcat-connectors-1.2.32-src/xdocs/generic_howto/timeouts.xml new file mode 100644 index 00000000..12208e1a --- /dev/null +++ b/rubbos/app/tomcat-connectors-1.2.32-src/xdocs/generic_howto/timeouts.xml @@ -0,0 +1,407 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!DOCTYPE document [ + <!ENTITY project SYSTEM "project.xml"> +]> +<document url="timeouts.html"> + + &project; +<copyright> + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +</copyright> +<properties> +<title>Timeouts HowTo</title> +<author email="rjung@apache.org">Rainer Jung</author> +<date>$Date: 2009-03-22 00:11:39 +0100 (Sun, 22 Mar 2009) $</date> +</properties> +<body> +<section name="Introduction"> +<br/> +<p>Setting communication timeouts is very important to improve the +communication process. They help to detect problems and stabilise +a distributed system. JK can use several different timeout types, which +can be individually configured. For historical reasons, all of them are +disabled by default. This HowTo explains their use and gives +hints how to find appropriate values. +</p> +<p>All timeouts can be configured in the workers.properties file. +For a complete reference of all worker configuration +items, please consult the worker <a href="../reference/workers.html">reference</a>. +This page assumes, that you are using at least version 1.2.16 of JK. +Dependencies on newer versions will be mentioned where necessary. +</p> +<warn> +Do not set timeouts to extreme values. Very small timeouts will likely +be counterproductive. +</warn> +<warn> +Long Garbage Collection pauses on the backend do not make a good +fit with some timeouts. Try to optimise your Java memory and GC settings. +</warn> +</section> + +<section name="JK Timeout Attributes"> +<br/> +<subsection name="CPing/CPong"> +<p> +CPing/CPong is our notion for using small test packets to check the +status of backend connections. JK can use such test packets directly after establishing +a new backend connection (connect mode) and also directly before each request gets +send to a backend (prepost mode). +Starting with version 1.2.27 it can also be used when a connection was idle +for a long time (interval mode). +The maximum waiting time (timeout) for a CPong answer to a CPing and the idle +time in interval mode can be configured. +</p> +<p> +The test packets will be answered by the backend very fast with a minimal amount of +needed processing resources. A positive answer tells us, that the backend can be reached +and is actively processing requests. It does not detect, if some context is deployed +and working. The benefit of CPing/CPong is a fast detection of a communication +problem with the backend. The downside is a slightly increased latency. +</p> +<p> +The worker attribute <b>ping_mode</b> can be set to a combination of characters +to determine, in which situations test packets are used: +<ul> +<li><b>C</b>: connect mode, timeout <b>ping_timeout</b> overwritten by <b>connect_timeout</b></li> +<li><b>P</b>: prepost mode, timeout <b>ping_timeout</b> overwritten by <b>prepost_timeout</b></li> +<li><b>I</b>: interval mode, timeout <b>ping_timeout</b>, idle time <b>connection_ping_interval</b></li> +<li><b>A</b>: all modes</li> +</ul> +</p> +<p> +Multiple values must be concatenated without any separator characters. +We recommend using all CPing tests. If your application is very latency sensitive, then +you should only use the combination of connect and interval mode. +</p> +<p> +Activating the CPing probing via <b>ping_mode</b> has been added in version 1.2.27. +For older versions only the connect and prepost modes exist and must be activated by +explicitely setting <b>connect_timeout</b> and <b>prepost_timeout</b>. +</p> +<p> +The worker attribute <b>ping_timeout</b> sets the default wait timeout +in milliseconds for CPong for all modes. By default the value is "10000" +milliseconds. The value only gets used, if you activate CPing/Cpong probes +via <b>ping_mode</b>. The default value should be fine, except if you experience +very long Java garbage collection pauses. +Depending on your network latency and stability, good custom values +often are between 5000 and 15000 milliseconds. +You can overwrite the timeout used for connect and prepost mode with +<b>connect_timeout</b> and <b>prepost_timeout</b>. +Remember: don't use extremely small values. +</p> +<p> +The worker attribute <b>connect_timeout</b> sets the wait timeout +in milliseconds for CPong during connection establishment. You can use it +if you want to overwrite the general timeout set with <b>ping_timeout</b>. +To use connect mode CPing, you need to enable it via <b>ping_mode</b>. +Since JK usually uses persistent connections, opening new connections is a +rare event. We therefore recommend activating connect mode. +Depending on your network latency and stability, good values often +are between 5000 and 15000 milliseconds. +Remember: don't use extremely small values. +</p> +<p> +The worker attribute <b>prepost_timeout</b> sets the wait timeout +in milliseconds for CPong before request forwarding. You can use it +if you want to overwrite the general timeout set with <b>ping_timeout</b>. +To use prepost mode CPing, you need to enable it via <b>ping_mode</b>. +Activating this type of CPing/CPong adds a small latency to each +request. Usually this is small enough and the benefit of CPing/CPong is more important. +So in general we also recommend using <b>prepost_timeout</b>. +Depending on your network latency and stability, good values often +are between 5000 and 10000 milliseconds. +Remember: don't use extremely small values. +</p> +<p> +Until version 1.2.27 <b>ping_mode</b> and <b>ping_timeout</b> did not +exist and to enable connect or prepost mode CPing you had to set <b>connect_timeout</b> +respectively <b>prepost_timeout</b> to some reasonable positive value. +</p> +</subsection> + +<subsection name="Low-Level TCP Timeouts"> +<p> +Some platforms allow to set timeouts for all operations on TCP sockets. +This is available for Linux and Windows, other platforms do not support this, +e.g. Solaris. If your platform supports TCP send and receive timeouts, +you can set them using the worker attribute <b>socket_timeout</b>. +You can not set the two timeouts to different values. +</p> +<p> +JK will accept this attribute even if your platform does not support +socket timeouts. In this case setting the attribute will have no effect. +By default the value is "0" and the timeout is disabled. +You can set the attribute to some seconds value (not: milliseconds). +JK will then set the send and the receive timeouts of the backend +connections to this value. The timeout is low-level, it is +used for each read and write operation on the socket individually. +</p> +<p> +Using this attribute will make JK react faster to some types of network problems. +Unfortunately socket timeouts have negative side effects, because for most +platforms, there is no good way to recover from such a timeout, once it fired. +For JK there is no way to decide, if this timeout fired because of real network +problems, or only because it didn't receive an answer packet from a backend in time. +So remember: don't use extremely small values. +</p> +<p> +For the general case of connection establishment you can use +<b>socket_connect_timeout</b>. It takes a millisecond value and works +on most platforms, even if <b>socket_timeout</b> is not supported. +We recommend using <b>socket_connect_timeout</b> because in some network +failure situations failure detection during connection establishment +can take several minutes due to TCP retransmits. Depending on the quality +of your network a timeout somewhere between 1000 and 5000 milliseconds +should be fine. Note that <code>socket_timeout</code> is in seconds, and +<code>socket_connect_timeout</code> in milliseconds. +</p> +</subsection> + +<subsection name="Connection Pools and Idle Timeouts"> +<p> +JK handles backend connections in a connection pool per web server process. +The connections are used in a persistent mode. After a request completed +successfully we keep the connection open and wait for the next +request to forward. The connection pool is able to grow according +to the number of threads that want to forward requests in parallel. +</p> +<p> +Most applications have a varying load depending on the hour of the day +or the day of the month. Other reasons for a growing connection pool +would be temporary slowness of backends, leading to an increasing +congestion of the frontends like web servers. Many backends use a dedicated +thread for each incoming connection they handle. So usually one wants the +connection pool to shrink, if the load diminishes. +</p> +<p> +JK allows connections in the pool to get closed after some idle time. +This maximum idle time can be configured with the attribute +<b>connection_pool_timeout</b> which is given in units of seconds. +The default value is "0", which disables closing idle connections. +</p> +<p> +We generally recommend values around 10 minutes, so setting +<b>connection_pool_timeout</b> to 600 (seconds). If you use this attribute, +please also set the attribute <b>connectionTimeout</b> in the AJP +Connector element of your Tomcat server.xml configuration file to +an analogous value. <b>Caution</b>: connectionTimeout is in milliseconds. +So if you set JK connection_pool_timeout to 600, you should set Tomcat +connectionTimeout to 600000. +</p> +<p> +JK connections do not get closed immediately after the timeout passed. +Instead there is an automatic internal maintenance task +running every 60 seconds, that checks the idle status of all connections. +The 60 seconds interval +can be adjusted with the global attribute worker.maintain. We do not +recommend to change this value, because it has a lot of side effects. +Until version 1.2.26, the maintenance task only runs, if requests get +processed. So if your web server has processes that do not receive any +requests for a long time, there is no way to close the idle connections +in its pool. Starting with version 1.2.27 you can configure an independent +watchdog thread when using Apache 2.x with threaded APR or IIS. +</p> +<p> +The maximum connection pool size can be configured with the +attribute <b>connection_pool_size</b>. We generally do not recommend +to use this attribute in combination with Apache httpd. For +Apache httpd we automatically detect the number of threads per +process and set the maximum pool size to this value. For IIS we use +a default value of 250 (before version 1.2.20: 10), +for the Sun Web Server the default is "1". +We strongly recommend adjusting this value for IIS and the Sun Web Server +to the number of requests one web server process should +be able to send to a backend in parallel. You should measure how many connections +you need during peak hours without performance problems, and then add some +percentage depending on your growth rate etc. Finally you should check, +whether your web server processes are able to use at least as many threads, +as you configured as the pool size. +</p> +<p> +The JK attribute <b>connection_pool_minsize</b> defines, +how many idle connections remain when the pool gets shrunken. +By default this is half of the maximum pool size. +</p> +</subsection> + +<subsection name="Firewall Connection Dropping"> +<p> +One particular problem with idle connections comes from firewalls, that +are often deployed between the web server layer and the backend. +Depending on their configuration, they will silently drop +connections from their status table if they are idle for to long. +</p> +<p> +From the point of view of JK and of the web server, the other side +simply doesn't answer any traffic. Since TCP is a reliable protocol +it detects the missing TCP ACKs and tries to resend the packets for +a relatively long time, typically several minutes. +</p> +<p> +Many firewalls will allow connection closing, even if they dropped +the connection for normal traffic. Therefore you should always use +<a href="#Connection Pools and Idle Timeouts">connection_pool_timeout and +connection_pool_minsize</a> on the JK side +and connectionTimeout on the Tomcat side. +</p> +<p> +Furthermore using the boolean attribute <b>socket_keepalive</b> you can +set a standard socket option, that automatically sends TCP keepalive packets +after some idle time on each connection. By default this is set to "False". +If you suspect idle connection drops by firewalls you should set this to +"True". +</p> +<p> +Unfortunately the default intervals and algorithms for these packets +are platform specific. You might need to inspect TCP tuning options for +your platform on how to control TCP keepalive. +Often the default intervals are much longer than the firewall timeouts +for idle connections. Nevertheless we recommend talking to your firewall +administration and your platform administration in order to make them agree +on good configuration values for the firewall and the platform TCP tuning. +</p> +<p> +In case none of our recommendations help and you are definitively having +problems with idle connection drops, you can disable the use of persistent +connections when using JK together with Apache httpd. For this you set +"JkOptions +DisableReuse" in your Apache httpd configuration. +This will have a huge negative performance impact! +</p> +</subsection> + +<subsection name="Reply Timeout"> +<p> +JK can also use a timeout on request replies. This timeout does not +measure the full processing time of the response. Instead it controls, +how much time between consecutive response packets is allowed. +</p> +<p> +In most cases, this is what one actually wants. Consider for example +long running downloads. You would not be able to set an effective global +reply timeout, because downloads could last for many minutes. +Most applications though have limited processing time before starting +to return the response. For those applications you could set an explicit +reply timeout. Applications that do not harmonise with reply timeouts +are batch type applications, data warehouse and reporting applications +which are expected to observe long processing times. +</p> +<warn> +If JK aborts waiting for a response, because a reply timeout fired, +there is no way to stop processing on the backend. Although you free +processing resources in your web server, the request +will continue to run on the backend - without any way to send back a +result once the reply timeout fired. +</warn> +<p> +JK uses the worker attribute <b>reply_timeout</b> to set reply timeouts. +The default value is "0" (timeout disabled) and you can set it to any +millisecond value. +</p> +<p> +In combination with Apache httpd, you can also set a more flexible reply_timeout +using an httpd environment variable. If you set the variable JK_REPLY_TIMEOUT +to some integer value, this value will be used instead of the value in +the worker configuration. This way you can set reply timeouts more flexible +with mod_setenvif and mod_rewrite depending on URI, query string etc. +If the environment variable JK_REPLY_TIMEOUT is not set, or is set to a +negative value, the default reply timeout of the worker will be used. If +JK_REPLY_TIMEOUT contains the value "0", then the reply timeout will be disabled +for the request. +</p> +<p> +In combination with a load balancing worker, JK will disable a member +worker of the load balancer if a reply timeout fires. The worker will then +no longer be used until it gets recovered during the next automatic +maintenance task. Starting with JK 1.2.24 you can improve this behaviour using +<b><a href="../reference/workers.html">max_reply_timeouts</a></b>. This +attribute will allow occasional long running requests without disabling the +worker. Only if those requests happen to often, the worker gets disabled by the +load balancer. +</p> +</subsection> +</section> + +<section name="Load Balancer Error Detection"> +<br/> +<subsection name="Local and Global Error States"> +<p> +A load balancer worker does not only have the ability to balance load. +It also handles stickyness and failover of requests in case of errors. +When a load balancer detects an error on one of its members, it needs to +decide, whether the error is serious, or only a temporary error or maybe +only related to the actual request that was processed. Temporary errors +are called local errors, serious errors will be called global errors. +</p> +<p> +If the load balancer decides that a backend should be put into the global error +state, then the web server will not send any more requests there. If no session +replication is used, this means that all user sessions located on the respective +backend are no longer available. The users will be send to another backend +and will have to login again. So the global error state is not transparent to the +users. The application is still available, but users might loose some work. +</p> +<p> +In some cases the decision between local error and global error is easy. +For instance if there is an error sending back the response to the client (browser), +then it is very unlikely that the backend is broken. +So this situation is a typical example of a local error. +</p> +<p> +Some situations are harder to decide though. If the load balancer can't establish +a new connection to a backend, it could be because of a temporary overload situation +(so no more free threads in the backend), or because the backend isn't alive any more. +Depending on the details, the right state could either be local error or global error. +</p> +</subsection> +<subsection name="Error Escalation Time"> +<p> +Until version 1.2.26 most errors were interpreted as global errors. +Starting with version 1.2.27 many errors which were previously interpreted as global +were switched to being local whenever the backend is still busy. Busy means, that +other concurrent requests are send to the same backend (successful or not). +</p> +<p> +In many cases there is no perfect way of making the decision +between local and global error. The load balancer simply doesn't have enough information. +In version 1.2.28 you can now tune, how fast the load balancer switches from local error to +global error. If a member of a load balancer stays in local error state for too long, +the load balancer will escalate it into global error state. +</p> +<p> +The time tolerated in local error state is controlled by the load balancer attribute +<b>error_escalation_time</b> (in seconds). The default value is half of <b>recover_time</b>, +so unless you changed <b>recover_time</b> the default is 30 seconds. +</p> +<p> +Using a smaller value for <b>error_escalation_time</b> will make the load balancer react +faster to serious errors, but also carries the risk of more often loosing sessions +in not so serious situations. You can lower <b>error_escalation_time</b> down to 0 seconds, +which means all local errors which are potentially serious are escalated to global errors +immediately. +</p> +<p> +Note that without good basic error detection the whole escalation procedure is useless. +So you should definitely use <b>socket_connect_timeout</b> and activate CPing/CPong +with <b>ping_mode</b> and <b>ping_timeout</b> before thinking about also tuning +<b>error_escalation_time</b>. +</p> +</subsection> +</section> + +</body> +</document> |