Connections in the FIN_WAIT_2 state and Apache

Available Languages: en

+ + +

Warning:

This document has not been fully updated + to take into account changes made in the 2.0 version of the + Apache HTTP Server. Some of the information may still be + relevant, but please use it with care.

+ +

Starting with the Apache 1.2 betas, people are reporting + many more connections in the FIN_WAIT_2 state (as reported + by netstat) than they saw using older + versions. When the server closes a TCP connection, it sends + a packet with the FIN bit set to the client, which then + responds with a packet with the ACK bit set. The client + then sends a packet with the FIN bit set to the server, + which responds with an ACK and the connection is closed. + The state that the connection is in during the period + between when the server gets the ACK from the client and + the server gets the FIN from the client is known as + FIN_WAIT_2. See the TCP RFC for + the technical details of the state transitions.

+ +

The FIN_WAIT_2 state is somewhat unusual in that there + is no timeout defined in the standard for it. This means + that on many operating systems, a connection in the + FIN_WAIT_2 state will stay around until the system is + rebooted. If the system does not have a timeout and too + many FIN_WAIT_2 connections build up, it can fill up the + space allocated for storing information about the + connections and crash the kernel. The connections in + FIN_WAIT_2 do not tie up an httpd process.

+ +

Why Does It Happen?
What Can I Do About it?
Appendix

Why Does It Happen?

+ +

There are numerous reasons for it happening, some of them + may not yet be fully clear. What is known follows.

+ +

Buggy Clients and Persistent + Connections

+ +

Several clients have a bug which pops up when dealing with + persistent connections (aka + keepalives). When the connection is idle and the server + closes the connection (based on the KeepAliveTimeout), + the client is programmed so that the client does not send + back a FIN and ACK to the server. This means that the + connection stays in the FIN_WAIT_2 state until one of the + following happens:

+ +

The client opens a new connection to the same or a + different site, which causes it to fully close the older + connection on that socket.
The user exits the client, which on some (most?) + clients causes the OS to fully shutdown the + connection.
The FIN_WAIT_2 times out, on servers that have a + timeout for this state.

+ +

If you are lucky, this means that the buggy client will + fully close the connection and release the resources on + your server. However, there are some cases where the socket + is never fully closed, such as a dialup client + disconnecting from their provider before closing the + client. In addition, a client might sit idle for days + without making another connection, and thus may hold its + end of the socket open for days even though it has no + further use for it. This is a bug in the browser or + in its operating system's TCP implementation.

+ +

The clients on which this problem has been verified to + exist:

+ +

Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE + i386)
Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE + i386)
Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)
MSIE 3.01 on the Macintosh
MSIE 3.01 on Windows 95

+ +

This does not appear to be a problem on:

+ +

Mozilla/3.01 (Win95; I)

+ +

It is expected that many other clients have the same + problem. What a client should do is + periodically check its open socket(s) to see if they have + been closed by the server, and close their side of the + connection if the server has closed. This check need only + occur once every few seconds, and may even be detected by a + OS signal on some systems (e.g., Win95 and NT + clients have this capability, but they seem to be ignoring + it).

+ +

Apache cannot avoid these FIN_WAIT_2 + states unless it disables persistent connections for the + buggy clients, just like we recommend doing for Navigator + 2.x clients due to other bugs. However, non-persistent + connections increase the total number of connections needed + per client and slow retrieval of an image-laden web page. + Since non-persistent connections have their own resource + consumptions and a short waiting period after each closure, + a busy server may need persistence in order to best serve + its clients.

+ +

As far as we know, the client-caused FIN_WAIT_2 problem + is present for all servers that support persistent + connections, including Apache 1.1.x and 1.2.

+ + + +

A necessary bit of code + introduced in 1.2

+ +

While the above bug is a problem, it is not the whole + problem. Some users have observed no FIN_WAIT_2 problems + with Apache 1.1.x, but with 1.2b enough connections build + up in the FIN_WAIT_2 state to crash their server. The most + likely source for additional FIN_WAIT_2 states is a + function called lingering_close() which was + added between 1.1 and 1.2. This function is necessary for + the proper handling of persistent connections and any + request which includes content in the message body + (e.g., PUTs and POSTs). What it does is read any + data sent by the client for a certain time after the server + closes the connection. The exact reasons for doing this are + somewhat complicated, but involve what happens if the + client is making a request at the same time the server + sends a response and closes the connection. Without + lingering, the client might be forced to reset its TCP + input buffer before it has a chance to read the server's + response, and thus understand why the connection has + closed. See the appendix for more + details.

+ +

The code in lingering_close() appears to + cause problems for a number of factors, including the + change in traffic patterns that it causes. The code has + been thoroughly reviewed and we are not aware of any bugs + in it. It is possible that there is some problem in the BSD + TCP stack, aside from the lack of a timeout for the + FIN_WAIT_2 state, exposed by the + lingering_close code that causes the observed + problems.

+ + +

What Can I Do About it?

+ +

There are several possible workarounds to the problem, some + of which work better than others.

+ +

Add a timeout for FIN_WAIT_2

+ +

The obvious workaround is to simply have a timeout for the + FIN_WAIT_2 state. This is not specified by the RFC, and + could be claimed to be a violation of the RFC, but it is + widely recognized as being necessary. The following systems + are known to have a timeout:

+ +

FreeBSD + versions starting at 2.0 or possibly earlier.
NetBSD version + 1.2(?)
OpenBSD all + versions(?)
BSD/OS 2.1, with + the + K210-027 patch installed.
Solaris as of + around version 2.2. The timeout can be tuned by using + ndd to modify + tcp_fin_wait_2_flush_interval, but the + default should be appropriate for most servers and + improper tuning can have negative impacts.
Linux 2.0.x and + earlier(?)
HP-UX 10.x defaults + to terminating connections in the FIN_WAIT_2 state after + the normal keepalive timeouts. This does not refer to the + persistent connection or HTTP keepalive timeouts, but the + SO_LINGER socket option which is enabled by + Apache. This parameter can be adjusted by using + nettune to modify parameters such as + tcp_keepstart and tcp_keepstop. + In later revisions, there is an explicit timer for + connections in FIN_WAIT_2 that can be modified; contact + HP support for details.
SGI IRIX can be + patched to support a timeout. For IRIX 5.3, 6.2, and 6.3, + use patches 1654, 1703 and 1778 respectively. If you have + trouble locating these patches, please contact your SGI + support channel for help.
NCR's MP RAS Unix + 2.xx and 3.xx both have FIN_WAIT_2 timeouts. In 2.xx it + is non-tunable at 600 seconds, while in 3.xx it defaults + to 600 seconds and is calculated based on the tunable + "max keep alive probes" (default of 8) multiplied by the + "keep alive interval" (default 75 seconds).
Sequent's ptx/TCP/IP + for DYNIX/ptx has had a FIN_WAIT_2 timeout since + around release 4.1 in mid-1994.

+ +

The following systems are known to not have a + timeout:

+ +

SunOS 4.x does not + and almost certainly never will have one because it as at + the very end of its development cycle for Sun. If you + have kernel source should be easy to patch.

+ +

There is a + patch available for adding a timeout to the FIN_WAIT_2 + state; it was originally intended for BSD/OS, but should be + adaptable to most systems using BSD networking code. You + need kernel source code to be able to use it.

+ + + +

Compile without using + `lingering_close()`

+ +

It is possible to compile Apache 1.2 without using the + lingering_close() function. This will result + in that section of code being similar to that which was in + 1.1. If you do this, be aware that it can cause problems + with PUTs, POSTs and persistent connections, especially if + the client uses pipelining. That said, it is no worse than + on 1.1, and we understand that keeping your server running + is quite important.

+ +

To compile without the lingering_close() + function, add -DNO_LINGCLOSE to the end of the + EXTRA_CFLAGS line in your + Configuration file, rerun + Configure and rebuild the server.

+ + + +

Use `SO_LINGER` as + an alternative to `lingering_close()`

+ +

On most systems, there is an option called + SO_LINGER that can be set with + setsockopt(2). It does something very similar + to lingering_close(), except that it is broken + on many systems so that it causes far more problems than + lingering_close. On some systems, it could + possibly work better so it may be worth a try if you have + no other alternatives.

+ +

To try it, add -DUSE_SO_LINGER + -DNO_LINGCLOSE to the end of the + EXTRA_CFLAGS line in your + Configuration file, rerun + Configure and rebuild the server.

+ +

NOTE

Attempting to use + SO_LINGER and lingering_close() + at the same time is very likely to do very bad things, so + don't.

+ + + +

Increase the amount of memory + used for storing connection state

+ +

BSD based networking code:: + BSD stores network data, such as connection states, in + something called an mbuf. When you get so many + connections that the kernel does not have enough mbufs + to put them all in, your kernel will likely crash. You + can reduce the effects of the problem by increasing the + number of mbufs that are available; this will not + prevent the problem, it will just make the server go + longer before crashing. + +
The exact way to increase them may depend on your + OS; look for some reference to the number of "mbufs" or + "mbuf clusters". On many systems, this can be done by + adding the line NMBCLUSTERS="n", where + n is the number of mbuf clusters you want + to your kernel config file and rebuilding your + kernel.
+

+ + + +

Disable KeepAlive

+ +

If you are unable to do any of the above then you + should, as a last resort, disable KeepAlive. Edit your + httpd.conf and change "KeepAlive On" to "KeepAlive + Off".

+ + +

Appendix

+ +

Below is a message from Roy Fielding, one of the authors + of HTTP/1.1.

+ +

Why the lingering close + functionality is necessary with HTTP

+ +

The need for a server to linger on a socket after a close + is noted a couple times in the HTTP specs, but not + explained. This explanation is based on discussions between + myself, Henrik Frystyk, Robert S. Thau, Dave Raggett, and + John C. Mallery in the hallways of MIT while I was at W3C.

+ +

If a server closes the input side of the connection + while the client is sending data (or is planning to send + data), then the server's TCP stack will signal an RST + (reset) back to the client. Upon receipt of the RST, the + client will flush its own incoming TCP buffer back to the + un-ACKed packet indicated by the RST packet argument. If + the server has sent a message, usually an error response, + to the client just before the close, and the client + receives the RST packet before its application code has + read the error message from its incoming TCP buffer and + before the server has received the ACK sent by the client + upon receipt of that buffer, then the RST will flush the + error message before the client application has a chance to + see it. The result is that the client is left thinking that + the connection failed for no apparent reason.

+ +

There are two conditions under which this is likely to + occur:

+ +

sending POST or PUT data without proper + authorization
sending multiple requests before each response + (pipelining) and one of the middle requests resulting in + an error or other break-the-connection result.

+ +

The solution in all cases is to send the response, close + only the write half of the connection (what shutdown is + supposed to do), and continue reading on the socket until + it is either closed by the client (signifying it has + finally read the response) or a timeout occurs. That is + what the kernel is supposed to do if SO_LINGER is set. + Unfortunately, SO_LINGER has no effect on some systems; on + some other systems, it does not have its own timeout and + thus the TCP memory segments just pile-up until the next + reboot (planned or not).

+ +

Please note that simply removing the linger code will + not solve the problem -- it only moves it to a different + and much harder one to detect.

+ +