From cc40af334e619bb549038238507407866f774f8f Mon Sep 17 00:00:00 2001 From: hongbotian Date: Mon, 30 Nov 2015 01:35:09 -0500 Subject: upload apache JIRA: BOTTLENECK-10 Change-Id: I67eae31de6dc824097dfa56ab454ba36fdd23a2c Signed-off-by: hongbotian --- rubbos/app/apache2/manual/developer/API.html | 5 + rubbos/app/apache2/manual/developer/API.html.en | 1223 ++++++++++++++++++++ rubbos/app/apache2/manual/developer/debugging.html | 5 + .../app/apache2/manual/developer/debugging.html.en | 196 ++++ .../app/apache2/manual/developer/documenting.html | 5 + .../apache2/manual/developer/documenting.html.en | 84 ++ rubbos/app/apache2/manual/developer/filters.html | 5 + .../app/apache2/manual/developer/filters.html.en | 210 ++++ rubbos/app/apache2/manual/developer/hooks.html | 5 + rubbos/app/apache2/manual/developer/hooks.html.en | 239 ++++ rubbos/app/apache2/manual/developer/index.html | 5 + rubbos/app/apache2/manual/developer/index.html.en | 73 ++ rubbos/app/apache2/manual/developer/modules.html | 9 + .../app/apache2/manual/developer/modules.html.en | 273 +++++ .../apache2/manual/developer/modules.html.ja.utf8 | 274 +++++ rubbos/app/apache2/manual/developer/request.html | 5 + .../app/apache2/manual/developer/request.html.en | 260 +++++ .../apache2/manual/developer/thread_safety.html | 5 + .../apache2/manual/developer/thread_safety.html.en | 285 +++++ 19 files changed, 3166 insertions(+) create mode 100644 rubbos/app/apache2/manual/developer/API.html create mode 100644 rubbos/app/apache2/manual/developer/API.html.en create mode 100644 rubbos/app/apache2/manual/developer/debugging.html create mode 100644 rubbos/app/apache2/manual/developer/debugging.html.en create mode 100644 rubbos/app/apache2/manual/developer/documenting.html create mode 100644 rubbos/app/apache2/manual/developer/documenting.html.en create mode 100644 rubbos/app/apache2/manual/developer/filters.html create mode 100644 rubbos/app/apache2/manual/developer/filters.html.en create mode 100644 rubbos/app/apache2/manual/developer/hooks.html create mode 100644 rubbos/app/apache2/manual/developer/hooks.html.en create mode 100644 rubbos/app/apache2/manual/developer/index.html create mode 100644 rubbos/app/apache2/manual/developer/index.html.en create mode 100644 rubbos/app/apache2/manual/developer/modules.html create mode 100644 rubbos/app/apache2/manual/developer/modules.html.en create mode 100644 rubbos/app/apache2/manual/developer/modules.html.ja.utf8 create mode 100644 rubbos/app/apache2/manual/developer/request.html create mode 100644 rubbos/app/apache2/manual/developer/request.html.en create mode 100644 rubbos/app/apache2/manual/developer/thread_safety.html create mode 100644 rubbos/app/apache2/manual/developer/thread_safety.html.en (limited to 'rubbos/app/apache2/manual/developer') diff --git a/rubbos/app/apache2/manual/developer/API.html b/rubbos/app/apache2/manual/developer/API.html new file mode 100644 index 00000000..b8ea3a1c --- /dev/null +++ b/rubbos/app/apache2/manual/developer/API.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: API.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 diff --git a/rubbos/app/apache2/manual/developer/API.html.en b/rubbos/app/apache2/manual/developer/API.html.en new file mode 100644 index 00000000..db62a134 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/API.html.en @@ -0,0 +1,1223 @@ + + + +Apache 1.3 API notes - Apache HTTP Server + + + + + +
<-
+

Apache 1.3 API notes

+
+

Available Languages:  en 

+
+ +

Warning

+

This document has not been updated to take into account changes made + in the 2.0 version of the Apache HTTP Server. Some of the information may + still be relevant, but please use it with care.

+
+ +

These are some notes on the Apache API and the data structures you have + to deal with, etc. They are not yet nearly complete, but hopefully, + they will help you get your bearings. Keep in mind that the API is still + subject to change as we gain experience with it. (See the TODO file for + what might be coming). However, it will be easy to adapt modules + to any changes that are made. (We have more modules to adapt than you + do).

+ +

A few notes on general pedagogical style here. In the interest of + conciseness, all structure declarations here are incomplete -- the real + ones have more slots that I'm not telling you about. For the most part, + these are reserved to one component of the server core or another, and + should be altered by modules with caution. However, in some cases, they + really are things I just haven't gotten around to yet. Welcome to the + bleeding edge.

+ +

Finally, here's an outline, to give you some bare idea of what's coming + up, and in what order:

+ + +
+ +
top
+
+

Basic concepts

+

We begin with an overview of the basic concepts behind the API, and how + they are manifested in the code.

+ +

Handlers, Modules, and Requests

+

Apache breaks down request handling into a series of steps, more or + less the same way the Netscape server API does (although this API has a + few more stages than NetSite does, as hooks for stuff I thought might be + useful in the future). These are:

+ +
    +
  • URI -> Filename translation
  • +
  • Auth ID checking [is the user who they say they are?]
  • +
  • Auth access checking [is the user authorized here?]
  • +
  • Access checking other than auth
  • +
  • Determining MIME type of the object requested
  • +
  • `Fixups' -- there aren't any of these yet, but the phase is intended + as a hook for possible extensions like SetEnv, which don't really fit well elsewhere.
  • +
  • Actually sending a response back to the client.
  • +
  • Logging the request
  • +
+ +

These phases are handled by looking at each of a succession of + modules, looking to see if each of them has a handler for the + phase, and attempting invoking it if so. The handler can typically do one + of three things:

+ +
    +
  • Handle the request, and indicate that it has done so by + returning the magic constant OK.
  • + +
  • Decline to handle the request, by returning the magic integer + constant DECLINED. In this case, the server behaves in all + respects as if the handler simply hadn't been there.
  • + +
  • Signal an error, by returning one of the HTTP error codes. This + terminates normal handling of the request, although an ErrorDocument may + be invoked to try to mop up, and it will be logged in any case.
  • +
+ +

Most phases are terminated by the first module that handles them; + however, for logging, `fixups', and non-access authentication checking, + all handlers always run (barring an error). Also, the response phase is + unique in that modules may declare multiple handlers for it, via a + dispatch table keyed on the MIME type of the requested object. Modules may + declare a response-phase handler which can handle any request, + by giving it the key */* (i.e., a wildcard MIME type + specification). However, wildcard handlers are only invoked if the server + has already tried and failed to find a more specific response handler for + the MIME type of the requested object (either none existed, or they all + declined).

+ +

The handlers themselves are functions of one argument (a + request_rec structure. vide infra), which returns an integer, + as above.

+ + +

A brief tour of a module

+

At this point, we need to explain the structure of a module. Our + candidate will be one of the messier ones, the CGI module -- this handles + both CGI scripts and the ScriptAlias config file command. It's actually a great deal + more complicated than most modules, but if we're going to have only one + example, it might as well be the one with its fingers in every place.

+ +

Let's begin with handlers. In order to handle the CGI scripts, the + module declares a response handler for them. Because of ScriptAlias, it also has handlers for the + name translation phase (to recognize ScriptAliased URIs), the type-checking phase (any + ScriptAliased request is typed + as a CGI script).

+ +

The module needs to maintain some per (virtual) server information, + namely, the ScriptAliases in + effect; the module structure therefore contains pointers to a functions + which builds these structures, and to another which combines two of them + (in case the main server and a virtual server both have ScriptAliases declared).

+ +

Finally, this module contains code to handle the ScriptAlias command itself. This particular + module only declares one command, but there could be more, so modules have + command tables which declare their commands, and describe where + they are permitted, and how they are to be invoked.

+ +

A final note on the declared types of the arguments of some of these + commands: a pool is a pointer to a resource pool + structure; these are used by the server to keep track of the memory which + has been allocated, files opened, etc., either to service a + particular request, or to handle the process of configuring itself. That + way, when the request is over (or, for the configuration pool, when the + server is restarting), the memory can be freed, and the files closed, + en masse, without anyone having to write explicit code to track + them all down and dispose of them. Also, a cmd_parms + structure contains various information about the config file being read, + and other status information, which is sometimes of use to the function + which processes a config-file command (such as ScriptAlias). With no further ado, the + module itself:

+ +

+ /* Declarations of handlers. */
+
+ int translate_scriptalias (request_rec *);
+ int type_scriptalias (request_rec *);
+ int cgi_handler (request_rec *);
+
+ /* Subsidiary dispatch table for response-phase
+  * handlers, by MIME type */
+
+ handler_rec cgi_handlers[] = {
+ + { "application/x-httpd-cgi", cgi_handler },
+ { NULL }
+
+ };
+
+ /* Declarations of routines to manipulate the
+  * module's configuration info. Note that these are
+  * returned, and passed in, as void *'s; the server
+  * core keeps track of them, but it doesn't, and can't,
+  * know their internal structure.
+  */
+
+ void *make_cgi_server_config (pool *);
+ void *merge_cgi_server_config (pool *, void *, void *);
+
+ /* Declarations of routines to handle config-file commands */
+
+ extern char *script_alias(cmd_parms *, void *per_dir_config, char *fake, + char *real);
+
+ command_rec cgi_cmds[] = {
+ + { "ScriptAlias", script_alias, NULL, RSRC_CONF, TAKE2,
+ "a fakename and a realname"},
+ { NULL }
+
+ };
+
+ module cgi_module = { +

  STANDARD_MODULE_STUFF,
+  NULL,                     /* initializer */
+  NULL,                     /* dir config creator */
+  NULL,                     /* dir merger */
+  make_cgi_server_config,   /* server config */
+  merge_cgi_server_config,  /* merge server config */
+  cgi_cmds,                 /* command table */
+  cgi_handlers,             /* handlers */
+  translate_scriptalias,    /* filename translation */
+  NULL,                     /* check_user_id */
+  NULL,                     /* check auth */
+  NULL,                     /* check access */
+  type_scriptalias,         /* type_checker */
+  NULL,                     /* fixups */
+  NULL,                     /* logger */
+  NULL                      /* header parser */
+};
+ +
top
+
+

How handlers work

+

The sole argument to handlers is a request_rec structure. + This structure describes a particular request which has been made to the + server, on behalf of a client. In most cases, each connection to the + client generates only one request_rec structure.

+ +

A brief tour of the request_rec

+

The request_rec contains pointers to a resource pool + which will be cleared when the server is finished handling the request; + to structures containing per-server and per-connection information, and + most importantly, information on the request itself.

+ +

The most important such information is a small set of character strings + describing attributes of the object being requested, including its URI, + filename, content-type and content-encoding (these being filled in by the + translation and type-check handlers which handle the request, + respectively).

+ +

Other commonly used data items are tables giving the MIME headers on + the client's original request, MIME headers to be sent back with the + response (which modules can add to at will), and environment variables for + any subprocesses which are spawned off in the course of servicing the + request. These tables are manipulated using the ap_table_get + and ap_table_set routines.

+ +
+

Note that the Content-type header value cannot + be set by module content-handlers using the ap_table_*() + routines. Rather, it is set by pointing the content_type + field in the request_rec structure to an appropriate + string. e.g.,

+

+ r->content_type = "text/html"; +

+
+ +

Finally, there are pointers to two data structures which, in turn, + point to per-module configuration structures. Specifically, these hold + pointers to the data structures which the module has built to describe + the way it has been configured to operate in a given directory (via + .htaccess files or <Directory> sections), for private data it has built in the + course of servicing the request (so modules' handlers for one phase can + pass `notes' to their handlers for other phases). There is another such + configuration vector in the server_rec data structure pointed + to by the request_rec, which contains per (virtual) server + configuration data.

+ +

Here is an abridged declaration, giving the fields most commonly + used:

+ +

+ struct request_rec {
+
+ pool *pool;
+ conn_rec *connection;
+ server_rec *server;
+
+ /* What object is being requested */
+
+ char *uri;
+ char *filename;
+ char *path_info; +

char *args;           /* QUERY_ARGS, if any */
+struct stat finfo;    /* Set by server core;
+                       * st_mode set to zero if no such file */

+ char *content_type;
+ char *content_encoding;
+
+ /* MIME header environments, in and out. Also,
+  * an array containing environment variables to
+  * be passed to subprocesses, so people can write
+  * modules to add to that environment.
+  *
+  * The difference between headers_out and
+  * err_headers_out is that the latter are printed
+  * even on error, and persist across internal
+  * redirects (so the headers printed for
+  * ErrorDocument handlers will have + them).
+  */
+
+ table *headers_in;
+ table *headers_out;
+ table *err_headers_out;
+ table *subprocess_env;
+
+ /* Info about the request itself... */
+
+

int header_only;     /* HEAD request, as opposed to GET */
+char *protocol;      /* Protocol, as given to us, or HTTP/0.9 */
+char *method;        /* GET, HEAD, POST, etc. */
+int method_number;   /* M_GET, M_POST, etc. */
+
+

+ /* Info for logging */
+
+ char *the_request;
+ int bytes_sent;
+
+ /* A flag which modules can set, to indicate that
+  * the data being returned is volatile, and clients
+  * should be told not to cache it.
+  */
+
+ int no_cache;
+
+ /* Various other config info which may change
+  * with .htaccess files
+  * These are config vectors, with one void*
+  * pointer for each module (the thing pointed
+  * to being the module's business).
+  */
+
+

void *per_dir_config;   /* Options set in config files, etc. */
+void *request_config;   /* Notes on *this* request */
+
+

+ }; +

+ + +

Where request_rec structures come from

+

Most request_rec structures are built by reading an HTTP + request from a client, and filling in the fields. However, there are a + few exceptions:

+ +
    +
  • If the request is to an imagemap, a type map (i.e., a + *.var file), or a CGI script which returned a local + `Location:', then the resource which the user requested is going to be + ultimately located by some URI other than what the client originally + supplied. In this case, the server does an internal redirect, + constructing a new request_rec for the new URI, and + processing it almost exactly as if the client had requested the new URI + directly.
  • + +
  • If some handler signaled an error, and an ErrorDocument + is in scope, the same internal redirect machinery comes into play.
  • + +
  • Finally, a handler occasionally needs to investigate `what would + happen if' some other request were run. For instance, the directory + indexing module needs to know what MIME type would be assigned to a + request for each directory entry, in order to figure out what icon to + use.

    + +

    Such handlers can construct a sub-request, using the + functions ap_sub_req_lookup_file, + ap_sub_req_lookup_uri, and ap_sub_req_method_uri; + these construct a new request_rec structure and processes it + as you would expect, up to but not including the point of actually sending + a response. (These functions skip over the access checks if the + sub-request is for a file in the same directory as the original + request).

    + +

    (Server-side includes work by building sub-requests and then actually + invoking the response handler for them, via the function + ap_run_sub_req).

    +
  • +
+ + +

Handling requests, declining, and returning + error codes

+

As discussed above, each handler, when invoked to handle a particular + request_rec, has to return an int to indicate + what happened. That can either be

+ +
    +
  • OK -- the request was handled successfully. This may or + may not terminate the phase.
  • + +
  • DECLINED -- no erroneous condition exists, but the module + declines to handle the phase; the server tries to find another.
  • + +
  • an HTTP error code, which aborts handling of the request.
  • +
+ +

Note that if the error code returned is REDIRECT, then + the module should put a Location in the request's + headers_out, to indicate where the client should be + redirected to.

+ + +

Special considerations for response + handlers

+

Handlers for most phases do their work by simply setting a few fields + in the request_rec structure (or, in the case of access + checkers, simply by returning the correct error code). However, response + handlers have to actually send a request back to the client.

+ +

They should begin by sending an HTTP response header, using the + function ap_send_http_header. (You don't have to do anything + special to skip sending the header for HTTP/0.9 requests; the function + figures out on its own that it shouldn't do anything). If the request is + marked header_only, that's all they should do; they should + return after that, without attempting any further output.

+ +

Otherwise, they should produce a request body which responds to the + client as appropriate. The primitives for this are ap_rputc + and ap_rprintf, for internally generated output, and + ap_send_fd, to copy the contents of some FILE * + straight to the client.

+ +

At this point, you should more or less understand the following piece + of code, which is the handler which handles GET requests + which have no more specific handler; it also shows how conditional + GETs can be handled, if it's desirable to do so in a + particular response handler -- ap_set_last_modified checks + against the If-modified-since value supplied by the client, + if any, and returns an appropriate code (which will, if nonzero, be + USE_LOCAL_COPY). No similar considerations apply for + ap_set_content_length, but it returns an error code for + symmetry.

+ +

+ int default_handler (request_rec *r)
+ {
+ + int errstatus;
+ FILE *f;
+
+ if (r->method_number != M_GET) return DECLINED;
+ if (r->finfo.st_mode == 0) return NOT_FOUND;
+
+ if ((errstatus = ap_set_content_length (r, r->finfo.st_size))
+     || + (errstatus = ap_set_last_modified (r, r->finfo.st_mtime)))
+ return errstatus;
+
+ f = fopen (r->filename, "r");
+
+ if (f == NULL) {
+ + log_reason("file permissions deny server access", r->filename, r);
+ return FORBIDDEN;
+
+ }
+
+ register_timeout ("send", r);
+ ap_send_http_header (r);
+
+ if (!r->header_only) send_fd (f, r);
+ ap_pfclose (r->pool, f);
+ return OK;
+
+ } +

+ +

Finally, if all of this is too much of a challenge, there are a few + ways out of it. First off, as shown above, a response handler which has + not yet produced any output can simply return an error code, in which + case the server will automatically produce an error response. Secondly, + it can punt to some other handler by invoking + ap_internal_redirect, which is how the internal redirection + machinery discussed above is invoked. A response handler which has + internally redirected should always return OK.

+ +

(Invoking ap_internal_redirect from handlers which are + not response handlers will lead to serious confusion).

+ + +

Special considerations for authentication + handlers

+

Stuff that should be discussed here in detail:

+ +
    +
  • Authentication-phase handlers not invoked unless auth is + configured for the directory.
  • + +
  • Common auth configuration stored in the core per-dir + configuration; it has accessors ap_auth_type, + ap_auth_name, and ap_requires.
  • + +
  • Common routines, to handle the protocol end of things, at + least for HTTP basic authentication + (ap_get_basic_auth_pw, which sets the + connection->user structure field + automatically, and ap_note_basic_auth_failure, + which arranges for the proper WWW-Authenticate: + header to be sent back).
  • +
+ + +

Special considerations for logging + handlers

+

When a request has internally redirected, there is the question of + what to log. Apache handles this by bundling the entire chain of redirects + into a list of request_rec structures which are threaded + through the r->prev and r->next pointers. + The request_rec which is passed to the logging handlers in + such cases is the one which was originally built for the initial request + from the client; note that the bytes_sent field will only be + correct in the last request in the chain (the one for which a response was + actually sent).

+ +
top
+
+

Resource allocation and resource pools

+

One of the problems of writing and designing a server-pool server is + that of preventing leakage, that is, allocating resources (memory, open + files, etc.), without subsequently releasing them. The resource + pool machinery is designed to make it easy to prevent this from happening, + by allowing resource to be allocated in such a way that they are + automatically released when the server is done with them.

+ +

The way this works is as follows: the memory which is allocated, file + opened, etc., to deal with a particular request are tied to a + resource pool which is allocated for the request. The pool is a + data structure which itself tracks the resources in question.

+ +

When the request has been processed, the pool is cleared. At + that point, all the memory associated with it is released for reuse, all + files associated with it are closed, and any other clean-up functions which + are associated with the pool are run. When this is over, we can be confident + that all the resource tied to the pool have been released, and that none of + them have leaked.

+ +

Server restarts, and allocation of memory and resources for per-server + configuration, are handled in a similar way. There is a configuration + pool, which keeps track of resources which were allocated while reading + the server configuration files, and handling the commands therein (for + instance, the memory that was allocated for per-server module configuration, + log files and other files that were opened, and so forth). When the server + restarts, and has to reread the configuration files, the configuration pool + is cleared, and so the memory and file descriptors which were taken up by + reading them the last time are made available for reuse.

+ +

It should be noted that use of the pool machinery isn't generally + obligatory, except for situations like logging handlers, where you really + need to register cleanups to make sure that the log file gets closed when + the server restarts (this is most easily done by using the function ap_pfopen, which also arranges for the + underlying file descriptor to be closed before any child processes, such as + for CGI scripts, are execed), or in case you are using the + timeout machinery (which isn't yet even documented here). However, there are + two benefits to using it: resources allocated to a pool never leak (even if + you allocate a scratch string, and just forget about it); also, for memory + allocation, ap_palloc is generally faster than + malloc.

+ +

We begin here by describing how memory is allocated to pools, and then + discuss how other resources are tracked by the resource pool machinery.

+ +

Allocation of memory in pools

+

Memory is allocated to pools by calling the function + ap_palloc, which takes two arguments, one being a pointer to + a resource pool structure, and the other being the amount of memory to + allocate (in chars). Within handlers for handling requests, + the most common way of getting a resource pool structure is by looking at + the pool slot of the relevant request_rec; hence + the repeated appearance of the following idiom in module code:

+ +

+ int my_handler(request_rec *r)
+ {
+ + struct my_structure *foo;
+ ...
+
+ foo = (foo *)ap_palloc (r->pool, sizeof(my_structure));
+
+ } +

+ +

Note that there is no ap_pfree -- + ap_palloced memory is freed only when the associated resource + pool is cleared. This means that ap_palloc does not have to + do as much accounting as malloc(); all it does in the typical + case is to round up the size, bump a pointer, and do a range check.

+ +

(It also raises the possibility that heavy use of + ap_palloc could cause a server process to grow excessively + large. There are two ways to deal with this, which are dealt with below; + briefly, you can use malloc, and try to be sure that all of + the memory gets explicitly freed, or you can allocate a + sub-pool of the main pool, allocate your memory in the sub-pool, and clear + it out periodically. The latter technique is discussed in the section + on sub-pools below, and is used in the directory-indexing code, in order + to avoid excessive storage allocation when listing directories with + thousands of files).

+ + +

Allocating initialized memory

+

There are functions which allocate initialized memory, and are + frequently useful. The function ap_pcalloc has the same + interface as ap_palloc, but clears out the memory it + allocates before it returns it. The function ap_pstrdup + takes a resource pool and a char * as arguments, and + allocates memory for a copy of the string the pointer points to, returning + a pointer to the copy. Finally ap_pstrcat is a varargs-style + function, which takes a pointer to a resource pool, and at least two + char * arguments, the last of which must be + NULL. It allocates enough memory to fit copies of each of + the strings, as a unit; for instance:

+ +

+ ap_pstrcat (r->pool, "foo", "/", "bar", NULL); +

+ +

returns a pointer to 8 bytes worth of memory, initialized to + "foo/bar".

+ + +

Commonly-used pools in the Apache Web + server

+

A pool is really defined by its lifetime more than anything else. + There are some static pools in http_main which are passed to various + non-http_main functions as arguments at opportune times. Here they + are:

+ +
+
permanent_pool
+
never passed to anything else, this is the ancestor of all pools
+ +
pconf
+
+
    +
  • subpool of permanent_pool
  • + +
  • created at the beginning of a config "cycle"; exists + until the server is terminated or restarts; passed to all + config-time routines, either via cmd->pool, or as the + "pool *p" argument on those which don't take pools
  • + +
  • passed to the module init() functions
  • +
+
+ +
ptemp
+
+
    +
  • sorry I lie, this pool isn't called this currently in + 1.3, I renamed it this in my pthreads development. I'm + referring to the use of ptrans in the parent... contrast + this with the later definition of ptrans in the + child.
  • + +
  • subpool of permanent_pool
  • + +
  • created at the beginning of a config "cycle"; exists + until the end of config parsing; passed to config-time + routines via cmd->temp_pool. Somewhat of a + "bastard child" because it isn't available everywhere. + Used for temporary scratch space which may be needed by + some config routines but which is deleted at the end of + config.
  • +
+
+ +
pchild
+
+
    +
  • subpool of permanent_pool
  • + +
  • created when a child is spawned (or a thread is + created); lives until that child (thread) is + destroyed
  • + +
  • passed to the module child_init functions
  • + +
  • destruction happens right after the child_exit + functions are called... (which may explain why I think + child_exit is redundant and unneeded)
  • +
+
+ +
ptrans
+
+
    +
  • should be a subpool of pchild, but currently is a + subpool of permanent_pool, see above
  • + +
  • cleared by the child before going into the accept() + loop to receive a connection
  • + +
  • used as connection->pool
  • +
+
+ +
r->pool
+
+
    +
  • for the main request this is a subpool of + connection->pool; for subrequests it is a subpool of + the parent request's pool.
  • + +
  • exists until the end of the request (i.e., + ap_destroy_sub_req, or in child_main after + process_request has finished)
  • + +
  • note that r itself is allocated from r->pool; + i.e., r->pool is first created and then r is + the first thing palloc()d from it
  • +
+
+
+ +

For almost everything folks do, r->pool is the pool to + use. But you can see how other lifetimes, such as pchild, are useful to + some modules... such as modules that need to open a database connection + once per child, and wish to clean it up when the child dies.

+ +

You can also see how some bugs have manifested themself, such as + setting connection->user to a value from + r->pool -- in this case connection exists for the + lifetime of ptrans, which is longer than + r->pool (especially if r->pool is a + subrequest!). So the correct thing to do is to allocate from + connection->pool.

+ +

And there was another interesting bug in mod_include + / mod_cgi. You'll see in those that they do this test + to decide if they should use r->pool or + r->main->pool. In this case the resource that they are + registering for cleanup is a child process. If it were registered in + r->pool, then the code would wait() for the + child when the subrequest finishes. With mod_include this + could be any old #include, and the delay can be up to 3 + seconds... and happened quite frequently. Instead the subprocess is + registered in r->main->pool which causes it to be + cleaned up when the entire request is done -- i.e., after the + output has been sent to the client and logging has happened.

+ + +

Tracking open files, etc.

+

As indicated above, resource pools are also used to track other sorts + of resources besides memory. The most common are open files. The routine + which is typically used for this is ap_pfopen, which takes a + resource pool and two strings as arguments; the strings are the same as + the typical arguments to fopen, e.g.,

+ +

+ ...
+ FILE *f = ap_pfopen (r->pool, r->filename, "r");
+
+ if (f == NULL) { ... } else { ... }
+

+ +

There is also a ap_popenf routine, which parallels the + lower-level open system call. Both of these routines arrange + for the file to be closed when the resource pool in question is + cleared.

+ +

Unlike the case for memory, there are functions to close files + allocated with ap_pfopen, and ap_popenf, namely + ap_pfclose and ap_pclosef. (This is because, on + many systems, the number of files which a single process can have open is + quite limited). It is important to use these functions to close files + allocated with ap_pfopen and ap_popenf, since to + do otherwise could cause fatal errors on systems such as Linux, which + react badly if the same FILE* is closed more than once.

+ +

(Using the close functions is not mandatory, since the + file will eventually be closed regardless, but you should consider it in + cases where your module is opening, or could open, a lot of files).

+ + +

Other sorts of resources -- cleanup functions

+

More text goes here. Describe the cleanup primitives in terms of + which the file stuff is implemented; also, spawn_process.

+ +

Pool cleanups live until clear_pool() is called: + clear_pool(a) recursively calls destroy_pool() + on all subpools of a; then calls all the cleanups for + a; then releases all the memory for a. + destroy_pool(a) calls clear_pool(a) and then + releases the pool structure itself. i.e., + clear_pool(a) doesn't delete a, it just frees + up all the resources and you can start using it again immediately.

+ + +

Fine control -- creating and dealing with sub-pools, with + a note on sub-requests

+

On rare occasions, too-free use of ap_palloc() and the + associated primitives may result in undesirably profligate resource + allocation. You can deal with such a case by creating a sub-pool, + allocating within the sub-pool rather than the main pool, and clearing or + destroying the sub-pool, which releases the resources which were + associated with it. (This really is a rare situation; the only + case in which it comes up in the standard module set is in case of listing + directories, and then only with very large directories. + Unnecessary use of the primitives discussed here can hair up your code + quite a bit, with very little gain).

+ +

The primitive for creating a sub-pool is ap_make_sub_pool, + which takes another pool (the parent pool) as an argument. When the main + pool is cleared, the sub-pool will be destroyed. The sub-pool may also be + cleared or destroyed at any time, by calling the functions + ap_clear_pool and ap_destroy_pool, respectively. + (The difference is that ap_clear_pool frees resources + associated with the pool, while ap_destroy_pool also + deallocates the pool itself. In the former case, you can allocate new + resources within the pool, and clear it again, and so forth; in the + latter case, it is simply gone).

+ +

One final note -- sub-requests have their own resource pools, which are + sub-pools of the resource pool for the main request. The polite way to + reclaim the resources associated with a sub request which you have + allocated (using the ap_sub_req_... functions) is + ap_destroy_sub_req, which frees the resource pool. Before + calling this function, be sure to copy anything that you care about which + might be allocated in the sub-request's resource pool into someplace a + little less volatile (for instance, the filename in its + request_rec structure).

+ +

(Again, under most circumstances, you shouldn't feel obliged to call + this function; only 2K of memory or so are allocated for a typical sub + request, and it will be freed anyway when the main request pool is + cleared. It is only when you are allocating many, many sub-requests for a + single main request that you should seriously consider the + ap_destroy_... functions).

+ +
top
+
+

Configuration, commands and the like

+

One of the design goals for this server was to maintain external + compatibility with the NCSA 1.3 server --- that is, to read the same + configuration files, to process all the directives therein correctly, and + in general to be a drop-in replacement for NCSA. On the other hand, another + design goal was to move as much of the server's functionality into modules + which have as little as possible to do with the monolithic server core. The + only way to reconcile these goals is to move the handling of most commands + from the central server into the modules.

+ +

However, just giving the modules command tables is not enough to divorce + them completely from the server core. The server has to remember the + commands in order to act on them later. That involves maintaining data which + is private to the modules, and which can be either per-server, or + per-directory. Most things are per-directory, including in particular access + control and authorization information, but also information on how to + determine file types from suffixes, which can be modified by + AddType and DefaultType directives, and so forth. In general, + the governing philosophy is that anything which can be made + configurable by directory should be; per-server information is generally + used in the standard set of modules for information like + Aliases and Redirects which come into play before the + request is tied to a particular place in the underlying file system.

+ +

Another requirement for emulating the NCSA server is being able to handle + the per-directory configuration files, generally called + .htaccess files, though even in the NCSA server they can + contain directives which have nothing at all to do with access control. + Accordingly, after URI -> filename translation, but before performing any + other phase, the server walks down the directory hierarchy of the underlying + filesystem, following the translated pathname, to read any + .htaccess files which might be present. The information which + is read in then has to be merged with the applicable information + from the server's own config files (either from the <Directory> sections in + access.conf, or from defaults in srm.conf, which + actually behaves for most purposes almost exactly like <Directory + />).

+ +

Finally, after having served a request which involved reading + .htaccess files, we need to discard the storage allocated for + handling them. That is solved the same way it is solved wherever else + similar problems come up, by tying those structures to the per-transaction + resource pool.

+ +

Per-directory configuration structures

+

Let's look out how all of this plays out in mod_mime.c, + which defines the file typing handler which emulates the NCSA server's + behavior of determining file types from suffixes. What we'll be looking + at, here, is the code which implements the AddType and AddEncoding commands. These commands can appear in + .htaccess files, so they must be handled in the module's + private per-directory data, which in fact, consists of two separate + tables for MIME types and encoding information, and is declared as + follows:

+ +
typedef struct {
+    table *forced_types;      /* Additional AddTyped stuff */
+    table *encoding_types;    /* Added with AddEncoding... */
+} mime_dir_config;
+ +

When the server is reading a configuration file, or <Directory> section, which includes + one of the MIME module's commands, it needs to create a + mime_dir_config structure, so those commands have something + to act on. It does this by invoking the function it finds in the module's + `create per-dir config slot', with two arguments: the name of the + directory to which this configuration information applies (or + NULL for srm.conf), and a pointer to a + resource pool in which the allocation should happen.

+ +

(If we are reading a .htaccess file, that resource pool + is the per-request resource pool for the request; otherwise it is a + resource pool which is used for configuration data, and cleared on + restarts. Either way, it is important for the structure being created to + vanish when the pool is cleared, by registering a cleanup on the pool if + necessary).

+ +

For the MIME module, the per-dir config creation function just + ap_pallocs the structure above, and a creates a couple of + tables to fill it. That looks like this:

+ +

+ void *create_mime_dir_config (pool *p, char *dummy)
+ {
+ + mime_dir_config *new =
+ + (mime_dir_config *) ap_palloc (p, sizeof(mime_dir_config));
+
+
+ new->forced_types = ap_make_table (p, 4);
+ new->encoding_types = ap_make_table (p, 4);
+
+ return new;
+
+ } +

+ +

Now, suppose we've just read in a .htaccess file. We + already have the per-directory configuration structure for the next + directory up in the hierarchy. If the .htaccess file we just + read in didn't have any AddType + or AddEncoding commands, its + per-directory config structure for the MIME module is still valid, and we + can just use it. Otherwise, we need to merge the two structures + somehow.

+ +

To do that, the server invokes the module's per-directory config merge + function, if one is present. That function takes three arguments: the two + structures being merged, and a resource pool in which to allocate the + result. For the MIME module, all that needs to be done is overlay the + tables from the new per-directory config structure with those from the + parent:

+ +

+ void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv)
+ {
+ + mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv;
+ mime_dir_config *subdir = (mime_dir_config *)subdirv;
+ mime_dir_config *new =
+ + (mime_dir_config *)ap_palloc (p, sizeof(mime_dir_config));
+
+
+ new->forced_types = ap_overlay_tables (p, subdir->forced_types,
+ + parent_dir->forced_types);
+
+ new->encoding_types = ap_overlay_tables (p, subdir->encoding_types,
+ + parent_dir->encoding_types);
+
+
+ return new;
+
+ } +

+ +

As a note -- if there is no per-directory merge function present, the + server will just use the subdirectory's configuration info, and ignore + the parent's. For some modules, that works just fine (e.g., for + the includes module, whose per-directory configuration information + consists solely of the state of the XBITHACK), and for those + modules, you can just not declare one, and leave the corresponding + structure slot in the module itself NULL.

+ + +

Command handling

+

Now that we have these structures, we need to be able to figure out how + to fill them. That involves processing the actual AddType and AddEncoding commands. To find commands, the server looks in + the module's command table. That table contains information on how many + arguments the commands take, and in what formats, where it is permitted, + and so forth. That information is sufficient to allow the server to invoke + most command-handling functions with pre-parsed arguments. Without further + ado, let's look at the AddType + command handler, which looks like this (the AddEncoding command looks basically the same, and won't be + shown here):

+ +

+ char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext)
+ {
+ + if (*ext == '.') ++ext;
+ ap_table_set (m->forced_types, ext, ct);
+ return NULL;
+
+ } +

+ +

This command handler is unusually simple. As you can see, it takes + four arguments, two of which are pre-parsed arguments, the third being the + per-directory configuration structure for the module in question, and the + fourth being a pointer to a cmd_parms structure. That + structure contains a bunch of arguments which are frequently of use to + some, but not all, commands, including a resource pool (from which memory + can be allocated, and to which cleanups should be tied), and the (virtual) + server being configured, from which the module's per-server configuration + data can be obtained if required.

+ +

Another way in which this particular command handler is unusually + simple is that there are no error conditions which it can encounter. If + there were, it could return an error message instead of NULL; + this causes an error to be printed out on the server's + stderr, followed by a quick exit, if it is in the main config + files; for a .htaccess file, the syntax error is logged in + the server error log (along with an indication of where it came from), and + the request is bounced with a server error response (HTTP error status, + code 500).

+ +

The MIME module's command table has entries for these commands, which + look like this:

+ +

+ command_rec mime_cmds[] = {
+ + { "AddType", add_type, NULL, OR_FILEINFO, TAKE2,
+ "a mime type followed by a file extension" },
+ { "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2,
+ + "an encoding (e.g., gzip), followed by a file extension" },
+
+ { NULL }
+
+ }; +

+ +

The entries in these tables are:

+
    +
  • The name of the command
  • +
  • The function which handles it
  • +
  • a (void *) pointer, which is passed in the + cmd_parms structure to the command handler --- + this is useful in case many similar commands are handled by + the same function.
  • + +
  • A bit mask indicating where the command may appear. There + are mask bits corresponding to each + AllowOverride option, and an additional mask + bit, RSRC_CONF, indicating that the command may + appear in the server's own config files, but not in + any .htaccess file.
  • + +
  • A flag indicating how many arguments the command handler + wants pre-parsed, and how they should be passed in. + TAKE2 indicates two pre-parsed arguments. Other + options are TAKE1, which indicates one + pre-parsed argument, FLAG, which indicates that + the argument should be On or Off, + and is passed in as a boolean flag, RAW_ARGS, + which causes the server to give the command the raw, unparsed + arguments (everything but the command name itself). There is + also ITERATE, which means that the handler looks + the same as TAKE1, but that if multiple + arguments are present, it should be called multiple times, + and finally ITERATE2, which indicates that the + command handler looks like a TAKE2, but if more + arguments are present, then it should be called multiple + times, holding the first argument constant.
  • + +
  • Finally, we have a string which describes the arguments + that should be present. If the arguments in the actual config + file are not as required, this string will be used to help + give a more specific error message. (You can safely leave + this NULL).
  • +
+ +

Finally, having set this all up, we have to use it. This is ultimately + done in the module's handlers, specifically for its file-typing handler, + which looks more or less like this; note that the per-directory + configuration structure is extracted from the request_rec's + per-directory configuration vector by using the + ap_get_module_config function.

+ +

+ int find_ct(request_rec *r)
+ {
+ + int i;
+ char *fn = ap_pstrdup (r->pool, r->filename);
+ mime_dir_config *conf = (mime_dir_config *)
+ + ap_get_module_config(r->per_dir_config, &mime_module);
+
+ char *type;
+
+ if (S_ISDIR(r->finfo.st_mode)) {
+ + r->content_type = DIR_MAGIC_TYPE;
+ return OK;
+
+ }
+
+ if((i=ap_rind(fn,'.')) < 0) return DECLINED;
+ ++i;
+
+ if ((type = ap_table_get (conf->encoding_types, &fn[i])))
+ {
+ + r->content_encoding = type;
+
+ /* go back to previous extension to try to use it as a type */
+ fn[i-1] = '\0';
+ if((i=ap_rind(fn,'.')) < 0) return OK;
+ ++i;
+
+ }
+
+ if ((type = ap_table_get (conf->forced_types, &fn[i])))
+ {
+ + r->content_type = type;
+
+ }
+
+ return OK; +
+ } +

+ + +

Side notes -- per-server configuration, + virtual servers, etc.

+

The basic ideas behind per-server module configuration are basically + the same as those for per-directory configuration; there is a creation + function and a merge function, the latter being invoked where a virtual + server has partially overridden the base server configuration, and a + combined structure must be computed. (As with per-directory configuration, + the default if no merge function is specified, and a module is configured + in some virtual server, is that the base configuration is simply + ignored).

+ +

The only substantial difference is that when a command needs to + configure the per-server private module data, it needs to go to the + cmd_parms data to get at it. Here's an example, from the + alias module, which also indicates how a syntax error can be returned + (note that the per-directory configuration argument to the command + handler is declared as a dummy, since the module doesn't actually have + per-directory config data):

+ +

+ char *add_redirect(cmd_parms *cmd, void *dummy, char *f, char *url)
+ {
+ + server_rec *s = cmd->server;
+ alias_server_conf *conf = (alias_server_conf *)
+ + ap_get_module_config(s->module_config,&alias_module);
+
+ alias_entry *new = ap_push_array (conf->redirects);
+
+ if (!ap_is_url (url)) return "Redirect to non-URL";
+
+ new->fake = f; new->real = url;
+ return NULL;
+
+ } +

+ +
+
+

Available Languages:  en 

+
+ \ No newline at end of file diff --git a/rubbos/app/apache2/manual/developer/debugging.html b/rubbos/app/apache2/manual/developer/debugging.html new file mode 100644 index 00000000..6d94fa27 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/debugging.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: debugging.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 diff --git a/rubbos/app/apache2/manual/developer/debugging.html.en b/rubbos/app/apache2/manual/developer/debugging.html.en new file mode 100644 index 00000000..3febd64c --- /dev/null +++ b/rubbos/app/apache2/manual/developer/debugging.html.en @@ -0,0 +1,196 @@ + + + +Debugging Memory Allocation in APR - Apache HTTP Server + + + + + +
<-
+

Debugging Memory Allocation in APR

+
+

Available Languages:  en 

+
+ +

The allocation mechanism's within APR have a number of debugging modes + that can be used to assist in finding memory problems. This document + describes the modes available and gives instructions on activating + them.

+
+ +
top
+
+

Available debugging options

+

Allocation Debugging - ALLOC_DEBUG

+ + +
Debugging support: Define this to enable code which + helps detect re-use of free()d memory and other such + nonsense.
+ +

The theory is simple. The FILL_BYTE (0xa5) + is written over all malloc'd memory as we receive it, and + is written over everything that we free up during a + clear_pool. We check that blocks on the free list always + have the FILL_BYTE in them, and we check during + palloc() that the bytes still have FILL_BYTE + in them. If you ever see garbage URLs or whatnot containing lots + of 0xa5s then you know something used data that's been + freed or uninitialized.

+ + +

Malloc Support - ALLOC_USE_MALLOC

+ + +
If defined all allocations will be done with + malloc() and free()d appropriately at the + end.
+ +

This is intended to be used with something like Electric + Fence or Purify to help detect memory problems. Note that if + you're using efence then you should also add in ALLOC_DEBUG. + But don't add in ALLOC_DEBUG if you're using Purify because + ALLOC_DEBUG would hide all the uninitialized read errors + that Purify can diagnose.

+ + +

Pool Debugging - POOL_DEBUG

+
This is intended to detect cases where the wrong pool is + used when assigning data to an object in another pool.
+ +

In particular, it causes the table_{set,add,merge}n + routines to check that their arguments are safe for the + apr_table_t they're being placed in. It currently only works + with the unix multiprocess model, but could be extended to others.

+ + +

Table Debugging - MAKE_TABLE_PROFILE

+ + +
Provide diagnostic information about make_table() calls + which are possibly too small.
+ +

This requires a recent gcc which supports + __builtin_return_address(). The error_log output will be a + message such as:

+

+ table_push: apr_table_t created by 0x804d874 hit limit of 10 +

+ +

Use l *0x804d874 to find the + source that corresponds to. It indicates that a apr_table_t + allocated by a call at that address has possibly too small an + initial apr_table_t size guess.

+ + +

Allocation Statistics - ALLOC_STATS

+ + +
Provide some statistics on the cost of allocations.
+ +

This requires a bit of an understanding of how alloc.c works.

+ +
top
+
+

Allowable Combinations

+ +

Not all the options outlined above can be activated at the + same time. the following table gives more information.

+ + + + + + + + + + + + + + + + +
+ ALLOC DEBUGALLOC USE MALLOCPOOL DEBUGMAKE TABLE PROFILEALLOC STATS
ALLOC DEBUG-NoYesYesYes
ALLOC USE MALLOCNo-NoNoNo
POOL DEBUGYesNo-YesYes
MAKE TABLE PROFILEYesNoYes-Yes
ALLOC STATSYesNoYesYes-
+ +

Additionally the debugging options are not suitable for + multi-threaded versions of the server. When trying to debug + with these options the server should be started in single + process mode.

+
top
+
+

Activating Debugging Options

+ +

The various options for debugging memory are now enabled in + the apr_general.h header file in APR. The various options are + enabled by uncommenting the define for the option you wish to + use. The section of the code currently looks like this + (contained in srclib/apr/include/apr_pools.h)

+ +

+ /*
+ #define ALLOC_DEBUG
+ #define POOL_DEBUG
+ #define ALLOC_USE_MALLOC
+ #define MAKE_TABLE_PROFILE
+ #define ALLOC_STATS
+ */
+
+ typedef struct ap_pool_t {
+ + union block_hdr *first;
+ union block_hdr *last;
+ struct cleanup *cleanups;
+ struct process_chain *subprocesses;
+ struct ap_pool_t *sub_pools;
+ struct ap_pool_t *sub_next;
+ struct ap_pool_t *sub_prev;
+ struct ap_pool_t *parent;
+ char *free_first_avail;
+
+ #ifdef ALLOC_USE_MALLOC
+ + void *allocation_list;
+
+ #endif
+ #ifdef POOL_DEBUG
+ + struct ap_pool_t *joined;
+
+ #endif
+ + int (*apr_abort)(int retcode);
+ struct datastruct *prog_data;
+
+ } ap_pool_t; +

+ +

To enable allocation debugging simply move the #define + ALLOC_DEBUG above the start of the comments block and rebuild + the server.

+ +

Note

+

In order to use the various options the server must + be rebuilt after editing the header file.

+
+
+
+

Available Languages:  en 

+
+ \ No newline at end of file diff --git a/rubbos/app/apache2/manual/developer/documenting.html b/rubbos/app/apache2/manual/developer/documenting.html new file mode 100644 index 00000000..db57cef3 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/documenting.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: documenting.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 diff --git a/rubbos/app/apache2/manual/developer/documenting.html.en b/rubbos/app/apache2/manual/developer/documenting.html.en new file mode 100644 index 00000000..204801d4 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/documenting.html.en @@ -0,0 +1,84 @@ + + + +Documenting Apache 2.0 - Apache HTTP Server + + + + + +
<-
+

Documenting Apache 2.0

+
+

Available Languages:  en 

+
+ +

Apache 2.0 uses Doxygen to + document the APIs and global variables in the code. This will explain + the basics of how to document using Doxygen.

+
+
top
+
+

Brief Description

+

To start a documentation block, use /**
+ To end a documentation block, use */

+ +

In the middle of the block, there are multiple tags we can + use:

+ +

+ Description of this functions purpose
+ @param parameter_name description
+ @return description
+ @deffunc signature of the function
+

+ +

The deffunc is not always necessary. DoxyGen does not + have a full parser in it, so any prototype that use a macro in the + return type declaration is too complex for scandoc. Those functions + require a deffunc. An example (using &gt; rather + than >):

+ +

+ /**
+  * return the final element of the pathname
+  * @param pathname The path to get the final element of
+  * @return the final element of the path
+  * @tip Examples:
+  * <pre>
+  * "/foo/bar/gum" -&gt; "gum"
+  * "/foo/bar/gum/" -&gt; ""
+  * "gum" -&gt; "gum"
+  * "wi\\n32\\stuff" -&gt; "stuff"
+  * </pre>
+  * @deffunc const char * ap_filename_of_pathname(const char *pathname)
+  */ +

+ +

At the top of the header file, always include:

+

+ /**
+  * @package Name of library header
+  */ +

+ +

Doxygen uses a new HTML file for each package. The HTML files are named + {Name_of_library_header}.html, so try to be concise with your names.

+ +

For a further discussion of the possibilities please refer to + the Doxygen site.

+
+
+

Available Languages:  en 

+
+ \ No newline at end of file diff --git a/rubbos/app/apache2/manual/developer/filters.html b/rubbos/app/apache2/manual/developer/filters.html new file mode 100644 index 00000000..6caade35 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/filters.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: filters.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 diff --git a/rubbos/app/apache2/manual/developer/filters.html.en b/rubbos/app/apache2/manual/developer/filters.html.en new file mode 100644 index 00000000..5f6fc54a --- /dev/null +++ b/rubbos/app/apache2/manual/developer/filters.html.en @@ -0,0 +1,210 @@ + + + +How filters work in Apache 2.0 - Apache HTTP Server + + + + + +
<-
+

How filters work in Apache 2.0

+
+

Available Languages:  en 

+
+ +

Warning

+

This is a cut 'n paste job from an email + (<022501c1c529$f63a9550$7f00000a@KOJ>) and only reformatted for + better readability. It's not up to date but may be a good start for + further research.

+
+
+ +
top
+
+

Filter Types

+

There are three basic filter types (each of these is actually broken + down into two categories, but that comes later).

+ +
+
CONNECTION
+
Filters of this type are valid for the lifetime of this connection. + (AP_FTYPE_CONNECTION, AP_FTYPE_NETWORK)
+ +
PROTOCOL
+
Filters of this type are valid for the lifetime of this request from + the point of view of the client, this means that the request is valid + from the time that the request is sent until the time that the response + is received. (AP_FTYPE_PROTOCOL, + AP_FTYPE_TRANSCODE)
+ +
RESOURCE
+
Filters of this type are valid for the time that this content is used + to satisfy a request. For simple requests, this is identical to + PROTOCOL, but internal redirects and sub-requests can change + the content without ending the request. (AP_FTYPE_RESOURCE, + AP_FTYPE_CONTENT_SET)
+
+ +

It is important to make the distinction between a protocol and a + resource filter. A resource filter is tied to a specific resource, it + may also be tied to header information, but the main binding is to a + resource. If you are writing a filter and you want to know if it is + resource or protocol, the correct question to ask is: "Can this filter + be removed if the request is redirected to a different resource?" If + the answer is yes, then it is a resource filter. If it is no, then it + is most likely a protocol or connection filter. I won't go into + connection filters, because they seem to be well understood. With this + definition, a few examples might help:

+ +
+
Byterange
+
We have coded it to be inserted for all requests, and it is removed + if not used. Because this filter is active at the beginning of all + requests, it can not be removed if it is redirected, so this is a + protocol filter.
+ +
http_header
+
This filter actually writes the headers to the network. This is + obviously a required filter (except in the asis case which is special + and will be dealt with below) and so it is a protocol filter.
+ +
Deflate
+
The administrator configures this filter based on which file has been + requested. If we do an internal redirect from an autoindex page to an + index.html page, the deflate filter may be added or removed based on + config, so this is a resource filter.
+
+ +

The further breakdown of each category into two more filter types is + strictly for ordering. We could remove it, and only allow for one + filter type, but the order would tend to be wrong, and we would need to + hack things to make it work. Currently, the RESOURCE filters + only have one filter type, but that should change.

+
top
+
+

How are filters inserted?

+

This is actually rather simple in theory, but the code is + complex. First of all, it is important that everybody realize that + there are three filter lists for each request, but they are all + concatenated together. So, the first list is + r->output_filters, then r->proto_output_filters, + and finally r->connection->output_filters. These correspond + to the RESOURCE, PROTOCOL, and + CONNECTION filters respectively. The problem previously, was + that we used a singly linked list to create the filter stack, and we + started from the "correct" location. This means that if I had a + RESOURCE filter on the stack, and I added a + CONNECTION filter, the CONNECTION filter would + be ignored. This should make sense, because we would insert the connection + filter at the top of the c->output_filters list, but the end + of r->output_filters pointed to the filter that used to be + at the front of c->output_filters. This is obviously wrong. + The new insertion code uses a doubly linked list. This has the advantage + that we never lose a filter that has been inserted. Unfortunately, it comes + with a separate set of headaches.

+ +

The problem is that we have two different cases were we use subrequests. + The first is to insert more data into a response. The second is to + replace the existing response with an internal redirect. These are two + different cases and need to be treated as such.

+ +

In the first case, we are creating the subrequest from within a handler + or filter. This means that the next filter should be passed to + make_sub_request function, and the last resource filter in the + sub-request will point to the next filter in the main request. This + makes sense, because the sub-request's data needs to flow through the + same set of filters as the main request. A graphical representation + might help:

+ +
+Default_handler --> includes_filter --> byterange --> ...
+
+ +

If the includes filter creates a sub request, then we don't want the + data from that sub-request to go through the includes filter, because it + might not be SSI data. So, the subrequest adds the following:

+ +
    
+Default_handler --> includes_filter -/-> byterange --> ...
+                                    /
+Default_handler --> sub_request_core
+
+ +

What happens if the subrequest is SSI data? Well, that's easy, the + includes_filter is a resource filter, so it will be added to + the sub request in between the Default_handler and the + sub_request_core filter.

+ +

The second case for sub-requests is when one sub-request is going to + become the real request. This happens whenever a sub-request is created + outside of a handler or filter, and NULL is passed as the next filter to + the make_sub_request function.

+ +

In this case, the resource filters no longer make sense for the new + request, because the resource has changed. So, instead of starting from + scratch, we simply point the front of the resource filters for the + sub-request to the front of the protocol filters for the old request. + This means that we won't lose any of the protocol filters, neither will + we try to send this data through a filter that shouldn't see it.

+ +

The problem is that we are using a doubly-linked list for our filter + stacks now. But, you should notice that it is possible for two lists to + intersect in this model. So, you do you handle the previous pointer? + This is a very difficult question to answer, because there is no "right" + answer, either method is equally valid. I looked at why we use the + previous pointer. The only reason for it is to allow for easier + addition of new servers. With that being said, the solution I chose was + to make the previous pointer always stay on the original request.

+ +

This causes some more complex logic, but it works for all cases. My + concern in having it move to the sub-request, is that for the more + common case (where a sub-request is used to add data to a response), the + main filter chain would be wrong. That didn't seem like a good idea to + me.

+
top
+
+

Asis

+

The final topic. :-) Mod_Asis is a bit of a hack, but the + handler needs to remove all filters except for connection filters, and + send the data. If you are using mod_asis, all other + bets are off.

+
top
+
+

Explanations

+

The absolutely last point is that the reason this code was so hard to + get right, was because we had hacked so much to force it to work. I + wrote most of the hacks originally, so I am very much to blame. + However, now that the code is right, I have started to remove some + hacks. Most people should have seen that the reset_filters + and add_required_filters functions are gone. Those inserted + protocol level filters for error conditions, in fact, both functions did + the same thing, one after the other, it was really strange. Because we + don't lose protocol filters for error cases any more, those hacks went away. + The HTTP_HEADER, Content-length, and + Byterange filters are all added in the + insert_filters phase, because if they were added earlier, we + had some interesting interactions. Now, those could all be moved to be + inserted with the HTTP_IN, CORE, and + CORE_IN filters. That would make the code easier to + follow.

+
+
+

Available Languages:  en 

+
+ \ No newline at end of file diff --git a/rubbos/app/apache2/manual/developer/hooks.html b/rubbos/app/apache2/manual/developer/hooks.html new file mode 100644 index 00000000..a4077c72 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/hooks.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: hooks.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 diff --git a/rubbos/app/apache2/manual/developer/hooks.html.en b/rubbos/app/apache2/manual/developer/hooks.html.en new file mode 100644 index 00000000..b89e9906 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/hooks.html.en @@ -0,0 +1,239 @@ + + + +Apache 2.0 Hook Functions - Apache HTTP Server + + + + + +
<-
+

Apache 2.0 Hook Functions

+
+

Available Languages:  en 

+
+ +

Warning

+

This document is still in development and may be partially out of + date.

+
+ +

In general, a hook function is one that Apache will call at + some point during the processing of a request. Modules can + provide functions that are called, and specify when they get + called in comparison to other modules.

+
+ +
top
+
+

Creating a hook function

+

In order to create a new hook, four things need to be + done:

+ +

Declare the hook function

+

Use the AP_DECLARE_HOOK macro, which needs to be given + the return type of the hook function, the name of the hook, and the + arguments. For example, if the hook returns an int and + takes a request_rec * and an int and is + called do_something, then declare it like this:

+

+ AP_DECLARE_HOOK(int, do_something, (request_rec *r, int n)) +

+ +

This should go in a header which modules will include if + they want to use the hook.

+ + +

Create the hook structure

+

Each source file that exports a hook has a private structure + which is used to record the module functions that use the hook. + This is declared as follows:

+ +

+ APR_HOOK_STRUCT(
+ + APR_HOOK_LINK(do_something)
+ ...
+
+ ) +

+ + +

Implement the hook caller

+

The source file that exports the hook has to implement a + function that will call the hook. There are currently three + possible ways to do this. In all cases, the calling function is + called ap_run_hookname().

+ +

Void hooks

+

If the return value of a hook is void, then all the + hooks are called, and the caller is implemented like this:

+ +

+ AP_IMPLEMENT_HOOK_VOID(do_something, (request_rec *r, int n), (r, n)) +

+ +

The second and third arguments are the dummy argument + declaration and the dummy arguments as they will be used when + calling the hook. In other words, this macro expands to + something like this:

+ +

+ void ap_run_do_something(request_rec *r, int n)
+ {
+ + ...
+ do_something(r, n);
+
+ } +

+ + +

Hooks that return a value

+

If the hook returns a value, then it can either be run until + the first hook that does something interesting, like so:

+ +

+ AP_IMPLEMENT_HOOK_RUN_FIRST(int, do_something, (request_rec *r, int n), (r, n), DECLINED) +

+ +

The first hook that does not return DECLINED + stops the loop and its return value is returned from the hook + caller. Note that DECLINED is the tradition Apache + hook return meaning "I didn't do anything", but it can be + whatever suits you.

+ +

Alternatively, all hooks can be run until an error occurs. + This boils down to permitting two return values, one of + which means "I did something, and it was OK" and the other + meaning "I did nothing". The first function that returns a + value other than one of those two stops the loop, and its + return is the return value. Declare these like so:

+ +

+ AP_IMPLEMENT_HOOK_RUN_ALL(int, do_something, (request_rec *r, int n), (r, n), OK, DECLINED) +

+ +

Again, OK and DECLINED are the traditional + values. You can use what you want.

+ + + +

Call the hook callers

+

At appropriate moments in the code, call the hook caller, + like so:

+ +

+ int n, ret;
+ request_rec *r;
+
+ ret=ap_run_do_something(r, n); +

+ +
top
+
+

Hooking the hook

+

A module that wants a hook to be called needs to do two + things.

+ +

Implement the hook function

+

Include the appropriate header, and define a static function + of the correct type:

+ +

+ static int my_something_doer(request_rec *r, int n)
+ {
+ + ...
+ return OK;
+
+ } +

+ + +

Add a hook registering function

+

During initialisation, Apache will call each modules hook + registering function, which is included in the module + structure:

+ +

+ static void my_register_hooks()
+ {
+ + ap_hook_do_something(my_something_doer, NULL, NULL, HOOK_MIDDLE);
+
+ }
+
+ mode MODULE_VAR_EXPORT my_module =
+ {
+ + ...
+ my_register_hooks /* register hooks */
+
+ }; +

+ + +

Controlling hook calling order

+

In the example above, we didn't use the three arguments in + the hook registration function that control calling order. + There are two mechanisms for doing this. The first, rather + crude, method, allows us to specify roughly where the hook is + run relative to other modules. The final argument control this. + There are three possible values: HOOK_FIRST, + HOOK_MIDDLE and HOOK_LAST.

+ +

All modules using any particular value may be run in any + order relative to each other, but, of course, all modules using + HOOK_FIRST will be run before HOOK_MIDDLE + which are before HOOK_LAST. Modules that don't care + when they are run should use HOOK_MIDDLE. (I spaced + these out so people could do stuff like HOOK_FIRST-2 + to get in slightly earlier, but is this wise? - Ben)

+ +

Note that there are two more values, + HOOK_REALLY_FIRST and HOOK_REALLY_LAST. These + should only be used by the hook exporter.

+ +

The other method allows finer control. When a module knows + that it must be run before (or after) some other modules, it + can specify them by name. The second (third) argument is a + NULL-terminated array of strings consisting of the names of + modules that must be run before (after) the current module. For + example, suppose we want "mod_xyz.c" and "mod_abc.c" to run + before we do, then we'd hook as follows:

+ +

+ static void register_hooks()
+ {
+ + static const char * const aszPre[] = { "mod_xyz.c", "mod_abc.c", NULL };
+
+ ap_hook_do_something(my_something_doer, aszPre, NULL, HOOK_MIDDLE);
+
+ } +

+ +

Note that the sort used to achieve this is stable, so + ordering set by HOOK_ORDER is preserved, as far + as is possible.

+ +

Ben Laurie, 15th August 1999

+ +
+
+

Available Languages:  en 

+
+ \ No newline at end of file diff --git a/rubbos/app/apache2/manual/developer/index.html b/rubbos/app/apache2/manual/developer/index.html new file mode 100644 index 00000000..e4d079c3 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/index.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: index.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 diff --git a/rubbos/app/apache2/manual/developer/index.html.en b/rubbos/app/apache2/manual/developer/index.html.en new file mode 100644 index 00000000..7006579b --- /dev/null +++ b/rubbos/app/apache2/manual/developer/index.html.en @@ -0,0 +1,73 @@ + + + +Developer Documentation for Apache 2.0 - Apache HTTP Server + + + + + +
<-
+

Developer Documentation for Apache 2.0

+
+

Available Languages:  en 

+
+ +

Many of the documents on these Developer pages are lifted + from Apache 1.3's documentation. While they are all being + updated to Apache 2.0, they are in different stages of + progress. Please be patient, and point out any discrepancies or + errors on the developer/ pages directly to the + dev@httpd.apache.org mailing list.

+
+ +
top
+
top
+
+
+

Available Languages:  en 

+
+ \ No newline at end of file diff --git a/rubbos/app/apache2/manual/developer/modules.html b/rubbos/app/apache2/manual/developer/modules.html new file mode 100644 index 00000000..33cfd0f4 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/modules.html @@ -0,0 +1,9 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: modules.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 + +URI: modules.html.ja.utf8 +Content-Language: ja +Content-type: text/html; charset=UTF-8 diff --git a/rubbos/app/apache2/manual/developer/modules.html.en b/rubbos/app/apache2/manual/developer/modules.html.en new file mode 100644 index 00000000..176a94dd --- /dev/null +++ b/rubbos/app/apache2/manual/developer/modules.html.en @@ -0,0 +1,273 @@ + + + +Converting Modules from Apache 1.3 to Apache 2.0 - Apache HTTP Server + + + + + +
<-
+

Converting Modules from Apache 1.3 to Apache 2.0

+
+

Available Languages:  en  | + ja 

+
+ +

This is a first attempt at writing the lessons I learned + when trying to convert the mod_mmap_static module to Apache + 2.0. It's by no means definitive and probably won't even be + correct in some ways, but it's a start.

+
+ +
top
+
+

The easier changes ...

+ +

Cleanup Routines

+

These now need to be of type apr_status_t and return a + value of that type. Normally the return value will be + APR_SUCCESS unless there is some need to signal an error in + the cleanup. Be aware that even though you signal an error not all code + yet checks and acts upon the error.

+ + +

Initialisation Routines

+

These should now be renamed to better signify where they sit + in the overall process. So the name gets a small change from + mmap_init to mmap_post_config. The arguments + passed have undergone a radical change and now look like

+ +
    +
  • apr_pool_t *p
  • +
  • apr_pool_t *plog
  • +
  • apr_pool_t *ptemp
  • +
  • server_rec *s
  • +
+ + +

Data Types

+

A lot of the data types have been moved into the APR. This means that some have had + a name change, such as the one shown above. The following is a brief + list of some of the changes that you are likely to have to make.

+ +
    +
  • pool becomes apr_pool_t
  • +
  • table becomes apr_table_t
  • +
+ +
top
+
+

The messier changes...

+ +

Register Hooks

+

The new architecture uses a series of hooks to provide for + calling your functions. These you'll need to add to your module + by way of a new function, static void register_hooks(void). + The function is really reasonably straightforward once you + understand what needs to be done. Each function that needs + calling at some stage in the processing of a request needs to + be registered, handlers do not. There are a number of phases + where functions can be added, and for each you can specify with + a high degree of control the relative order that the function + will be called in.

+ +

This is the code that was added to mod_mmap_static:

+
+static void register_hooks(void)
+{
+    static const char * const aszPre[]={ "http_core.c",NULL };
+    ap_hook_post_config(mmap_post_config,NULL,NULL,HOOK_MIDDLE);
+    ap_hook_translate_name(mmap_static_xlat,aszPre,NULL,HOOK_LAST);
+};
+ +

This registers 2 functions that need to be called, one in + the post_config stage (virtually every module will need this + one) and one for the translate_name phase. note that while + there are different function names the format of each is + identical. So what is the format?

+ +

+ ap_hook_phase_name(function_name, + predecessors, successors, position); +

+ +

There are 3 hook positions defined...

+ +
    +
  • HOOK_FIRST
  • +
  • HOOK_MIDDLE
  • +
  • HOOK_LAST
  • +
+ +

To define the position you use the position and then modify + it with the predecessors and successors. Each of the modifiers + can be a list of functions that should be called, either before + the function is run (predecessors) or after the function has + run (successors).

+ +

In the mod_mmap_static case I didn't care about the + post_config stage, but the mmap_static_xlat + must be called after the core module had done it's name + translation, hence the use of the aszPre to define a modifier to the + position HOOK_LAST.

+ + +

Module Definition

+

There are now a lot fewer stages to worry about when + creating your module definition. The old defintion looked + like

+ +
+module MODULE_VAR_EXPORT module_name_module =
+{
+    STANDARD_MODULE_STUFF,
+    /* initializer */
+    /* dir config creater */
+    /* dir merger --- default is to override */
+    /* server config */
+    /* merge server config */
+    /* command handlers */
+    /* handlers */
+    /* filename translation */
+    /* check_user_id */
+    /* check auth */
+    /* check access */
+    /* type_checker */
+    /* fixups */
+    /* logger */
+    /* header parser */
+    /* child_init */
+    /* child_exit */
+    /* post read-request */
+};
+ +

The new structure is a great deal simpler...

+
+module MODULE_VAR_EXPORT module_name_module =
+{
+    STANDARD20_MODULE_STUFF,
+    /* create per-directory config structures */
+    /* merge per-directory config structures  */
+    /* create per-server config structures    */
+    /* merge per-server config structures     */
+    /* command handlers */
+    /* handlers */
+    /* register hooks */
+};
+ +

Some of these read directly across, some don't. I'll try to + summarise what should be done below.

+ +

The stages that read directly across :

+ +
+
/* dir config creater */
+
/* create per-directory config structures */
+ +
/* server config */
+
/* create per-server config structures */
+ +
/* dir merger */
+
/* merge per-directory config structures */
+ +
/* merge server config */
+
/* merge per-server config structures */
+ +
/* command table */
+
/* command apr_table_t */
+ +
/* handlers */
+
/* handlers */
+
+ +

The remainder of the old functions should be registered as + hooks. There are the following hook stages defined so + far...

+ +
+
ap_hook_post_config
+
this is where the old _init routines get + registered
+ +
ap_hook_http_method
+
retrieve the http method from a request. (legacy)
+ +
ap_hook_open_logs
+
open any specified logs
+ +
ap_hook_auth_checker
+
check if the resource requires authorization
+ +
ap_hook_access_checker
+
check for module-specific restrictions
+ +
ap_hook_check_user_id
+
check the user-id and password
+ +
ap_hook_default_port
+
retrieve the default port for the server
+ +
ap_hook_pre_connection
+
do any setup required just before processing, but after + accepting
+ +
ap_hook_process_connection
+
run the correct protocol
+ +
ap_hook_child_init
+
call as soon as the child is started
+ +
ap_hook_create_request
+
??
+ +
ap_hook_fixups
+
last chance to modify things before generating content
+ +
ap_hook_handler
+
generate the content
+ +
ap_hook_header_parser
+
lets modules look at the headers, not used by most modules, because + they use post_read_request for this
+ +
ap_hook_insert_filter
+
to insert filters into the filter chain
+ +
ap_hook_log_transaction
+
log information about the request
+ +
ap_hook_optional_fn_retrieve
+
retrieve any functions registered as optional
+ +
ap_hook_post_read_request
+
called after reading the request, before any other phase
+ +
ap_hook_quick_handler
+
called before any request processing, used by cache modules.
+ +
ap_hook_translate_name
+
translate the URI into a filename
+ +
ap_hook_type_checker
+
determine and/or set the doc type
+
+ +
+
+

Available Languages:  en  | + ja 

+
+ \ No newline at end of file diff --git a/rubbos/app/apache2/manual/developer/modules.html.ja.utf8 b/rubbos/app/apache2/manual/developer/modules.html.ja.utf8 new file mode 100644 index 00000000..8297d28d --- /dev/null +++ b/rubbos/app/apache2/manual/developer/modules.html.ja.utf8 @@ -0,0 +1,274 @@ + + + +モジュールの Apache 1.3 から Apache 2.0 への移植 - Apache HTTP サーバ + + + + + +
<-
+

モジュールの Apache 1.3 から Apache 2.0 への移植

+
+

Available Languages:  en  | + ja 

+
+ +

この文書は mod_mmap_static モジュールを Apache 2.0 用に移植した時に + 学んだ経験をもとに書いた、最初の手引き書です。まだまだ完全じゃないし、 + ひょっとすると間違っている部分もあるかもしれませんが、 + 取っ掛りにはなるでしょう。

+
+ +
top
+
+

簡単な変更点

+ +

クリーンナップ ルーチン

+

クリーンナップルーチンは apr_status_t 型である必要があります。 + そして、apr_status_t 型の値を返さなくてはなりません。 + クリーンナップ中のエラーを通知する必要がなければ、返り値は普通、 + ARP_SUCCESS です。たとえエラーを通知したとしても、 + すべてのコードがその通知をチェックしたり、 + エラーに応じた動作をするわけではないことに気をつけてください。

+ + + +

初期化ルーチン

+ +

初期化ルーチンは処理全体から見てしっくりくるような意味を表すように、 + 名前が変更されました。ですから、mmap_init から mmap_post_config + のようにちょっと変更されました。 + 渡される引数は大幅に変更され、次のようになりました。

+ +
    +
  • apr_pool_t *p
  • +
  • apr_pool_t *plog
  • +
  • apr_pool_t *ptemp
  • +
  • server_rec *s
  • +
+ + +

データ型

+

データ型のほとんどは APR に移されました。つまり、 + いくつかの名前が前述のように変更されています。 + 施すべき変更点の簡単な一覧を以下に示します。

+ +
    +
  • pool becomes apr_pool_t
  • +
  • table becomes apr_table_t
  • +
+ +
top
+
+

もっと厄介な変更点…

+ +

フックの登録

+

新しいアーキテクチャでは作成した関数を呼び出すのに + 一連のフックを使用します。このフックは、新しい関数 + static void register_hooks(void) を使って登録するよう、 + モジュールに書き足さなくてはなりません。 + この関数は、なにをすべきか一旦理解してしまえば、 + 十分にわかりやすいものです。 + リクエストの処理のあるステージで呼び出さなくてはならない + 関数は登録する必要があります。ハンドラは登録する必要はありません。 + 関数を登録できるフェーズはたくさんあります。 + それぞれのフェーズで、関数を呼び出す相対的な順番は、 + かなりの程度制御できます。

+ +

以下は、mod_mmap_static に追加したコードです:

+ +
+static void register_hooks(void)
+{
+    static const char * const aszPre[]={ "http_core.c",NULL };
+    ap_hook_post_config(mmap_post_config,NULL,NULL,HOOK_MIDDLE);
+    ap_hook_translate_name(mmap_static_xlat,aszPre,NULL,HOOK_LAST);
+};
+ +

ここでは呼びだすべき二つの関数を登録しています。一つは + post_config ステージ用 (ほとんどすべてのモジュール + はこれが必要です) で、もう一つは translate_name フェーズ用です。 + それぞれの関数は名前は違うけれども形式は同じであることに注意してください。 + それでは、形式はどのようになっているでしょうか?

+ +

+ ap_hook_phase_name(function_name, + predecessors, successors, position); +

+ +

三つの位置が定義されています…

+ +
    +
  • HOOK_FIRST
  • +
  • HOOK_MIDDLE
  • +
  • HOOK_LAST
  • +
+ +

位置を定義するには、上記の「位置」を指定し、 + 修飾子である「先行」と「後行」で手を加えます。 + 「先行」「後行」は、呼ばれるべき関数のリストです。 + 「先行」は関数の実行前に呼ばれるもので、 + 「後行」は実行後に呼ばれるものです。

+ +

mod_mmap_static の場合、post_config + ステージでは必要ありませんが、 + mmap_static_xlat が core モジュールが名前の変換を実行した後に + 呼ばれなければなりません。 + そこで aszPre を使って HOOK_LAST の修飾子を定義しています。

+ + +

モジュールの定義

+

モジュールの定義を作成する際に注意しなければならない + ステージの数は激減しています。古い定義は次のようになっていました。

+ +
+module MODULE_VAR_EXPORT module_name_module =
+{
+    STANDARD_MODULE_STUFF,
+    /* initializer */
+    /* dir config creater */
+    /* dir merger --- default is to override */
+    /* server config */
+    /* merge server config */
+    /* command handlers */
+    /* handlers */
+    /* filename translation */
+    /* check_user_id */
+    /* check auth */
+    /* check access */
+    /* type_checker */
+    /* fixups */
+    /* logger */
+    /* header parser */
+    /* child_init */
+    /* child_exit */
+    /* post read-request */
+};
+ +

新しい構造体はとってもシンプルです…

+
+module MODULE_VAR_EXPORT module_name_module =
+{
+    STANDARD20_MODULE_STUFF,
+    /* create per-directory config structures */
+    /* merge per-directory config structures  */
+    /* create per-server config structures    */
+    /* merge per-server config structures     */
+    /* command handlers */
+    /* handlers */
+    /* register hooks */
+};
+ +

このうちのいくつかは古いものから新しいものに直接読み替えられるもので、 + いくつかはそうではありません。どうすればいいのかを要約してみます。

+ +

直接読み替えられるステージ:

+ +
+
/* ディレクトリ設定作成関数 */
+
/* ディレクトリ毎設定構造体作成 */
+ +
/* サーバ設定作成関数 */
+
/* サーバ毎設定構造体作成 */
+ +
/* ディレクトリ設定マージ関数 */
+
/* ディレクトリ毎設定構造体マージ */
+ +
/* サーバ設定マージ関数 */
+
/* サーバ毎設定構造体作成マージ */
+ +
/* コマンド・テーブル */
+
/* コマンド apr_table_t */
+ +
/* ハンドラ */
+
/* ハンドラ */
+
+ +

古い関数の残りのものはフックとして登録されるべきです。 + 現時点で次のようなフック・ステージが定義されています…

+ +
+
ap_hook_post_config
+
(以前の _init ルーチンが登録されるべき場所です)
+ +
ap_hook_http_method
+
(リクエストから HTTP メソッドを取得します (互換用))
+ +
ap_hook_open_logs
+
(特定のログのオープン)
+ +
ap_hook_auth_checker
+
(リソースが権限を必要とするかどうかの確認)
+ +
ap_hook_access_checker
+
(モジュール固有の制約の確認)
+ +
ap_hook_check_user_id
+
(ユーザ ID とパスワードの確認)
+ +
ap_hook_default_port
+
(サーバのデフォルト・ポートの取得)
+ +
ap_hook_pre_connection
+
(処理の直前に必要なことを実行。ただし accept 直後に呼ばれる)
+ +
ap_hook_process_connection
+
(プロトコルの処理)
+ +
ap_hook_child_init
+
(子プロセル起動直後)
+ +
ap_hook_create_request
+
(??)
+ +
ap_hook_fixups
+
(応答内容の生成を変更するラスト・チャンス)
+ +
ap_hook_handler
+
(応答内容の生成)
+ +
ap_hook_header_parser
+
(モジュールにヘッダの照会をさせる。ほとんどのモジュールでは使われません。post_read_request を使います)
+ +
ap_hook_insert_filter
+
(フィルタ・チェインにフィルタを挿入)
+ +
ap_hook_log_transaction
+
(リクエストについての情報を記録する)
+ +
ap_hook_optional_fn_retrieve
+
(オプションとして登録された関数の取得)
+ +
ap_hook_post_read_request
+
(リクエストを読みこんだ後、他のフェーズの前に呼ばれる)
+ +
ap_hook_quick_handler
+
リクエストの処理が始まる前に呼ばれる。キャッシュモジュールが + 使用している
+ +
ap_hook_translate_name
+
(URI をファイル名に変換する)
+ +
ap_hook_type_checker
+
(文書型の決定と設定。あるいはその片方)
+
+ +
+
+

Available Languages:  en  | + ja 

+
+ \ No newline at end of file diff --git a/rubbos/app/apache2/manual/developer/request.html b/rubbos/app/apache2/manual/developer/request.html new file mode 100644 index 00000000..ed3694f6 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/request.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: request.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 diff --git a/rubbos/app/apache2/manual/developer/request.html.en b/rubbos/app/apache2/manual/developer/request.html.en new file mode 100644 index 00000000..1ac89d24 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/request.html.en @@ -0,0 +1,260 @@ + + + +Request Processing in Apache 2.0 - Apache HTTP Server + + + + + +
<-
+

Request Processing in Apache 2.0

+
+

Available Languages:  en 

+
+ +

Warning

+

Warning - this is a first (fast) draft that needs further + revision!

+
+ +

Several changes in Apache 2.0 affect the internal request + processing mechanics. Module authors need to be aware of these + changes so they may take advantage of the optimizations and + security enhancements.

+ +

The first major change is to the subrequest and redirect + mechanisms. There were a number of different code paths in + Apache 1.3 to attempt to optimize subrequest or redirect + behavior. As patches were introduced to 2.0, these + optimizations (and the server behavior) were quickly broken due + to this duplication of code. All duplicate code has been folded + back into ap_process_request_internal() to prevent + the code from falling out of sync again.

+ +

This means that much of the existing code was 'unoptimized'. + It is the Apache HTTP Project's first goal to create a robust + and correct implementation of the HTTP server RFC. Additional + goals include security, scalability and optimization. New + methods were sought to optimize the server (beyond the + performance of Apache 1.3) without introducing fragile or + insecure code.

+
+ +
top
+
+

The Request Processing Cycle

+

All requests pass through ap_process_request_internal() + in request.c, including subrequests and redirects. If a module + doesn't pass generated requests through this code, the author is cautioned + that the module may be broken by future changes to request + processing.

+ +

To streamline requests, the module author can take advantage + of the hooks offered to drop out of the request cycle early, or + to bypass core Apache hooks which are irrelevant (and costly in + terms of CPU.)

+
top
+
+

The Request Parsing Phase

+

Unescapes the URL

+

The request's parsed_uri path is unescaped, once and only + once, at the beginning of internal request processing.

+ +

This step is bypassed if the proxyreq flag is set, or the + parsed_uri.path element is unset. The module has no further + control of this one-time unescape operation, either failing to + unescape or multiply unescaping the URL leads to security + reprecussions.

+ + +

Strips Parent and This Elements from the + URI

+

All /../ and /./ elements are + removed by ap_getparents(). This helps to ensure + the path is (nearly) absolute before the request processing + continues.

+ +

This step cannot be bypassed.

+ + +

Initial URI Location Walk

+

Every request is subject to an + ap_location_walk() call. This ensures that + <Location> sections + are consistently enforced for all requests. If the request is an internal + redirect or a sub-request, it may borrow some or all of the processing + from the previous or parent request's ap_location_walk, so this step + is generally very efficient after processing the main request.

+ + +

translate_name

+

Modules can determine the file name, or alter the given URI + in this step. For example, mod_vhost_alias will + translate the URI's path into the configured virtual host, + mod_alias will translate the path to an alias path, + and if the request falls back on the core, the DocumentRoot is prepended to the request resource.

+ +

If all modules DECLINE this phase, an error 500 is + returned to the browser, and a "couldn't translate name" error is logged + automatically.

+ + +

Hook: map_to_storage

+

After the file or correct URI was determined, the + appropriate per-dir configurations are merged together. For + example, mod_proxy compares and merges the appropriate + <Proxy> sections. + If the URI is nothing more than a local (non-proxy) TRACE + request, the core handles the request and returns DONE. + If no module answers this hook with OK or DONE, + the core will run the request filename against the <Directory> and <Files> sections. If the request + 'filename' isn't an absolute, legal filename, a note is set for + later termination.

+ + +

URI Location Walk

+

Every request is hardened by a second + ap_location_walk() call. This reassures that a + translated request is still subjected to the configured + <Location> sections. + The request again borrows some or all of the processing from its previous + location_walk above, so this step is almost always very + efficient unless the translated URI mapped to a substantially different + path or Virtual Host.

+ + +

Hook: header_parser

+

The main request then parses the client's headers. This + prepares the remaining request processing steps to better serve + the client's request.

+ +
top
+
+

The Security Phase

+

Needs Documentation. Code is:

+ +
+switch (ap_satisfies(r)) {
+case SATISFY_ALL:
+case SATISFY_NOSPEC:
+    if ((access_status = ap_run_access_checker(r)) != 0) {
+        return decl_die(access_status, "check access", r);
+    }
+
+    if (ap_some_auth_required(r)) {
+        if (((access_status = ap_run_check_user_id(r)) != 0)
+            || !ap_auth_type(r)) {
+            return decl_die(access_status, ap_auth_type(r)
+                          ? "check user.  No user file?"
+                          : "perform authentication. AuthType not set!",
+                          r);
+        }
+
+        if (((access_status = ap_run_auth_checker(r)) != 0)
+            || !ap_auth_type(r)) {
+            return decl_die(access_status, ap_auth_type(r)
+                          ? "check access.  No groups file?"
+                          : "perform authentication. AuthType not set!",
+                          r);
+        }
+    }
+    break;
+
+case SATISFY_ANY:
+    if (((access_status = ap_run_access_checker(r)) != 0)) {
+        if (!ap_some_auth_required(r)) {
+            return decl_die(access_status, "check access", r);
+        }
+
+        if (((access_status = ap_run_check_user_id(r)) != 0)
+            || !ap_auth_type(r)) {
+            return decl_die(access_status, ap_auth_type(r)
+                          ? "check user.  No user file?"
+                          : "perform authentication. AuthType not set!",
+                          r);
+        }
+
+        if (((access_status = ap_run_auth_checker(r)) != 0)
+            || !ap_auth_type(r)) {
+            return decl_die(access_status, ap_auth_type(r)
+                          ? "check access.  No groups file?"
+                          : "perform authentication. AuthType not set!",
+                          r);
+        }
+    }
+    break;
+}
+
top
+
+

The Preparation Phase

+

Hook: type_checker

+

The modules have an opportunity to test the URI or filename + against the target resource, and set mime information for the + request. Both mod_mime and + mod_mime_magic use this phase to compare the file + name or contents against the administrator's configuration and set the + content type, language, character set and request handler. Some modules + may set up their filters or other request handling parameters at this + time.

+ +

If all modules DECLINE this phase, an error 500 is + returned to the browser, and a "couldn't find types" error is logged + automatically.

+ + +

Hook: fixups

+

Many modules are 'trounced' by some phase above. The fixups + phase is used by modules to 'reassert' their ownership or force + the request's fields to their appropriate values. It isn't + always the cleanest mechanism, but occasionally it's the only + option.

+ +
top
+
+

The Handler Phase

+

This phase is not part of the processing in + ap_process_request_internal(). Many + modules prepare one or more subrequests prior to creating any + content at all. After the core, or a module calls + ap_process_request_internal() it then calls + ap_invoke_handler() to generate the request.

+ +

Hook: insert_filter

+

Modules that transform the content in some way can insert + their values and override existing filters, such that if the + user configured a more advanced filter out-of-order, then the + module can move its order as need be. There is no result code, + so actions in this hook better be trusted to always succeed.

+ + +

Hook: handler

+

The module finally has a chance to serve the request in its + handler hook. Note that not every prepared request is sent to + the handler hook. Many modules, such as mod_autoindex, + will create subrequests for a given URI, and then never serve the + subrequest, but simply lists it for the user. Remember not to + put required teardown from the hooks above into this module, + but register pool cleanups against the request pool to free + resources as required.

+ +
+
+

Available Languages:  en 

+
+ \ No newline at end of file diff --git a/rubbos/app/apache2/manual/developer/thread_safety.html b/rubbos/app/apache2/manual/developer/thread_safety.html new file mode 100644 index 00000000..50f07bf3 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/thread_safety.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: thread_safety.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 diff --git a/rubbos/app/apache2/manual/developer/thread_safety.html.en b/rubbos/app/apache2/manual/developer/thread_safety.html.en new file mode 100644 index 00000000..82b4e5a7 --- /dev/null +++ b/rubbos/app/apache2/manual/developer/thread_safety.html.en @@ -0,0 +1,285 @@ + + + +Apache 2.0 Thread Safety Issues - Apache HTTP Server + + + + + +
<-
+

Apache 2.0 Thread Safety Issues

+
+

Available Languages:  en 

+
+ +

When using any of the threaded mpms in Apache 2.0 it is important + that every function called from Apache be thread safe. When linking in 3rd + party extensions it can be difficult to determine whether the resulting + server will be thread safe. Casual testing generally won't tell you this + either as thread safety problems can lead to subtle race conditons that + may only show up in certain conditions under heavy load.

+
+ +
top
+
+

Global and static variables

+

When writing your module or when trying to determine if a module or + 3rd party library is thread safe there are some common things to keep in + mind.

+ +

First, you need to recognize that in a threaded model each individual + thread has its own program counter, stack and registers. Local variables + live on the stack, so those are fine. You need to watch out for any + static or global variables. This doesn't mean that you are absolutely not + allowed to use static or global variables. There are times when you + actually want something to affect all threads, but generally you need to + avoid using them if you want your code to be thread safe.

+ +

In the case where you have a global variable that needs to be global and + accessed by all threads, be very careful when you update it. If, for + example, it is an incrementing counter, you need to atomically increment + it to avoid race conditions with other threads. You do this using a mutex + (mutual exclusion). Lock the mutex, read the current value, increment it + and write it back and then unlock the mutex. Any other thread that wants + to modify the value has to first check the mutex and block until it is + cleared.

+ +

If you are using APR, have a look + at the apr_atomic_* functions and the + apr_thread_mutex_* functions.

+ +
top
+
+

errno

+

This is a common global variable that holds the error number of the + last error that occurred. If one thread calls a low-level function that + sets errno and then another thread checks it, we are bleeding error + numbers from one thread into another. To solve this, make sure your + module or library defines _REENTRANT or is compiled with + -D_REENTRANT. This will make errno a per-thread variable + and should hopefully be transparent to the code. It does this by doing + something like this:

+ +

+ #define errno (*(__errno_location())) +

+ +

which means that accessing errno will call + __errno_location() which is provided by the libc. Setting + _REENTRANT also forces redefinition of some other functions + to their *_r equivalents and sometimes changes + the common getc/putc macros into safer function + calls. Check your libc documentation for specifics. Instead of, or in + addition to _REENTRANT the symbols that may affect this are + _POSIX_C_SOURCE, _THREAD_SAFE, + _SVID_SOURCE, and _BSD_SOURCE.

+
top
+
+

Common standard troublesome functions

+

Not only do things have to be thread safe, but they also have to be + reentrant. strtok() is an obvious one. You call it the first + time with your delimiter which it then remembers and on each subsequent + call it returns the next token. Obviously if multiple threads are + calling it you will have a problem. Most systems have a reentrant version + of of the function called strtok_r() where you pass in an + extra argument which contains an allocated char * which the + function will use instead of its own static storage for maintaining + the tokenizing state. If you are using APR you can use apr_strtok().

+ +

crypt() is another function that tends to not be reentrant, + so if you run across calls to that function in a library, watch out. On + some systems it is reentrant though, so it is not always a problem. If + your system has crypt_r() chances are you should be using + that, or if possible simply avoid the whole mess by using md5 instead.

+ +
top
+
+

Common 3rd Party Libraries

+

The following is a list of common libraries that are used by 3rd party + Apache modules. You can check to see if your module is using a potentially + unsafe library by using tools such as ldd(1) and + nm(1). For PHP, for example, + try this:

+ +

+ % ldd libphp4.so
+ libsablot.so.0 => /usr/local/lib/libsablot.so.0 (0x401f6000)
+ libexpat.so.0 => /usr/lib/libexpat.so.0 (0x402da000)
+ libsnmp.so.0 => /usr/lib/libsnmp.so.0 (0x402f9000)
+ libpdf.so.1 => /usr/local/lib/libpdf.so.1 (0x40353000)
+ libz.so.1 => /usr/lib/libz.so.1 (0x403e2000)
+ libpng.so.2 => /usr/lib/libpng.so.2 (0x403f0000)
+ libmysqlclient.so.11 => /usr/lib/libmysqlclient.so.11 (0x40411000)
+ libming.so => /usr/lib/libming.so (0x40449000)
+ libm.so.6 => /lib/libm.so.6 (0x40487000)
+ libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0x404a8000)
+ libjpeg.so.62 => /usr/lib/libjpeg.so.62 (0x404e7000)
+ libcrypt.so.1 => /lib/libcrypt.so.1 (0x40505000)
+ libssl.so.2 => /lib/libssl.so.2 (0x40532000)
+ libcrypto.so.2 => /lib/libcrypto.so.2 (0x40560000)
+ libresolv.so.2 => /lib/libresolv.so.2 (0x40624000)
+ libdl.so.2 => /lib/libdl.so.2 (0x40634000)
+ libnsl.so.1 => /lib/libnsl.so.1 (0x40637000)
+ libc.so.6 => /lib/libc.so.6 (0x4064b000)
+ /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x80000000) +

+ +

In addition to these libraries you will need to have a look at any + libraries linked statically into the module. You can use nm(1) + to look for individual symbols in the module.

+
top
+
+

Library List

+

Please drop a note to dev@httpd.apache.org + if you have additions or corrections to this list.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
LibraryVersionThread Safe?Notes
ASpell/PSpell ?
Berkeley DB3.x, 4.xYesBe careful about sharing a connection across threads.
bzip2 YesBoth low-level and high-level APIs are thread-safe. However, + high-level API requires thread-safe access to errno.
cdb ?
C-Client Perhapsc-client uses strtok() and + gethostbyname() which are not thread-safe on most C + library implementations. c-client's static data is meant to be shared + across threads. If strtok() and + gethostbyname() are thread-safe on your OS, c-client + may be thread-safe.
cpdflib ?
libcrypt ?
Expat YesNeed a separate parser instance per thread
FreeTDS ?
FreeType ?
GD 1.8.x ?
GD 2.0.x ?
gdbm NoErrors returned via a static gdbm_error + variable
ImageMagick5.2.2YesImageMagick docs claim it is thread safe since version 5.2.2 (see Change log). +
Imlib2 ?
libjpegv6b?
libmysqlclient YesUse mysqlclient_r library variant to ensure thread-safety. For + more information, please read http://www.mysql.com/doc/en/Threaded_clients.html.
Ming0.2a?
Net-SNMP5.0.x?
OpenLDAP2.1.xYesUse ldap_r library variant to ensure + thread-safety.
OpenSSL0.9.6gYesRequires proper usage of CRYPTO_num_locks, + CRYPTO_set_locking_callback, + CRYPTO_set_id_callback
liboci8 (Oracle 8+)8.x,9.x?
pdflib5.0.xYesPDFLib docs claim it is thread safe; changes.txt indicates it + has been partially thread-safe since V1.91: http://www.pdflib.com/products/pdflib/index.html.
libpng1.0.x?
libpng1.2.x?
libpq (PostgreSQL)7.xYesDon't share connections across threads and watch out for + crypt() calls
Sablotron0.95?
zlib1.1.4YesRelies upon thread-safe zalloc and zfree functions Default is to + use libc's calloc/free which are thread-safe.
+
+
+

Available Languages:  en 

+
+ \ No newline at end of file -- cgit 1.2.3-korg