From cc40af334e619bb549038238507407866f774f8f Mon Sep 17 00:00:00 2001 From: hongbotian Date: Mon, 30 Nov 2015 01:35:09 -0500 Subject: upload apache JIRA: BOTTLENECK-10 Change-Id: I67eae31de6dc824097dfa56ab454ba36fdd23a2c Signed-off-by: hongbotian --- .../apache2/manual/rewrite/rewrite_guide.html.en | 788 +++++++++++++++++++++ 1 file changed, 788 insertions(+) create mode 100644 rubbos/app/apache2/manual/rewrite/rewrite_guide.html.en (limited to 'rubbos/app/apache2/manual/rewrite/rewrite_guide.html.en') diff --git a/rubbos/app/apache2/manual/rewrite/rewrite_guide.html.en b/rubbos/app/apache2/manual/rewrite/rewrite_guide.html.en new file mode 100644 index 00000000..527cdac2 --- /dev/null +++ b/rubbos/app/apache2/manual/rewrite/rewrite_guide.html.en @@ -0,0 +1,788 @@ + + + +URL Rewriting Guide - Apache HTTP Server + + + + + +

URL Rewriting Guide


Available Languages:  en 

+ + +

This document supplements the mod_rewrite + reference documentation. + It describes how one can use Apache's mod_rewrite + to solve typical URL-based problems with which webmasters are + commonony confronted. We give detailed descriptions on how to + solve each problem by configuring URL rewriting rulesets.

+ +
ATTENTION: Depending on your server configuration + it may be necessary to slightly change the examples for your + situation, e.g. adding the [PT] flag when + additionally using mod_alias and + mod_userdir, etc. Or rewriting a ruleset + to fit in .htaccess context instead + of per-server context. Always try to understand what a + particular ruleset really does before you use it. This + avoids many problems.
+ +
+ +

Canonical URLs

+ + + +
+ +

On some webservers there are more than one URL for a + resource. Usually there are canonical URLs (which should be + actually used and distributed) and those which are just + shortcuts, internal ones, etc. Independent of which URL the + user supplied with the request he should finally see the + canonical one only.

+ +
+ +

We do an external HTTP redirect for all non-canonical + URLs to fix them in the location view of the Browser and + for all subsequent requests. In the example ruleset below + we replace /~user by the canonical + /u/user and fix a missing trailing slash for + /u/user.

+ +
+RewriteRule   ^/~([^/]+)/?(.*)    /u/$1/$2  [R]
+RewriteRule   ^/([uge])/([^/]+)$  /$1/$2/   [R]
+ +

Canonical Hostnames

+ +
+ +
The goal of this rule is to force the use of a particular + hostname, in preference to other hostnames which may be used to + reach the same site. For example, if you wish to force the use + of instead of +, you might use a variant of the + following recipe.
+ +
+ +

For sites running on a port other than 80:

+RewriteCond %{HTTP_HOST}   !^fully\.qualified\.domain\.name [NC]
+RewriteCond %{HTTP_HOST}   !^$
+RewriteCond %{SERVER_PORT} !^80$
+RewriteRule ^/(.*){SERVER_PORT}/$1 [L,R]
+ +

And for a site running on port 80

+RewriteCond %{HTTP_HOST}   !^fully\.qualified\.domain\.name [NC]
+RewriteCond %{HTTP_HOST}   !^$
+RewriteRule ^/(.*)$1 [L,R]
+ +

Moved DocumentRoot

+ + + +
+ +

Usually the DocumentRoot +of the webserver directly relates to the URL "/". +But often this data is not really of top-level priority. For example, +you may wish for visitors, on first entering a site, to go to a +particular subdirectory /about/. This may be accomplished +using the following ruleset:

+ +
+ +

We redirect the URL / to + /about/: +

+ +
+RewriteEngine on
+RewriteRule   ^/$  /about/  [R]
+ +

Note that this can also be handled using the RedirectMatch directive:

+ +

+RedirectMatch ^/$ +

+ +

Trailing Slash Problem

+ + + +
+ +

The vast majority of "trailing slash" problems can be dealt + with using the techniques discussed in the FAQ + entry. However, occasionally, there is a need to use mod_rewrite + to handle a case where a missing trailing slash causes a URL to + fail. This can happen, for example, after a series of complex + rewrite rules.

+ +
+ +

The solution to this subtle problem is to let the server + add the trailing slash automatically. To do this + correctly we have to use an external redirect, so the + browser correctly requests subsequent images etc. If we + only did a internal rewrite, this would only work for the + directory page, but would go wrong when any images are + included into this page with relative URLs, because the + browser would request an in-lined object. For instance, a + request for image.gif in + /~quux/foo/index.html would become + /~quux/image.gif without the external + redirect!

+ +

So, to do this trick we write:

+ +
+RewriteEngine  on
+RewriteBase    /~quux/
+RewriteRule    ^foo$  foo/  [R]
+ +

Alternately, you can put the following in a + top-level .htaccess file in the content directory. + But note that this creates some processing overhead.

+ +
+RewriteEngine  on
+RewriteBase    /~quux/
+RewriteCond    %{REQUEST_FILENAME}  -d
+RewriteRule    ^(.+[^/])$           $1/  [R]
+ +

Move Homedirs to Different Webserver

+ + + +
+ +

Many webmasters have asked for a solution to the + following situation: They wanted to redirect just all + homedirs on a webserver to another webserver. They usually + need such things when establishing a newer webserver which + will replace the old one over time.

+ +
+ +

The solution is trivial with mod_rewrite. + On the old webserver we just redirect all + /~user/anypath URLs to + http://newserver/~user/anypath.

+ +
+RewriteEngine on
+RewriteRule   ^/~(.+)  http://newserver/~$1  [R,L]
+ +

Search pages in more than one directory

+ + + +
+ +

Sometimes it is necessary to let the webserver search + for pages in more than one directory. Here MultiViews or + other techniques cannot help.

+ +
+ +

We program a explicit ruleset which searches for the + files in the directories.

+ +
+RewriteEngine on
+#   first try to find it in custom/...
+#   ...and if found stop and be happy:
+RewriteCond         /your/docroot/dir1/%{REQUEST_FILENAME}  -f
+RewriteRule  ^(.+)  /your/docroot/dir1/$1  [L]
+#   second try to find it in pub/...
+#   ...and if found stop and be happy:
+RewriteCond         /your/docroot/dir2/%{REQUEST_FILENAME}  -f
+RewriteRule  ^(.+)  /your/docroot/dir2/$1  [L]
+#   else go on for other Alias or ScriptAlias directives,
+#   etc.
+RewriteRule   ^(.+)  -  [PT]
+ +

Set Environment Variables According To URL Parts

+ + + +
+ +

Perhaps you want to keep status information between + requests and use the URL to encode it. But you don't want + to use a CGI wrapper for all pages just to strip out this + information.

+ +
+ +

We use a rewrite rule to strip out the status information + and remember it via an environment variable which can be + later dereferenced from within XSSI or CGI. This way a + URL /foo/S=java/bar/ gets translated to + /foo/bar/ and the environment variable named + STATUS is set to the value "java".

+ +
+RewriteEngine on
+RewriteRule   ^(.*)/S=([^/]+)/(.*)    $1/$3 [E=STATUS:$2]
+ +

Virtual User Hosts

+ + + +
+ +

Assume that you want to provide + + for the homepage of username via just DNS A records to the + same machine and without any virtualhosts on this + machine.

+ +
+ +

For HTTP/1.0 requests there is no solution, but for + HTTP/1.1 requests which contain a Host: HTTP header we + can use the following ruleset to rewrite + + internally to /home/username/anypath:

+ +
+RewriteEngine on
+RewriteCond   %{HTTP_HOST}                 ^www\.[^.]+\.host\.com$
+RewriteRule   ^(.+)                        %{HTTP_HOST}$1          [C]
+RewriteRule   ^www\.([^.]+)\.host\.com(.*) /home/$1$2
+ +

Redirect Homedirs For Foreigners

+ + + +
+ +

We want to redirect homedir URLs to another webserver + when the requesting user + does not stay in the local domain + This is sometimes used in + virtual host contexts.

+ +
+ +

Just a rewrite condition:

+ +
+RewriteEngine on
+RewriteCond   %{REMOTE_HOST}  !^.+\.ourdomain\.com$
+RewriteRule   ^(/~.+)$1 [R,L]
+ +

Redirecting Anchors

+ + + +
+ +

By default, redirecting to an HTML anchor doesn't work, + because mod_rewrite escapes the # character, + turning it into %23. This, in turn, breaks the + redirection.

+ +
+ +

Use the [NE] flag on the + RewriteRule. NE stands for No Escape. +

+ +

Time-Dependent Rewriting

+ + + +
+ +

When tricks like time-dependent content should happen a + lot of webmasters still use CGI scripts which do for + instance redirects to specialized pages. How can it be done + via mod_rewrite?

+ +
+ +

There are a lot of variables named TIME_xxx + for rewrite conditions. In conjunction with the special + lexicographic comparison patterns <STRING, + >STRING and =STRING we can + do time-dependent redirects:

+ +
+RewriteEngine on
+RewriteCond   %{TIME_HOUR}%{TIME_MIN} >0700
+RewriteCond   %{TIME_HOUR}%{TIME_MIN} <1900
+RewriteRule   ^foo\.html$   
+RewriteRule   ^foo\.html$             foo.night.html
+ +

This provides the content of + under the URL foo.html from + 07:00-19:00 and at the remaining time the + contents of foo.night.html. Just a nice + feature for a homepage...

+ +

Backward Compatibility for YYYY to XXXX migration

+ + + +
+ +

How can we make URLs backward compatible (still + existing virtually) after migrating document.YYYY + to document.XXXX, e.g. after translating a + bunch of .html files to .phtml?

+ +
+ +

We just rewrite the name to its basename and test for + existence of the new extension. If it exists, we take + that name, else we rewrite the URL to its original state.

+ + +
+#   backward compatibility ruleset for
+#   rewriting document.html to document.phtml
+#   when and only when document.phtml exists
+#   but no longer document.html
+RewriteEngine on
+RewriteBase   /~quux/
+#   parse out basename, but remember the fact
+RewriteRule   ^(.*)\.html$              $1      [C,E=WasHTML:yes]
+#   rewrite to document.phtml if exists
+RewriteCond   %{REQUEST_FILENAME}.phtml -f
+RewriteRule   ^(.*)$ $1.phtml                   [S=1]
+#   else reverse the previous basename cutout
+RewriteCond   %{ENV:WasHTML}            ^yes$
+RewriteRule   ^(.*)$ $1.html
+ +

Content Handling

+ + + +

From Old to New (intern)

+ + + +
+ +

Assume we have recently renamed the page + foo.html to bar.html and now want + to provide the old URL for backward compatibility. Actually + we want that users of the old URL even not recognize that + the pages was renamed.

+ +
+ +

We rewrite the old URL to the new one internally via the + following rule:

+ +
+RewriteEngine  on
+RewriteBase    /~quux/
+RewriteRule    ^foo\.html$  bar.html
+ + + +

From Old to New (extern)

+ + + +
+ +

Assume again that we have recently renamed the page + foo.html to bar.html and now want + to provide the old URL for backward compatibility. But this + time we want that the users of the old URL get hinted to + the new one, i.e. their browsers Location field should + change, too.

+ +
+ +

We force a HTTP redirect to the new URL which leads to a + change of the browsers and thus the users view:

+ +
+RewriteEngine  on
+RewriteBase    /~quux/
+RewriteRule    ^foo\.html$  bar.html  [R]
+ + + +

From Static to Dynamic

+ + + +
+ +

How can we transform a static page + foo.html into a dynamic variant + foo.cgi in a seamless way, i.e. without notice + by the browser/user.

+ +
+ +

We just rewrite the URL to the CGI-script and force the + correct MIME-type so it gets really run as a CGI-script. + This way a request to /~quux/foo.html + internally leads to the invocation of + /~quux/foo.cgi.

+ +
+RewriteEngine  on
+RewriteBase    /~quux/
+RewriteRule    ^foo\.html$  foo.cgi  [T=application/x-httpd-cgi]
+ + +

Access Restriction

+ + + +

Blocking of Robots

+ + + +
+ +

How can we block a really annoying robot from + retrieving pages of a specific webarea? A + /robots.txt file containing entries of the + "Robot Exclusion Protocol" is typically not enough to get + rid of such a robot.

+ +
+ +

We use a ruleset which forbids the URLs of the webarea + /~quux/foo/arc/ (perhaps a very deep + directory indexed area where the robot traversal would + create big server load). We have to make sure that we + forbid access only to the particular robot, i.e. just + forbidding the host where the robot runs is not enough. + This would block users from this host, too. We accomplish + this by also matching the User-Agent HTTP header + information.

+ +
+RewriteCond %{HTTP_USER_AGENT}   ^NameOfBadRobot.*
+RewriteCond %{REMOTE_ADDR}       ^123\.45\.67\.[8-9]$
+RewriteRule ^/~quux/foo/arc/.+   -   [F]
+ + + +

Blocked Inline-Images

+ + + +
+ +

Assume we have under + some pages with inlined GIF graphics. These graphics are + nice, so others directly incorporate them via hyperlinks to + their pages. We don't like this practice because it adds + useless traffic to our server.

+ +
+ +

While we cannot 100% protect the images from inclusion, + we can at least restrict the cases where the browser + sends a HTTP Referer header.

+ +
+RewriteCond %{HTTP_REFERER} !^$
+RewriteCond %{HTTP_REFERER} !^*$ [NC]
+RewriteRule .*\.gif$        -                                    [F]
+ +
+RewriteCond %{HTTP_REFERER}         !^$
+RewriteCond %{HTTP_REFERER}         !.*/foo-with-gif\.html$
+RewriteRule ^inlined-in-foo\.gif$   -                        [F]
+ + + +

Proxy Deny

+ + + +
+ +

How can we forbid a certain host or even a user of a + special host from using the Apache proxy?

+ +
+ +

We first have to make sure mod_rewrite + is below(!) mod_proxy in the Configuration + file when compiling the Apache webserver. This way it gets + called before mod_proxy. Then we + configure the following for a host-dependent deny...

+ +
+RewriteCond %{REMOTE_HOST} ^badhost\.mydomain\.com$
+RewriteRule !^http://[^/.]\*  - [F]
+ +

...and this one for a user@host-dependent deny:

+ +
+RewriteCond %{REMOTE_IDENT}@%{REMOTE_HOST}  ^badguy@badhost\.mydomain\.com$
+RewriteRule !^http://[^/.]\*  - [F]
+ + + +


+ + + +

External Rewriting Engine

+ + + +
+ +

A FAQ: How can we solve the FOO/BAR/QUUX/etc. + problem? There seems no solution by the use of + mod_rewrite...

+ +
+ +

Use an external RewriteMap, i.e. a program which acts + like a RewriteMap. It is run once on startup of Apache + receives the requested URLs on STDIN and has + to put the resulting (usually rewritten) URL on + STDOUT (same order!).

+ +
+RewriteEngine on
+RewriteMap    quux-map       prg:/path/to/
+RewriteRule   ^/~quux/(.*)$  /~quux/${quux-map:$1}
+ +
+#   disable buffered I/O which would lead
+#   to deadloops for the Apache server
+$| = 1;
+#   read URLs one per line from stdin and
+#   generate substitution URL on stdout
+while (<>) {
+    s|^foo/|bar/|;
+    print $_;
+ +

This is a demonstration-only example and just rewrites + all URLs /~quux/foo/... to + /~quux/bar/.... Actually you can program + whatever you like. But notice that while such maps can be + used also by an average user, only the + system administrator can define it.

+ + + +

Available Languages:  en 

+ \ No newline at end of file -- cgit 1.2.3-korg