diff options
author | hongbotian <hongbo.tianhongbo@huawei.com> | 2015-11-30 01:45:08 -0500 |
---|---|---|
committer | hongbotian <hongbo.tianhongbo@huawei.com> | 2015-11-30 01:45:08 -0500 |
commit | e8ec7aa8e38a93f5b034ac74cebce5de23710317 (patch) | |
tree | aa031937bf856c1f8d6ad7877b8d2cb0224da5ef /rubbos/app/httpd-2.0.64/srclib/pcre/doc/pcretest.txt | |
parent | cc40af334e619bb549038238507407866f774f8f (diff) |
upload http
JIRA: BOTTLENECK-10
Change-Id: I7598427ff904df438ce77c2819ee48ac75ffa8da
Signed-off-by: hongbotian <hongbo.tianhongbo@huawei.com>
Diffstat (limited to 'rubbos/app/httpd-2.0.64/srclib/pcre/doc/pcretest.txt')
-rw-r--r-- | rubbos/app/httpd-2.0.64/srclib/pcre/doc/pcretest.txt | 319 |
1 files changed, 319 insertions, 0 deletions
diff --git a/rubbos/app/httpd-2.0.64/srclib/pcre/doc/pcretest.txt b/rubbos/app/httpd-2.0.64/srclib/pcre/doc/pcretest.txt new file mode 100644 index 00000000..0e13b6c6 --- /dev/null +++ b/rubbos/app/httpd-2.0.64/srclib/pcre/doc/pcretest.txt @@ -0,0 +1,319 @@ +NAME + pcretest - a program for testing Perl-compatible regular + expressions. + + + +SYNOPSIS + pcretest [-d] [-i] [-m] [-o osize] [-p] [-t] [source] [des- + tination] + + pcretest was written as a test program for the PCRE regular + expression library itself, but it can also be used for + experimenting with regular expressions. This man page + describes the features of the test program; for details of + the regular expressions themselves, see the pcre man page. + + + +OPTIONS + -d Behave as if each regex had the /D modifier (see + below); the internal form is output after compila- + tion. + + -i Behave as if each regex had the /I modifier; + information about the compiled pattern is given + after compilation. + + -m Output the size of each compiled pattern after it + has been compiled. This is equivalent to adding /M + to each regular expression. For compatibility with + earlier versions of pcretest, -s is a synonym for + -m. + + -o osize Set the number of elements in the output vector + that is used when calling PCRE to be osize. The + default value is 45, which is enough for 14 cap- + turing subexpressions. The vector size can be + changed for individual matching calls by including + \O in the data line (see below). + + -p Behave as if each regex has /P modifier; the POSIX + wrapper API is used to call PCRE. None of the + other options has any effect when -p is set. + + -t Run each compile, study, and match 20000 times + with a timer, and output resulting time per com- + pile or match (in milliseconds). Do not set -t + with -m, because you will then get the size output + 20000 times and the timing will be distorted. + + + +DESCRIPTION + If pcretest is given two filename arguments, it reads from + the first and writes to the second. If it is given only one + + + + +SunOS 5.8 Last change: 1 + + + + filename argument, it reads from that file and writes to + stdout. Otherwise, it reads from stdin and writes to stdout, + and prompts for each line of input, using "re>" to prompt + for regular expressions, and "data>" to prompt for data + lines. + + The program handles any number of sets of input on a single + input file. Each set starts with a regular expression, and + continues with any number of data lines to be matched + against the pattern. An empty line signals the end of the + data lines, at which point a new regular expression is read. + The regular expressions are given enclosed in any non- + alphameric delimiters other than backslash, for example + + /(a|bc)x+yz/ + + White space before the initial delimiter is ignored. A regu- + lar expression may be continued over several input lines, in + which case the newline characters are included within it. It + is possible to include the delimiter within the pattern by + escaping it, for example + + /abc\/def/ + + If you do so, the escape and the delimiter form part of the + pattern, but since delimiters are always non-alphameric, + this does not affect its interpretation. If the terminating + delimiter is immediately followed by a backslash, for exam- + ple, + + /abc/\ + + then a backslash is added to the end of the pattern. This is + done to provide a way of testing the error condition that + arises if a pattern finishes with a backslash, because + + /abc\/ + + is interpreted as the first line of a pattern that starts + with "abc/", causing pcretest to read the next line as a + continuation of the regular expression. + + + +PATTERN MODIFIERS + The pattern may be followed by i, m, s, or x to set the + PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, or PCRE_EXTENDED + options, respectively. For example: + + /caseless/i + + These modifier letters have the same effect as they do in + Perl. There are others which set PCRE options that do not + correspond to anything in Perl: /A, /E, and /X set + PCRE_ANCHORED, PCRE_DOLLAR_ENDONLY, and PCRE_EXTRA respec- + tively. + + Searching for all possible matches within each subject + string can be requested by the /g or /G modifier. After + finding a match, PCRE is called again to search the + remainder of the subject string. The difference between /g + and /G is that the former uses the startoffset argument to + pcre_exec() to start searching at a new point within the + entire string (which is in effect what Perl does), whereas + the latter passes over a shortened substring. This makes a + difference to the matching process if the pattern begins + with a lookbehind assertion (including \b or \B). + + If any call to pcre_exec() in a /g or /G sequence matches an + empty string, the next call is done with the PCRE_NOTEMPTY + and PCRE_ANCHORED flags set in order to search for another, + non-empty, match at the same point. If this second match + fails, the start offset is advanced by one, and the normal + match is retried. This imitates the way Perl handles such + cases when using the /g modifier or the split() function. + + There are a number of other modifiers for controlling the + way pcretest operates. + + The /+ modifier requests that as well as outputting the sub- + string that matched the entire pattern, pcretest should in + addition output the remainder of the subject string. This is + useful for tests where the subject contains multiple copies + of the same substring. + + The /L modifier must be followed directly by the name of a + locale, for example, + + /pattern/Lfr + + For this reason, it must be the last modifier letter. The + given locale is set, pcre_maketables() is called to build a + set of character tables for the locale, and this is then + passed to pcre_compile() when compiling the regular expres- + sion. Without an /L modifier, NULL is passed as the tables + pointer; that is, /L applies only to the expression on which + it appears. + + The /I modifier requests that pcretest output information + about the compiled expression (whether it is anchored, has a + fixed first character, and so on). It does this by calling + pcre_fullinfo() after compiling an expression, and output- + ting the information it gets back. If the pattern is stu- + died, the results of that are also output. + The /D modifier is a PCRE debugging feature, which also + assumes /I. It causes the internal form of compiled regular + expressions to be output after compilation. + + The /S modifier causes pcre_study() to be called after the + expression has been compiled, and the results used when the + expression is matched. + + The /M modifier causes the size of memory block used to hold + the compiled pattern to be output. + + The /P modifier causes pcretest to call PCRE via the POSIX + wrapper API rather than its native API. When this is done, + all other modifiers except /i, /m, and /+ are ignored. + REG_ICASE is set if /i is present, and REG_NEWLINE is set if + /m is present. The wrapper functions force + PCRE_DOLLAR_ENDONLY always, and PCRE_DOTALL unless + REG_NEWLINE is set. + + The /8 modifier causes pcretest to call PCRE with the + PCRE_UTF8 option set. This turns on the (currently incom- + plete) support for UTF-8 character handling in PCRE, pro- + vided that it was compiled with this support enabled. This + modifier also causes any non-printing characters in output + strings to be printed using the \x{hh...} notation if they + are valid UTF-8 sequences. + + + +DATA LINES + Before each data line is passed to pcre_exec(), leading and + trailing whitespace is removed, and it is then scanned for \ + escapes. The following are recognized: + + \a alarm (= BEL) + \b backspace + \e escape + \f formfeed + \n newline + \r carriage return + \t tab + \v vertical tab + \nnn octal character (up to 3 octal digits) + \xhh hexadecimal character (up to 2 hex digits) + \x{hh...} hexadecimal UTF-8 character + + \A pass the PCRE_ANCHORED option to pcre_exec() + \B pass the PCRE_NOTBOL option to pcre_exec() + \Cdd call pcre_copy_substring() for substring dd + after a successful match (any decimal number + less than 32) + \Gdd call pcre_get_substring() for substring dd + + after a successful match (any decimal number + less than 32) + \L call pcre_get_substringlist() after a + successful match + \N pass the PCRE_NOTEMPTY option to pcre_exec() + \Odd set the size of the output vector passed to + pcre_exec() to dd (any number of decimal + digits) + \Z pass the PCRE_NOTEOL option to pcre_exec() + + When \O is used, it may be higher or lower than the size set + by the -O option (or defaulted to 45); \O applies only to + the call of pcre_exec() for the line in which it appears. + + A backslash followed by anything else just escapes the any- + thing else. If the very last character is a backslash, it is + ignored. This gives a way of passing an empty line as data, + since a real empty line terminates the data input. + + If /P was present on the regex, causing the POSIX wrapper + API to be used, only B, and Z have any effect, causing + REG_NOTBOL and REG_NOTEOL to be passed to regexec() respec- + tively. + + The use of \x{hh...} to represent UTF-8 characters is not + dependent on the use of the /8 modifier on the pattern. It + is recognized always. There may be any number of hexadecimal + digits inside the braces. The result is from one to six + bytes, encoded according to the UTF-8 rules. + + + +OUTPUT FROM PCRETEST + When a match succeeds, pcretest outputs the list of captured + substrings that pcre_exec() returns, starting with number 0 + for the string that matched the whole pattern. Here is an + example of an interactive pcretest run. + + $ pcretest + PCRE version 2.06 08-Jun-1999 + + re> /^abc(\d+)/ + data> abc123 + 0: abc123 + 1: 123 + data> xyz + No match + + If the strings contain any non-printing characters, they are + output as \0x escapes, or as \x{...} escapes if the /8 + modifier was present on the pattern. If the pattern has the + /+ modifier, then the output for substring 0 is followed by + the the rest of the subject string, identified by "0+" like + this: + + re> /cat/+ + data> cataract + 0: cat + 0+ aract + + If the pattern has the /g or /G modifier, the results of + successive matching attempts are output in sequence, like + this: + + re> /\Bi(\w\w)/g + data> Mississippi + 0: iss + 1: ss + 0: iss + 1: ss + 0: ipp + 1: pp + + "No match" is output only if the first match attempt fails. + + If any of the sequences \C, \G, or \L are present in a data + line that is successfully matched, the substrings extracted + by the convenience functions are output with C, G, or L + after the string number instead of a colon. This is in addi- + tion to the normal full list. The string length (that is, + the return from the extraction function) is given in + parentheses after each string for \C and \G. + + Note that while patterns can be continued over several lines + (a plain ">" prompt is used for continuations), data lines + may not. However newlines can be included in data by means + of the \n escape. + + + +AUTHOR + Philip Hazel <ph10@cam.ac.uk> + University Computing Service, + New Museums Site, + Cambridge CB2 3QG, England. + Phone: +44 1223 334714 + + Last updated: 15 August 2001 + Copyright (c) 1997-2001 University of Cambridge. |