diff options
author | hongbotian <hongbo.tianhongbo@huawei.com> | 2015-11-30 02:41:33 -0500 |
---|---|---|
committer | hongbotian <hongbo.tianhongbo@huawei.com> | 2015-11-30 02:43:36 -0500 |
commit | 9401f816dd0d9d550fe98a8507224bde51c4b847 (patch) | |
tree | 94f2d7a7893a787bafdca8b5ef063ea316938874 /rubbos/app/tomcat-connectors-1.2.32-src/native/TODO.txt | |
parent | e8ec7aa8e38a93f5b034ac74cebce5de23710317 (diff) |
upload tomcat
JIRA: BOTTLENECK-7
Change-Id: I875d474869efd76ca203c30b60ebc0c3ee606d0e
Signed-off-by: hongbotian <hongbo.tianhongbo@huawei.com>
Diffstat (limited to 'rubbos/app/tomcat-connectors-1.2.32-src/native/TODO.txt')
-rw-r--r-- | rubbos/app/tomcat-connectors-1.2.32-src/native/TODO.txt | 371 |
1 files changed, 371 insertions, 0 deletions
diff --git a/rubbos/app/tomcat-connectors-1.2.32-src/native/TODO.txt b/rubbos/app/tomcat-connectors-1.2.32-src/native/TODO.txt new file mode 100644 index 00000000..73dd13f3 --- /dev/null +++ b/rubbos/app/tomcat-connectors-1.2.32-src/native/TODO.txt @@ -0,0 +1,371 @@ + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +TODO for tomcat-connectors + +$Id: TODO.txt 920112 2010-03-07 20:57:35Z timw $ + +1) Optimize "distance" +====================== + +Sorting the list of balanced workers by distance would be nice, but: +How to combine the sorting with the offset implementation (especially +useful for strategy BUSYNESS under low load). + +Local error states and likely other features will also break in case +we do naive reordering. + +2) Reduce number of string comparisons in most_suitable +======================================================== + +a) redirect/domains + +It would be easy to improve the redirect string b an integer, giving the +index of the worker in the lb. Then lb would not need to search for the redirect worker. + +The same way, one could add a list with indizes to workers in the same domain. +Whenever domain names are managed (init and status worker update) one would +scan the worker list and update the index list. + +Finally one could have a list of workers, whose domain is the same as the redirect +attribute of the worker, because that's also something we consider. + +What I'm not sure about, even in the existing code, is the locking between updates +by the status worker and the process local information about the workers, +especially in the case, when status updates a redirect or domain attribute. + +I would like to keep these attributes and the new index arrays process local, +and the processes should find out about changes made by status to shm (redirect/domain) +and then rebuild their data. No need to get these on every request from the shm, +only the check for up-to-date should be made. + +b) exact matches for jvmRoutes + +Could we use hashes instead of string comparisons all the time? +I'm not sure, if a good enough hash takes longer than a string comparison though. + +3) Code separation between factory, validate and init in lb +============================================================ + +The factory contains: + + private_data->worker.retries = JK_RETRIES; + private_data->s->recover_wait_time = WAIT_BEFORE_RECOVER; + +I think, this should move to validate() or init(). +It might even be obsolete, because in init, we already have: + + pThis->retries = jk_get_worker_retries(props, p->s->name, + p->s->retries = pThis->retries; + p->s->recover_wait_time = jk_get_worker_recover_timeout(props, p->s->name, WAIT_BEFORE_RECOVER); + if (p->s->recover_wait_time < WAIT_BEFORE_RECOVER) + p->s->recover_wait_time = WAIT_BEFORE_RECOVER; + +Then: In validate there is + + p->lb_workers[i].s->error_time = 0; + +So shouldn't there also be + + p->lb_workers[i].s->maintain_time = time(NULL); + +4) Logging +========== + +a) Allow logging of request url or uuid in jk log to ease matching with access log. + +b) Implement log rotation for IIS. (done in 1.2.31) + +c) Allow adding of log notes for IIS like we do with Apache. + +d) Add error type info to access log notes + +e) Refactor: Use the same code files for the request logging functions in Apache 1.3 and 2.0. + +f) Refactor: Use the same code files for piped logging in Apache 1.3 and 2.0. + +5) ajpget +========== + +Combine ajplib and Apache ab to an ajp13 commandline client ajpget. + +6) Parsing workers.properties +============================= + +Parsing of workers.properties aditionally to just looking up attributes +would help users to detect syntax errors in the file. At the moment +no information will be logged, e.g. when attributes contain typos. + +Example: worker.list vs. worker.lists. + +7) Persisting workers.properties +================================ + +Make workers.properties persist from inside status worker. + +Add additional properties file, that contains a journal of property changes done +via the status worker. When initializing those overwrite the initial workers.properties. + +Update actions in the status worker will allow to optionally add a change to this journal. +We can also add a comment with timestamp etc. to each journal line. + +8) Reduce number of uses of time(NULL) +====================================== + +We use time(NULL) a lot. Since it only has resolution of a second, +I'm asking myself, if we could update the actual time in only a few +places and get time out of some variables when needed. The same does +not hold true for millisecond time, but in several cases we use the time, +it's not very critical, that it is exact. These cases are related to: + +Some of this is already been done, the remaining parts are: + +- last_access for usage against timeout value that is ~minutes +- error_time for usage against retry timeout that is ~minutes +- uri_worker_map checked for usage against JK_URIMAP_RELOAD=1 minute + +So I think, it would suffice to set an actual time at the beginning of +the request/response cycle (used by everything before the request is being +sent over the socket) and maybe after the response shows up/ an error occurs +(for everything else, if there is). + +For which cases would it be OK, to use the time before sending to TC: +- uri_worker_map "checked" (uri map lookup starts early) +- setting/testing last_access in + - jk_ajp_common.c:ajp_connect_to_endpoint() + - jk_ajp_common.c:ajp_get_endpoint() + - jk_ajp_common.c:ajp_maintain() + +What about the others: +- setting last_access in init should use the actual time in + jk_ajp_common.c:ajp_create_endpoint_cache() + +- setting last_access again after the service could also use the + actual time in jk_ajp_common.c:ajp_done() +- setting error_time should better use the actual time + jk_lb_worker.c service(): rec->s->error_time = time(NULL); + +The last two cases could again use the same time, which then would be needed +to be generated at the end or directly after service. + +9) Access/Modification Time in shm +================================== + +a) [Discussion] What will this generally be used for? At the moment, +only jk_status "uses" it, but it only sets the values, it never asks for them. + +b) [Improvement, minor] jk_shm_set_workers_time() implicitly calls +jk_shm_sync_access_time(), but jk_status does: + + jk_shm_set_workers_time(time(NULL)); + /* Since we updated the config no need to reload + * on the next request + */ + jk_shm_sync_access_time(); + +two times. So depending on the idea of the functionality of these calls, +either set_workers_time and sync_access_time should be independently, +or the second call in jk_status coulkd be removed. + +10) "Destroy" functionality +=========================== + +[Hint] Destroy on a worker never seems to free shm, +but I think that was already a flaw without shm. + +11) Locks against shm +===================== + +It might be an interesting experiment to implement an improved locking structure. +It looks like one would need in fact different types of locks. +In shm we have as read/write information: + +Changed only by status worker: +- redirect, domain, lb_factor, sticky_session, sticky_session_force, + recover_wait_time, retries, status (is_disabled, is_stopped). + +These changes need some kind of reconfiguration in the threads after +change and before the next request/response. Since changes are rare, +here we would be better of, with a simple detect change and copy from +shm to process procedure. status updates the data in shm and after that +the time stamp on the shh. Each process checks the time stamp before +doing a request, and when the time stamp changed it does a writer CS +lock and updates it's local copy. All threads always do a reader CS +lock when doing a request/response cycle. Reader CS locks are concurrent, +writers are exclusive. So readers are not allowed, when the config data is being updated. + +Changed by the threads themselves (and via reset by the status worker): +- counters needed by routing decisions (lb_value, readed, transferred, busy) +- timers needed by maintenance functions (error_time, servic_time/maintain_time) +- status is_busy, in_error_state +- uncritical data with no influence on routing decisions (max_busy, elected, errors, + in_recovering) + +Here again we could improve by using reader/writer locks. I have a +tendency for the PESSIMISTIC side of locking, but I think we could +shrink the code blocks that should be locked. At the monent they are +pretty big (most of get_most_suitable_worker). + +Read-only: name and id. + +By the way: at several places we don't check for errors on getting the lock. + +12) Global locks +================ + +We might want to make the lock technology choosable, like httpd does. +E.g. on Solaris the default lock type if fcntl, and we can easily +get an invalid EDEADLOCK for our jk_log_lock. + +The following pthread based non global locks are used: + +- one mutex for each AJP worker, synchronizing access to the connection +pool, which exists per process + +- one mutex for each lb worker + +- a mutex used during dynamic update of uriworkermap.properties to +prevent concurrent updates. Updates are done per process. + +- a mutex to prevent concurrent execution of the process local internal +maintenance task + +- a mutex for access to the shared memory when changing or reading +configuration parameters. That might be a little unsafe, because it +actually should be a global mutex, not a process local, but those config +changes are only done due to interaction with the status worker, so +there's very little chance for unwanted concurrency here. All dynamic +runtime data are already marked as being volatile. + +All except the last seem to be safe. The last might need some hybrid model +using thread local mostly and process global when doing updates. + +See also: http://marc.info/?t=123394059800003&r=1&w=2 + +13) Understand the exact behaviour of shm and restarts +====================================================== + +Furthermore: rotatelogs (?) and gzip (mod_mime_magic) seem to close +the (non-existing) shm file. Maybe a problem on some platforms? + +14) What I didn't yet check +=========================== + +a) Correctness of is_busy handling + +b) Correctness of the reset values after reset by status worker + +c) What would be the exact behaviour, if shm does not work (memory case). + Will this be a critical failure, or will we only experience a + degradation in routing decisions. + +d) How complete is mod_proxy_ajp/mod_proxy_balancer. + Port changes from mod_jk to them. + +15) Status worker +================= + +Allow managing pool and connection parameters. Add flags to +pool and connections to signal workers and maintenance whether +existing connections should be closed and renewed. + +Check completeness of attribute manageability for AJP workers. + +Check completeness of runtime data display, e.g. +reset_time, recover_time, etc. Maybe also add "last error type". + +Work on a global display of process local data, e.g. state of the +process local connection pools (sizes, num connected/idle). + +Rework the GUI: + +- Basic overview start page with links to the workers, + maybe a checkbox to decide whether you want to see config data too. +- Detail view (what type of error was last and when etc.), detail view + for connection pools. + +16) URI mapping +=============== + +Add more extensions? + +17) Connection Pool +=================== + +How would a good global maximum count look like? +Simply limit busyness? Soft limit (applies only to non-sticky requests) +and a hard limit (applies to all requests)? + +18) IP V6 +========= + +There's a Bugzilla with a patch. + +19) IIS Chunked Encoding +======================== + +Move from alternative build to default. + +What about the other ifdef'd features? + +20) Add REMORE_PORT as a default JkEnvVar +========================================= + +... and port to IIS to fix the getRemotePort() problem. + +21) Rework HowTo docs +===================== + +22) Add better example config +============================= + +23) Remove JNI worker +===================== + +24) Watchdog reload of uriworkermap.properties +============================================== + +Questions about uriworkermap.properties watchdog reload (r745898): +- shm lock needed? +- Apache port (need to iterate over vhosts) + +25) Watchdog backend probing +============================ + +Configurable probe URL to test backend, e.g. to decide about +recovery instead of using random real requests. + +26) Commandline shm +=================== + +Commandline tool to read data from shm file. + +27) Status worker property format +================================= + +Check whether we return list properties as one line +(because Java doesn't allow the same property key multiple +times). + +Applies possibly to: + +"list", +BALANCE_WORKERS, +MOUNT_OF_WORKER, +USER_OF_WORKER, +GOOD_RATING_OF_WORKER, +BAD_RATING_OF_WORKER, +STATUS_FAIL_OF_WORKER, + |