Age | Commit message (Collapse) | Author | Files | Lines |
|
This commit converts Python assertions to a custom exception in all
places where SLA validation is checked with an assertion.
This commit also fixes all emerged pylint errors.
JIRA: YARDSTICK-966
Change-Id: If771ed03b2cbc0a43a57fcfb9293f18740b3ff80
Signed-off-by: Miikka Koistinen <miikka.koistinen@nokia.com>
|
|
The PROX tests were hanging in the duration
runner.
These are fixes for various errors:
raise error in collect_kpi if VNF is down
move prox dpdk_rebind after collectd stop
fix dpdk nicbind rebind to group by drivers
prox: raise error in collect_kpi if the VNF is down
prox: add VNF_TYPE for consistency
sample_vnf: debug and fix kill_vnf
pkill is not matching some executable names,
add some debug process dumps and try switching
back to killall until we can find the issue
sample_vnf: add default timeout, so we can override
default 3600 SSH timeout
collect_kpi is the point at which we check
the VNFs and TGs for failures or exits
queues are the problem make sure we aren't silently blocking on
non-empty queues by canceling join thread in subprocess
fixup duration runner to close queues
and other attempt to stop duration runner
from hanging
VnfdHelper: memoize port_num
resource: fail if ssh can't connect
at the end of 3600 second test our ssh connection
is dead, so we can't actually stop collectd
unless we reconnect
fix stop() logic to ignore ssh errors
Change-Id: I6c8e682a80cb9d00362e2fef4a46df080f304e55
Signed-off-by: Ross Brattain <ross.b.brattain@intel.com>
|
|
Sometimes the runners can hang. Initially
debugging lead to the queue join thread, so I thought
we could cancel all the join threads and everything would be okay.
But it turns out canceling the queue join threads can lead
to corruption of the queues, so when we go to drain the queues
the task hangs.
But it also turns out that we were not properly draining
the queues in the task process. We were waiting for all
the runners to exit, then draining the queues.
This is bad and will cause the queues to fill up and hang
and/or drop data or corrupt the queues.
The proper fix seems to be to draining the queues in a
loop before calling join with a timeout.
Also modified the queue drain loops to no block on queue.get()
Revert "cancel all queue join threads"
This reverts commit 75c0e3a54b8f6e8fd77c7d9d95decab830159929.
Revert "duration runner: add teardown and cancel all queue join threads"
This reverts commit 7eb6abb6931b24e085b139cc3500f4497cdde57d.
Change-Id: Ic4f8e814cf23615621c1250535967716b425ac18
Signed-off-by: Ross Brattain <ross.b.brattain@intel.com>
|
|
In some cases we are blocking in base.Runner join() because the
queues are not empty
call cancel_join_thread to prevent the Queue from blocking the
Process exit
https://docs.python.org/3.3/library/multiprocessing.html#all-platforms
Joining processes that use queues
Bear in mind that a process that has put items in a queue will wait
before terminating until all the buffered items are fed by the
"feeder" thread to the underlying pipe. (The child process can call
the cancel_join_thread() method of the queue to avoid this behaviour.)
This means that whenever you use a queue you need to make sure that
all items which have been put on the queue will eventually be removed
before the process is joined. Otherwise you cannot be sure that
processes which have put items on the queue will terminate. Remember
also that non-daemonic processes will be joined automatically.
Warning
As mentioned above, if a child process has put items on a queue (and
it has not used JoinableQueue.cancel_join_thread), then that process
will not terminate until all buffered items have been flushed to the
pipe.
This means that if you try joining that process you may get a deadlock
unless you are sure that all items which have been put on the queue
have been consumed. Similarly, if the child process is non-daemonic
then the parent process may hang on exit when it tries to join all its
non-daemonic children.
cancel_join_thread()
Prevent join_thread() from blocking. In particular, this prevents the
background thread from being joined automatically when the process
exits – see join_thread().
A better name for this method might be allow_exit_without_flush(). It
is likely to cause enqueued data to lost, and you almost certainly
will not need to use it. It is really only there if you need the
current process to exit immediately without waiting to flush enqueued
data to the underlying pipe, and you don’t care about lost data.
Change-Id: I345c722a752bddf9f0824a11cdf52ae9f04669af
Signed-off-by: Ross Brattain <ross.b.brattain@intel.com>
|
|
flake8 warns about indent
Change-Id: Ib7a57dcb8462de24a2ccb1fdbc60db995cc60d33
Signed-off-by: Ross Brattain <ross.b.brattain@intel.com>
|
|
A run consists of multiple (configurable) iterations. The first iteration starts from a configured packet rate. The subsequent iterations start from the observed rate from the previous run.
An iteration is a binary search for maximum pps while not exceeding 10-6 packet loss rate. The upper rate is capped to the last pps when packet loss target is missed, the bottom rate is capped to the last pps when packet loss target is met. An iteration stops when the upper rate and the lower rate are close enough (configurable) or the received rate is well below the sending rate.
The output observed rate is set to the bottom rate.
Update-1: local run of run_tests.sh is good, but two lines are reported by Jekins as too long
Update-2: Minor fix to cope with "pps" is not defined in test case yaml file.
Update-3: Add pragma to skip unit test for this patch.
JIRA: YARDSTICK-613
Change-Id: I2411b173d18d928cc1cf08f883b08bc13a125ea2
Signed-off-by: Jing Zhang <jing.c.zhang@nokia.com>
|