diff options
Diffstat (limited to 'docs/development')
-rw-r--r-- | docs/development/design/index.rst | 1 | ||||
-rw-r--r-- | docs/development/design/ndrpdr.rst | 51 | ||||
-rw-r--r-- | docs/development/design/traffic_desc.rst | 85 |
3 files changed, 131 insertions, 6 deletions
diff --git a/docs/development/design/index.rst b/docs/development/design/index.rst index c54888a..0500ca2 100644 --- a/docs/development/design/index.rst +++ b/docs/development/design/index.rst @@ -12,4 +12,5 @@ OPNFV NFVbench Euphrates Design design versioning + traffic_desc ndrpdr diff --git a/docs/development/design/ndrpdr.rst b/docs/development/design/ndrpdr.rst index 5361174..e34e8ba 100644 --- a/docs/development/design/ndrpdr.rst +++ b/docs/development/design/ndrpdr.rst @@ -6,11 +6,15 @@ NDR/PDR Binary Search ===================== +The NDR/PDR binary search algorithm used by NFVbench is based on the algorithm used by the +FD.io CSIT project, with some additional optimizations. + Algorithm Outline ----------------- -The ServiceChain class is responsible for calculating the NDR/PDR for all frame sizes requested in the configuration. -Calculation for 1 frame size is delegated to the TrafficClient class. +The ServiceChain class (nfvbench/service_chain.py) is responsible for calculating the NDR/PDR +or all frame sizes requested in the configuration. +Calculation for 1 frame size is delegated to the TrafficClient class (nfvbench/traffic_client.py) Call chain for calculating the NDR-PDR for a list of frame sizes: @@ -22,23 +26,58 @@ Call chain for calculating the NDR-PDR for a list of frame sizes: - TrafficClient.__range_search() recursive binary search The search range is delimited by a left and right rate (expressed as a % of line rate per direction). +The search always start at line rate per port, e.g. in the case of 2x10Gbps, the first iteration +will send 10Gbps of traffic on each port. The load_epsilon configuration parameter defines the accuracy of the result as a % of line rate. The default value of 0.1 indicates for example that the measured NDR and PDR are within 0.1% of line rate of the actual NDR/PDR (e.g. 0.1% of 10Gbps is 10Mbps). It also determines how small the search range must be in the binary search. +Smaller values of load_epsilon will result in more iterations and will take more time but may not +always be beneficial if the absolute value falls below the precision level of the measurement. +For example a value of 0.01% would translate to an absolute value of 1Mbps (for a 10Gbps port) or +around 10kpps (at 64 byte size) which might be too fine grain. The recursion narrows down the range by half and stops when: - the range is smaller than the configured load_epsilon value - or when the search hits 100% or 0% of line rate +Optimization +------------ + +Binary search algorithms assume that the drop rate curve is monotonically increasing with the Tx rate. +To save time, the algorithm used by NFVbench is capable of calculating the optimal Tx rate for an +arbitrary list of target maximum drop rates in one pass instead of the usual 1 pass per target maximum drop rate. +This saves time linearly to the number target drop rates. +For example, a typical NDR/PDR search will have 2 target maximum drop rates: + +- NDR = 0.001% +- PDR = 0.1% + +The binary search will then start with a sorted list of 2 target drop rates: [0.1, 0.001]. +The first part of the binary search will then focus on finding the optimal rate for the first target +drop rate (0.1%). When found, the current target drop rate is removed from the list and +iteration continues with the next target drop rate in the list but this time +starting from the upper/lower range of the previous target drop rate, which saves significant time. +The binary search continues until the target maximum drop rate list is empty. + +Results Granularity +------------------- +The binary search results contain per direction stats (forward and reverse). +In the case of multi-chaining, results contain per chain stats. +The current code only reports aggregated stats (forward + reverse for all chains) but could be enhanced +to report per chain stats. + + +CPU Limitations +--------------- One particularity of using a software traffic generator is that the requested Tx rate may not always be met due to resource limitations (e.g. CPU is not fast enough to generate a very high load). The algorithm should take this into consideration: -- always monitor the actual Tx rate achieved +- always monitor the actual Tx rate achieved as reported back by the traffic generator - actual Tx rate is always <= requested Tx rate - the measured drop rate should always be relative to the actual Tx rate -- if the actual Tx rate is < requested Tx rate and the measured drop rate is already within threshold (<NDR/PDR threshold) then the binary search must stop with proper warning - - +- if the actual Tx rate is < requested Tx rate and the measured drop rate is already within threshold + (<NDR/PDR threshold) then the binary search must stop with proper warning because the actual NDR/PDR + might probably be higher than the reported values diff --git a/docs/development/design/traffic_desc.rst b/docs/development/design/traffic_desc.rst new file mode 100644 index 0000000..2a40b6a --- /dev/null +++ b/docs/development/design/traffic_desc.rst @@ -0,0 +1,85 @@ +.. This work is licensed under a Creative Commons Attribution 4.0 International +.. License. +.. http://creativecommons.org/licenses/by/4.0 +.. (c) Cisco Systems, Inc + +Traffic Description +=================== + +The general packet path model followed by NFVbench requires injecting traffic into an arbitrary +number of service chains, where each service chain is identified by 2 edge networks (left and right). +In the current multi-chaining model: + +- all service chains share the same left and right edge networks +- each port associated to the traffic generator is dedicated to send traffic to one edge network + +In an OpenStack deployment, this corresponds to all chains sharing the same 2 neutron networks. +If VLAN encapsulation is used, all traffic sent to a port will have the same VLAN id. + +Basic Packet Description +------------------------ + +The code to create the UDP packet is located in TRex.create_pkt() (nfvbench/traffic_gen/trex.py). + +NFVbench always generates UDP packets (even when doing L2 forwarding). +The final size of the frame containing each UDP packet will be based on the requested L2 frame size. +When taking into account the minimum payload size requirements from the traffic generator for +the latency streams, the minimum L2 frame size is 64 byte (no vlan tagging) or +68 bytes (with vlan tagging). + +Flows Specification +------------------- + +Mac Addresses +............. +The source MAC address is always the local port MAC address (for each port). +The destination MAC address is based on the configuration and can be: + +- the traffic generator peer port MAC address in the case of L2 loopback at the switch level + or when using a loopback cable +- the dest MAC as specified by the configuration file (EXT chain no ARP) +- the dest MAC as discovered by ARP (EXT chain) +- the VM MAC as dicovered from Neutron API (PVP, PVVP chains) + +NFVbench does not currently range on the MAC addresses. + +IP addresses +............ +The source IP address is fixed per chain. +The destination IP address is variable within a distinct range per chain. + +UDP ports +......... +The source and destination ports are fixed for all packets and can be set in the configuratoon +file (default is 53). + +Payload User Data +................. +The length of the user data is based on the requested L2 frame size and takes into account the +size of the L2 header - including the VLAN tag if applicable. + + +IMIX Support +------------ +In the case of IMIX, each direction is made of 4 streams: +- 1 latency stream +- 1 stream for each IMIX frame size + +The IMIX ratio is encoded into the number of consecutive packets sent by each stream in turn. + +Service Chains and Streams +-------------------------- +A stream identifies one "stream" of packets with same characteristics such as rate and destination address. +NFVbench will create 2 streams per service chain per direction: + +- 1 latency stream set to 1000pps +- 1 main traffic stream set to the requested Tx rate less the latency stream rate (1000pps) + +For example, a benchmark with 1 chain (fixed rate) will result in a total of 4 streams. +A benchmark with 20 chains will results in a total of 80 streams (fixed rate, it is more with IMIX). + +The overall flows are split equally between the number of chains by using the appropriate destination +MAC address. + +For example, in the case of 10 chains, 1M flows and fixed rate, there will be a total of 40 streams. +Each of the 20 non-latency stream will generate packets corresponding to 50,000 flows (unique src/dest address tuples). |