diff options
author | Maryam Tahhan <maryam.tahhan@intel.com> | 2016-08-10 12:22:54 +0000 |
---|---|---|
committer | Gerrit Code Review <gerrit@172.30.200.206> | 2016-08-10 12:22:54 +0000 |
commit | 5a0c6772d255aeb0ab1e6246c91bbeb56bb3a239 (patch) | |
tree | a2e21ca2d0f342df2e446b2d71cbbd0bcd58894c /docs/requirements/vswitchperf_ltd.rst | |
parent | 612090b0009574d02cf21a4fe41f036d0694b1b1 (diff) | |
parent | ac313930284dd310fe909684d66956f27181f966 (diff) |
Merge "docs: split LTD into LTP and LTD."
Diffstat (limited to 'docs/requirements/vswitchperf_ltd.rst')
-rw-r--r-- | docs/requirements/vswitchperf_ltd.rst | 1088 |
1 files changed, 38 insertions, 1050 deletions
diff --git a/docs/requirements/vswitchperf_ltd.rst b/docs/requirements/vswitchperf_ltd.rst index 9e26f637..5464c381 100644 --- a/docs/requirements/vswitchperf_ltd.rst +++ b/docs/requirements/vswitchperf_ltd.rst @@ -102,1031 +102,19 @@ Features to be tested Characterizing virtual switches (i.e. Device Under Test (DUT) in this document) includes measuring the following performance metrics: -- **Throughput** as defined by `RFC1242 - <https://www.rfc-editor.org/rfc/rfc1242.txt>`__: The maximum rate at which - **none** of the offered frames are dropped by the DUT. The maximum frame - rate and bit rate that can be transmitted by the DUT without any error - should be recorded. Note there is an equivalent bit rate and a specific - layer at which the payloads contribute to the bits. Errors and - improperly formed frames or packets are dropped. -- **Packet delay** introduced by the DUT and its cumulative effect on - E2E networks. Frame delay can be measured equivalently. -- **Packet delay variation**: measured from the perspective of the - VNF/application. Packet delay variation is sometimes called "jitter". - However, we will avoid the term "jitter" as the term holds different - meaning to different groups of people. In this document we will - simply use the term packet delay variation. The preferred form for this - metric is the PDV form of delay variation defined in `RFC5481 - <https://www.rfc-editor.org/rfc/rfc5481.txt>`__. The most relevant - measurement of PDV considers the delay variation of a single user flow, - as this will be relevant to the size of end-system buffers to compensate - for delay variation. The measurement system's ability to store the - delays of individual packets in the flow of interest is a key factor - that determines the specific measurement method. At the outset, it is - ideal to view the complete PDV distribution. Systems that can capture - and store packets and their delays have the freedom to calculate the - reference minimum delay and to determine various quantiles of the PDV - distribution accurately (in post-measurement processing routines). - Systems without storage must apply algorithms to calculate delay and - statistical measurements on the fly. For example, a system may store - temporary estimates of the mimimum delay and the set of (100) packets - with the longest delays during measurement (to calculate a high quantile, - and update these sets with new values periodically. - In some cases, a limited number of delay histogram bins will be - available, and the bin limits will need to be set using results from - repeated experiments. See section 8 of `RFC5481 - <https://www.rfc-editor.org/rfc/rfc5481.txt>`__. -- **Packet loss** (within a configured waiting time at the receiver): All - packets sent to the DUT should be accounted for. -- **Burst behaviour**: measures the ability of the DUT to buffer packets. -- **Packet re-ordering**: measures the ability of the device under test to - maintain sending order throughout transfer to the destination. -- **Packet correctness**: packets or Frames must be well-formed, in that - they include all required fields, conform to length requirements, pass - integrity checks, etc. -- **Availability and capacity** of the DUT i.e. when the DUT is fully “up” - and connected, following measurements should be captured for - DUT without any network packet load: - - - Includes average power consumption of the CPUs (in various power states) and - system over specified period of time. Time period should not be less - than 60 seconds. - - Includes average per core CPU utilization over specified period of time. - Time period should not be less than 60 seconds. - - Includes the number of NIC interfaces supported. - - Includes headroom of VM workload processing cores (i.e. available - for applications). +- Throughput +- Packet delay +- Packet delay variation +- Packet loss +- Burst behaviour +- Packet re-ordering +- Packet correctness +- Availability and capacity of the DUT .. 3.2.2 -.. _Approach: - -Approach -============== - -In order to determine the packet transfer characteristics of a virtual -switch, the tests will be broken down into the following categories: - -.. 3.2.2.1 - -Test Categories ----------------------- -- **Throughput Tests** to measure the maximum forwarding rate (in - frames per second or fps) and bit rate (in Mbps) for a constant load - (as defined by `RFC1242 <https://www.rfc-editor.org/rfc/rfc1242.txt>`__) - without traffic loss. -- **Packet and Frame Delay Tests** to measure average, min and max - packet and frame delay for constant loads. -- **Stream Performance Tests** (TCP, UDP) to measure bulk data transfer - performance, i.e. how fast systems can send and receive data through - the virtual switch. -- **Request/Response Performance** Tests (TCP, UDP) the measure the - transaction rate through the virtual switch. -- **Packet Delay Tests** to understand latency distribution for - different packet sizes and over an extended test run to uncover - outliers. -- **Scalability Tests** to understand how the virtual switch performs - as the number of flows, active ports, complexity of the forwarding - logic's configuration... it has to deal with increases. -- **Control Path and Datapath Coupling** Tests, to understand how - closely coupled the datapath and the control path are as well as the - effect of this coupling on the performance of the DUT. -- **CPU and Memory Consumption Tests** to understand the virtual - switch’s footprint on the system, this includes: - - * CPU core utilization. - * CPU cache utilization. - * Memory footprint. - * System bus (QPI, PCI, ..) utilization. - * Memory lanes utilization. - * CPU cycles consumed per packet. - * Time To Establish Flows Tests. - -- **Noisy Neighbour Tests**, to understand the effects of resource - sharing on the performance of a virtual switch. - -**Note:** some of the tests above can be conducted simultaneously where -the combined results would be insightful, for example Packet/Frame Delay -and Scalability. - -.. 3.2.2.2 - -Deployment Scenarios --------------------------- -The following represents possible deployment test scenarios which can -help to determine the performance of both the virtual switch and the -datapaths to physical ports (to NICs) and to logical ports (to VNFs): - -.. 3.2.2.2.1 - -Physical port → vSwitch → physical port -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -.. code-block:: console - - _ - +--------------------------------------------------+ | - | +--------------------+ | | - | | | | | - | | v | | Host - | +--------------+ +--------------+ | | - | | phy port | vSwitch | phy port | | | - +---+--------------+------------+--------------+---+ _| - ^ : - | | - : v - +--------------------------------------------------+ - | | - | traffic generator | - | | - +--------------------------------------------------+ - -.. 3.2.2.2.2 - -Physical port → vSwitch → VNF → vSwitch → physical port -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -.. code-block:: console - - _ - +---------------------------------------------------+ | - | | | - | +-------------------------------------------+ | | - | | Application | | | - | +-------------------------------------------+ | | - | ^ : | | - | | | | | Guest - | : v | | - | +---------------+ +---------------+ | | - | | logical port 0| | logical port 1| | | - +---+---------------+-----------+---------------+---+ _| - ^ : - | | - : v _ - +---+---------------+----------+---------------+---+ | - | | logical port 0| | logical port 1| | | - | +---------------+ +---------------+ | | - | ^ : | | - | | | | | Host - | : v | | - | +--------------+ +--------------+ | | - | | phy port | vSwitch | phy port | | | - +---+--------------+------------+--------------+---+ _| - ^ : - | | - : v - +--------------------------------------------------+ - | | - | traffic generator | - | | - +--------------------------------------------------+ - -.. 3.2.2.2.3 - -Physical port → vSwitch → VNF → vSwitch → VNF → vSwitch → physical port -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: console - - _ - +----------------------+ +----------------------+ | - | Guest 1 | | Guest 2 | | - | +---------------+ | | +---------------+ | | - | | Application | | | | Application | | | - | +---------------+ | | +---------------+ | | - | ^ | | | ^ | | | - | | v | | | v | | Guests - | +---------------+ | | +---------------+ | | - | | logical ports | | | | logical ports | | | - | | 0 1 | | | | 0 1 | | | - +---+---------------+--+ +---+---------------+--+ _| - ^ : ^ : - | | | | - : v : v _ - +---+---------------+---------+---------------+--+ | - | | 0 1 | | 3 4 | | | - | | logical ports | | logical ports | | | - | +---------------+ +---------------+ | | - | ^ | ^ | | | Host - | | L-----------------+ v | | - | +--------------+ +--------------+ | | - | | phy ports | vSwitch | phy ports | | | - +---+--------------+----------+--------------+---+ _| - ^ ^ : : - | | | | - : : v v - +--------------------------------------------------+ - | | - | traffic generator | - | | - +--------------------------------------------------+ - -.. 3.2.2.2.4 - -Physical port → VNF → vSwitch → VNF → physical port -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: console - - _ - +----------------------+ +----------------------+ | - | Guest 1 | | Guest 2 | | - |+-------------------+ | | +-------------------+| | - || Application | | | | Application || | - |+-------------------+ | | +-------------------+| | - | ^ | | | ^ | | | Guests - | | v | | | v | | - |+-------------------+ | | +-------------------+| | - || logical ports | | | | logical ports || | - || 0 1 | | | | 0 1 || | - ++--------------------++ ++--------------------++ _| - ^ : ^ : - (PCI passthrough) | | (PCI passthrough) - | v : | _ - +--------++------------+-+------------++---------+ | - | | || 0 | | 1 || | | | - | | ||logical port| |logical port|| | | | - | | |+------------+ +------------+| | | | - | | | | ^ | | | | - | | | L-----------------+ | | | | - | | | | | | | Host - | | | vSwitch | | | | - | | +-----------------------------+ | | | - | | | | | - | | v | | - | +--------------+ +--------------+ | | - | | phy port/VF | | phy port/VF | | | - +-+--------------+--------------+--------------+-+ _| - ^ : - | | - : v - +--------------------------------------------------+ - | | - | traffic generator | - | | - +--------------------------------------------------+ - -.. 3.2.2.2.5 - -Physical port → vSwitch → VNF -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: console - - _ - +---------------------------------------------------+ | - | | | - | +-------------------------------------------+ | | - | | Application | | | - | +-------------------------------------------+ | | - | ^ | | - | | | | Guest - | : | | - | +---------------+ | | - | | logical port 0| | | - +---+---------------+-------------------------------+ _| - ^ - | - : _ - +---+---------------+------------------------------+ | - | | logical port 0| | | - | +---------------+ | | - | ^ | | - | | | | Host - | : | | - | +--------------+ | | - | | phy port | vSwitch | | - +---+--------------+------------ -------------- ---+ _| - ^ - | - : - +--------------------------------------------------+ - | | - | traffic generator | - | | - +--------------------------------------------------+ - -.. 3.2.2.2.6 - -VNF → vSwitch → physical port -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: console - - _ - +---------------------------------------------------+ | - | | | - | +-------------------------------------------+ | | - | | Application | | | - | +-------------------------------------------+ | | - | : | | - | | | | Guest - | v | | - | +---------------+ | | - | | logical port | | | - +-------------------------------+---------------+---+ _| - : - | - v _ - +------------------------------+---------------+---+ | - | | logical port | | | - | +---------------+ | | - | : | | - | | | | Host - | v | | - | +--------------+ | | - | vSwitch | phy port | | | - +-------------------------------+--------------+---+ _| - : - | - v - +--------------------------------------------------+ - | | - | traffic generator | - | | - +--------------------------------------------------+ - -.. 3.2.2.2.7 - -VNF → vSwitch → VNF → vSwitch -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: console - - _ - +-------------------------+ +-------------------------+ | - | Guest 1 | | Guest 2 | | - | +-----------------+ | | +-----------------+ | | - | | Application | | | | Application | | | - | +-----------------+ | | +-----------------+ | | - | : | | ^ | | - | | | | | | | Guest - | v | | : | | - | +---------------+ | | +---------------+ | | - | | logical port 0| | | | logical port 0| | | - +-----+---------------+---+ +---+---------------+-----+ _| - : ^ - | | - v : _ - +----+---------------+------------+---------------+-----+ | - | | port 0 | | port 1 | | | - | +---------------+ +---------------+ | | - | : ^ | | - | | | | | Host - | +--------------------+ | | - | | | - | vswitch | | - +-------------------------------------------------------+ _| - -.. 3.2.2.2.8 - -HOST 1(Physical port → virtual switch → VNF → virtual switch → Physical port) -→ HOST 2(Physical port → virtual switch → VNF → virtual switch → Physical port) - -HOST 1 (PVP) → HOST 2 (PVP) -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: console - - _ - +----------------------+ +----------------------+ | - | Guest 1 | | Guest 2 | | - | +---------------+ | | +---------------+ | | - | | Application | | | | Application | | | - | +---------------+ | | +---------------+ | | - | ^ | | | ^ | | | - | | v | | | v | | Guests - | +---------------+ | | +---------------+ | | - | | logical ports | | | | logical ports | | | - | | 0 1 | | | | 0 1 | | | - +---+---------------+--+ +---+---------------+--+ _| - ^ : ^ : - | | | | - : v : v _ - +---+---------------+--+ +---+---------------+--+ | - | | 0 1 | | | | 3 4 | | | - | | logical ports | | | | logical ports | | | - | +---------------+ | | +---------------+ | | - | ^ | | | ^ | | | Hosts - | | v | | | v | | - | +--------------+ | | +--------------+ | | - | | phy ports | | | | phy ports | | | - +---+--------------+---+ +---+--------------+---+ _| - ^ : : : - | +-----------------+ | - : v - +--------------------------------------------------+ - | | - | traffic generator | - | | - +--------------------------------------------------+ - - - -**Note:** For tests where the traffic generator and/or measurement -receiver are implemented on VM and connected to the virtual switch -through vNIC, the issues of shared resources and interactions between -the measurement devices and the device under test must be considered. - -**Note:** Some RFC 2889 tests require a full-mesh sending and receiving -pattern involving more than two ports. This possibility is illustrated in the -Physical port → vSwitch → VNF → vSwitch → VNF → vSwitch → physical port -diagram above (with 2 sending and 2 receiving ports, though all ports -could be used bi-directionally). - -**Note:** When Deployment Scenarios are used in RFC 2889 address learning -or cache capacity testing, an additional port from the vSwitch must be -connected to the test device. This port is used to listen for flooded -frames. - -.. 3.2.2.3 - -General Methodology: --------------------------- -To establish the baseline performance of the virtual switch, tests would -initially be run with a simple workload in the VNF (the recommended -simple workload VNF would be `DPDK <http://www.dpdk.org/>`__'s testpmd -application forwarding packets in a VM or vloop\_vnf a simple kernel -module that forwards traffic between two network interfaces inside the -virtualized environment while bypassing the networking stack). -Subsequently, the tests would also be executed with a real Telco -workload running in the VNF, which would exercise the virtual switch in -the context of higher level Telco NFV use cases, and prove that its -underlying characteristics and behaviour can be measured and validated. -Suitable real Telco workload VNFs are yet to be identified. - -.. 3.2.2.3.1 - -Default Test Parameters -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The following list identifies the default parameters for suite of -tests: - -- Reference application: Simple forwarding or Open Source VNF. -- Frame size (bytes): 64, 128, 256, 512, 1024, 1280, 1518, 2K, 4k OR - Packet size based on use-case (e.g. RTP 64B, 256B) OR Mix of packet sizes as - maintained by the Functest project <https://wiki.opnfv.org/traffic_profile_management>. -- Reordering check: Tests should confirm that packets within a flow are - not reordered. -- Duplex: Unidirectional / Bidirectional. Default: Full duplex with - traffic transmitting in both directions, as network traffic generally - does not flow in a single direction. By default the data rate of - transmitted traffic should be the same in both directions, please - note that asymmetric traffic (e.g. downlink-heavy) tests will be - mentioned explicitly for the relevant test cases. -- Number of Flows: Default for non scalability tests is a single flow. - For scalability tests the goal is to test with maximum supported - flows but where possible will test up to 10 Million flows. Start with - a single flow and scale up. By default flows should be added - sequentially, tests that add flows simultaneously will explicitly - call out their flow addition behaviour. Packets are generated across - the flows uniformly with no burstiness. For multi-core tests should - consider the number of packet flows based on vSwitch/VNF multi-thread - implementation and behavior. - -- Traffic Types: UDP, SCTP, RTP, GTP and UDP traffic. -- Deployment scenarios are: -- Physical → virtual switch → physical. -- Physical → virtual switch → VNF → virtual switch → physical. -- Physical → virtual switch → VNF → virtual switch → VNF → virtual - switch → physical. -- Physical → VNF → virtual switch → VNF → physical. -- Physical → virtual switch → VNF. -- VNF → virtual switch → Physical. -- VNF → virtual switch → VNF. - -Tests MUST have these parameters unless otherwise stated. **Test cases -with non default parameters will be stated explicitly**. - -**Note**: For throughput tests unless stated otherwise, test -configurations should ensure that traffic traverses the installed flows -through the virtual switch, i.e. flows are installed and have an appropriate -time out that doesn't expire before packet transmission starts. - -.. 3.2.2.3.2 - -Flow Classification -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Virtual switches classify packets into flows by processing and matching -particular header fields in the packet/frame and/or the input port where -the packets/frames arrived. The vSwitch then carries out an action on -the group of packets that match the classification parameters. Thus a -flow is considered to be a sequence of packets that have a shared set of -header field values or have arrived on the same port and have the same -action applied to them. Performance results can vary based on the -parameters the vSwitch uses to match for a flow. The recommended flow -classification parameters for L3 vSwitch performance tests are: the -input port, the source IP address, the destination IP address and the -Ethernet protocol type field. It is essential to increase the flow -time-out time on a vSwitch before conducting any performance tests that -do not measure the flow set-up time. Normally the first packet of a -particular flow will install the flow in the vSwitch which adds an -additional latency, subsequent packets of the same flow are not subject -to this latency if the flow is already installed on the vSwitch. - -.. 3.2.2.3.3 - -Test Priority -~~~~~~~~~~~~~~~~~~~~~ - -Tests will be assigned a priority in order to determine which tests -should be implemented immediately and which tests implementations -can be deferred. - -Priority can be of following types: - Urgent: Must be implemented -immediately. - High: Must be implemented in the next release. - Medium: -May be implemented after the release. - Low: May or may not be -implemented at all. - -.. 3.2.2.3.4 - -SUT Setup -~~~~~~~~~~~~~~~~~~ - -The SUT should be configured to its "default" state. The -SUT's configuration or set-up must not change between tests in any way -other than what is required to do the test. All supported protocols must -be configured and enabled for each test set up. - -.. 3.2.2.3.5 - -Port Configuration -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The DUT should be configured with n ports where -n is a multiple of 2. Half of the ports on the DUT should be used as -ingress ports and the other half of the ports on the DUT should be used -as egress ports. Where a DUT has more than 2 ports, the ingress data -streams should be set-up so that they transmit packets to the egress -ports in sequence so that there is an even distribution of traffic -across ports. For example, if a DUT has 4 ports 0(ingress), 1(ingress), -2(egress) and 3(egress), the traffic stream directed at port 0 should -output a packet to port 2 followed by a packet to port 3. The traffic -stream directed at port 1 should also output a packet to port 2 followed -by a packet to port 3. - -.. 3.2.2.3.6 - -Frame Formats -~~~~~~~~~~~~~~~~~~~~~ - -**Frame formats Layer 2 (data link layer) protocols** - -- Ethernet II - -.. code-block:: console - - +---------------------------+-----------+ - | Ethernet Header | Payload | Check Sum | - +-----------------+---------+-----------+ - |_________________|_________|___________| - 14 Bytes 46 - 1500 4 Bytes - Bytes - - -**Layer 3 (network layer) protocols** - -- IPv4 - -.. code-block:: console - - +-----------------+-----------+---------+-----------+ - | Ethernet Header | IP Header | Payload | Checksum | - +-----------------+-----------+---------+-----------+ - |_________________|___________|_________|___________| - 14 Bytes 20 bytes 26 - 1480 4 Bytes - Bytes - -- IPv6 - -.. code-block:: console - - +-----------------+-----------+---------+-----------+ - | Ethernet Header | IP Header | Payload | Checksum | - +-----------------+-----------+---------+-----------+ - |_________________|___________|_________|___________| - 14 Bytes 40 bytes 26 - 1460 4 Bytes - Bytes - -**Layer 4 (transport layer) protocols** - - - TCP - - UDP - - SCTP - -.. code-block:: console - - +-----------------+-----------+-----------------+---------+-----------+ - | Ethernet Header | IP Header | Layer 4 Header | Payload | Checksum | - +-----------------+-----------+-----------------+---------+-----------+ - |_________________|___________|_________________|_________|___________| - 14 Bytes 40 bytes 20 Bytes 6 - 1460 4 Bytes - Bytes - - -**Layer 5 (application layer) protocols** - - - RTP - - GTP - -.. code-block:: console - - +-----------------+-----------+-----------------+---------+-----------+ - | Ethernet Header | IP Header | Layer 4 Header | Payload | Checksum | - +-----------------+-----------+-----------------+---------+-----------+ - |_________________|___________|_________________|_________|___________| - 14 Bytes 20 bytes 20 Bytes >= 6 Bytes 4 Bytes - -.. 3.2.2.3.7 - -Packet Throughput -~~~~~~~~~~~~~~~~~~~~~~~~~ -There is a difference between an Ethernet frame, -an IP packet, and a UDP datagram. In the seven-layer OSI model of -computer networking, packet refers to a data unit at layer 3 (network -layer). The correct term for a data unit at layer 2 (data link layer) is -a frame, and at layer 4 (transport layer) is a segment or datagram. - -Important concepts related to 10GbE performance are frame rate and -throughput. The MAC bit rate of 10GbE, defined in the IEEE standard 802 -.3ae, is 10 billion bits per second. Frame rate is based on the bit rate -and frame format definitions. Throughput, defined in IETF RFC 1242, is -the highest rate at which the system under test can forward the offered -load, without loss. - -The frame rate for 10GbE is determined by a formula that divides the 10 -billion bits per second by the preamble + frame length + inter-frame -gap. - -The maximum frame rate is calculated using the minimum values of the -following parameters, as described in the IEEE 802 .3ae standard: - -- Preamble: 8 bytes \* 8 = 64 bits -- Frame Length: 64 bytes (minimum) \* 8 = 512 bits -- Inter-frame Gap: 12 bytes (minimum) \* 8 = 96 bits - -Therefore, Maximum Frame Rate (64B Frames) -= MAC Transmit Bit Rate / (Preamble + Frame Length + Inter-frame Gap) -= 10,000,000,000 / (64 + 512 + 96) -= 10,000,000,000 / 672 -= 14,880,952.38 frame per second (fps) - -.. 3.2.2.3.8 - -System isolation and validation -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -A key consideration when conducting any sort of benchmark is trying to -ensure the consistency and repeatability of test results between runs. -When benchmarking the performance of a virtual switch there are many -factors that can affect the consistency of results. This section -describes these factors and the measures that can be taken to limit -their effects. In addition, this section will outline some system tests -to validate the platform and the VNF before conducting any vSwitch -benchmarking tests. - -**System Isolation:** - -When conducting a benchmarking test on any SUT, it is essential to limit -(and if reasonable, eliminate) any noise that may interfere with the -accuracy of the metrics collected by the test. This noise may be -introduced by other hardware or software (OS, other applications), and -can result in significantly varying performance metrics being collected -between consecutive runs of the same test. In the case of characterizing -the performance of a virtual switch, there are a number of configuration -parameters that can help increase the repeatability and stability of -test results, including: - -- OS/GRUB configuration: - - - maxcpus = n where n >= 0; limits the kernel to using 'n' - processors. Only use exactly what you need. - - isolcpus: Isolate CPUs from the general scheduler. Isolate all - CPUs bar one which will be used by the OS. - - use taskset to affinitize the forwarding application and the VNFs - onto isolated cores. VNFs and the vSwitch should be allocated - their own cores, i.e. must not share the same cores. vCPUs for the - VNF should be affinitized to individual cores also. - - Limit the amount of background applications that are running and - set OS to boot to runlevel 3. Make sure to kill any unnecessary - system processes/daemons. - - Only enable hardware that you need to use for your test – to - ensure there are no other interrupts on the system. - - Configure NIC interrupts to only use the cores that are not - allocated to any other process (VNF/vSwitch). - -- NUMA configuration: Any unused sockets in a multi-socket system - should be disabled. -- CPU pinning: The vSwitch and the VNF should each be affinitized to - separate logical cores using a combination of maxcpus, isolcpus and - taskset. -- BIOS configuration: BIOS should be configured for performance where - an explicit option exists, sleep states should be disabled, any - virtualization optimization technologies should be enabled, and - hyperthreading should also be enabled, turbo boost and overclocking - should be disabled. - -**System Validation:** - -System validation is broken down into two sub-categories: Platform -validation and VNF validation. The validation test itself involves -verifying the forwarding capability and stability for the sub-system -under test. The rationale behind system validation is two fold. Firstly -to give a tester confidence in the stability of the platform or VNF that -is being tested; and secondly to provide base performance comparison -points to understand the overhead introduced by the virtual switch. - -* Benchmark platform forwarding capability: This is an OPTIONAL test - used to verify the platform and measure the base performance (maximum - forwarding rate in fps and latency) that can be achieved by the - platform without a vSwitch or a VNF. The following diagram outlines - the set-up for benchmarking Platform forwarding capability: - - .. code-block:: console - - __ - +--------------------------------------------------+ | - | +------------------------------------------+ | | - | | | | | - | | l2fw or DPDK L2FWD app | | Host - | | | | | - | +------------------------------------------+ | | - | | NIC | | | - +---+------------------------------------------+---+ __| - ^ : - | | - : v - +--------------------------------------------------+ - | | - | traffic generator | - | | - +--------------------------------------------------+ - -* Benchmark VNF forwarding capability: This test is used to verify - the VNF and measure the base performance (maximum forwarding rate in - fps and latency) that can be achieved by the VNF without a vSwitch. - The performance metrics collected by this test will serve as a key - comparison point for NIC passthrough technologies and vSwitches. VNF - in this context refers to the hypervisor and the VM. The following - diagram outlines the set-up for benchmarking VNF forwarding - capability: - - .. code-block:: console - - __ - +--------------------------------------------------+ | - | +------------------------------------------+ | | - | | | | | - | | VNF | | | - | | | | | - | +------------------------------------------+ | | - | | Passthrough/SR-IOV | | Host - | +------------------------------------------+ | | - | | NIC | | | - +---+------------------------------------------+---+ __| - ^ : - | | - : v - +--------------------------------------------------+ - | | - | traffic generator | - | | - +--------------------------------------------------+ - - -**Methodology to benchmark Platform/VNF forwarding capability** - - -The recommended methodology for the platform/VNF validation and -benchmark is: - Run `RFC2889 <https://www.rfc-editor.org/rfc/rfc2289.txt>`__ -Maximum Forwarding Rate test, this test will produce maximum -forwarding rate and latency results that will serve as the -expected values. These expected values can be used in -subsequent steps or compared with in subsequent validation tests. - -Transmit bidirectional traffic at line rate/max forwarding rate -(whichever is higher) for at least 72 hours, measure throughput (fps) -and latency. - Note: Traffic should be bidirectional. - Establish a -baseline forwarding rate for what the platform can achieve. - Additional -validation: After the test has completed for 72 hours run bidirectional -traffic at the maximum forwarding rate once more to see if the system is -still functional and measure throughput (fps) and latency. Compare the -measure the new obtained values with the expected values. - -**NOTE 1**: How the Platform is configured for its forwarding capability -test (BIOS settings, GRUB configuration, runlevel...) is how the -platform should be configured for every test after this - -**NOTE 2**: How the VNF is configured for its forwarding capability test -(# of vCPUs, vNICs, Memory, affinitization…) is how it should be -configured for every test that uses a VNF after this. - -.. 3.2.2.4 - -RFCs for testing virtual switch performance --------------------------------------------------- - -The starting point for defining the suite of tests for benchmarking the -performance of a virtual switch is to take existing RFCs and standards -that were designed to test their physical counterparts and adapting them -for testing virtual switches. The rationale behind this is to establish -a fair comparison between the performance of virtual and physical -switches. This section outlines the RFCs that are used by this -specification. - -.. 3.2.2.4.1 - -RFC 1242 Benchmarking Terminology for Network Interconnection -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Devices RFC 1242 defines the terminology that is used in describing -performance benchmarking tests and their results. Definitions and -discussions covered include: Back-to-back, bridge, bridge/router, -constant load, data link frame size, frame loss rate, inter frame gap, -latency, and many more. - -.. 3.2.2.4.2 - -RFC 2544 Benchmarking Methodology for Network Interconnect Devices -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -RFC 2544 outlines a benchmarking methodology for network Interconnect -Devices. The methodology results in performance metrics such as latency, -frame loss percentage, and maximum data throughput. - -In this document network “throughput” (measured in millions of frames -per second) is based on RFC 2544, unless otherwise noted. Frame size -refers to Ethernet frames ranging from smallest frames of 64 bytes to -largest frames of 9K bytes. - -Types of tests are: - -1. Throughput test defines the maximum number of frames per second - that can be transmitted without any error, or 0% loss ratio. - In some Throughput tests (and those tests with long duration), - evaluation of an additional frame loss ratio is suggested. The - current ratio (10^-7 %) is based on understanding the typical - user-to-user packet loss ratio needed for good application - performance and recognizing that a single transfer through a - vswitch must contribute a tiny fraction of user-to-user loss. - Further, the ratio 10^-7 % also recognizes practical limitations - when measuring loss ratio. - -2. Latency test measures the time required for a frame to travel from - the originating device through the network to the destination device. - Please note that RFC2544 Latency measurement will be superseded with - a measurement of average latency over all successfully transferred - packets or frames. - -3. Frame loss test measures the network’s - response in overload conditions - a critical indicator of the - network’s ability to support real-time applications in which a - large amount of frame loss will rapidly degrade service quality. - -4. Burst test assesses the buffering capability of a virtual switch. It - measures the maximum number of frames received at full line rate - before a frame is lost. In carrier Ethernet networks, this - measurement validates the excess information rate (EIR) as defined in - many SLAs. - -5. System recovery to characterize speed of recovery from an overload - condition. - -6. Reset to characterize speed of recovery from device or software - reset. This type of test has been updated by `RFC6201 - <https://www.rfc-editor.org/rfc/rfc6201.txt>`__ as such, - the methodology defined by this specification will be that of RFC 6201. - -Although not included in the defined RFC 2544 standard, another crucial -measurement in Ethernet networking is packet delay variation. The -definition set out by this specification comes from -`RFC5481 <https://www.rfc-editor.org/rfc/rfc5481.txt>`__. - -.. 3.2.2.4.3 - -RFC 2285 Benchmarking Terminology for LAN Switching Devices -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -RFC 2285 defines the terminology that is used to describe the -terminology for benchmarking a LAN switching device. It extends RFC -1242 and defines: DUTs, SUTs, Traffic orientation and distribution, -bursts, loads, forwarding rates, etc. - -.. 3.2.2.4.4 - -RFC 2889 Benchmarking Methodology for LAN Switching -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -RFC 2889 outlines a benchmarking methodology for LAN switching, it -extends RFC 2544. The outlined methodology gathers performance -metrics for forwarding, congestion control, latency, address handling -and finally filtering. - -.. 3.2.2.4.5 - -RFC 3918 Methodology for IP Multicast Benchmarking -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -RFC 3918 outlines a methodology for IP Multicast benchmarking. - -.. 3.2.2.4.6 - -RFC 4737 Packet Reordering Metrics -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -RFC 4737 describes metrics for identifying and counting re-ordered -packets within a stream, and metrics to measure the extent each -packet has been re-ordered. - -.. 3.2.2.4.7 - -RFC 5481 Packet Delay Variation Applicability Statement -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -RFC 5481 defined two common, but different forms of delay variation -metrics, and compares the metrics over a range of networking -circumstances and tasks. The most suitable form for vSwitch -benchmarking is the "PDV" form. - -.. 3.2.2.4.8 - -RFC 6201 Device Reset Characterization -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -RFC 6201 extends the methodology for characterizing the speed of -recovery of the DUT from device or software reset described in RFC -2544. - -.. 3.2.2.5 - -Details of the Test Report ---------------------------------- - -There are a number of parameters related to the system, DUT and tests -that can affect the repeatability of a test results and should be -recorded. In order to minimise the variation in the results of a test, -it is recommended that the test report includes the following information: - -- Hardware details including: - - - Platform details. - - Processor details. - - Memory information (see below) - - Number of enabled cores. - - Number of cores used for the test. - - Number of physical NICs, as well as their details (manufacturer, - versions, type and the PCI slot they are plugged into). - - NIC interrupt configuration. - - BIOS version, release date and any configurations that were - modified. - -- Software details including: - - - OS version (for host and VNF) - - Kernel version (for host and VNF) - - GRUB boot parameters (for host and VNF). - - Hypervisor details (Type and version). - - Selected vSwitch, version number or commit id used. - - vSwitch launch command line if it has been parameterised. - - Memory allocation to the vSwitch – which NUMA node it is using, - and how many memory channels. - - Where the vswitch is built from source: compiler details including - versions and the flags that were used to compile the vSwitch. - - DPDK or any other SW dependency version number or commit id used. - - Memory allocation to a VM - if it's from Hugpages/elsewhere. - - VM storage type: snapshot/independent persistent/independent - non-persistent. - - Number of VMs. - - Number of Virtual NICs (vNICs), versions, type and driver. - - Number of virtual CPUs and their core affinity on the host. - - Number vNIC interrupt configuration. - - Thread affinitization for the applications (including the vSwitch - itself) on the host. - - Details of Resource isolation, such as CPUs designated for - Host/Kernel (isolcpu) and CPUs designated for specific processes - (taskset). - -- Memory Details - - - Total memory - - Type of memory - - Used memory - - Active memory - - Inactive memory - - Free memory - - Buffer memory - - Swap cache - - Total swap - - Used swap - - Free swap - -- Test duration. -- Number of flows. -- Traffic Information: - - - Traffic type - UDP, TCP, IMIX / Other. - - Packet Sizes. - -- Deployment Scenario. - -**Note**: Tests that require additional parameters to be recorded will -explicitly specify this. - .. _TestIdentification: -.. 3.2.3 - Test identification ========================= @@ -1139,7 +127,7 @@ can be achieved with a virtual switch. The list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added. -.. 3.2.3.1.1 +.. 3.2.2.1.1 Test ID: LTD.Throughput.RFC2544.PacketLossRatio ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1189,7 +177,7 @@ Test ID: LTD.Throughput.RFC2544.PacketLossRatio - CPU and memory utilization may also be collected as part of this test, to determine the vSwitch's performance footprint on the system. -.. 3.2.3.1.2 +.. 3.2.2.1.2 Test ID: LTD.Throughput.RFC2544.PacketLossRatioFrameModification ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1259,7 +247,7 @@ Test ID: LTD.Throughput.RFC2544.PacketLossRatioFrameModification - CPU and memory utilization may also be collected as part of this test, to determine the vSwitch's performance footprint on the system. -.. 3.2.3.1.3 +.. 3.2.2.1.3 Test ID: LTD.Throughput.RFC2544.Profile ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1314,7 +302,7 @@ Test ID: LTD.Throughput.RFC2544.Profile when the offered load is above Maximum Throughput MUST be recorded and reported with the results. -.. 3.2.3.1.4 +.. 3.2.2.1.4 Test ID: LTD.Throughput.RFC2544.SystemRecoveryTime ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1366,7 +354,7 @@ Test ID: LTD.Throughput.RFC2544.SystemRecoveryTime - Physical → virtual switch → physical. -.. 3.2.3.1.5 +.. 3.2.2.1.5 Test ID: LTD.Throughput.RFC2544.BackToBackFrames ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1409,7 +397,7 @@ Test ID: LTD.Throughput.RFC2544.BackToBackFrames - Physical → virtual switch → physical. -.. 3.2.3.1.6 +.. 3.2.2.1.6 Test ID: LTD.Throughput.RFC2889.MaxForwardingRateSoak ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1451,7 +439,7 @@ Test ID: LTD.Throughput.RFC2889.MaxForwardingRateSoak PDV form of delay variation on the traffic flow, using the 99th percentile. -.. 3.2.3.1.7 +.. 3.2.2.1.7 Test ID: LTD.Throughput.RFC2889.MaxForwardingRateSoakFrameModification ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1507,7 +495,7 @@ Test ID: LTD.Throughput.RFC2889.MaxForwardingRateSoakFrameModification PDV form of delay variation on the traffic flow, using the 99th percentile. -.. 3.2.3.1.8 +.. 3.2.2.1.8 Test ID: LTD.Throughput.RFC6201.ResetTime ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1598,7 +586,7 @@ Test ID: LTD.Throughput.RFC6201.ResetTime * Physical → virtual switch → physical. -.. 3.2.3.1.9 +.. 3.2.2.1.9 Test ID: LTD.Throughput.RFC2889.MaxForwardingRate ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1660,7 +648,7 @@ Test ID: LTD.Throughput.RFC2889.MaxForwardingRate benchmarks, and scenarios with both 2 and 4 ports should be tested. In any case, the number of ports used must be reported. -.. 3.2.3.1.10 +.. 3.2.2.1.10 Test ID: LTD.Throughput.RFC2889.ForwardPressure ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1695,7 +683,7 @@ Test ID: LTD.Throughput.RFC2889.ForwardPressure - Physical → virtual switch → physical. -.. 3.2.3.1.11 +.. 3.2.2.1.11 Test ID: LTD.Throughput.RFC2889.ErrorFramesFiltering ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1731,7 +719,7 @@ Test ID: LTD.Throughput.RFC2889.ErrorFramesFiltering - Physical → virtual switch → physical. -.. 3.2.3.1.12 +.. 3.2.2.1.12 Test ID: LTD.Throughput.RFC2889.BroadcastFrameForwarding ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1776,7 +764,7 @@ Test ID: LTD.Throughput.RFC2889.BroadcastFrameForwarding four test ports are required. One of the ports is connected to the test device, so it can send broadcast frames and listen for miss-routed frames. -.. 3.2.3.1.13 +.. 3.2.2.1.13 Test ID: LTD.Throughput.RFC2544.WorstN-BestN ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1868,7 +856,7 @@ Test ID: LTD.Throughput.RFC2544.WorstN-BestN - System bus (QPI, PCI, ...) utilization. - CPU cycles consumed per packet. -.. 3.2.3.1.14 +.. 3.2.2.1.14 Test ID: LTD.Throughput.Overlay.Network.<tech>.RFC2544.PacketLossRatio ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2070,7 +1058,7 @@ Test ID: LTD.Throughput.RFC2544.MatchAction.PacketLossRatio [http://openvswitch.org/support/dist-docs/ovs-ofctl.8.txt ] -.. 3.2.3.2 +.. 3.2.2.2 Packet Latency tests --------------------------- @@ -2079,7 +1067,7 @@ delay variation for various packet types through the virtual switch. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added. -.. 3.2.3.2.1 +.. 3.2.2.2.1 Test ID: LTD.PacketLatency.InitialPacketProcessingLatency ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2128,7 +1116,7 @@ Test ID: LTD.PacketLatency.InitialPacketProcessingLatency - Physical → Virtual Switch → Physical. -.. 3.2.3.2.2 +.. 3.2.2.2.2 Test ID: LTD.PacketDelayVariation.RFC3393.Soak ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2161,7 +1149,7 @@ Test ID: LTD.PacketDelayVariation.RFC3393.Soak - CPU and memory utilization may also be collected as part of this test, to determine the vSwitch's performance footprint on the system. -.. 3.2.3.3 +.. 3.2.2.3 Scalability tests ------------------------ @@ -2170,7 +1158,7 @@ table size and flow lookups on throughput. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added. -.. 3.2.3.3.1 +.. 3.2.2.3.1 Test ID: LTD.Scalability.Flows.RFC2544.0PacketLoss ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2225,7 +1213,7 @@ Test ID: LTD.Scalability.Flows.RFC2544.0PacketLoss specified number of flows and the specified frame size, with zero packet loss. -.. 3.2.3.3.2 +.. 3.2.2.3.2 Test ID: LTD.MemoryBandwidth.RFC2544.0PacketLoss.Scalability ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2274,7 +1262,7 @@ Test ID: LTD.MemoryBandwidth.RFC2544.0PacketLoss.Scalability - The DUT's 0% packet loss throughput in the presence of cache sharing and memory bandwidth between processes. -.. 3.2.3.3.3 +.. 3.2.2.3.3 Test ID: LTD.Scalability.VNF.RFC2544.PacketLossRatio ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2355,7 +1343,7 @@ Test ID: LTD.Scalability.VNF.RFC2544.PacketLossRatio - CPU and memory utilization may also be collected as part of this test, to determine the vSwitch's performance footprint on the system. -.. 3.2.3.3.4 +.. 3.2.2.3.4 Test ID: LTD.Scalability.VNF.RFC2544.PacketLossProfile ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2433,14 +1421,14 @@ Test ID: LTD.Scalability.VNF.RFC2544.PacketLossProfile when the offered load is above Maximum Throughput MUST be recorded and reported with the results. -.. 3.2.3.4 +.. 3.2.2.4 Activation tests ---------------- The general aim of these tests is to understand the capacity of the and speed with which the vswitch can accommodate new flows. -.. 3.2.3.4.1 +.. 3.2.2.4.1 Test ID: LTD.Activation.RFC2889.AddressCachingCapacity ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2492,7 +1480,7 @@ Test ID: LTD.Activation.RFC2889.AddressCachingCapacity - Physical → virtual switch → 2 x physical (one receiving, one listening). -.. 3.2.3.4.2 +.. 3.2.2.4.2 Test ID: LTD.Activation.RFC2889.AddressLearningRate ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2529,7 +1517,7 @@ Test ID: LTD.Activation.RFC2889.AddressLearningRate - Physical → virtual switch → 2 x physical (one receiving, one listening). -.. 3.2.3.5 +.. 3.2.2.5 Coupling between control path and datapath Tests ------------------------------------------------------- @@ -2538,7 +1526,7 @@ and the control path are within a virtual switch. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added. -.. 3.2.3.5.1 +.. 3.2.2.5.1 Test ID: LTD.CPDPCouplingFlowAddition ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2586,7 +1574,7 @@ Test ID: LTD.CPDPCouplingFlowAddition - Physical → virtual switch → physical. -.. 3.2.3.6 +.. 3.2.2.6 CPU and memory consumption --------------------------------- @@ -2595,7 +1583,7 @@ utilization under various loads and circumstances. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added. -.. 3.2.3.6.1 +.. 3.2.2.6.1 Test ID: LTD.Stress.RFC2544.0PacketLoss ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2633,7 +1621,7 @@ Test ID: LTD.Stress.RFC2544.0PacketLoss component being stressed, when reporting the results: LTD.CPU.RFC2544.0PacketLoss or LTD.Memory.RFC2544.0PacketLoss -.. 3.2.3.7 +.. 3.2.2.7 Summary List of Tests ---------------------------- |