From 87a527a3a62d486e950dbc7a358a4678c0e18b18 Mon Sep 17 00:00:00 2001 From: Maryam Tahhan Date: Tue, 9 Aug 2016 14:27:37 +0100 Subject: docs: update collectd information Update collectd description and data model as well as the information about the available collectd plugins and plugins that are a work in progress. Change-Id: I54d1945a9cfdd1511ca34c2c7e542b4c24b561e6 JIRA: DOCS-106 Signed-off-by: Maryam Tahhan --- ...easuring_telco_traffic_and_performance_KPIs.rst | 156 ++++++++++++++++++--- 1 file changed, 140 insertions(+), 16 deletions(-) (limited to 'docs/requirements') diff --git a/docs/requirements/02-measuring_telco_traffic_and_performance_KPIs.rst b/docs/requirements/02-measuring_telco_traffic_and_performance_KPIs.rst index 511128c0..7f0d4861 100644 --- a/docs/requirements/02-measuring_telco_traffic_and_performance_KPIs.rst +++ b/docs/requirements/02-measuring_telco_traffic_and_performance_KPIs.rst @@ -4,7 +4,7 @@ Measuring Telco Traffic and Performance KPIs ============================================ -This section will look at what SFQM has done to enable Measuring Telco Traffic +This section will discuss the SFQM features that enable Measuring Telco Traffic and Performance KPIs. .. Figure:: stats_and_timestamps.png @@ -28,30 +28,146 @@ and Performance KPIs. * For DPDK 2.2 the API was implemented for igb, i40e and all the Virtual Functions (VFs) for all drivers. + * For DPDK 16.07 the API migrated from using string value pairs to using id + value pairs, improving the overall performance of the API. + Monitoring DPDK interfaces =========================== With the features SFQM enabled in DPDK to enable measuring Telco traffic and performance KPIs, we can now retrieve NIC statistics including error stats and relay them to a DPDK user. The next step is to enable monitoring of the DPDK -interfaces based on the stats that we are retrieving from the NICs, and relay +interfaces based on the stats that we are retrieving from the NICs, by relaying the information to a higher level Fault Management entity. To enable this SFQM has been enabling a number of plugins for collectd. -collectd is is a daemon which collects system performance statistics periodically -and provides mechanisms to store the values in a variety of ways. It supports -more than 90 different plugins to retrieve platform information, such as CPU -utilization, and is capable of publishing/writing the information is gathers to -a number of endpoints through its write plugins. +collectd +--------- +collectd is a daemon which collects system performance statistics periodically +and provides a variety of mechanisms to publish the collected metrics. It +supports more than 90 different input and output plugins. Input plugins retrieve +metrics and publish them to the collectd deamon, while output plugins publish +the data they receive to an end point. collectd also has infrastructure to +support thresholding and notification. + +Statistics and Notifications +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Within collectd notifications and performance data are dispatched in the same +way. There are producer plugins (plugins that create notifications/metrics), +and consumer plugins (plugins that receive notifications/metrics and do +something with them). + +Statistics in collectd consist of a value list. A value list includes: + +* Values, can be one of: + + * Derive: used for values where a change in the value since it's last been + read is of interest. Can be used to calculate and store a rate. + + * Counter: similar to derive values, but take the possibility of a counter + wrap around into consideration. + + * Gauge: used for values that are stored as is. + + * Absolute: used for counters that are reset after reading. + +* Value length: the number of values in the data set. + +* Time: timestamp at which the value was collected. + +* Interval: interval at which to expect a new value. + +* Host: used to identify the host. + +* Plugin: used to identify the plugin. + +* Plugin instance (optional): used to group a set of values together. For e.g. + values belonging to a DPDK interface. + +* Type: unit used to measure a value. In other words used to refer to a data + set. + +* Type instance (optional): used to distinguish between values that have an + identical type. + +* meta data: an opaque data structure that enables the passing of additional + information about a value list. "Meta data in the global cache can be used to + store arbitrary information about an identifier" [7]. + +Host, plugin, plugin instance, type and type instance uniquely identify a +collectd value. + +Values lists are often accompanied by data sets that describe the values in more +detail. Data sets consist of: + +* A type: a name which uniquely identifies a data set. + +* One or more data sources (entries in a data set) which include: + + * The name of the data source. If there is only a single data source this is + set to "value". + + * The type of the data source, one of: counter, gauge, absolute or derive. -SFQM has been enabling two collectd plugins to collect DPDK NIC statistics and -push the stats to Ceilometer: + * A min and a max value. -* dpdkstat plugin: A read plugin that retrieve stats from the DPDK extended NIC - stats API. -* ceilometer plugin: A write plugin that pushes the retrieved stats to +Types in collectd are defined in types.db. Examples of types in types.db: + +.. code-block:: console + + bitrate value:GAUGE:0:4294967295 + counter value:COUNTER:U:U + if_octets rx:COUNTER:0:4294967295, tx:COUNTER:0:4294967295 + +In the example above if_octets has two data sources: tx and rx. + +Notifications in collectd are generic messages containing: + +* An associated severity, which can be one of OKAY, WARNING, and FAILURE. + +* A time. + +* A Message + +* A host. + +* A plugin. + +* A plugin instance (optional). + +* A type. + +* A types instance (optional). + +* Meta-data. + +collectd plugins +---------------- +SFQM has enabled three collectd plugins to date: + +* `dpdkstat plugin`_: A read plugin that retrieve stats from the DPDK extended + NIC stats API. + +* `ceilometer plugin`_: A write plugin that pushes the retrieved stats to Ceilometer. It's capable of pushing any stats read through collectd to Ceilometer, not just the DPDK stats. +* `hugepages plugin`_: A read plugin that retrieves the number of available + and free hugepages on a platform as well as what is available in terms of + hugepages per socket. + +Other plugins in progress: + +* dpdkevents: A read plugin that retrieves DPDK link status and DPDK + forwarding cores liveliness status (DPDK Keep Alive). + +* Open vSwitch stats Plugin: A read plugin that retrieve flow and interface + stats from OVS. + +* Open vSwitch events Plugin: A read plugin that retrieves events from OVS. + + +Monitoring Interfaces and Openstack Support +------------------------------------------- .. Figure:: monitoring_interfaces.png Monitoring Interfaces and Openstack Support @@ -63,9 +179,17 @@ and publishing the retrieved stats to Ceilometer through the ceilometer plugin. To see this demo in action please checkout: `SFQM OPNFV Summit demo`_ -Future enahancements to the DPDK stats plugin include: - -* Integration of DPDK Keep Alive functionality. -* Implementation of the ability to retrieve link status. +References +---------- +[1] https://collectd.org/wiki/index.php/Naming_schema +[2] https://github.com/collectd/collectd/blob/master/src/daemon/plugin.h +[3] https://collectd.org/wiki/index.php/Value_list_t +[4] https://collectd.org/wiki/index.php/Data_set +[5] https://collectd.org/documentation/manpages/types.db.5.shtml +[6] https://collectd.org/wiki/index.php/Data_source +[7] https://collectd.org/wiki/index.php/Meta_Data_Interface .. _SFQM OPNFV Summit demo: https://prezi.com/kjv6o8ixs6se/software-fastpath-service-quality-metrics-demo/ +.. _dpdkstat plugin: https://github.com/maryamtahhan/collectd-with-DPDK/tree/dpdkstat +.. _ceilometer plugin: https://github.com/openstack/collectd-ceilometer-plugin/tree/stable/mitaka +.. _hugepages plugin: https://github.com/maryamtahhan/collectd-with-DPDK/tree/hugepages -- cgit 1.2.3-korg