diff options
Diffstat (limited to 'docs/release')
-rw-r--r-- | docs/release/configguide/featureconfig.rst | 61 | ||||
-rw-r--r-- | docs/release/configguide/postinstall.rst | 123 | ||||
-rw-r--r-- | docs/release/release-notes/release-notes.rst | 90 | ||||
-rw-r--r-- | docs/release/scenarios/index.rst | 4 | ||||
-rw-r--r-- | docs/release/scenarios/os-nosdn-bar-ha/scenario.description.rst | 61 | ||||
-rw-r--r-- | docs/release/scenarios/os-nosdn-bar-noha/scenario.description.rst | 61 | ||||
-rw-r--r-- | docs/release/userguide/feature.userguide.rst | 318 |
7 files changed, 445 insertions, 273 deletions
diff --git a/docs/release/configguide/featureconfig.rst b/docs/release/configguide/featureconfig.rst index f7f7ec5e..52546178 100644 --- a/docs/release/configguide/featureconfig.rst +++ b/docs/release/configguide/featureconfig.rst @@ -1,14 +1,12 @@ .. This work is licensed under a Creative Commons Attribution 4.0 International License. .. http://creativecommons.org/licenses/by/4.0 -======================== -Barometer Configuration -======================== -This document provides guidelines on how to install and configure the Barometer -plugin when using Fuel as a deployment tool. The plugin name is: Collectd -Ceilometer Plugin. This plugin installs collectd on a compute node and enables -a number of collectd plugins to collect metrics and events from the platform -and send them to ceilometer. +============================= +Barometer Configuration Guide +============================= +This document provides guidelines on how to install and configure Barometer with Apex. +The deployment script installs and enables a series of collectd plugins on the compute node(s), +which collect and dispatch specific metrics and events from the platform. .. contents:: :depth: 3 @@ -16,45 +14,28 @@ and send them to ceilometer. Pre-configuration activities ---------------------------- -The Barometer Fuel plugin can be found in /opt/opnfv on the fuel master. -To enable this plugin: +Deploying the Barometer components in Apex is done through the deploy-opnfv command by selecting +a scenario-file which contains the ``barometer: true`` option. These files are located on the +Jump Host in the ``/etc/opnfv-apex/ folder``. Two scenarios are pre-defined to include Barometer, +and they are: ``os-nosdn-bar-ha.yaml`` and ``os-nosdn-bar-noha.yaml``. .. code:: bash - $ cd /opt/opnfv - $ fuel plugins --install fuel-plugin-collectd-ceilometer-1.0-1.0.0-1.noarch.rpm - -On the Fuel UI, create a new environment. -* In Settings > OpenStack Services -* Enable "Install Ceilometer and Aodh" -* In Settings > Other -* Enable "Deploy Collectd Ceilometer Plugin" -* Enable the barometer plugins you'd like to deploy using the checkboxes -* Continue with environment configuration and deployment as normal. + $ cd /etc/opnfv-apex + $ opnfv-deploy -d os-nosdn-bar-ha.yaml -n network_settings.yaml -i inventory.yaml –- debug Hardware configuration ---------------------- -There's no specific Hardware configuration required for this the barometer fuel plugin. +There's no specific Hardware configuration required. However, the ``intel_rdt`` plugin works +only on platforms with Intel CPUs. Feature configuration --------------------- -Describe the procedures to configure your feature on the platform in order -that it is ready to use according to the feature instructions in the platform -user guide. Where applicable you should add content in the postinstall.rst -to validate the feature is configured for use. -(checking components are installed correctly etc...) - -Upgrading the plugin --------------------- - -From time to time new versions of the plugin may become available. - -The plugin cannot be upgraded if an active environment is using the plugin. - -In order to upgrade the plugin: - -* Copy the updated plugin file to the fuel-master. -* On the Fuel UI, reset the environment. -* On the Fuel CLI "fuel plugins --update <fuel-plugin-file>" -* On the Fuel UI, re-deploy the environment. +All Barometer plugins are automatically deployed on all compute nodes. There is no option to +selectively install only a subset of plugins. Any custom disabling or configuration must be done +directly on the compute node(s) after the deployment is completed. +Upgrading the plugins +--------------------- +The Barometer components are built-in in the Apex ISO image, and respectively the Apex RPMs. There +is no simple way to update only the Barometer plugins in an existing deployment. diff --git a/docs/release/configguide/postinstall.rst b/docs/release/configguide/postinstall.rst index 5ebdc031..45a79ffb 100644 --- a/docs/release/configguide/postinstall.rst +++ b/docs/release/configguide/postinstall.rst @@ -1,81 +1,100 @@ .. This work is licensed under a Creative Commons Attribution 4.0 International License. .. http://creativecommons.org/licenses/by/4.0 +====================================== Barometer post installation procedures ====================================== -Add a brief introduction to the methods of validating the installation -according to this specific installer or feature. +This document describes briefly the methods of validating the Barometer installation. Automated post installation activities -------------------------------------- -Describe specific post installation activities performed by the OPNFV -deployment pipeline including testing activities and reports. Refer to -the relevant testing guides, results, and release notes. - -note: this section should be singular and derived from the test projects -once we have one test suite to run for all deploy tools. This is not the -case yet so each deploy tool will need to provide (hopefully very simillar) -documentation of this. +The Barometer test-suite in Functest is called ``barometercollectd`` and is part of the ``Features`` +tier. Running these tests is done automatically by the OPNFV deployment pipeline on the supported +scenarios. The testing consists of basic verifications that each plugin is functional per their +default configurations. Inside the Functest container, the detailed results can be found in the +``/home/opnfv/functest/results/barometercollectd.log``. Barometer post configuration procedures --------------------------------------- -The fuel plugin installs collectd and its plugins on compute nodes. -separate config files for each of the collectd plugins. These -configuration files can be found on the compute node @ -`/etc/collectd/collectd.conf.d/` directory. Each collectd plugin will -have its own configuration file with a default configuration for each -plugin. You can override any of the plugin configurations, by modifying -the configuration file and restarting the collectd service on the compute node. +--------------------------------------- +The functionality for each plugin (such as enabling/disabling and configuring its capabilities) +is controlled as described in the User Guide through their individual ``.conf`` file located in +the ``/etc/collectd/collectd.conf.d/`` folder on the compute node(s). In order for any changes to +take effect, the collectd service must be stopped and then started again. Platform components validation ---------------------------------- -1. SSH to a compute node and ensure that the collectd service is running. +------------------------------ +The following steps describe how to perform a simple "manual" testing of the Barometer components: + +1. Connect to any compute node and ensure that the collectd service is running. The log file + ``collectd.log`` should contain no errors and should indicate that each plugin was successfully + loaded. For example, from the Jump Host: + + .. code:: bash + + $ opnfv-util overcloud compute0 + $ ls /etc/collectd/collectd.conf.d/ + $ systemctl status collectd + $ vi /opt/stack/collectd.log -2. On the compute node, you need to inject a corrected memory error: + The following plugings should be found loaded: + aodh, gnocchi, hugepages, intel_rdt, mcelog, ovs_events, ovs_stats, snmp, virt -.. code:: bash +2. On the compute node, induce an event monitored by the plugins; e.g. a corrected memory error: - $ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git - $ cd mce-inject - $ make - $ modprobe mce-inject + .. code:: bash -Modify the test/corrected script to include the following: + $ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git + $ cd mce-inject + $ make + $ modprobe mce-inject -.. code:: bash + Modify the test/corrected script to include the following: - CPU 0 BANK 0 - STATUS 0xcc00008000010090 - ADDR 0x0010FFFFFFF + .. code:: bash -Inject the error: + CPU 0 BANK 0 + STATUS 0xcc00008000010090 + ADDR 0x0010FFFFFFF -.. code:: bash + Inject the error: - $ ./mce-inject < test/corrected + .. code:: bash -3. SSH to openstack controller node and query the ceilometer DB: + $ ./mce-inject < test/corrected -.. code:: bash +3. Connect to the controller and query the monitoring services. Make sure the overcloudrc.v3 + file has been copied to the controller (from the undercloud VM or from the Jump Host) in order + to be able to authenticate for OpenStack services. - $ source openrc - $ ceilometer sample-list -m interface.if_packets - $ ceilometer sample-list -m hugepages.vmpage_number - $ ceilometer sample-list -m ovs_events.gauge - $ ceilometer sample-list -m mcelog.errors + .. code:: bash -As you run each command above, you should see output similar to the examples below: + $ opnfv-util overcloud controller0 + $ su + $ source overcloudrc.v3 + $ gnocchi metric list + $ aodh alarm list -.. code:: bash - | node-6.domain.tld-br-prv-link_status | ovs_events.gauge | gauge | 1.0 | None | 2017-01-20T18:18:40 | - | node-6.domain.tld-int-br-prv-link_status | ovs_events.gauge | gauge | 1.0 | None | 2017-01-20T18:18:39 | - | node-6.domain.tld-br-int-link_status | ovs_events.gauge | gauge | 0.0 | None | 2017-01-20T18:18:39 | + The output for the gnocchi and aodh queries should be similar to the excerpts below: - | node-6.domain.tld-mm-2048Kb-free | hugepages.vmpage_number | gauge | 0.0 | None | 2017-01-20T18:17:12 | - | node-6.domain.tld-mm-2048Kb-used | hugepages.vmpage_number | gauge | 0.0 | None | 2017-01-20T18:17:12 | - +-------------------------------------+-------------------------+-------+--------+------+---------------------+ + .. code:: bash - | bf05daca-df41-11e6-b097-5254006ed58e | node-6.domain.tld-SOCKET_0_CHANNEL_0_DIMM_any-uncorrected_memory_errors_in_24h | mcelog.errors | gauge | 0.0 | None | 2017-01-20T18:53:34 | - | bf05dacb-df41-11e6-b097-5254006ed58e | node-6.domain.tld-SOCKET_0_CHANNEL_any_DIMM_any-uncorrected_memory_errors_in_24h | mcelog.errors | gauge | 0.0 | None | 2017-01-20T18:53:34 | - | bdcb930d-df41-11e6-b097-5254006ed58e | node-6.domain.tld-SOCKET_0_CHANNEL_any_DIMM_any-uncorrected_memory_errors | mcelog.errors | gauge | 0.0 | None | 2017-01-20T18:53:33 | + +--------------------------------------+---------------------+------------------------------------------------------------------------------------------------------------+-----------+-------------+ + | id | archive_policy/name | name | unit | resource_id | + +--------------------------------------+---------------------+------------------------------------------------------------------------------------------------------------+-----------+-------------+ + [...] + | 0550d7c1-384f-4129-83bc-03321b6ba157 | high | overcloud-novacompute-0.jf.intel.com-hugepages-mm-2048Kb@vmpage_number.free | Pages | None | + | 0cf9f871-0473-4059-9497-1fea96e5d83a | high | overcloud-novacompute-0.jf.intel.com-hugepages-node0-2048Kb@vmpage_number.free | Pages | None | + | 0d56472e-99d2-4a64-8652-81b990cd177a | high | overcloud-novacompute-0.jf.intel.com-hugepages-node1-1048576Kb@vmpage_number.used | Pages | None | + | 0ed71a49-6913-4e57-a475-d30ca2e8c3d2 | high | overcloud-novacompute-0.jf.intel.com-hugepages-mm-1048576Kb@vmpage_number.used | Pages | None | + | 11c7be53-b2c1-4c0e-bad7-3152d82c6503 | high | overcloud-novacompute-0.jf.intel.com-mcelog- | None | None | + | | | SOCKET_0_CHANNEL_any_DIMM_any@errors.uncorrected_memory_errors_in_24h | | | + | 120752d4-385e-4153-aed8-458598a2a0e0 | high | overcloud-novacompute-0.jf.intel.com-cpu-24@cpu.interrupt | jiffies | None | + | 1213161e-472e-4e1b-9e56-5c6ad1647c69 | high | overcloud-novacompute-0.jf.intel.com-cpu-6@cpu.softirq | jiffies | None | + [...] + +--------------------------------------+-------+------------------------------------------------------------------+-------+----------+---------+ + | alarm_id | type | name | state | severity | enabled | + +--------------------------------------+-------+------------------------------------------------------------------+-------+----------+---------+ + | fbd06539-45dd-42c5-a991-5c5dbf679730 | event | gauge.memory_erros(overcloud-novacompute-0.jf.intel.com-mcelog) | ok | moderate | True | + | d73251a5-1c4e-4f16-bd3d-377dd1e8cdbe | event | gauge.mcelog_status(overcloud-novacompute-0.jf.intel.com-mcelog) | ok | moderate | True | + [...] diff --git a/docs/release/release-notes/release-notes.rst b/docs/release/release-notes/release-notes.rst index 3837b1e7..73ba36d2 100644 --- a/docs/release/release-notes/release-notes.rst +++ b/docs/release/release-notes/release-notes.rst @@ -2,10 +2,10 @@ .. http://creativecommons.org/licenses/by/4.0 ====================================================================== -OPNFV Barometer Release Notes +Barometer Release Notes ====================================================================== -This document provides the release notes for Danube Release of Barometer. +This document provides the release notes for Euphrates release of Barometer. .. contents:: :depth: 3 @@ -19,7 +19,10 @@ Version history | **Date** | **Ver.** | **Author** | **Comment** | | | | | | +--------------------+--------------------+--------------------+--------------------+ -| 2017-02-16 | 0.1.0 | Maryam Tahhan | First draft | +| 2017-08-25 | 0.1.0 | Shobhi Jain | First draft | +| | | | | ++--------------------+--------------------+--------------------+--------------------+ +| | | | | | | | | | +--------------------+--------------------+--------------------+--------------------+ @@ -31,26 +34,26 @@ Summary ------------ The Barometer@OPNFV project adds a platform telemetry agent to compute nodes that is capabable of retrieving platform statistics and events, and relay them -to Openstack ceilometer. The telemetry agent currently supported by Barometer -is collectd. Some additional collectd plugin were developed to add functionality -to retrieve statistics or events for: +to Openstack Gnocchi and Aodh. The telemetry agent currently supported by barometer +is collectd. Some additional collectd plugins and application were developed to add +functionality to retrieve statistics or events for: + +Write Plugins: aodh plugin, SNMP agent plugin, gnocchi plugin. -- Hugepages -- mcelog memory machine check exceptions -- Open vSwitch events -- Ceilometer +Read Plugins/application: Intel RDT plugin, virt plugin, Open vSwitch stats plugin, +Open vSwitch PMD stats application. Release Data --------------- +--------------------------------------+--------------------------------------+ -| **Project** | Danube/barometer/barometer@opnfv | +| **Project** | Euphrates/barometer/barometer@opnfv | | | | +--------------------------------------+--------------------------------------+ | **Repo/commit-ID** | barometer/ | | | | +--------------------------------------+--------------------------------------+ -| **Release designation** | Danube 1.0 | +| **Release designation** | Euphrates 1.0 | | | | +--------------------------------------+--------------------------------------+ | **Release date** | | @@ -70,7 +73,7 @@ Module version changes Document version changes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- The Barometer@OPNFV installation guide version has changed from version 0.1 to to 0.2 + Reason for version ^^^^^^^^^^^^^^^^^^^^ @@ -83,16 +86,7 @@ Feature additions | **JIRA REFERENCE** | **SLOGAN** | | | | +--------------------------------------+--------------------------------------+ -| BAROMETER-38 | RAS Collectd Plugin | -| | | -+--------------------------------------+--------------------------------------+ -| BAROMETER-41 | OVS Collectd Plugin | -| | | -+--------------------------------------+--------------------------------------+ -| BAROMETER-43 | Fuel Plugin for D Release | -| | | -+--------------------------------------+--------------------------------------+ -| BAROMETER-48 | Hugepages Plugin for Collectd | +| BAROMETER-78 | Barometer + Doctor Collaboration | | | | +--------------------------------------+--------------------------------------+ | | | @@ -124,19 +118,6 @@ Software deliverables Features to Date ~~~~~~~~~~~~~~~~ -This section provides a summary of the features implemented to date and their -relevant upstream projects. - -.. Figure:: Features_to_date1.png - - Barometer features to date - -.. Figure:: Features_to_date2.png - - Barometer features to date cont. - -Please note the timeline denotes DPDK releases. - Release B ~~~~~~~~~~ The features implemented for OPNFV release B (as part of SFQM) in DPDK include: @@ -159,6 +140,18 @@ The features implemented for OPNFV release C (as part of SFQM) include: collectd to ceilometer. * Fuel plugin support for the collectd ceilometer plugin for OPNFV. +Release D +~~~~~~~~~ +The features implemented for OPNFV release D include: + +* collectd hugepages plugin that can retrieves the number of available and free hugepages + on a platform as well as what is available in terms of hugepages per socket. +* collectd Open vSwitch Events plugin that can retrieves events from OVS. +* collectd mcelog plugin that can use mcelog client protocol to check for memory Machine + Check Exceptions and sends the stats for reported exceptions. +* collectd ceilometer plugin that can publish any statistics collected by + collectd to ceilometer. + Documentation deliverables ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -173,21 +166,7 @@ Known Limitations, Issues and Workarounds System Limitations ^^^^^^^^^^^^^^^^^^^^ -Barometer has the same limiations as the fuel project in general as regards - -- **Max number of blades* - -- **Min number of blades** - -- **Storage** - -- **Max number of networks** - -- **L3Agent** - -The only additional limitiation is the following: - -**Telemetry:** Ceilometer service needs to be configured for compute nodes. +For Intel RDT plugin, compute node needs to support Intel RDT. Known issues ^^^^^^^^^^^^^^^ @@ -217,13 +196,13 @@ Workarounds Test Result --------------- -Barometer@OPNFV Danube RC1 has undergone QA test runs with the following results: +Barometer@OPNFV Euphrates has undergone QA test runs with the following results: +--------------------------------------+--------------------------------------+ | **TEST-SUITE** | **Results:** | | | | +--------------------------------------+--------------------------------------+ -| | | +| barometercollectd | | | | | | | | | | | @@ -238,8 +217,3 @@ Barometer@OPNFV Danube RC1 has undergone QA test runs with the following results References ------------ - -For more information on the OPNFV Danube release, please see: - -http://opnfv.org/danube - diff --git a/docs/release/scenarios/index.rst b/docs/release/scenarios/index.rst index 12ca9933..7593434a 100644 --- a/docs/release/scenarios/index.rst +++ b/docs/release/scenarios/index.rst @@ -12,5 +12,5 @@ OPNFV Barometer Scenarios .. toctree:: :maxdepth: 1 - ./os-nosdn-kvm_ovs_dpdk_bar-ha/scenario.description - + ./os-nosdn-bar-ha/scenario.description + ./os-nosdn-bar-noha/scenario.description diff --git a/docs/release/scenarios/os-nosdn-bar-ha/scenario.description.rst b/docs/release/scenarios/os-nosdn-bar-ha/scenario.description.rst new file mode 100644 index 00000000..3f31ff0d --- /dev/null +++ b/docs/release/scenarios/os-nosdn-bar-ha/scenario.description.rst @@ -0,0 +1,61 @@ +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 +.. (c) <optionally add copywriters name> + +=============================================== +OPNFV os-nosdn-bar-ha overview and description +=============================================== + +This document provides details of the scenario for Euphrates release of Barometer. + +.. contents:: + :depth: 3 + :local: + +Introduction +--------------- +.. In this section explain the purpose of the scenario and the types of +.. capabilities provided + +This scenario has the features from the Barometer project. Collectd (a telemetry agent) is installed +on compute nodes so that their statistics, events and alarming services can be relayed to Gnoochi and Aodh. +These are the first steps in paving the way for Platform (NFVI) Monitoring in OPNFV. + +Scenario components and composition +------------------------------------- +.. In this section describe the unique components that make up the scenario, +.. what each component provides and why it has been included in order +.. to communicate to the user the capabilities available in this scenario. + +This scenario deploys the High Availability OPNFV Cloud based on the +configurations provided in ``os-nosdn-bar-ha.yaml``. +This yaml file contains configurations and is passed as an +argument to ``overcloud-deploy-function.sh`` script. +This scenario deploys multiple nodes: 3 Controllers, 2 Computes. + +Collectd is installed on compute nodes and Openstack services runs on the controller nodes. + +os-nosdn-bar-ha scenario is successful when all the nodes are accessible, up and running. +Also, verify if plugins/services are communicating with Gnocchi and Aodh on the controller nodes. + +Scenario usage overview +---------------------------- +.. Provide a brief overview on how to use the scenario and the features available to the +.. user. This should be an "introduction" to the userguide document, and explicitly link to it, +.. where the specifics of the features are covered including examples and API's + +After installation, plugins will be able to read/write the stats on/from the controller node. +A detailed list of supported plugins along with their sample configuration can be found in the userguide. + +Limitations, Issues and Workarounds +--------------------------------------- +.. Explain scenario limitations here, this should be at a design level rather than discussing +.. faults or bugs. If the system design only provide some expected functionality then provide +.. some insight at this point. + +None. + +References +----------------- + + diff --git a/docs/release/scenarios/os-nosdn-bar-noha/scenario.description.rst b/docs/release/scenarios/os-nosdn-bar-noha/scenario.description.rst new file mode 100644 index 00000000..d6a1184a --- /dev/null +++ b/docs/release/scenarios/os-nosdn-bar-noha/scenario.description.rst @@ -0,0 +1,61 @@ +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 +.. (c) <optionally add copywriters name> + +================================================= +OPNFV os-nosdn-bar-noha overview and description +================================================= + +This document provides details of the scenario for Euphrates release of Barometer. + +.. contents:: + :depth: 3 + :local: + +Introduction +--------------- +.. In this section explain the purpose of the scenario and the types of +.. capabilities provided + +This scenario has the features from the Barometer project. Collectd (a telemetry agent) is installed +on compute nodes so that their statistics, events and alarming services can be relayed to Gnoochi and Aodh. +These are the first steps in paving the way for Platform (NFVI) Monitoring in OPNFV. + +Scenario components and composition +------------------------------------- +.. In this section describe the unique components that make up the scenario, +.. what each component provides and why it has been included in order +.. to communicate to the user the capabilities available in this scenario. + +This scenario deploys the High Availability OPNFV Cloud based on the +configurations provided in ``os-nosdn-bar-noha.yaml``. +This yaml file contains configurations and is passed as an +argument to ``overcloud-deploy-function.sh`` script. +This scenario deploys multiple nodes: 1 Controller, 2 Computes. + +Collectd is installed on compute nodes and Openstack services runs on the controller node. + +os-nosdn-bar-noha scenario is successful when all the nodes are accessible, up and running. +Also, verify if plugins/services are communicating with Gnocchi and Aodh on the controller nodes. + +Scenario usage overview +---------------------------- +.. Provide a brief overview on how to use the scenario and the features available to the +.. user. This should be an "introduction" to the userguide document, and explicitly link to it, +.. where the specifics of the features are covered including examples and API's + +After installation, plugins will be able to read/write the stats on/from the controller node. +A detailed list of supported plugins along with their sample configuration can be found in the userguide. + +Limitations, Issues and Workarounds +--------------------------------------- +.. Explain scenario limitations here, this should be at a design level rather than discussing +.. faults or bugs. If the system design only provide some expected functionality then provide +.. some insight at this point. + +None. + +References +----------------- + + diff --git a/docs/release/userguide/feature.userguide.rst b/docs/release/userguide/feature.userguide.rst index 29298536..099d8e27 100644 --- a/docs/release/userguide/feature.userguide.rst +++ b/docs/release/userguide/feature.userguide.rst @@ -15,7 +15,7 @@ Barometer collectd plugins description .. Describe the specific features and how it is realised in the scenario in a brief manner .. to ensure the user understand the context for the user guide instructions to follow. -collectd is a daemon which collects system performance statistics periodically +Collectd is a daemon which collects system performance statistics periodically and provides a variety of mechanisms to publish the collected metrics. It supports more than 90 different input and output plugins. Input plugins retrieve metrics and publish them to the collectd deamon, while output plugins @@ -24,8 +24,8 @@ to support thresholding and notification. Barometer has enabled the following collectd plugins: -* *dpdkstat plugin*: A read plugin that retrieve stats from the DPDK extended - NIC stats API. +* *dpdkstat plugin*: A read plugin that retrieves stats from the DPDK extended + NIC stats API. * *dpdkevents plugin*: A read plugin that retrieves DPDK link status and DPDK forwarding cores liveliness status (DPDK Keep Alive). @@ -47,10 +47,13 @@ Barometer has enabled the following collectd plugins: stats from OVS. * *mcelog plugin*: A read plugin that uses mcelog client protocol to check for - memory Machine Check Exceptions and sends the stats for reported exceptions + memory Machine Check Exceptions and sends the stats for reported exceptions. + +* *PMU plugin*: A read plugin that provides performance counters data on + Intel CPUs using Linux perf interface. * *RDT plugin*: A read plugin that provides the last level cache utilization and - memory bandwidth utilization + memory bandwidth utilization. * *virt*: A read plugin that uses virtualization API *libvirt* to gather statistics about virtualized guests on a system directly from the hypervisor, @@ -84,22 +87,20 @@ Third party application in Barometer repository: * *Open vSwitch PMD stats*: An aplication that retrieves PMD stats from OVS. It is run through exec plugin. -**Plugins included in the Danube release:** +**Plugins and application included in the Euphrates release:** + +Write Plugins: aodh plugin, SNMP agent plugin, gnocchi plugin. -* Hugepages -* Open vSwitch Events -* Ceilometer -* Mcelog +Read Plugins/application: Intel RDT plugin, virt plugin, Open vSwitch stats plugin, +Open vSwitch PMD stats application. -collectd capabilities and usage +Collectd capabilities and usage ------------------------------------ .. Describe the specific capabilities and usage for <XYZ> feature. .. Provide enough information that a user will be able to operate the feature on a deployed scenario. -.. note:: Plugins included in the OPNFV D release will be built-in to the fuel - plugin and available in the /opt/opnfv directory on the fuel master. You don't - need to clone the barometer/collectd repos to use these, but you can configure - them as shown in the examples below. +.. note:: Plugins included in the OPNFV E release will be built-in for Apex integration + and can be configured as shown in the examples below. The collectd plugins in OPNFV are configured with reasonable defaults, but can be overridden. @@ -111,7 +112,7 @@ built and configured through the barometer repository. .. note:: * sudo permissions are required to install collectd. - * These are instructions for Ubuntu 16.04 + * These instructions are for Centos 7. To build all the upstream plugins, clone the barometer repo: @@ -135,13 +136,11 @@ Sample configuration files can be found in '/opt/collectd/etc/collectd.conf.d' sample config file from '/opt/collectd/etc/collectd.conf.d' .. note:: If you plan on using the Exec plugin (for OVS_PMD_STATS or for executing scripts - on notification generation), the plugin requires a non-root - user to execute scripts. By default, `collectd_exec` user is used in the exec.conf - provided in the sample configurations directory under src/collectd in the Barometer - repo. The scripts *DO NOT* create this user. You need to create this user before you - run build_base_machine.sh. Or modify configuration in the sample configurations - directory under src/collectd to use another existing non root user before running - run build_base_machine.sh. + on notification generation), the plugin requires a non-root user to execute scripts. + By default, `collectd_exec` user is used in the exec.conf provided in the sample + configurations directory under src/collectd in the Barometer repo. These scripts *DO NOT* create this user. + You need to create this user or modify the configuration in the sample configurations directory + under src/collectd to use another existing non root user before running build_base_machine.sh. .. note:: If you are using any Open vSwitch plugins you need to run: @@ -173,7 +172,7 @@ Branch: master Dependencies: DPDK (http://dpdk.org/) -.. note:: DPDK statistics plugin requires DPDK version 16.04 or later +.. note:: DPDK statistics plugin requires DPDK version 16.04 or later. To build and install DPDK to /usr please see: https://github.com/collectd/collectd/blob/master/docs/BUILD.dpdkstat.md @@ -201,25 +200,21 @@ Example of specifying custom paths to DPDK headers and libraries: $ ./configure LIBDPDK_CPPFLAGS="path to DPDK header files" LIBDPDK_LDFLAGS="path to DPDK libraries" -This will install collectd to /opt/collectd -The collectd configuration file can be found at /opt/collectd/etc - +This will install collectd to default folder ``/opt/collectd``. The collectd +configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``. To configure the dpdkstats plugin you need to modify the configuration file to include: .. code:: bash LoadPlugin dpdkstat - <Plugin "dpdkstat"> - <EAL> - Coremask "0x2" - MemoryChannels "4" - ProcessType "secondary" - FilePrefix "rte" - </EAL> - EnabledPortMask 0xffff - PortName "interface1" - PortName "interface2" + <Plugin dpdkstat> + Coremask "0xf" + ProcessType "secondary" + FilePrefix "rte" + EnabledPortMask 0xffff + PortName "interface1" + PortName "interface2" </Plugin> @@ -228,28 +223,27 @@ include: .. code:: bash - LoadPlugin dpdkevents + <LoadPlugin dpdkevents> + Interval 1 + </LoadPlugin> + <Plugin "dpdkevents"> - Interval 1 - <EAL> - Coremask "0x1" - MemoryChannels "4" - ProcessType "secondary" - FilePrefix "rte" - </EAL> - <Event "link_status"> - SendEventsOnUpdate true - EnabledPortMask 0xffff - PortName "interface1" - PortName "interface2" - SendNotification false - </Event> - <Event "keep_alive"> - SendEventsOnUpdate true - LCoreMask "0xf" - KeepAliveShmName "/dpdk_keepalive_shm_name" - SendNotification false - </Event> + <EAL> + Coremask "0x1" + MemoryChannels "4" + FilePrefix "rte" + </EAL> + <Event "link_status"> + SendEventsOnUpdate false + EnabledPortMask 0xffff + SendNotification true + </Event> + <Event "keep_alive"> + SendEventsOnUpdate false + LCoreMask "0xf" + KeepAliveShmName "/dpdk_keepalive_shm_name" + SendNotification true + </Event> </Plugin> .. note:: Currently, the DPDK library doesn’t support API to de-initialize @@ -267,10 +261,10 @@ For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod .. note:: dpdkstat plugin initialization time depends on read interval. It - requires 5 read cycles to set up internal buffers and states. During that time - no statistics are submitted. Also if plugin is running and the number of DPDK + requires 5 read cycles to set up internal buffers and states, during that time + no statistics are submitted. Also, if plugin is running and the number of DPDK ports is increased, internal buffers are resized. That requires 3 read cycles - and no port statistics are submitted in that time. + and no port statistics are submitted during that time. The Address-Space Layout Randomization (ASLR) security feature in Linux should be disabled, in order for the same hugepage memory mappings to be present in all @@ -310,7 +304,7 @@ http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html When network port controlled by Linux is bound to DPDK driver, the port will not be available in the OS. It affects the SNMP write plugin as those - ports will not be present in standard IF-MIB. Thus addition work is + ports will not be present in standard IF-MIB. Thus, additional work is required to be done to support DPDK ports and statistics. Hugepages Plugin @@ -325,9 +319,9 @@ To configure some hugepages: .. code:: bash - sudo mkdir -p /mnt/huge - sudo mount -t hugetlbfs nodev /mnt/huge - sudo echo 14336 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages + $ sudo mkdir -p /mnt/huge + $ sudo mount -t hugetlbfs nodev /mnt/huge + $ sudo echo 14336 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages Building and installing collectd: @@ -340,8 +334,8 @@ Building and installing collectd: $ make $ sudo make install -This will install collectd to /opt/collectd -The collectd configuration file can be found at /opt/collectd/etc +This will install collectd to default folder ``/opt/collectd``. The collectd +configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``. To configure the hugepages plugin you need to modify the configuration file to include: @@ -359,6 +353,72 @@ include: For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod +Intel PMU Plugin +^^^^^^^^^^^^^^^^ +Repo: https://github.com/collectd/collectd + +Branch: master + +Dependencies: + + * PMU tools (jevents library) https://github.com/andikleen/pmu-tools + +To be suitable for use in collectd plugin shared library *libjevents* should be +compiled as position-independent code. To do this add the following line to +*pmu-tools/jevents/Makefile*: + +.. code:: bash + + CFLAGS += -fPIC + +Building and installing *jevents* library: + +.. code:: bash + + $ git clone https://github.com/andikleen/pmu-tools.git + $ cd pmu-tools/jevents/ + $ make + $ sudo make install + +Building and installing collectd: + +.. code:: bash + + $ git clone https://github.com/collectd/collectd.git + $ cd collectd + $ ./build.sh + $ ./configure --enable-syslog --enable-logfile --with-libjevents=/usr/local --enable-debug + $ make + $ sudo make install + +This will install collectd to default folder ``/opt/collectd``. The collectd +configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``. +To configure the PMU plugin you need to modify the configuration file to +include: + +.. code:: bash + + <LoadPlugin intel_pmu> + Interval 1 + </LoadPlugin> + <Plugin "intel_pmu"> + ReportHardwareCacheEvents true + ReportKernelPMUEvents true + ReportSoftwareEvents true + </Plugin> + +For more information on the plugin parameters, please see: +https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod + +.. note:: + + The plugin opens file descriptors whose quantity depends on number of + monitored CPUs and number of monitored counters. Depending on configuration, + it might be required to increase the limit on the number of open file + descriptors allowed. This can be done using 'ulimit -n' command. If collectd + is executed as a service 'LimitNOFILE=' directive should be defined in + [Service] section of *collectd.service* file. + Intel RDT Plugin ^^^^^^^^^^^^^^^^ Repo: https://github.com/collectd/collectd @@ -396,8 +456,8 @@ Building and installing collectd: $ make $ sudo make install -This will install collectd to /opt/collectd -The collectd configuration file can be found at /opt/collectd/etc +This will install collectd to default folder ``/opt/collectd``. The collectd +configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``. To configure the RDT plugin you need to modify the configuration file to include: @@ -436,11 +496,11 @@ has been introduced. **Install dependencies** -On Ubuntu, the OpenIPMI library can be installed via apt package manager: +On Centos, install OpenIPMI library: .. code:: bash - $ sudo apt-get install libopenipmi-dev + $ sudo yum install OpenIPMI ipmitool Anyway, it's recommended to use the latest version of the OpenIPMI library as it includes fixes of known issues which aren't included in standard OpenIPMI @@ -452,7 +512,7 @@ Remove old version of OpenIPMI library: .. code:: bash - $ sudo apt-get remove libopenipmi-dev + $ sudo yum remove OpenIPMI ipmitool Download OpenIPMI library sources: @@ -515,8 +575,8 @@ Clone and install the collectd IPMI plugin: Where $BRANCH is feat_ipmi_events or feat_ipmi_analog. This will install collectd to default folder ``/opt/collectd``. The collectd -configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``. To -configure the IPMI plugin you need to modify the file to include: +configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``. +To configure the IPMI plugin you need to modify the file to include: .. code:: bash @@ -556,19 +616,19 @@ Start by installing mcelog. .. note:: The kernel has to have CONFIG_X86_MCE enabled. For 32bit kernels you - need at least a 2.6,30 kernel. + need atleast a 2.6,30 kernel. -On ubuntu: +On Centos: .. code:: bash - $ apt-get update && apt-get install mcelog + $ sudo yum install mcelog Or build from source .. code:: bash - $ git clone git://git.kernel.org/pub/scm/utils/cpu/mce/mcelog.git + $ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mcelog.git $ cd mcelog $ make ... become root ... @@ -595,6 +655,27 @@ enable: .. code:: bash socket-path = /var/run/mcelog-client + [dimm] + dimm-tracking-enabled = yes + dmi-prepopulate = yes + uc-error-threshold = 1 / 24h + ce-error-threshold = 10 / 24h + + [socket] + socket-tracking-enabled = yes + mem-uc-error-threshold = 100 / 24h + mem-ce-error-threshold = 100 / 24h + mem-ce-error-log = yes + + [page] + memory-ce-threshold = 10 / 24h + memory-ce-log = yes + memory-ce-action = soft + + [trigger] + children-max = 2 + directory = /etc/mcelog + Clone and install the collectd mcelog plugin: @@ -602,14 +683,13 @@ Clone and install the collectd mcelog plugin: $ git clone https://github.com/maryamtahhan/collectd $ cd collectd - $ git checkout feat_ras $ ./build.sh $ ./configure --enable-syslog --enable-logfile --enable-debug $ make $ sudo make install -This will install collectd to /opt/collectd -The collectd configuration file can be found at /opt/collectd/etc +This will install collectd to default folder ``/opt/collectd``. The collectd +configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``. To configure the mcelog plugin you need to modify the configuration file to include: @@ -618,7 +698,7 @@ include: <LoadPlugin mcelog> Interval 1 </LoadPlugin> - <Plugin "mcelog"> + <Plugin mcelog> McelogClientSocket "/var/run/mcelog-client" </Plugin> @@ -652,7 +732,7 @@ Then you can run the mcelog test suite with This will inject different classes of errors and check that the mcelog triggers runs. There will be some kernel messages about page offlining attempts. The -test will also lose a few pages of memory in your system (not significant) +test will also lose a few pages of memory in your system (not significant). .. note:: This test will kill any running mcelog, which needs to be restarted manually afterwards. @@ -685,7 +765,7 @@ Inject the error: The uncorrected and fatal scripts under test will cause a platform reset. Only the fatal script generates the memory errors**. In order to quickly emulate uncorrected memory errors and avoid host reboot following test errors - from mce-test suite can be injected: + from mce-test suite can be injected: .. code:: bash @@ -693,7 +773,7 @@ Inject the error: **mce-test:** -In addition an more in-depth test of the Linux kernel machine check facilities +In addition a more in-depth test of the Linux kernel machine check facilities can be done with the mce-test test suite. mce-test supports testing uncorrected error handling, real error injection, handling of different soft offlining cases, and other tests. @@ -728,11 +808,14 @@ IF-MIB (http://www.net-snmp.org/docs/mibs/IF-MIB.txt) Dependencies: Open vSwitch, Yet Another JSON Library (https://github.com/lloyd/yajl) -On Ubuntu, install the dependencies: +On Centos, install the dependencies and Open vSwitch: .. code:: bash - $ sudo apt-get install libyajl-dev openvswitch-switch + $ sudo yum install yajl-devel + +Steps to install Open vSwtich can be found at +http://docs.openvswitch.org/en/latest/intro/install/fedora/ Start the Open vSwitch service: @@ -740,7 +823,7 @@ Start the Open vSwitch service: $ sudo service openvswitch-switch start -configure the ovsdb-server manager: +Configure the ovsdb-server manager: .. code:: bash @@ -758,21 +841,21 @@ Clone and install the collectd ovs plugin: $ make $ sudo make install -This will install collectd to /opt/collectd. The collectd configuration file -can be found at /opt/collectd/etc. To configure the OVS events plugin you -need to modify the configuration file to include: +This will install collectd to default folder ``/opt/collectd``. The collectd +configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``. +To configure the OVS events plugin you need to modify the configuration file to include: .. code:: bash <LoadPlugin ovs_events> Interval 1 </LoadPlugin> - <Plugin "ovs_events"> - Port 6640 + <Plugin ovs_events> + Port "6640" + Address "127.0.0.1" Socket "/var/run/openvswitch/db.sock" Interfaces "br0" "veth0" - SendNotification false - DispatchValues true + SendNotification true </Plugin> To configure the OVS stats plugin you need to modify the configuration file @@ -787,7 +870,7 @@ to include: Port "6640" Address "127.0.0.1" Socket "/var/run/openvswitch/db.sock" - Bridges "br0" "br_ext" + Bridges "br0" </Plugin> For more information on the plugin parameters, please see: @@ -815,7 +898,6 @@ to include: </LoadPlugin <Plugin exec> Exec "user:group" "<path to ovs_pmd_stat.sh>" - #NotificationExec "nobody" "/usr/lib/collectd/notify.sh" </Plugin> .. note:: Exec plugin configuration has to be changed to use appropriate user before starting collectd service. @@ -831,28 +913,22 @@ SNMP Agent Plugin ^^^^^^^^^^^^^^^^^ Repo: https://github.com/maryamtahhan/collectd/ -Branch: feat_snmp +Branch: master Dependencies: NET-SNMP library Start by installing net-snmp and dependencies. -On ubuntu: +On Centos 7: .. code:: bash - $ apt-get install snmp snmp-mibs-downloader snmpd libsnmp-dev + $ sudo yum install net-snmp net-snmp-libs net-snmp-utils net-snmp-devel $ systemctl start snmpd.service Or build from source -Become root to install net-snmp dependencies - -.. code:: bash - - $ apt-get install libperl-dev - -Clone and build net-snmp +Clone and build net-snmp: .. code:: bash @@ -867,13 +943,13 @@ Become root $ make install -Copy default configuration to persistent folder +Copy default configuration to persistent folder: .. code:: bash $ cp EXAMPLE.conf /usr/share/snmp/snmpd.conf -Set library path and default MIB configuration +Set library path and default MIB configuration: .. code:: bash @@ -882,7 +958,7 @@ Set library path and default MIB configuration $ net-snmp-config --default-mibdirs $ net-snmp-config --snmpconfpath -Configure snmpd as a service +Configure snmpd as a service: .. code:: bash @@ -917,8 +993,9 @@ Clone and install the collectd snmp_agent plugin: $ make $ sudo make install -This will install collectd to /opt/collectd -The collectd configuration file can be found at /opt/collectd/etc +This will install collectd to default folder ``/opt/collectd``. The collectd +configuration file (``collectd.conf``) can be found at ``/opt/collectd/etc``. + **SNMP Agent plugin is a generic plugin and cannot work without configuration**. To configure the snmp_agent plugin you need to modify the configuration file to include OIDs mapped to collectd types. The following example maps scalar @@ -954,15 +1031,15 @@ virt plugin ^^^^^^^^^^^^ Repo: https://github.com/maryamtahhan/collectd -Branch: feat_libvirt_upstream +Branch: master Dependencies: libvirt (https://libvirt.org/), libxml2 -On Ubuntu, install the dependencies: +On Centos, install the dependencies: .. code:: bash - $ sudo apt-get install libxml2-dev + $ sudo yum install libxml2-dev libpciaccess-devel yajl-devel device-mapper-devel Install libvirt: @@ -1056,24 +1133,23 @@ Clone and install the collectd virt plugin: $ git clone $REPO $ cd collectd - $ git checkout $BRANCH $ ./build.sh $ ./configure --enable-syslog --enable-logfile --enable-debug $ make $ sudo make install -Where ``$REPO`` and ``$BRANCH`` are equal to information provided above. +Where ``$REPO`` is equal to information provided above. This will install collectd to ``/opt/collectd``. The collectd configuration file -``collectd.conf`` can be found at ``/opt/collectd/etc``. To load the virt plugin -user needs to modify the configuration file to include: +``collectd.conf`` can be found at ``/opt/collectd/etc``. +To load the virt plugin user needs to modify the configuration file to include: .. code:: bash LoadPlugin virt Additionally, user can specify plugin configuration parameters in this file, -such as connection URI, domain name and much more. By default extended virt plugin +such as connection URL, domain name and much more. By default extended virt plugin statistics are disabled. They can be enabled with ``ExtraStats`` option. .. code:: bash @@ -1182,7 +1258,7 @@ Monitoring Interfaces and Openstack Support Monitoring Interfaces and Openstack Support The figure above shows the DPDK L2 forwarding application running on a compute -node, sending and receiving traffic. collectd is also running on this compute +node, sending and receiving traffic. Collectd is also running on this compute node retrieving the stats periodically from DPDK through the dpdkstat plugin and publishing the retrieved stats to OpenStack through the collectd-ceilometer-plugin. |