diff options
author | Stephen Wong <stephen.kf.wong@gmail.com> | 2018-11-09 18:28:57 +0000 |
---|---|---|
committer | Gerrit Code Review <gerrit@opnfv.org> | 2018-11-09 18:28:57 +0000 |
commit | 79bb4023420396130c97f0ed8da2045400c1d7a3 (patch) | |
tree | f245dc7ea4613d08596ea7394983de85cd2a4580 /docs | |
parent | 505a0bbe3d01a234cb47c9c482ed1a37b9360e26 (diff) | |
parent | 0562c9c5d261374461bf4d311af839236fd5c7f1 (diff) |
Merge "Initial commit of visibility services config guide" into stable/gambia
Diffstat (limited to 'docs')
-rw-r--r-- | docs/release/configguide/controller_services_config_guide.rst | 6 | ||||
-rw-r--r-- | docs/release/configguide/imgs/jmeter_overview.png | bin | 0 -> 78367 bytes | |||
-rw-r--r-- | docs/release/configguide/imgs/visibility_discovered_active.png | bin | 0 -> 33626 bytes | |||
-rw-r--r-- | docs/release/configguide/imgs/visibility_distinct_counts.png | bin | 0 -> 38887 bytes | |||
-rw-r--r-- | docs/release/configguide/imgs/visibility_distinct_http.png | bin | 0 -> 27362 bytes | |||
-rw-r--r-- | docs/release/configguide/imgs/visibility_monitoring_metrics.png | bin | 0 -> 96758 bytes | |||
-rw-r--r-- | docs/release/configguide/imgs/visibility_overview.png | bin | 0 -> 64705 bytes | |||
-rw-r--r-- | docs/release/configguide/imgs/visibility_system_counts_response_times.png | bin | 0 -> 40552 bytes | |||
-rw-r--r-- | docs/release/configguide/jmeter_config_guide.rst | 8 | ||||
-rw-r--r-- | docs/release/configguide/visibility_config_guide.rst | 403 |
10 files changed, 410 insertions, 7 deletions
diff --git a/docs/release/configguide/controller_services_config_guide.rst b/docs/release/configguide/controller_services_config_guide.rst index 6671458..d9ad891 100644 --- a/docs/release/configguide/controller_services_config_guide.rst +++ b/docs/release/configguide/controller_services_config_guide.rst @@ -44,10 +44,6 @@ The **clover-controller** service provides the following functions: Clover visibility services. It presents analyzed visibility data and provides basic controls such as selecting which user services visibility will track. -.. image:: imgs/controller_services.png - :align: center - :scale: 100% - The **cloverctl** CLI command syntax is similar to k8s kubectl or istio istioctl CLI tools, using a <verb> <noun> convention. @@ -106,6 +102,8 @@ namespace: NAMESPACE NAME READY STATUS clover-system clover-controller-74d8596bb5-jczqz 1/1 Running +.. _exposing_clover_controller: + Exposing clover-controller ========================== diff --git a/docs/release/configguide/imgs/jmeter_overview.png b/docs/release/configguide/imgs/jmeter_overview.png Binary files differnew file mode 100644 index 0000000..ee986e6 --- /dev/null +++ b/docs/release/configguide/imgs/jmeter_overview.png diff --git a/docs/release/configguide/imgs/visibility_discovered_active.png b/docs/release/configguide/imgs/visibility_discovered_active.png Binary files differnew file mode 100644 index 0000000..6c91559 --- /dev/null +++ b/docs/release/configguide/imgs/visibility_discovered_active.png diff --git a/docs/release/configguide/imgs/visibility_distinct_counts.png b/docs/release/configguide/imgs/visibility_distinct_counts.png Binary files differnew file mode 100644 index 0000000..57ba901 --- /dev/null +++ b/docs/release/configguide/imgs/visibility_distinct_counts.png diff --git a/docs/release/configguide/imgs/visibility_distinct_http.png b/docs/release/configguide/imgs/visibility_distinct_http.png Binary files differnew file mode 100644 index 0000000..e15333d --- /dev/null +++ b/docs/release/configguide/imgs/visibility_distinct_http.png diff --git a/docs/release/configguide/imgs/visibility_monitoring_metrics.png b/docs/release/configguide/imgs/visibility_monitoring_metrics.png Binary files differnew file mode 100644 index 0000000..f5c6ada --- /dev/null +++ b/docs/release/configguide/imgs/visibility_monitoring_metrics.png diff --git a/docs/release/configguide/imgs/visibility_overview.png b/docs/release/configguide/imgs/visibility_overview.png Binary files differnew file mode 100644 index 0000000..f986440 --- /dev/null +++ b/docs/release/configguide/imgs/visibility_overview.png diff --git a/docs/release/configguide/imgs/visibility_system_counts_response_times.png b/docs/release/configguide/imgs/visibility_system_counts_response_times.png Binary files differnew file mode 100644 index 0000000..a456a61 --- /dev/null +++ b/docs/release/configguide/imgs/visibility_system_counts_response_times.png diff --git a/docs/release/configguide/jmeter_config_guide.rst b/docs/release/configguide/jmeter_config_guide.rst index de1d2f5..78858d0 100644 --- a/docs/release/configguide/jmeter_config_guide.rst +++ b/docs/release/configguide/jmeter_config_guide.rst @@ -15,7 +15,7 @@ the Clover Gambia release. Overview ========= -Apache JMeter is a mature, open source application that supports web client emulation. Its +`Apache JMeter`_ is a mature, open source application that supports web client emulation. Its functionality has been integrated into the Clover project to allow various CI validations and performance tests to be performed. The system under test can either be REST services/APIs directly or a set of L7 network services. In the latter scenario, Clover nginx servers may @@ -77,7 +77,7 @@ the command shown below:: $ cloverctl create system validation -t latest -The Clover CLI will add master/slave pods to the k8s cluster in the default namespace. +The Clover CLI will add master/slave pods to the k8s cluster in the default namespace. The JMeter master/slave docker images will automatically be pulled from the OPNFV public Dockerhub registry. Deployments and respective services will be created with three slave @@ -147,7 +147,7 @@ Creating a test plan -------------------- To employ a test plan that can be used against the :ref:`sdc_config_guide` sample, navigate to - cloverctl yaml directory and use the sample named 'jmeter_testplan.yaml', which is shown below. +cloverctl yaml directory and use the sample named 'jmeter_testplan.yaml', which is shown below. .. code-block:: bash @@ -296,3 +296,5 @@ in the k8s cluster. $ docker rmi opnfv/clover-jmeter-master $ docker rmi opnfv/clover-jmeter-slave $ docker rmi opnfv/clover-controller + +.. _Apache JMeter: https://jmeter.apache.org/ diff --git a/docs/release/configguide/visibility_config_guide.rst b/docs/release/configguide/visibility_config_guide.rst new file mode 100644 index 0000000..77db2f7 --- /dev/null +++ b/docs/release/configguide/visibility_config_guide.rst @@ -0,0 +1,403 @@ +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 +.. SPDX-License-Identifier CC-BY-4.0 +.. (c) Authors of Clover + +.. _visibility_config_guide: + +============================================== +Clover Visibility Services Configuration Guide +============================================== + +This document provides a guide to use Clover visibility services, which are initially delivered in +the Clover Gambia release. A key assumption of this guide is that Istio 1.0.x has been deployed +to Kubernetes (k8s), as it is a foundational element for Clover visibility services. + +Overview +========= + +Clover visibility services are an integrated set of microservices that allow HTTP/gRPC traffic to +be observed and analyzed in an Istio service mesh within k8s managed clusters. It leverages +observability open source projects from the CNCF community such as Jaeger for distributed tracing +and Prometheus for monitoring. These tools are packaged with Istio and service mesh sidecars have +extensive hooks built in to interface with them. They gather low-level, per HTTP request driven +data. Clover visibility services focus on enriching the data, gathering it from various sources +and analyzing it at the system or aggregate level. + +The visibility services are comprised of the following microservices all deployed within the +**clover-system** namespace in a k8s cluster: + + * **clover-controller** - exposes REST interface external to the k8s cluster and + used to relay messages to other Clover services via gRPC from external agents including + **cloverctl** CLI, web browsers and other APIs, scripts or CI jobs. It incorporates a web + application with dashboard views to consume analyzed visibility data and control other + Clover services. + + * **clover-collector** - gathers data from tracing (Jaeger) and monitoring (Prometheus) + infrastructure that is integrated with Istio using a pull model. + + * **clover-spark** - is a Clover specific Apache Spark service. It leverages Spark 2.3.x native + k8s support and includes visibility services artifacts to execute Spark jobs. + + * **clover-spark-submit** - simple service to continually perform Spark job submits interacting + with the k8s API to spawn driver and executor pods. + + * **cassandra** - a sink for visibility data from **clover-collector** with specific schemas + for monitoring and tracing. + + * **redis** - holds configuration data and analyzed data for visibility services. Used by + **clover-controller** web application and REST API to maintain state and exchange data. + +The table below shows key details of the visibility service manifests outlined above: + ++---------------------+----------------------+---------------------------+-----------------------+ +| Service | Kubernetes | Docker Image | Ports | +| | Deployment App Name | | | ++=====================+======================+===========================+=======================+ +| Controller | clover-controller | opnfv/clover-controller | HTTP: 80 (external) | +| | | | gRPC: 50052, 50054 | +| | | | | ++---------------------+----------------------+---------------------------+-----------------------+ +| Collector | clover-collector | opnfv/clover-collector | Jaeger: 16686 | +| | | | Prometheus: 9090 | +| | | | gRPC: 50054 | +| | | | Datastore: 6379, 9042 | ++---------------------+----------------------+---------------------------+-----------------------+ +| Spark | clover-spark | opnfv/clover-spark | Datastore: 6379, 9042 | +| | clover-spark-submit | opnfv/clover-spark-submit | | +| | | | | +| | | | | +| | | | | ++---------------------+----------------------+---------------------------+-----------------------+ +| Data Stores | cassandra | cassandra:3 | 9042 | +| | redis | k8s.gcr.io/redis:v1 | 6379 | +| | | kubernetes/redis:v1 | | ++---------------------+----------------------+---------------------------+-----------------------+ + +The **redis** and **cassandra** data stores use community container images while the other +services use Clover-specific Dockerhub OPNFV images. + +Additionally, visibility services are operated with the **cloverctl** CLI. Further information on +setting up **clover-controller** and **cloverctl** can be found at +:ref:`controller_services_config_guide`. + + +.. image:: imgs/visibility_overview.png + :align: center + :scale: 100% + +The diagram above shows the flow of data through the visibility services where all blue arrows +denote the path of data ingestion originating from the observability tools. The +**clover-collector** reads data from these underlying tools using their REST query interfaces +and inserts into schemas within the **cassandra** data store. + +Apache Spark jobs are used to analyze data within **cassandra**. Spark is deployed using native +Kubernetes support added since Spark version 2.3. The **clover-spark-submit** +container continually submits jobs to the Kubernetes API. The API spawns a Spark driver pod which +in turn spawns executor pods to run Clover-specific jobs packaged in the **clover-spark** +service. + +Analyzed data from **clover-spark** jobs is written to **redis**, an in-memory data store. The +**clover-controller** provides a REST API for the analyzed visibility data to be read by other +services (**cloverctl**, CI jobs, etc.) or viewed using a Clover provided visibility web +dashboard. + +Deploying the visibility engine +=============================== + +.. _visibility_prerequisites: + +Prerequisites +------------- + +The following assumptions must be met before continuing on to deployment: + + * Installation of Docker has already been performed. It's preferable to install Docker CE. + * Installation of k8s in a single-node or multi-node cluster with at least + twelve cores and 16GB of memory. Google Kubernetes Engine (GKE) clusters are supported. + * Installation of Istio in the k8s cluster. See :ref:`sdc_deploy_container`. + * Clover CLI (**cloverctl**) has been downloaded and setup. Instructions to deploy can be found + at :ref:`controller_services_controller`. + + +Deploy with Clover CLI +---------------------- + +To deploy the visibility services into your k8s cluster use the **cloverctl** CLI command +shown below:: + + $ cloverctl create system visibility + +Container images with the Gambia release tag will pulled if the tag is unspecified. The release +tag is **opnfv-7.0.0** for the Gambia release. To deploy the latest containers from master, use +the command shown below:: + + $ cloverctl create system visibility -t latest + + Using config file: /home/earrage/.cloverctl.yaml + Creating visibility services + Created clover-system namespace + Created statefulset "cassandra". + Created service "cassandra" + Created pod "redis". + Created service "redis" + Created deployment "clover-collector". + Image: opnfv/clover-collector:latest + Created service "clover-collector" + Created deployment "clover-controller". + Image: opnfv/clover-controller:latest + Created service "clover-controller-internal" + Created serviceaccount "clover-spark". + Created clusterrolebinding "clover-spark-default". + Created clusterrolebinding "clover-spark". + Created deployment "clover-spark-submit". + Image: opnfv/clover-spark-submit:latest + +Verifying the deployment +------------------------ + +To verify the visibility services deployment, ensure the following pods have been deployed +with the command below:: + + $ kubectl get pod --all-namespaces + + NAMESPACE NAME READY STATUS + clover-system clover-collector-7dcc5d849f-6jc6m 1/1 Running + clover-system clover-controller-74d8596bb5-qrr6b 1/1 Running + clover-system cassandra-0 1/1 Running + clover-system redis 2/2 Running + clover-system clover-spark-submit-6c4d5bcdf8-kc6l9 1/1 Running + +Additionally, spark driver and executor pods will continuously be deployed as displayed below:: + + clover-system clover-spark-0fa43841362b3f27b35eaf6112965081-driver + clover-system clover-spark-fast-d5135cdbdd8330f6b46431d9a7eb3c20-driver + clover-system clover-spark-0fa43841362b3f27b35eaf6112965081-exec-3 + clover-system clover-spark-0fa43841362b3f27b35eaf6112965081-exec-4 + +Initializing visibility services +================================ + +In order to setup visibility services, initialization and start commands must be +invoked from the **cloverctl** CLI. There are sample yaml files in yaml directory +from the **cloverctl** binary path. Navigate to this directory to execute the next +sequence of commands. + +Initialize the visibility schemas in cassandra with the following command:: + + $ cloverctl init visibility + + Using config file: /home/earrage/.cloverctl.yaml + clover-controller address: http://10.145.71.21:32044 + Added visibility schemas in cassandra + +The initial configuration to the visibility services are the Jaeger tracing and Prometheus +connection parameters and sample interval to **clover-collector**. To start visibility +use the sample yaml provided and execute the command:: + + cloverctl start visibility -f start_visibility.yaml + + Started collector on pid: 44 + +The ``start_visibility.yaml`` has defaults for the tracing and monitoring modules packaged with +Istio 1.0.0. + +Configure and control visibility +================================ + +The core requirement for Clover visibility services to function, is for your services to be +added to the Istio service mesh. Istio deployment and usage instructions are in the +:ref:`sdc_config_guide` and the Service Delivery Controller (SDC) sample can be used to +evaluate the Clover visibility services initially. A user may inject their own web-based services +into the service mesh and track separately. + +Connecting to visibility dashboard UI +------------------------------------- + +The **clover-controller** service comes packaged with a web-based UI with a visibility view. +To access the dashboard, navigate to the **clover-controller** address for either a ``NodePort`` +or ``LoadBalancer`` service + + * http://<node or CNI IP address>:<``NodePort`` port>/ + * http://<``LoadBalancer`` IP address>/ + +See :ref:`exposing_clover_controller` to expose **clover-controller** externally with a k8s +service. + +Set runtime parameters using Clover CLI +--------------------------------------- + +The services visibility will track are based on the deployment/pod names specified in the k8s +resources. Using some sample services from the SDC guide, the **proxy-access-control**, +**clover-server1**, **clover-server2** and **clover-server3** services are specified in the +``set_visibility.yaml`` sample yaml referenced below. + +To modify the configuration of the services visibility will track, use the **cloverctl CLI**, +executing the following command:: + + cloverctl set visibility -f set_visibility.yaml + +Use the ``services:`` section of the yaml to configure service names to track. + +.. code-block:: bash + + # set_visibility.yaml + services: + - name: proxy_access_control + - name: clover_server1 + - name: clover_server2 + - name: clover_server3 + metric_prefixes: + - prefix: envoy_cluster_outbound_9180__ + - prefix: envoy_cluster_inbound_9180__ + metric_suffixes: + - suffix: _default_svc_cluster_local_upstream_rq_2xx + - suffix: _default_svc_cluster_local_upstream_cx_active + custom_metrics: + - metric: envoy_tracing_zipkin_spans_sent + +Set runtime parameters using dashboard UI +----------------------------------------- + +The services being tracked by visibility can also be configured by selecting from the +boxes under **Discovered Services** within the dashboard, as shown in the graphic below. +Services can be multi-selected by using by holding the ``Ctrl`` or ``command`` (Mac OS) +keyboard button down while selecting or unselecting. The SDC services that were configured from +the **cloverctl** CLI above are currently active, denoted as the boxes with blue backgrounds. + +.. image:: imgs/visibility_discovered_active.png + :align: center + :scale: 100% + +In order for any services to be discovered from Jaeger tracing and displayed within the dashboard, +some traffic must target the services of interest. Using curl/wget to send HTTP requests +to your services will cause services to be discovered. Using Clover JMeter validation services, +as detailed :ref:`jmeter_config_guide` against SDC sample services will also generate a service +listing. The **cloverctl** CLI commands below will generate traces through the SDC service chain +with the JMeter master injected into the service mesh:: + + $ cloverctl create testplan –f yaml/jmeter_testplan.yaml # yaml located with cloverctl binary + $ cloverctl start testplan + +Clearing visibility data +------------------------- + +To clear visibility data in cassandra and redis, which truncates **cassandra** tables and +deletes or zeros out **redis** keys, use the following command:: + + $ cloverctl clear visibility + +This can be useful when analyzing or observing an issue during a particular time horizon. +The same function can be performed from the dashboard UI using the ``Clear`` button under +``Visibility Controls``, as illustrated in the graphic from the previous section. + +Viewing visibility data +======================== + +The visibility dashboard can be used to view visibility data in real-time. The page will +automatically refresh every 5 seconds. To disable continuous page refresh and freeze on a +snapshot of the data, use the slider at the top of the page that defaults to ``On``. Toggling +it will result in it displaying ``Off``. + +The visibility dashboard displays various metrics and graphs of analyzed data described in +subsequent sections. + +System metrics +-------------- + +System metrics provide aggregate counts of cassandra tables including total traces, spans +and metrics, as depicted on the left side of the graphic below. + +.. image:: imgs/visibility_system_counts_response_times.png + :align: center + :scale: 100% + +The metrics counter will continually increase, as it is based on time series data from +Prometheus. The trace count will correspond to the number of HTTP requests sent to services +within the Istio service mesh. The span count ties to trace count, as it is a child object +under Jaeger tracing data hierarchy and is based on the service graph (number of interactions +between microservices for a given request). It will increase more rapidly when service graph +depths are larger. + +Per service response times +-------------------------- + +Per service response times are displayed on the right side of the graphic above and are +calculated from tracing data when visibility is started. The minimum, maximum and average +response times are output over the entire analysis period. + +Group by span field counts +-------------------------- + +This category groups schema fields in various combinations to gain insight into the composition +of HTTP data and can be used by CI scripts to perform various validations. Metrics include: + + * Per service + * Distinct URL + * Distinct URL / HTTP status code + * Distinct user-agent (HTTP header) + * Per service / distinct URL + +The dashboard displays bar/pie charts with counts and percentages, as depicted below. Each distinct +key is displayed when hovering your mouse over a chart value. + +.. image:: imgs/visibility_distinct_counts.png + :align: center + :scale: 100% + +Distinct HTTP details +--------------------- + +A listing of distinct HTTP user-agents, request URLs and status codes is shown below divided +with tabs. + +.. image:: imgs/visibility_distinct_http.png + :align: center + :scale: 100% + + +Monitoring Metrics +------------------ + +The Istio sidecars (Envoy) provide a lengthy set of metrics exposed through Prometheus. These +metrics can be analyzed with the visibility service by setting up metrics, as outlined in section +`Set runtime parameters using Clover CLI`_. Use ``metric_prefixes`` and ``metric_suffixes`` +sections of the set visibility yaml for many Envoy metrics that have a key with the service +straddled by a prefix/suffix. A row in the table and a graph will be displayed for each +combination of service, prefix and suffix. + +The metrics are displayed in tabular and scatter plots over time formats from the dashboard, as +shown in the graphic below: + +.. image:: imgs/visibility_monitoring_metrics.png + :align: center + :scale: 100% + +Uninstall from Kubernetes envionment +==================================== + +Delete with Clover CLI +---------------------- + +When you're finished working with Clover visibility services, you can uninstall them with the +following command:: + + $ cloverctl delete system visibility + +The command above will remove the SDC sample services, Istio components and Jaeger/Prometheus +tools from your Kubernetes environment. + +Uninstall from Docker environment +================================= + +The OPNFV docker images can be removed with the following commands: + +.. code-block:: bash + + $ docker rmi opnfv/clover-collector + $ docker rmi opnfv/clover-spark + $ docker rmi opnfv/clover-spark-submit + $ docker rmi opnfv/clover-controller + $ docker rmi k8s.gcr.io/redis + $ docker rmi kubernetes/redis + $ docker rmi cassandra:3 |