aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2016-09-27Merge "Add integration with Manila CephFS Native driver"Jenkins4-0/+81
2016-09-27Merge "A few major-upgrade issues"Jenkins3-25/+46
2016-09-27Merge "Start mongod before calling ceilometer-dbsync"Jenkins1-0/+7
2016-09-27Merge "Reinstantiate parts of code that were accidentally removed"Jenkins2-0/+9
2016-09-27Merge "Neutron metadata agent worker count fix"Jenkins1-3/+10
2016-09-27Merge "Remove double definition of config_settings key in keystone"Jenkins1-1/+0
2016-09-26Merge "Bind MySQL address to hostname appropriate to its network"Jenkins2-1/+14
2016-09-26Remove double definition of config_settings key in keystoneJuan Antonio Osorio Robles1-1/+0
Change-Id: I291bfb1e5736864ea504cd82eea1d4001fcdd931
2016-09-26Bind MySQL address to hostname appropriate to its networkJuan Antonio Osorio Robles2-1/+14
This now takes into use the mysql_bind_host key, to set an appropriate fqdn for mysql to bind to. Closes-Bug: #1627060 Change-Id: I50f4082ea968d93b240b6b5541d84f27afd6e2a3 Depends-On: I316acfd514aac63b84890e20283c4ca611ccde8b
2016-09-25get_param calls with multiple arguments need brackets around themMichele Baldessari8-19/+19
This issue was spotted during major upgrade where we had calls like this: servers: {get_param: servers, Controller} These get_param calls are hanging indefinitely and make the whole upgrade end in a timeout. We need to put brackets around the get_param function when there are multiple arguments: http://docs.openstack.org/developer/heat/template_guide/hot_spec.html#get-param This is already done in most of the tree, and the few places where this was not happening were parts not under CI. After this change the following grep returns only one false positive: grep -ir get_param: |grep -v -- '\[' |grep ',' Change-Id: I65b23bb44f37b93e017dd15a5212939ffac76614 Closes-Bug: #1626628
2016-09-25A few major-upgrade issuesMichele Baldessari3-25/+46
This commit does the following: 1. We now explicitly disable/stop and then remove the resources that are moving to systemd. We do this because we want to make sure they are all stopped before doing a yum upgrade, which otherwise would take ages due to rabbitmq and galera being down. It is best if we do this via pcs while we do the HA Full -> HA NG migration because it is simpler to make sure all the services are stopped at that stage. For extra safety we can still do a check by hand. By doing it via pacemaker we have the guarantee that all the migrated services are down already when we stop the cluster (which happens to be a syncronization point between all controller nodes). That way we can be certain that they are all down on all nodes before starting the yum upgrade process. 2. We actually need to start the systemd services in major_upgrade_controller_pacemaker_2.sh and not stop them. 3. We need to use the proper bash variable name 4. Use is_bootstrap_node everywhere to make the code more consistent Change-Id: Ic565c781b80357bed9483df45a4a94ec0423487c Closes-Bug: #1627490
2016-09-25Start mongod before calling ceilometer-dbsyncMichele Baldessari1-0/+7
Currently we in major_upgrade_controller_pacemaker_2.sh we are calling ceilometer-dbsync before mongod is actually started (only galera is started at this point). This will make the dbsync hang indefinitely until the heat stack times out. Now this approach should be okay, but do note that when we start mongod via systemctl we are not guaranteed that it will be up on all nodes before we call ceilometer-dbsync. This *should* be okay because ceilometer-dbsync keeps retrying and eventually one of the nodes will be available. A completely clean fix here would be to add another step in heat to have the guarantee that all mongo servers are up and running before the dbsync call. Change-Id: I10c960b1e0efdeb1e55d77c25aebf1e3e67f17ca Closes-Bug: #1627453
2016-09-25Reinstantiate parts of code that were accidentally removedMichele Baldessari2-0/+9
With commit fb25385d34e604d2f670cebe3e03fd57c14fa6be "Rework the pacemaker_common_functions for M..N upgrades" we accidentally removed some lines that fixed M/N upgrade issues. Namely: extraconfig/tasks/major_upgrade_controller_pacemaker_1.sh -# https://bugzilla.redhat.com/show_bug.cgi?id=1284047 -# Change-Id: Ib3f6c12ff5471e1f017f28b16b1e6496a4a4b435 -crudini --set /etc/ceilometer/ceilometer.conf DEFAULT rpc_backend rabbit -# https://bugzilla.redhat.com/show_bug.cgi?id=1284058 -# Ifd1861e3df46fad0e44ff9b5cbd58711bbc87c97 Swift Ceilometer middleware no longer exists -crudini --set /etc/swift/proxy-server.conf pipeline:main pipeline "catch_errors healthcheck cache ratelimit tempurl formpost authtoken keystone staticweb proxy-logging proxy-server" -# LP: 1615035, required only for M/N upgrade. -crudini --set /etc/nova/nova.conf DEFAULT scheduler_host_manager host_manager extraconfig/tasks/major_upgrade_controller_pacemaker_2.sh nova-manage db sync - nova-manage api_db sync This patch simply puts that code back without reverting the whole commit that broke things, because that is needed. Closes-Bug: #1627448 Change-Id: I89124ead8928ff33e6b6907a7c2178169e91f4e6
2016-09-23Merge "Remove hard-coded roles in EnabledServices output"Jenkins1-5/+3
2016-09-23Add integration with Manila CephFS Native driverErno Kuvaja4-0/+81
Enables configuring CephFS Native backend for Manila. This change is based on the usage of environments like in review https://review.openstack.org/#/c/354019 for Netapp driver. Co-Authored-By: Marios Andreou <marios@redhat.com> Change-Id: If013d796bcdfe48b2c995bcab462c89c360b7367 Depends-On: I918f6f23ae0bd3542bcfe1bf0c797d4e6aa8f4d9 Depends-On: I2b537f735b8d1be8f39e8c274be3872b193c1014
2016-09-23Move keystone::auth into service_config_settingsDan Prince19-101/+152
This patch moves the keystone::auth settings for all services into the new service_config_settings section. This is important because we execute the keystone commands via puppet only on the role containing the keystone service and without these settings it will fail. Note that yaql merging/filtering is used here to ensure that service_config_settings is optional in service templates, and also that we'll only deploy hieradata for a given service on a node running the service (the key in the service_config_settings map must match the service_name in the service template for this to work). e.g the following will result in only deploying keystone: 123 in hiera on the role running the "keystone" service, regardless of which service template defines it. service_config_settings: keystone: keystone: 123 Co-Authored-By: Steven Hardy <shardy@redhat.com> Change-Id: I0c2fce037a1a38772f998d582a816b4b703f8265 Closes-bug: 1620829
2016-09-23Merge "Tolerate missing keys from role_data in service templates"Jenkins1-6/+10
2016-09-23Merge "explicitly set fluentd service_provider"Jenkins1-0/+1
2016-09-23Merge "No-op Puppet for upgrades/migrations according to composable roles"Jenkins3-15/+3
2016-09-23Remove hard-coded roles in EnabledServices outputSteven Hardy1-5/+3
This was missed during custom-roles work, and will mean deployments break if any of the existing roles are removed from roles_data.yaml Change-Id: Ia737b48a0dd272f8d706b7458764201fa47cb0bb Closes-Bug: #1625755
2016-09-23Merge "Make apache-based services use network-dependent servername"Jenkins4-1/+30
2016-09-22Neutron metadata agent worker count fixBrent Eagles1-3/+10
This patch changes the default value and type of the NeutronWorkers parameter, allowing it to be unset and let a system-dependent value to be used (e.g. processorcount or some derivate value). Change-Id: Ia385b3503fe405c4b981c451f131ac91e1af5602 Closes-Bug: #1626126
2016-09-22explicitly set fluentd service_providerLars Kellogg-Stedman1-0/+1
the konstantin-fluentd package assumes sysv init scripts, while the fluentd package in rhel(/centos/fedora) uses systemd. this can cause errors starting the service. This review explicitly sets the service_provider to "systemd". This requires https://github.com/soylent/konstantin-fluentd/pull/15, which exposes the service_provider parameter in konstantin-fluentd. Change-Id: I24332203de33f56a0e49fcc15f7fb7bb576e8752
2016-09-22Tolerate missing keys from role_data in service templatesSteven Hardy1-6/+10
Currently we have a few keys which may be considered optional, such as monitoring_subscription, logging* and global_config_settings. Currently we dereference these directly via get_attr, but this will break when heat output validation is fixed, ref bug #1599114 is fixed (patches are up for this, so it may be soon). Change-Id: If4eed1ca39c10ace9b1cb5ce2dc4b9c70a3dd2f4 Partial-Bug: #1620829
2016-09-22No-op Puppet for upgrades/migrations according to composable rolesJiri Stransky3-15/+3
Our previous no-ops stopped working because the Puppet run resources moved under a different entry in resource registry. This is now fixed to follow the latest way. Change-Id: Ia5598385ddca185bfbf10e2d3babb53f6f77d1ac Closes-Bug: #1626452
2016-09-22Merge "Make sure major upgrade script fails."Jenkins2-0/+3
2016-09-21Merge "Provide for RAM-constrained environments"Jenkins2-0/+24
2016-09-21Merge "Glance worker count fix"Jenkins2-6/+20
2016-09-21Merge "Define step input as a Number type"Jenkins5-0/+15
2016-09-21Merge "Update capabilities-map.yaml"Jenkins1-22/+282
2016-09-21Merge "Set Neutron's metadata_ip to the nova metadata VIP"Jenkins1-6/+1
2016-09-21Define step input as a Number typeSteven Hardy5-0/+15
Currently we pass numbers in (hard-coded in post.j2.yaml) but the SoftwareConfig schema defaults to String. If puppet requires an integer number, setting this type may help preserve the type for the hook. Change-Id: Ie9227d7adb58ea3c791aa459a1ab5b17ad935919
2016-09-21Glance worker count fixJoe Talerico2-6/+20
This patch changes the default value and type of the Glance worker configuration to allow it to be unset and allow a system dependent default to be used (e.g. processorcount or some derivative value). The previous default of 0 would result in a single self contained process, which while suitable for debugging and testing is not appropriate for production deployments. Partial-Bug: #1626126 Change-Id: I58a6a72a581e7083e1dc4e5ca568fdd3fdd6cdf1
2016-09-21Provide for RAM-constrained environmentsJiri Stransky2-0/+24
We hit problems in environments which don't have a lot of RAM (e.g. dev envs, could be also CI) that Apache ate too much memory due to too many worker processes being spawned. This commit allows customizing the Apache MaxRequestWorkers and ServerLimit directives via Heat parameters. The default stays 256 as that's the default in the Puppet module, to be suited for production environments with powerful machines. Also low-memory-usage.yaml environment file is added, which can be used to make dev/test/CI overclouds less memory hungry, where the limits are now set to 32. Change-Id: Ibcf1d9c3326df8bb5b380066166c4ae3c4bf8d96 Co-Authored-By: Carlos Camacho <ccamacho@redhat.com> Closes-Bug: #1619205
2016-09-21Make defaults from roles_data.yaml more robustSteven Hardy2-13/+24
The previous logic left out the default Count completely when it was zero, which breaks nested validation and it's likely similar problems would exist with the other optional defaults, so rework it so the defaulting happens in the jinja2 logic, and document the interfaces better in roles_data.yaml Change-Id: I7f2eb4a3a0b43c5d2cd0d001ed3c73f783c95c74 Closes-Bug: #1625760
2016-09-21Merge "Enable L3 HA when multiple controllers and no DVR"Jenkins1-3/+25
2016-09-21Make apache-based services use network-dependent servernameJuan Antonio Osorio Robles4-1/+30
Currently the servername is incorrectly set for the services running over apache. It currently takes the default value which is just the regular FQDN, when the services actually might be running on different IPs that require alternative FQDNs. This fixes that by filling that value from a fact in hiera that's dependant on the service's network. Closes-Bug: #1625677 Change-Id: Ib7ea5fd2d18a376eaa2f5a3fa5687cb9b719a8e2
2016-09-21Make sure major upgrade script fails.Sofer Athlan-Guyot2-0/+3
Running upgrade-non-controller.sh against compute and object storage did not fail if the /root/tripleo_upgrade_node.sh failed. This make it harder to detect error in CI system for instance. Change-Id: I12b7d640547d3b8ec1f70104d159d6052b7638ff Closes-Bug: 1620973
2016-09-20Merge "RabbitMQ threads should be configured dynamically"Jenkins1-1/+1
2016-09-20Set Neutron's metadata_ip to the nova metadata VIPBrent Eagles1-6/+1
The neutron metadata agent's metadata_ip field is meant to refer to the nova metadata service, not the local address on the NeutronApiNetwork. Change-Id: Ibb25a80ea3e66ab3f5cf63c197460d495939778d Closes-Bug: #1625504
2016-09-20Add nova-metadata templateJuan Antonio Osorio Robles3-0/+36
This is needed because currently we're not generating nova_metadata_vip or nova_metadata_nodes_ip, and a service profile is required for that. Unfortunately, currently puppet-nova only deploys osapi and metadata through the same manifest, so this profile doesn't really inject any puppet code. We can make this more elegant later. Change-Id: Id7112111f16d0c749a6203b90e29e6d9f1e4d57e Closes-Bug: #1625543
2016-09-20RabbitMQ threads should be configured dynamicallyMichele Baldessari1-1/+1
Currently in puppet/services/rabbitmq.yaml we hardcode the thread pool size to 30 (via the +A30 snippet): rabbitmq_environment: RABBITMQ_SERVER_ERL_ARGS: '"+K true +A30 +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<<5000:64/native>>}] -kernel inet_default_listen_options [{raw,6,18,<<5000:64/native>>}]"' Upstream rabbit has gained the ability to dynamically configure the number of threads since 3.6.2 via the following commit: https://github.com/rabbitmq/rabbitmq-server/commit/41ce5ad808863944cd6d62ce7f7e2271f1010582 Given that the default was hardcoded in rabbit from at least 3.4.0 up until 3.6.2 (see LP bug associated to this commit), we can actually remove this hardcoded value as it overrides a sane default. Before the change: /usr/lib64/erlang/erts-7.3.1/bin/beam.smp -W w -A 64 -K true -A30 -P 1048576 ... After the change: /usr/lib64/erlang/erts-7.3.1/bin/beam.smp -W w -A 64 -K true -P 1048576 ... So effectively with this change we will have the following: - With older rabbitmq versions we keep the +A30 default - With rabbitmq versions >= 3.6.2 the thread number is dynamically computed to nr_cpus * 16 Change-Id: I8d30c7d141c29fcc439d40fc767498520be7966e Closes-Bug: #1625486
2016-09-19Enable L3 HA when multiple controllers and no DVRBrent Eagles1-3/+25
This patch conditionally enables Neutron L3 HA if there are multiple controllers but DVR has not been enabled. If the conditions are false, the value of NeutronL3HA is used. Change-Id: If1ebeaf417c0da99d833450e394b71cabff2c800 Closes-Bug: #1623155
2016-09-19Merge "Add a function to upgrade from full HA to NG HA"Jenkins3-16/+137
2016-09-19Merge "Set VNC URL parameters for nova-compute"Jenkins1-0/+3
2016-09-19Add a function to upgrade from full HA to NG HAMichele Baldessari3-16/+137
This is the initial work to have a function that migrates a full HA architecture as deployed in Mitaka to the HA architecture as deployed in Newton where only a few resources are managed by pacemaker. The sequence is the following: 1) We remove the desired services from pacemaker's control. The services at this point are still running normally via the systemd service as invoked by pacemaker 2) We do a "systemctl stop <service>" on all controllers for all the services that were removed from pacemaker's control. We do this to make sure that during the yum upgrade, the %post sections that call "systemctl try-restart" do not take ages, because at this point during the upgrade rabbit is down. The only exceptions are "openstack-core" and "delay" which are dummy pacemaker resources that do not exist on the system 3) We do a "systemctl start <service>" on all nodes for all the services mentioned above. We should probably merge this patch only when newton has branched as it is very specific to the M/N upgrade. Closes-Bug: 1617520 Change-Id: I4c409ce58c1a57b6e0decc3cf168b62698b32e39
2016-09-19Use osd_pool_default_* puppet parameters when creating the poolsGiulio Fidente1-3/+6
While it is possible to override the pg_num, pgp_num and size for each pool, the defaults are hardcoded. This patch uses as default the values given via ceph::profile::params::osd_pool_default_* parameters, if any. Closes-Bug: 1623590 Change-Id: Iecde772e7f72fd9abedb54cff4b8f2605df8fedd
2016-09-17Merge "M/N upgrade sahara-api fails to restart."Jenkins1-0/+2
2016-09-17Merge "Add fluentd client service"Jenkins66-0/+678
2016-09-17Merge "Move rabbit's clustering port away from the ephemeral port range"Jenkins1-3/+3