1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
|
.. This work is licensed under a Creative Commons Attribution 4.0 International License.
.. http://creativecommons.org/licenses/by/4.0
Detailed architecture and message flows
=======================================
Within the Promise project we consider two different architectural options, i.e.
a *shim-layer* based architecture and an architecture targeting at full
OpenStack *integration*.
Shim-layer architecture
-----------------------
The *shim-layer architecture* is using a layer on top of OpenStack to provide
the capacity management, resource reservation, and resource allocation features.
Detailed Message Flows
^^^^^^^^^^^^^^^^^^^^^^
Note, that only selected parameters for the messages are shown. Refer to
:ref:`northbound_API` and Annex :ref:`yang_schema` for a full set of message
parameters.
Resource Capacity Management
""""""""""""""""""""""""""""
.. figure:: images/figure5_new.png
:name: figure5
:width: 90%
Capacity Management Scenario
:numref:`figure5` shows a detailed message flow between the consumers and the
capacity management functional blocks inside the shim-layer. It has the
following steps:
* Step 1a: The Consumer sends a *query-capacity* request to Promise
using some filter like time-windows or resource type. The capacity is
looked up in the shim-layer capacity map.
* Step 1b: The shim-layer will respond with information about the
total, available, reserved, and used (allocated) capacities matching the
filter.
* Step 2a: The Consumer can send *increase/decrease-capacity* requests
to update the capacity available to the reservation system. It can be
100% of available capacity in the given provider/source or only a subset,
i.e., it can allow for leaving some "buffer" in the actual NFVI to be
used outside the Promise shim-layer or for a different reservation
service instance. It can also be used to inform the reservation system
that from a certain time in the future, additional resources can be
reserved (e.g. due to a planned upgrade of the capacity), or the
available capacity will be reduced (e.g. due to a planned downtime of
some of the resources).
* Step 2b: The shim-layer will respond with an ACK/NACK message.
* Step 3a: Consumers can subscribe for capacity-change events using a
filter.
* Step 3b: Each successful subscription is responded with a
subscription_id.
* Step 4: The shim-layer monitors the capacity information for the
various types of resources by periodically querying the various
Controllers (e.g. Nova, Neutron, Cinder) or by creating event alarms in
the VIM (e.g. with Ceilometer for OpenStack) and updates capacity
information in its capacity map.
* Step 5: Capacity changes are notified to the Consumer.
Resource Reservation
""""""""""""""""""""
.. figure:: images/figure6_new.png
:name: figure6
:width: 90%
Resource Reservation for Future Use Scenario
:numref:`figure6` shows a detailed message flow between the Consumer and the
resource reservation functional blocks inside the shim-layer. It has the
following steps:
* Step 1a: The Consumer creates a resource reservation request for
future use by setting a start and end time for the reservation as well as
more detailed information about the resources to be reserved. The Promise
shim-layer will check the free capacity in the given time window and in
case sufficient capacity exists to meet the reservation request, will
mark those resources "reserved" in its reservation map.
* Step 1b: If the reservation was successful, a reservation_id and
status of the reservation will be returned to the Consumer. In case the
reservation cannot be met, the shim-layer may return information about
the maximum capacity that could be reserved during the requested time
window and/or a potential time window where the requested (amount of)
resources would be available.
* Step 2a: Reservations can be updated using an *update-reservation*,
providing the reservation_id and the new reservation_data. Promise
Reservation Manageer will check the feasibility to update the reservation
as requested.
* Step 2b: If the reservation was updated successfully, a
reservation_id and status of the reservation will be returned to the
Consumer. Otherwise, an appropriate error message will be returned.
* Step 3a: A *cancel-reservation* request can be used to withdraw an
existing reservation. Promise will update the reservation map by removing
the reservation as well as the capacity map by adding the freed capacity.
* Step 3b: The response message confirms the cancelation.
* Step 4a: Consumers can also issue *query-reservation* requests to
receive a list of reservation. An input filter can be used to narrow down
the query, e.g., only provide reservations in a given time window.
Promise will query its reservation map to identify reservations matching
the input filter.
* Step 4b: The response message contains information about all
reservations matching the input filter. It also provides information
about the utilization in the requested time window.
* Step 5a: Consumers can subscribe for reservation-change events using
a filter.
* Step 5b: Each successful subscription is responded with a
subscription_id.
* Step 6a: Promise synchronizes the available and used capacity with
the underlying VIM.
* Step 6b: In certain cases, e.g., due a failure in the underlying
hardware, some reservations cannot be kept up anymore and have to be
updated or canceled. The shim-layer will identify affected reservations
among its reservation records.
* Step 7: Subscribed Consumers will be informed about the updated
reservations. The notification contains the updated reservation_data and
new status of the reservation. It is then up to the Consumer to take
appropriate actions in order to ensure high priority reservations are
favored over lower priority reservations.
Resource Allocation
"""""""""""""""""""
.. figure:: images/figure7_new.png
:name: figure7
:width: 90%
Resource Allocation
:numref:`figure7` shows a detailed message flow between the Consumer, the
functional blocks inside the shim-layer, and the VIM. It has the following
steps:
* Step 1a: The Consumer sends a *create-instance* request providing
information about the resources to be reserved, i.e., provider_id
(optional in case of only one provider), name of the instance, the
requested flavour and image, etc. If the allocation is against an
existing reservation, the reservation_id has to be provided.
* Step 1b: If a reservation_id was provided, Promise checks if a
reservation with that ID exists, the reservation start time has arrived
(i.e. the reservation is active), and the required capacity for the
requested flavor is within the available capacity of the reservation. If
those conditions are met, Promise creates a record for the allocation
(VMState="INITIALIZED") and update its databases. If no reservation_id
was provided in the allocation request, Promise checks whether the
required capacity to meet the request can be provided from the available,
non-reserved capacity. If yes, Promise creates a record for the
allocation and update its databases. In any other case, Promise rejects
the *create-instance* request.
* Step 2: In the case the *create-instance* request was rejected,
Promise responds with a "status=rejected" providing the reason of the
rejection. This will help the Consumer to take appropriate actions, e.g.,
send an updated *create-instance* request. The allocation work flow will
terminate at this step and the below steps are not executed.
* Step 3a: If the *create-instance* request was accepted and a related
allocation record has been created, the shim-layer issues a
*createServer* request to the VIM Controller providing all information to
create the server instance.
* Step 3b: The VIM Controller sends an immediate reply with an
instance_id and starts the VIM-internal allocation process.
* Step 4: The Consumer gets an immediate response message with
allocation status "in progress" and the assigned instance_id.
* Step 5a+b: The consumer subscribes to receive notifications about
allocation events related to the requested instance. Promise responds
with an acknowledgment including a subscribe_id.
* Step 6: In parallel to the previous step, Promise shim-layer creates
an alarm in Aodh to receive notifications about all changes to the
VMState for instance_id.
* Step 7a: The VIM Controller notifies all instance related events to
Ceilometer. After the allocation has been completed or failed, it sends
an event to Ceilometer. This triggers the OpenStack alarming service Aodh
to notify the new VMState (e.g. ACTIVE and ERROR) to the shim-layer that
updates its internal allocation records.
* Step 7b: Promise sends a notification message to the subscribed
Consumer with information on the allocated resources including their new
VMState.
* Step 8a+b: Allocated instances can be terminated by the Consumer by
sending a *destroy-instance* request to the shim-layer. Promise responds
with an acknowledgment and the new status "DELETING" for the instance.
* Step 9a: Promise sends a *deleteServer* request for the instance_id
to the VIM Controller.
* Step 10a: After the instance has been deleted, an event alarm is
sent to the shim-layer that updates its internal allocation records and
capacity utilization.
* Step 10b: The shim-layer also notifies the subscribed Consumer about
the successfully destroyed instance.
Internal operations
^^^^^^^^^^^^^^^^^^^
.. note:: This section is to be updated
In the following, the internal logic and operations of the shim-layer will be
explained in more detail, e.g. the "check request" (step 1b in
:numref:`figure7` of the allocation work flow).
Integrated architecture
-----------------------
The *integrated architecture* aims at full integration with OpenStack.
This means that it is planned to use the already existing OpenStack APIs
extended with the reservation capabilities.
The advantage of this approach is that we don't need to re-model the
complex resource structure we have for the virtual machines and the
corresponding infrastructure.
The atomic item is the virtual machine with the minimum set of resources
it requires to be able to start it up. It is important to state that
resource reservation is handled on VM instance level as opposed to standalone
resources like CPU, memory and so forth. As the placement is an important
aspect in order to be able to use the reserved resources it provides the
constraint to handle resources in groups.
The placement constraint also makes it impossible to use a quota management
system to solve the base use case described earlier in this document.
OpenStack had a project called Blazar, which was created in order to provide
resource reservation functionality in cloud environments. It uses the Shelve
API of Nova, which provides a sub-optimal solution. Due to the fact that this
feature blocks the reserved resources this solution cannot be considered to
be final. Further work is needed to reach a more optimal stage, where the
Nova scheduler is intended to be used to schedule the resources for future
use to make the reservations.
Phases of the work
^^^^^^^^^^^^^^^^^^
The work has two main stages to reach the final solution. The following main work items
are on the roadmap for this approach:
#. Sub-optimal solution by using the shelve API of Nova through the Blazar project:
* Fix the code base of the Blazar project:
Due to integration difficulties the Blazar project got suspended. Since the last
activities in that repository the OpenStack code base and environment changed
significantly, which means that the project's code base needs to be updated to the
latest standards and has to be able to interact with the latest version of the
other OpenStack services.
* Update the Blazar API:
The REST API needs to be extended to contain the attributes for the reservation
defined in this document. This activity shall include testing towards the new API.
#. Use Nova scheduler to avoid blocking the reserved resources:
* Analyze the Nova scheduler:
The status and the possible interface between the resource reservation system and
the Nova scheduler needs to be identified. It is crucial to achieve a much more
optimal solution than what the current version of Blazar can provide. The goal is
to be able to use the reserved resources before the reservation starts. In order to
be able to achieve this we need the scheduler to do scheduling for the future
considering the reservation intervals that are specified in the request.
* Define a new design based on the analysis and start the work on it:
The design for the more optimal solution can be defined only after analyzing the
structure and capabilities of the Nova scheduler.
* This phase can be started in parallel with the previous one.
Detailed Message Flows
^^^^^^^^^^^^^^^^^^^^^^
.. note:: to be done
Resource Reservation
""""""""""""""""""""
.. note:: to be specified
|