Node Migration¶
Node Migration is a functionality which allows Clusterman to recycle nodes of a pool according to various criteria, in order to reduce the amount of manual work necessary when performing infrastructure migrations.
NOTE: this is only compatible with Kubernetes clusters.
Node Migration Batch¶
The Node Migration batch is the entrypoint of the migration logic. It takes care of fetching migration trigger events, spawning the worker processes actually performing the node recycling procedures, and monitoring their health.
Batch specific configuration values are described as part of the main service configuration in Service Configuration.
The batch code can be invoked from the clusterman.batch.node_migration
Python module.
Pool Configuration¶
The behaviour of the migration logic for a pool is controlled by the node_migration
section of the pool configuration.
The allowed values for the migration settings are as follows:
trigger
:max_uptime
: if set, monitor nodes’ uptime to ensure it stays lower than the provided value; human readable time string (e.g. 30d).event
: if set totrue
, accept async migration trigger for this pool; details about event triggers are described below in Migration Event Trigger.
strategy
:rate
: rate at which nodes are selected for termination; percentage or absolute value (required).prescaling
: if set, pool size (in nodes) is increased by this amount before performing node recycling; percentage or absolute value (0 by default). This directly sets a capacity value for the pool if autoscaling is disabled, or applies a temporary capacity offset otherwise.precedence
: precedence with which nodes are selected for termination:highest_uptime
: select older nodes first (default);lowest_task_count
: select node with fewer running tasks first;az_name_alphabetical
: group nodes by availability zone, and select group in alphabetical order;
bootstrap_wait
: indicative time necessary for a node to be ready to run workloads after boot; human readable time string (3 minutes by default).bootstrap_timeout
: maximum wait for nodes to be ready after boot; human readable time string (10 minutes by default).allowed_failed_drains
: allow for up to this many nodes to fail draining and be requeued before aborting (3 by default)
disable_autoscaling
: turn off autoscaler while recycling instances (false by default).ignore_pod_health
: avoid loading and checking pod information to determine pool health (false by default).health_check_interval
: how much to wait between checks when monitoring pool health (2 minutes by default).orphan_capacity_tollerance
: acceptable ratio of orphan capacity over target capacity to still consider the pool healthy (float, 0 by default, max 0.2).max_uptime_worker_skips
: maximum number of times the uptime monitoring worker can skip churning nodes due to unsatisfied pool capacity (6 by default, set to 0 to always allow skipping).expected_duration
: estimated duration for migration of the whole pool; human readable time string (1 day by default).
See Pool Configuration for how an example configuration block would look like.
Migration Event Trigger¶
Migration trigger events are submitted as Kubernetes custom resources of type nodemigration
.
They can be easily generated and submitted by using the clusterman migrate
CLI sub-command and it related options.
In case jobs for a pool need to be stopped, it is possible to use the clusterman migrate-stop
utility.
The manifest for the custom resource defintion is as follows:
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: nodemigrations.clusterman.yelp.com
spec:
scope: Cluster
group: clusterman.yelp.com
names:
plural: nodemigrations
singular: nodemigration
kind: NodeMigration
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
required:
- spec
properties:
spec:
type: object
required:
- cluster
- pool
- condition
properties:
cluster:
type: string
pool:
type: string
label_selectors:
type: array
items:
type: string
condition:
type: object
properties:
trait:
type: string
enum: [kernel, lsbrelease, instance_type, uptime]
target:
type: string
operator:
type: string
enum: [gt, ge, eq, ne, lt, le, in, notin]
In more readable terms, an example resource manifest would look like:
---
apiVersion: "clusterman.yelp.com/v1"
kind: NodeMigration
metadata:
name: my-test-migration-220912
labels:
clusterman.yelp.com/migration_status: pending
spec:
cluster: kubestage
pool: default
condition:
trait: uptime
operator: lt
target: 90d
The fields in each migration event allow to control which nodes are affected by the event and what is the desired final condition for them. More specifically:
cluster
: name of the cluster to be targeted.pool
: name of the pool to be targeted.label_selectors
: list of additional Kubernetes label selectors to filter affected nodes.condition
: the desired final state for the node, i.e. all nodes must be have kernel version higher than X.trait
: metadata to be compared; currently supportskernel
,lsbrelease
,instance_type
, oruptime
.operator
: comparison operator; supportsgt
,ge
,eq
,ne
,lt
,le
,in
,notin
.target
: right side of the comparison expression, e.g. a kernel version or an instance type; may be a single string or a comma separated list when usingin
/notin
operators.