Node Migration is a functionality which allows Clusterman to recycle nodes of a pool according to various criteria, in order to reduce the amount of manual work necessary when performing infrastructure migrations.
NOTE: this is only compatible with Kubernetes clusters.
Node Migration Batch¶
The Node Migration batch is the entrypoint of the migration logic. It takes care of fetching migration trigger events, spawning the worker processes actually performing the node recycling procedures, and monitoring their health.
Batch specific configuration values are described as part of the main service configuration in Service Configuration.
The batch code can be invoked from the
clusterman.batch.node_migration Python module.
The behaviour of the migration logic for a pool is controlled by the
node_migration section of the pool configuration.
The allowed values for the migration settings are as follows:
max_uptime: if set, monitor nodes’ uptime to ensure it stays lower than the provided value; human readable time string (e.g. 30d).
event: if set to
true, accept async migration trigger for this pool; details about event triggers are described below in Migration Event Trigger.
rate: rate at which nodes are selected for termination; percentage or absolute value (required).
prescaling: if set, pool size (in nodes) is increased by this amount before performing node recycling; percentage or absolute value (0 by default). This directly sets a capacity value for the pool if autoscaling is disabled, or applies a temporary capacity offset otherwise.
precedence: precedence with which nodes are selected for termination:
highest_uptime: select older nodes first (default);
lowest_task_count: select node with fewer running tasks first;
az_name_alphabetical: group nodes by availability zone, and select group in alphabetical order;
bootstrap_wait: indicative time necessary for a node to be ready to run workloads after boot; human readable time string (3 minutes by default).
bootstrap_timeout: maximum wait for nodes to be ready after boot; human readable time string (10 minutes by default).
allowed_failed_drains: allow for up to this many nodes to fail draining and be requeued before aborting (3 by default)
disable_autoscaling: turn off autoscaler while recycling instances (false by default).
ignore_pod_health: avoid loading and checking pod information to determine pool health (false by default).
health_check_interval: how much to wait between checks when monitoring pool health (2 minutes by default).
orphan_capacity_tollerance: acceptable ratio of orphan capacity over target capacity to still consider the pool healthy (float, 0 by default, max 0.2).
max_uptime_worker_skips: maximum number of times the uptime monitoring worker can skip churning nodes due to unsatisfied pool capacity (6 by default, set to 0 to always allow skipping).
expected_duration: estimated duration for migration of the whole pool; human readable time string (1 day by default).
See Pool Configuration for how an example configuration block would look like.
Migration Event Trigger¶
Migration trigger events are submitted as Kubernetes custom resources of type
They can be easily generated and submitted by using the
clusterman migrate CLI sub-command and it related options.
In case jobs for a pool need to be stopped, it is possible to use the
clusterman migrate-stop utility.
The manifest for the custom resource defintion is as follows:
--- apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: name: nodemigrations.clusterman.yelp.com spec: scope: Cluster group: clusterman.yelp.com names: plural: nodemigrations singular: nodemigration kind: NodeMigration versions: - name: v1 served: true storage: true schema: openAPIV3Schema: type: object required: - spec properties: spec: type: object required: - cluster - pool - condition properties: cluster: type: string pool: type: string label_selectors: type: array items: type: string condition: type: object properties: trait: type: string enum: [kernel, lsbrelease, instance_type, uptime] target: type: string operator: type: string enum: [gt, ge, eq, ne, lt, le, in, notin]
In more readable terms, an example resource manifest would look like:
--- apiVersion: "clusterman.yelp.com/v1" kind: NodeMigration metadata: name: my-test-migration-220912 labels: clusterman.yelp.com/migration_status: pending spec: cluster: kubestage pool: default condition: trait: uptime operator: lt target: 90d
The fields in each migration event allow to control which nodes are affected by the event and what is the desired final condition for them. More specifically:
cluster: name of the cluster to be targeted.
pool: name of the pool to be targeted.
label_selectors: list of additional Kubernetes label selectors to filter affected nodes.
condition: the desired final state for the node, i.e. all nodes must be have kernel version higher than X.
trait: metadata to be compared; currently supports
operator: comparison operator; supports
target: right side of the comparison expression, e.g. a kernel version or an instance type; may be a single string or a comma separated list when using