Consolidating Milvus Across AZs

I migrated a Milvus (standalone) deployment from a three-AZ Kubernetes setup (us-east-1a/b/c) to a single dedicated node in us-east-1b. I preserved all vector data, reduced infra to one m6a.xlarge node, consolidated all PVCs to 1b

Published
Updated
Reading
5 min
Consolidating Milvus Across AZs

I migrated a Milvus (standalone) deployment from a three-AZ Kubernetes setup (us-east-1a/b/c) to a single dedicated node in us-east-1b. I preserved all vector data, reduced infra to one m6a.xlarge node, consolidated all PVCs to 1b, and restored four collections with full integrity while untangling a handful of AWS, Kubernetes, Helm, and etcd knots.

This post documents the end-to-end path: decisions, traps, exact commands, and final checks. No fluff.


Context

  • Cluster: cm.montai.k8s.local (kops)
  • Namespace: milvus
  • Milvus: (standalone), deployed via Helm (`zilliztech/milvus )
  • Initial pain: PVCs spanned three AZs, forcing nodes in all three to satisfy volume affinity. Standalone Milvus didn’t need multi-AZ.

Goal: One dedicated nodegroup with taints in us-east-1b, with all PVCs in 1b, zero data loss.


Strategy in One Page

  1. Create a dedicated instance group with taints, pinned to us-east-1b.
  2. Add nodeSelectors/tolerations for Milvus, etcd, MinIO via Helm values.
  3. Snapshot the PVCs; restore into 1b (EBS volumes can’t cross AZs; snapshots can).
  4. Repair etcd membership from snapshot using ETCD_FORCE_NEW_CLUSTER=true.
  5. Bring up MinIO (object storage with vector data), then Milvus.
  6. Validate collections and segments; clean up.

Design calls:

  • Snapshots over cloning to cross AZ boundaries.
  • Preserve MinIO (data), rebuild etcd (metadata) from snapshot with FORCE_NEW_CLUSTER.
  • Single AZ for simplicity and cost (dev/test trade-off accepted).

What Went Wrong (and How I Fixed It)

1) AZ mismatch blocking scheduling

  • Symptom: volume node affinity conflict.
  • Cause: Node still in 1a; PVCs bound to 1b.
  • Fix: Move IG to 1b, apply, delete old node(s) to force recreation in 1b.

2) StatefulSet PVCs stuck in old AZs

  • Reality: PVC zone affinity is immutable.
  • Fix: VolumeSnapshot → delete PVC → recreate PVC from snapshot; let CSI bind in 1b.

3) Missing IAM for snapshot restores

  • Symptom: UnauthorizedOperation on ec2:CreateVolume from snapshot.

Fix: Add:

{ "Effect": "Allow", "Action": "ec2:CreateVolume", "Resource": "arn:aws:ec2:*:*:snapshot/*" }

Restart EBS CSI controller.

4) etcd membership deadlock after restore

  • Symptom: CrashLoop, “No active endpoints in cluster”.
  • Cause: Restored data contained old member IPs.
  • Fix (disaster recovery):
    • Restore only etcd-0 PVC from snapshot.
    • Scale to 3; etcd-1/2 join fresh.

Start one replica with:

kubectl set env statefulset/milvus-release-etcd \
  ETCD_FORCE_NEW_CLUSTER=true ETCD_INITIAL_CLUSTER_STATE=new -n milvus
kubectl scale statefulset milvus-release-etcd -n milvus --replicas=1

5) “Missing collections” scare

  • Reality: Milvus stores metadata in etcd and vectors in MinIO.
  • Fix: Once etcd metadata was restored from snapshot, Milvus mapped names→IDs and loaded segments. Data intact.

6) PVC selector immutability

  • Lesson: Don’t try to patch PVC zone/selector. Use snapshot→recreate. With WaitForFirstConsumer, node placement determines AZ.

Step-By-Step Execution

Phase 1 - Prep

Dedicated nodegroup (1b, tainted):

kops edit ig milvus --state s3://kops-cm-montai-com-state-store
# Set subnets: [us-east-1b], taint: milvus.io/node=cpu:NoSchedule
kops update cluster cm.montai.k8s.local --state s3://... --yes

Helm values with selectors/tolerations (Milvus/etcd/MinIO):

# /tmp/milvus-migration-values.yaml
standalone:
  {
    nodeSelector: { kops.k8s.io/instancegroup: milvus },
    tolerations: [{ key: milvus.io/node, operator: Equal, value: cpu, effect: NoSchedule }],
  }
etcd:
  {
    nodeSelector: { kops.k8s.io/instancegroup: milvus },
    tolerations: [{ key: milvus.io/node, operator: Equal, value: cpu, effect: NoSchedule }],
    replicaCount: 3,
  }
minio:
  {
    nodeSelector: { kops.k8s.io/instancegroup: milvus },
    tolerations: [{ key: milvus.io/node, operator: Equal, value: cpu, effect: NoSchedule }],
    replicaCount: 4,
    persistence: { size: 500Gi },
  }

Create VolumeSnapshots:

kubectl apply -f VolumeSnapshotClass(ebs.csi.aws.com)
kubectl apply -f snapshots for etcd-0, etcd-2, minio-0, minio-1
kubectl wait volumesnapshot/<name> -n milvus --for=jsonpath='{.status.readyToUse}'=true --timeout=300s

Phase 2 - IAM

  • Add ec2:CreateVolume on arn:aws:ec2:*:*:snapshot/*; restart EBS CSI controller.

Phase 3 - Scale down & delete old PVCs

kubectl scale sts milvus-release-etcd -n milvus --replicas=0
kubectl scale sts milvus-release-minio -n milvus --replicas=0
kubectl delete deploy milvus-release-standalone -n milvus
kubectl delete pvc data-milvus-release-etcd-{0,2} export-milvus-release-minio-{0,1} -n milvus

Phase 4 - Restore PVCs into 1b

# Recreate PVCs from snapshots (no selector); CSI will place in 1b
kubectl apply -f restored-PVCs.yaml

Phase 5 - Lock IG to 1b & replace nodes

kops edit ig milvus  # ensure only us-east-1b
kops update cluster --yes
kubectl delete node <nodes in 1a/1c>

Phase 6 - Bring up MinIO

kubectl scale sts milvus-release-minio -n milvus --replicas=4
kubectl wait -n milvus -l app.kubernetes.io/name=minio --for=condition=Ready pod --timeout=300s

Phase 7 - etcd recovery

kubectl delete pvc data-milvus-release-etcd-1 -n milvus
kubectl set env sts/milvus-release-etcd ETCD_FORCE_NEW_CLUSTER=true ETCD_INITIAL_CLUSTER_STATE=new -n milvus
kubectl scale sts milvus-release-etcd -n milvus --replicas=1
kubectl wait pod/milvus-release-etcd-0 -n milvus --for=condition=Ready --timeout=120s
kubectl delete pvc data-milvus-release-etcd-2 -n milvus
kubectl scale sts milvus-release-etcd -n milvus --replicas=3

Phase 8 - Start Milvus

helm upgrade milvus-release zilliztech/milvus -n milvus --reuse-values -f /tmp/milvus-migration-values.yaml
kubectl wait -n milvus -l app.kubernetes.io/name=milvus --for=condition=Ready pod --timeout=300s

Phase 9 - Cleanup extra nodes

  • Verify all pods on the single node; cordon/drain/delete any stragglers.

Validation Checklist

Infra:

kubectl get nodes -l kops.k8s.io/instancegroup=milvus -o custom-columns='NODE:.metadata.name,ZONE:.metadata.labels.topology\.kubernetes\.io/zone,STATUS:.status.conditions[-1].type'
kubectl get pods -n milvus -o wide
for pvc in $(kubectl get pvc -n milvus -o jsonpath='{.items[*].metadata.name}'); do
  vol=$(kubectl get pvc $pvc -n milvus -o jsonpath='{.spec.volumeName}')
  zone=$(kubectl get pv $vol -o jsonpath='{.spec.nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values[0]}')
  echo "$pvc | $zone"
done
# Expect all zones = us-east-1b

Milvus health & data:

kubectl get pods -n milvus
kubectl logs -l app.kubernetes.io/name=milvus -n milvus --tail=200 | grep -i "Auditor loaded segment metadata"
kubectl port-forward -n milvus svc/milvus-release 19530:19530 &
python - <<'EOF'
from pymilvus import connections, utility
connections.connect(host="localhost", port="19530")
print(utility.list_collections())
EOF
pkill -f "port-forward.*milvus"

MinIO contents (sanity):

kubectl exec -it milvus-release-minio-0 -n milvus -- ls /export/milvus-bucket/file/index_files/

Lessons You Can Reuse

Kubernetes

  • Snapshot first. It’s the only sane way to cross AZs with EBS.
  • With CSI WaitForFirstConsumer, node placement → AZ. Don’t fight PVC immutability.
  • Use values files for Helm upgrades; avoid subchart auth traps.

AWS

  • EBS volumes don’t cross AZs; snapshots do (within region).
  • IAM for CSI is granular: creating a volume from a snapshot needs explicit rights.

Distributed Milvus

  • In Milvus: MinIO = data, etcd = metadata. Protect MinIO PVCs; snapshot etcd.
  • etcd DR: Start one restored member with ETCD_FORCE_NEW_CLUSTER=true, then scale.

Final State

  • Single node (m6a.xlarge) in us-east-1b, tainted and isolated.
  • All PVCs consolidated in 1b (2,070 Gi total).
  • Services: Milvus standalone, etcd (3), MinIO (4).
  • Data: Four collections, 12 segments, full integrity.

Optional Cleanup & Monitoring

Watch stability and resources:

kubectl top node
kubectl top pods -n milvus
kubectl logs -l app=milvus-release -n milvus --since=24h | grep -i error

Remove snapshots if policy allows:

kubectl delete volumesnapshot -n milvus snapshot-etcd-{0,2} snapshot-minio-{0,1}

Appendix - Minimal IAM Addition for Snapshot Restore

{
	"Effect": "Allow",
	"Action": "ec2:CreateVolume",
	"Resource": "arn:aws:ec2:*:*:snapshot/*"
}

Add to your EBS CSI controller role, then restart the controller.


Outcome: single-AZ Milvus, clean scheduling, lower cost, no data loss, reproducible steps.

All posts →