Lab: Scheduling the Voting App for Production
Objectives
By the end of this lab, you will be able to:
- Label cluster nodes for scheduling experiments and verify label application
- Apply node affinity to place postgres on SSD-labeled nodes for optimal I/O performance
- Use pod anti-affinity to spread vote replicas across different nodes for high availability
- Configure taints and tolerations for dedicated workload placement
- Verify scheduling decisions using kubectl and understand how to debug scheduling failures
Prerequisites
Before starting this lab, ensure you have:
- Completed Module 0: Introduction and Getting Started
- A running KIND cluster with 2 worker nodes (standard 3-node cluster from Module 0)
- The Example Voting App deployed and functional from Module 0
- kubectl CLI configured to communicate with your cluster
- Basic understanding of Kubernetes Deployments and Services
This lab uses the standard 3-node KIND cluster (1 control-plane + 2 workers) from Module 0. We'll demonstrate scheduling concepts across these 2 worker nodes.
Setup
This lab builds on the Voting App deployment from Module 0. You do NOT need to create a new cluster or redeploy the application. We'll enhance the existing deployment with scheduling rules.
Step 1: Verify cluster status
kubectl get nodes
Expected output:
NAME STATUS ROLES AGE VERSION
voting-app-control-plane Ready control-plane 1d v1.32.0
voting-app-worker Ready <none> 1d v1.32.0
voting-app-worker2 Ready <none> 1d v1.32.0
You should see 1 control-plane node and 2 worker nodes, all in Ready status.
Step 2: Verify Voting App is running
kubectl get pods -o wide
Expected: All Voting App pods (vote, result, worker, redis, postgres) should be Running. Note which nodes they're currently on - they're likely distributed randomly.
Step 3: Verify Voting App functionality
kubectl port-forward svc/vote 8080:80 &
sleep 2
curl -s -o /dev/null -w "%{http_code}" http://localhost:8080
pkill -f "port-forward svc/vote"
Expected output: 200 (Voting App is accessible)
Tasks
Task 1: Label Nodes for Scheduling
We'll simulate a production cluster where nodes have different hardware characteristics. In this case, we'll label one node as having SSD storage and the others as having HDD storage.
Step 1: Label voting-app-worker as an SSD node
kubectl label nodes voting-app-worker disktype=ssd
Expected output:
node/voting-app-worker labeled
Step 2: Label voting-app-worker2 as an HDD node
kubectl label nodes voting-app-worker2 disktype=hdd
Step 3: Verify labels were applied
kubectl get nodes -L disktype
Expected output:
NAME STATUS ROLES AGE VERSION DISKTYPE
voting-app-control-plane Ready control-plane 1d v1.32.0
voting-app-worker Ready <none> 1d v1.32.0 ssd
voting-app-worker2 Ready <none> 1d v1.32.0 hdd
The -L disktype flag shows the disktype label value for each node. We're simulating SSD vs HDD nodes. In production, these labels would represent real hardware differences.
Task 2: Node Affinity for Postgres
Databases perform better on fast storage. Let's ensure postgres always runs on the SSD node using node affinity.
Step 1: Create postgres deployment with node affinity
Create a file named postgres-affinity.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
labels:
app: voting-app
tier: backend
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
tier: backend
spec:
affinity:
nodeAffinity:
# REQUIRED: Must schedule on nodes with SSD
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
containers:
- name: postgres
image: postgres:15-alpine
env:
- name: POSTGRES_USER
value: "postgres"
- name: POSTGRES_PASSWORD
value: "postgres"
ports:
- containerPort: 5432
name: postgres
Step 2: Apply the updated deployment
kubectl apply -f postgres-affinity.yaml
Expected output:
deployment.apps/postgres configured
Step 3: Force postgres to reschedule
Since scheduling happens once at pod creation, we need to delete the existing postgres pod to trigger rescheduling:
kubectl delete pod -l app=postgres
Wait a few seconds for the new pod to start:
kubectl wait --for=condition=ready pod -l app=postgres --timeout=60s
Step 4: Verify postgres is on the SSD node
kubectl get pod -l app=postgres -o wide
Expected output shows the pod on voting-app-worker (our SSD node):
NAME READY STATUS RESTARTS AGE NODE
postgres-xxxxx-xxxxx 1/1 Running 0 30s voting-app-worker
You can also use jsonpath to extract just the node name:
kubectl get pod -l app=postgres -o jsonpath='{.items[0].spec.nodeName}'
Expected: voting-app-worker
This demonstrates required node affinity. The postgres pod MUST schedule on a node with disktype=ssd. If no such node existed, the pod would stay Pending forever.
Task 3: Pod Anti-Affinity for Vote High Availability
Right now, if all vote replicas land on the same node and that node fails, your entire voting frontend goes down. Let's spread vote replicas across different nodes using pod anti-affinity.
Step 1: Scale vote to 2 replicas
First, let's scale vote to 2 replicas to demonstrate anti-affinity across our 2 worker nodes:
kubectl scale deployment vote --replicas=2
Step 2: Check current vote pod distribution
kubectl get pods -l app=vote -o wide
You'll likely see vote pods distributed randomly, possibly both on the same node.
Step 3: Create vote deployment with pod anti-affinity
Create a file named vote-antiaffinity.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: vote
labels:
app: voting-app
tier: frontend
spec:
replicas: 2
selector:
matchLabels:
app: vote
template:
metadata:
labels:
app: vote
tier: frontend
spec:
affinity:
podAntiAffinity:
# PREFERRED: Try to spread replicas across different nodes
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- vote
# Spread across different nodes (topology key = hostname)
topologyKey: kubernetes.io/hostname
containers:
- name: vote
image: dockersamples/examplevotingapp_vote
ports:
- containerPort: 80
name: http
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
Step 4: Apply the updated deployment
kubectl apply -f vote-antiaffinity.yaml
Step 5: Force vote pods to reschedule
kubectl delete pods -l app=vote
Wait for new pods to start:
kubectl wait --for=condition=ready pod -l app=vote --timeout=60s
Step 6: Verify vote pods are spread across nodes
kubectl get pods -l app=vote -o wide
Expected output shows each vote pod on a different node:
NAME READY STATUS RESTARTS AGE NODE
vote-xxxxx-aaaaa 1/1 Running 0 20s voting-app-worker
vote-xxxxx-bbbbb 1/1 Running 0 20s voting-app-worker2
Now if one node goes down, 1/2 of your vote capacity survives. This is high availability through pod anti-affinity.
Note we used preferred anti-affinity, not required. This works perfectly with our 2-node cluster - each replica gets its own node. With required anti-affinity and more replicas than nodes, extra replicas would stay Pending.
Task 4: Taints and Tolerations
Taints repel pods from nodes. Let's taint kind-worker to see how it affects scheduling, then add a toleration to postgres so it can still run there.
Step 1: Taint voting-app-worker
kubectl taint nodes voting-app-worker dedicated=database:NoSchedule
Expected output:
node/voting-app-worker tainted
This taint means "no new pods can schedule on voting-app-worker unless they tolerate the taint."
Step 2: Try to scale vote replicas to 3
kubectl scale deployment vote --replicas=3
Wait a moment, then check pod distribution:
kubectl get pods -l app=vote -o wide
Notice the new (3rd) vote pod does NOT schedule on voting-app-worker (the tainted node). It goes to voting-app-worker2 instead. The taint is working - it repels pods that don't have a matching toleration.
Step 3: Check if postgres is affected
kubectl get pod -l app=postgres -o wide
Postgres is still running on voting-app-worker because taints with the NoSchedule effect don't evict existing pods. They only prevent new pods from scheduling.
Step 4: Delete postgres to see the taint in action
kubectl delete pod -l app=postgres
Wait a moment and check:
kubectl get pod -l app=postgres -o wide
What happened? The postgres pod is Pending! Check the events:
kubectl describe pod -l app=postgres | grep -A 5 Events
You'll see events like:
Events:
Warning FailedScheduling 1s default-scheduler 0/3 nodes are available: 1 node(s) had untolerated taint {dedicated: database}, 2 node(s) didn't match Pod's node affinity/selector
The scheduler can't place postgres because:
- The node affinity requires
disktype=ssd(only voting-app-worker has this) - But voting-app-worker is tainted, and postgres doesn't have a toleration
Step 5: Add toleration to postgres
Update postgres-affinity.yaml to include a toleration:
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
labels:
app: voting-app
tier: backend
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
tier: backend
spec:
# Add toleration for the database taint
tolerations:
- key: dedicated
operator: Equal
value: database
effect: NoSchedule
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
containers:
- name: postgres
image: postgres:15-alpine
env:
- name: POSTGRES_USER
value: "postgres"
- name: POSTGRES_PASSWORD
value: "postgres"
ports:
- containerPort: 5432
name: postgres
Step 6: Apply the updated deployment
kubectl apply -f postgres-affinity.yaml
The Pending postgres pod should now schedule successfully:
kubectl get pod -l app=postgres -o wide
Expected: Postgres is back on voting-app-worker, combining toleration (permission) and affinity (attraction).
Step 7: Verify vote pods still avoid the tainted node
kubectl get pods -l app=vote -o wide
All 4 vote pods should be on kind-worker2 and kind-worker3. None are on kind-worker because they lack the toleration.
Key Learning: Taints REPEL pods. Tolerations grant permission but don't attract. You need affinity to attract pods to specific nodes.
Step 8: Scale vote back to 3 replicas
kubectl scale deployment vote --replicas=3
Task 5: Combined Strategy - Dedicated Database Node
Let's put it all together. We want voting-app-worker to be a dedicated database node: fast storage (SSD), isolated from general workloads (tainted), with postgres preferring to run there.
We already have:
- Label:
disktype=ssdon voting-app-worker - Taint:
dedicated=database:NoScheduleon voting-app-worker - Postgres affinity: requires
disktype=ssd - Postgres toleration: tolerates
dedicated=database
Step 1: Verify the complete configuration
kubectl describe node voting-app-worker | grep -A 5 Labels
kubectl describe node voting-app-worker | grep Taints
Expected:
- Labels include
disktype=ssd - Taints include
dedicated=database:NoSchedule
Step 2: Verify postgres placement
kubectl get pod -l app=postgres -o wide
Expected: Postgres is on voting-app-worker (the dedicated database node)
Step 3: Verify other pods avoid voting-app-worker
kubectl get pods -o wide | grep -v postgres | grep "voting-app-worker[^2]"
Expected: Only vote pods that existed before the taint (no new non-database pods on voting-app-worker)
Step 4: Check overall pod distribution
kubectl get pods -o wide
You should see:
- Postgres: voting-app-worker (dedicated database node)
- Vote (3 replicas): 1 on voting-app-worker (pre-taint), 2 on voting-app-worker2
- Result, worker, redis: voting-app-worker2
This is a realistic production pattern: dedicated nodes for stateful workloads, spread replicas for HA, keep workloads isolated.
Challenge: Scheduling Failure Debugging
Let's intentionally create a scheduling failure and learn how to debug it.
Step 1: Create a deployment with invalid node affinity
Create a file named broken-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: broken-app
spec:
replicas: 1
selector:
matchLabels:
app: broken-app
template:
metadata:
labels:
app: broken-app
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- nvme # This label doesn't exist!
containers:
- name: nginx
image: nginx:1.25
Step 2: Apply the broken deployment
kubectl apply -f broken-deployment.yaml
Step 3: Check pod status
kubectl get pods -l app=broken-app
Expected output:
NAME READY STATUS RESTARTS AGE
broken-app-xxxxx-xxxxx 0/1 Pending 0 30s
The pod is stuck in Pending state.
Step 4: Debug the scheduling failure
kubectl describe pod -l app=broken-app
Look at the Events section at the bottom:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 20s default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector
The error message tells you: "didn't match Pod's node affinity." This means no nodes have the required label.
Step 5: Verify the problem
kubectl get nodes -L disktype
You'll see nodes have disktype=ssd or disktype=hdd, but none have disktype=nvme.
Step 6: Fix the issue (Option 1: Add the label)
kubectl label nodes kind-worker3 disktype=nvme
kubectl get pods -l app=broken-app -w
The pod should schedule to kind-worker3 within seconds.
Step 7: Clean up
kubectl delete deployment broken-app
kubectl label nodes kind-worker3 disktype=hdd --overwrite
Key Learning: Always verify node labels exist BEFORE writing affinity rules. Use kubectl get nodes --show-labels to check. Pod stuck in Pending + "didn't match affinity" = label mismatch.
Verification
Confirm your scheduling configuration is working correctly:
1. Check postgres placement on SSD node
kubectl get pod -l app=postgres -o jsonpath='{.items[0].spec.nodeName}'
Expected output: kind-worker
2. Verify vote replicas are spread across nodes
kubectl get pods -l app=vote -o wide | awk '{print $7}' | tail -n +2 | sort | uniq -c
Expected: Each node should have at most 2 vote pods (with 3 replicas across 2 untainted nodes)
3. Confirm no non-database pods on tainted node
kubectl get pods -o wide | grep -v postgres | grep kind-worker
Expected: No output (no non-database pods on kind-worker)
4. Test Voting App functionality
kubectl port-forward svc/vote 8080:80 &
sleep 2
curl -s -o /dev/null -w "%{http_code}" http://localhost:8080
pkill -f "port-forward svc/vote"
Expected output: 200 (Voting App still works after all scheduling changes)
Cleanup
Do NOT clean up the resources from this lab. Module 2 (Autoscaling) builds on the current state.
However, we should remove the taint from voting-app-worker to have a cleaner starting point for Module 2:
kubectl taint nodes voting-app-worker dedicated=database:NoSchedule-
The - at the end removes the taint.
Verify the taint is removed:
kubectl describe node voting-app-worker | grep Taints
Expected output: Taints: <none>
Keep all labels and affinity rules. They don't hurt and demonstrate the evolution of our Voting App configuration.
Troubleshooting
Issue: Pod Stuck in Pending with "didn't match Pod's node affinity"
Symptom: Pod shows Pending status indefinitely. kubectl describe pod shows event: "didn't match Pod's node affinity/selector"
Cause: The node affinity rule references a label that doesn't exist on any nodes in the cluster.
Solution:
# Check what labels exist on nodes
kubectl get nodes --show-labels
# Check which label your pod is looking for
kubectl get pod <pod-name> -o yaml | grep -A 10 affinity
# Either add the label to a node
kubectl label nodes <node-name> <key>=<value>
# Or update the pod's affinity rule to match existing labels
# Edit the deployment and apply the changes
Issue: Pod with Toleration Schedules to Untainted Node
Symptom: You added a toleration to a pod and expected it to schedule on the tainted node, but it went to a different node instead.
Cause: Toleration grants PERMISSION to schedule on tainted nodes, but doesn't ATTRACT pods there. You need node affinity to attract.
Solution:
# Combine toleration (permission) with node affinity (attraction)
# Update your deployment to include both:
# 1. Toleration for the taint
# 2. Node affinity targeting a label on the tainted node
See Task 5 for a complete example of combining taints, tolerations, and affinity.
Issue: Voting App Breaks After Rescheduling
Symptom: After deleting pods to trigger rescheduling, the Voting App stops working. Vote submissions don't appear in results.
Cause: Network connectivity issues between components, or postgres data loss if the pod was deleted without persistent storage.
Solution:
# Verify all pods are Running
kubectl get pods
# Check pod logs for errors
kubectl logs -l app=worker
kubectl logs -l app=redis
kubectl logs -l app=postgres
# Test connectivity between components
kubectl exec -it deploy/worker -- ping redis
kubectl exec -it deploy/worker -- ping postgres
# Restart all pods if needed
kubectl rollout restart deployment vote result worker redis postgres
Issue: Taint Doesn't Seem to Work
Symptom: You tainted a node but pods still schedule there.
Cause: Existing pods are not affected by taints with NoSchedule effect. Only new pods respect the taint.
Solution:
# Verify the taint is applied
kubectl describe node <node-name> | grep Taints
# Delete existing pods to force rescheduling
kubectl delete pods <pod-name>
# For immediate eviction, use NoExecute effect instead
kubectl taint nodes <node-name> key=value:NoExecute
# Warning: This evicts running pods immediately
Key Takeaways
- Node labels are prerequisites for affinity rules - always verify labels exist before writing affinity rules, use
kubectl get nodes --show-labels - Required vs preferred affinity - required rules block scheduling if unmet, preferred rules add weights to scoring but allow exceptions
- Pod anti-affinity spreads replicas - use
topologyKey: kubernetes.io/hostnameto spread pods across nodes for high availability - Taints repel, affinity attracts - taints push pods away, tolerations grant permission, affinity pulls pods toward nodes
- Production patterns combine all three - dedicated nodes need taints (isolation), tolerations (permission), and affinity (attraction)