Scheduling¶
The Machineuse scheduler intelligently places instances across worker nodes.
Overview¶
The scheduler runs on the control plane and makes placement decisions based on:
- Node resource availability
- User preferences (affinity)
- Load balancing goals
- Historical performance
Scheduling Algorithm¶
Weighted Score Calculation¶
Each node receives a score based on available resources:
Score = Σ (weight_i × normalized_resource_i)
Where:
- weight_i = configured weight for resource type
- normalized_resource_i = available / total (0-1)
Default Weights¶
| Resource | Weight | Rationale |
|---|---|---|
| Memory | 0.40 | Browser instances are memory-intensive |
| CPU | 0.30 | Rendering and JavaScript execution |
| Disk | 0.20 | Snapshots and temporary files |
| Network | 0.10 | Stream bandwidth |
Example Calculation¶
Node: worker-1
- CPU: 60% available → 0.60
- Memory: 40% available → 0.40
- Disk: 80% available → 0.80
- Network: 90% available → 0.90
Score = (0.30 × 0.60) + (0.40 × 0.40) + (0.20 × 0.80) + (0.10 × 0.90)
= 0.18 + 0.16 + 0.16 + 0.09
= 0.59
Placement Process¶
┌──────────────┐
│ API Request │
│ Create Inst. │
└──────┬───────┘
│
▼
┌──────────────┐
│ Validate │
│ Request │
└──────┬───────┘
│
▼
┌──────────────┐ ┌──────────────┐
│ Filter │────►│ Insufficient │
│ Candidates │ │ Resources │
└──────┬───────┘ └──────────────┘
│
▼
┌──────────────┐
│ Score │
│ Nodes │
└──────┬───────┘
│
▼
┌──────────────┐
│ Select │
│ Best Node │
└──────┬───────┘
│
▼
┌──────────────┐
│ Dispatch │
│ to Worker │
└──────────────┘
Step 1: Filter Candidates¶
Eliminate nodes that cannot host the instance:
def filter_candidates(nodes, request):
candidates = []
for node in nodes:
if node.status != "online":
continue
if node.instances_count >= node.max_instances:
continue
if node.available_memory < request.memory_mb:
continue
if node.available_cpu < request.cpu_cores:
continue
candidates.append(node)
return candidates
Step 2: Score Candidates¶
Calculate weighted score for each candidate:
def score_node(node, weights):
cpu_score = node.available_cpu / node.total_cpu
mem_score = node.available_memory / node.total_memory
disk_score = node.available_disk / node.total_disk
net_score = node.available_network / node.total_network
return (
weights['cpu'] * cpu_score +
weights['memory'] * mem_score +
weights['disk'] * disk_score +
weights['network'] * net_score
)
Step 3: Select Best¶
Choose the node with the highest score, applying any tiebreakers:
def select_best(candidates, scores):
# Sort by score descending
ranked = sorted(
zip(candidates, scores),
key=lambda x: x[1],
reverse=True
)
# Tiebreaker: prefer node with fewer instances
best_score = ranked[0][1]
ties = [n for n, s in ranked if s == best_score]
return min(ties, key=lambda n: n.instances_count)
Affinity and Anti-Affinity¶
Node Affinity¶
Request placement on specific nodes:
Strength levels: - required: Fail if node unavailable - preferred: Use if available, otherwise best alternative
Anti-Affinity¶
Spread instances across nodes:
Instances in the same anti-affinity group are placed on different nodes when possible.
Load Balancing¶
Even Distribution¶
The scheduler naturally balances load through resource-based scoring. Nodes with more available resources receive higher scores.
Rebalancing¶
Automatic rebalancing when: - Node utilization exceeds threshold (default: 80%) - Node goes offline - Manual trigger
# Trigger rebalancing
machineuse-cli cluster rebalance
# Rebalance specific node
machineuse-cli cluster rebalance --source worker-1
Migration¶
Move instances between nodes:
# Migrate specific instance
machineuse-cli instance migrate abc123 --target worker-2
# Migrate all from a node
machineuse-cli node drain worker-1
Configuration¶
Scheduling Weights¶
{
"scheduling": {
"algorithm": "weighted_score",
"weights": {
"cpu": 0.30,
"memory": 0.40,
"disk": 0.20,
"network": 0.10
}
}
}
Rebalancing Thresholds¶
{
"scheduling": {
"rebalance_threshold": 0.80,
"rebalance_interval_seconds": 300,
"min_instances_for_rebalance": 5
}
}
Placement Constraints¶
{
"scheduling": {
"constraints": {
"max_instances_per_node": 50,
"min_memory_headroom_mb": 1024,
"min_cpu_headroom_percent": 10
}
}
}
Monitoring¶
Scheduler Metrics¶
Output:
Scheduler Status: active
Pending Requests: 0
Average Placement Time: 45ms
Placements (24h): 150
Failed Placements: 2
Node Distribution:
worker-1: 15 instances (45% load)
worker-2: 23 instances (67% load)
worker-3: 12 instances (35% load)
Placement History¶
Best Practices¶
- Balance weights based on your workload characteristics
- Set appropriate thresholds to prevent resource exhaustion
- Use anti-affinity for high-availability deployments
- Monitor scheduler metrics to identify bottlenecks
- Plan capacity to ensure headroom for scheduling flexibility