Custom Resource Definitions
This document is generated with assistance from Qoder AI.
Introductionโ
Koordinator extends Kubernetes functionality through Custom Resource Definitions (CRDs) that enable advanced scheduling, resource management, and performance optimization. These CRDs provide declarative interfaces for configuring and managing various aspects of cluster behavior, including resource recommendations, colocation policies, quota management, reservations, and system-level objectives. This documentation provides comprehensive details on each CRD, including schema definitions, field descriptions, validation rules, and usage patterns.
API Groups and Versionsโ
Koordinator's CRDs are organized into several API groups, each serving a specific purpose within the system architecture:
- analysis.koordinator.sh: Provides resource optimization recommendations
- config.koordinator.sh: Manages colocation profiles for workload placement
- quota.koordinator.sh: Handles elastic quota profiles for resource allocation
- scheduling.koordinator.sh: Manages resource reservations and scheduling policies
- slo.koordinator.sh: Defines Service Level Objectives for node performance and resource quality
All CRDs are currently at version v1alpha1, indicating they are in early development and subject to change. The CRDs follow Kubernetes API conventions with proper metadata, spec, and status fields, and leverage controller patterns for reconciliation and lifecycle management.
Recommendation CRDโ
The Recommendation CRD in the analysis.koordinator.sh API group provides resource optimization suggestions for workloads based on historical usage patterns.
Schema Documentationโ
type Recommendation struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec RecommendationSpec `json:"spec,omitempty"`
Status RecommendationStatus `json:"status,omitempty"`
}
Spec Fieldsโ
- target: Defines the analysis target with the following subfields:
- type: Specifies the target type (workload or podSelector)
- workload: References a Kubernetes workload object when type is "workload"
- podSelector: Label selector for targeting specific pods when type is "podSelector"
Status Fieldsโ
- podStatus: Contains resource recommendations with:
- containerStatuses: Array of container-specific recommendations
- containerName: Name of the container
- resources: Recommended resource amounts (CPU, memory)
- containerStatuses: Array of container-specific recommendations
- updateTime: Timestamp of the last recommendation update
- conditions: Lifecycle and status conditions
Validation Rulesโ
- The target field is required
- Type must be either "workload" or "podSelector"
- When type is "workload", the workload field must be specified
- When type is "podSelector", the podSelector field must be specified
Default Valuesโ
No default values are explicitly defined in the schema.
Section sources
ClusterColocationProfile CRDโ
The ClusterColocationProfile CRD in the config.koordinator.sh API group defines policies for workload colocation and QoS configuration.
Schema Documentationโ
type ClusterColocationProfile struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ClusterColocationProfileSpec `json:"spec,omitempty"`
Status ClusterColocationProfileStatus `json:"status,omitempty"`
}
Spec Fieldsโ
- namespaceSelector: Label selector to filter namespaces (optional, defaults to all)
- selector: Label selector to filter pods (optional, defaults to all)
- probability: Probability (0-100%) that the profile will be applied
- qosClass: QoS class to assign (LSE, LSR, LS, BE, SYSTEM)
- priorityClassName: Priority class to inject into pods
- koordinatorPriority: Sub-priority value for Koordinator scheduling
- labels: Key-value pairs to inject as pod labels
- annotations: Key-value pairs to inject as pod annotations
- labelKeysMapping: Maps existing label keys to new keys
- annotationKeysMapping: Maps existing annotation keys to new keys
- labelSuffixes: Appends suffixes to existing label values
- schedulerName: Scheduler to dispatch pods
- patch: Raw extension for patching pod templates
Validation Rulesโ
- qosClass must be one of: LSE, LSR, LS, BE, SYSTEM
- namespaceSelector and selector follow standard Kubernetes label selector validation
- probability must be a valid integer or string percentage
Default Valuesโ
- namespaceSelector and selector default to empty (matches all)
- probability has no default (optional)
Section sources
ElasticQuotaProfile CRDโ
The ElasticQuotaProfile CRD in the quota.koordinator.sh API group manages quota profiles for flexible resource allocation.
Schema Documentationโ
type ElasticQuotaProfile struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ElasticQuotaProfileSpec `json:"spec,omitempty"`
Status ElasticQuotaProfileStatus `json:"status,omitempty"`
}
Spec Fieldsโ
- quotaName: Required name of the associated quota
- quotaLabels: Labels to apply to the quota
- resourceRatio: Ratio to address resource fragmentation (e.g., "0.9")
- nodeSelector: Required node selector to target specific nodes
Validation Rulesโ
- quotaName is required
- nodeSelector is required
- resourceRatio must be a valid string representation of a ratio
Default Valuesโ
No default values are explicitly defined.
Section sources
Reservation CRDโ
The Reservation CRD in the scheduling.koordinator.sh API group enables resource reservation for future workload placement.
Schema Documentationโ
type Reservation struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ReservationSpec `json:"spec,omitempty"`
Status ReservationStatus `json:"status,omitempty"`
}
Spec Fieldsโ
- template: Pod template specifying resource requirements and scheduling constraints
- owners: Array of reservation owners (object reference, controller reference, or label selector)
- ttl: Time-to-live duration (default: 24h)
- expires: Absolute expiration timestamp
- preAllocation: Allows overcommitment when set
- allocateOnce: Limits allocation to first successful owner (default: true)
- allocatePolicy: Allocation policy (Aligned, Restricted)
- unschedulable: Controls reservation schedulability
- taints: Taints applied to the reservation
Status Fieldsโ
- phase: Current phase (Pending, Available, Succeeded, Waiting, Failed)
- conditions: Status conditions with type, status, reason, and message
- currentOwners: Currently allocated owners
- nodeName: Node where resources are reserved
- allocatable: Total allocatable resources
- allocated: Currently allocated resources
Validation Rulesโ
- template and owners are required
- ttl and expires are mutually exclusive
- allocatePolicy must be Aligned or Restricted
- preAllocation and allocateOnce have specific interaction rules
Default Valuesโ
- ttl defaults to "24h"
- allocateOnce defaults to true
- phase defaults to "Pending"
Section sources
NodeMetric CRDโ
The NodeMetric CRD in the slo.koordinator.sh API group collects and reports node-level resource metrics.
Schema Documentationโ
type NodeMetric struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec NodeMetricSpec `json:"spec,omitempty"`
Status NodeMetricStatus `json:"status,omitempty"`
}
Spec Fieldsโ
- metricCollectPolicy: Collection policy with:
- aggregateDurationSeconds: Aggregation period in seconds
- reportIntervalSeconds: Reporting interval in seconds
- nodeAggregatePolicy: Target grain for node aggregation
- nodeMemoryCollectPolicy: Memory collection method (usageWithHotPageCache, usageWithoutPageCache, usageWithPageCache)
Status Fieldsโ
- updateTime: Last update timestamp
- nodeMetric: Node resource usage metrics
- nodeUsage: Total node resource usage
- aggregatedNodeUsages: Historical usage aggregates
- systemUsage: System-level resource usage
- aggregatedSystemUsages: Historical system usage aggregates
- podsMetric: Array of pod-level metrics
- hostApplicationMetric: Metrics for host applications
- prodReclaimableMetric: Reclaimable resource indicators
Validation Rulesโ
- nodeMemoryCollectPolicy must be one of the defined enum values
- Duration fields must be valid time durations
Default Valuesโ
- nodeMemoryCollectPolicy defaults to usageWithoutPageCache
Section sources
NodeSLO CRDโ
The NodeSLO CRD in the slo.koordinator.sh API group defines Service Level Objectives for node performance and resource quality.
Schema Documentationโ
type NodeSLO struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec NodeSLOSpec `json:"spec,omitempty"`
Status NodeSLOStatus `json:"status,omitempty"`
}
Spec Fieldsโ
- resourceUsedThresholdWithBE: Resource threshold strategy for BE workloads
- resourceQOSStrategy: QoS configuration strategy for different workload classes
- cpuBurstStrategy: CPU burst configuration
- systemStrategy: Node-level system configuration
- extensions: Third-party extensions
- hostApplications: QoS management for host applications
The resourceQOSStrategy supports detailed configuration for CPU, memory, block I/O, network, and resctrl QoS across different workload classes (LSR, LS, BE, SYSTEM).
Validation Rulesโ
- Various percentage fields have minimum and maximum constraints (0-100)
- CPU and memory burst policies have specific value ranges
- Network bandwidth values must be non-negative
Default Valuesโ
Many fields have sensible defaults for production environments, such as:
- cpuSuppressThresholdPercent: 65
- memoryEvictThresholdPercent: 70
- cpuEvictTimeWindowSeconds: 300
Section sources
Usage Examplesโ
Creating Resources with kubectlโ
# Create a Recommendation
kubectl apply -f recommendation.yaml
# Create a ClusterColocationProfile
kubectl apply -f cluster-colocation-profile.yaml
# Create an ElasticQuotaProfile
kubectl apply -f elastic-quota-profile.yaml
# Create a Reservation
kubectl apply -f reservation.yaml
# Create a NodeMetric
kubectl apply -f nodemetric.yaml
# Create a NodeSLO
kubectl apply -f nodeslo.yaml
Programmatic Creation with Go Clientโ
// Example of creating a Reservation programmatically
reservation := &schedulingv1alpha1.Reservation{
ObjectMeta: metav1.ObjectMeta{
Name: "example-reservation",
},
Spec: schedulingv1alpha1.ReservationSpec{
Template: &corev1.PodTemplateSpec{
// Template specification
},
Owners: []schedulingv1alpha1.ReservationOwner{
{
LabelSelector: &metav1.LabelSelector{
MatchLabels: map[string]string{"app": "example"},
},
},
},
TTL: &metav1.Duration{Duration: 24 * time.Hour},
},
}
createdReservation, err := client.SchedulingV1alpha1().Reservations().Create(context.TODO(), reservation, metav1.CreateOptions{})
Sample YAML Manifestsโ
Recommendationโ
apiVersion: analysis.koordinator.sh/v1alpha1
kind: Recommendation
metadata:
name: example-recommendation
spec:
target:
type: workload
workload:
kind: Deployment
name: example-app
apiVersion: apps/v1
ClusterColocationProfileโ
apiVersion: config.koordinator.sh/v1alpha1
kind: ClusterColocationProfile
metadata:
name: high-priority
spec:
selector:
matchLabels:
priority: high
qosClass: LSE
priorityClassName: system-cluster-critical
ElasticQuotaProfileโ
apiVersion: quota.koordinator.sh/v1alpha1
kind: ElasticQuotaProfile
metadata:
name: gpu-quota-profile
spec:
quotaName: gpu-quota
nodeSelector:
matchLabels:
accelerator: gpu
resourceRatio: "0.8"
Reservationโ
apiVersion: scheduling.koordinator.sh/v1alpha1
kind: Reservation
metadata:
name: batch-job-reservation
spec:
template:
spec:
containers:
- name: placeholder
resources:
requests:
cpu: "8"
memory: "32Gi"
owners:
- labelSelector:
matchLabels:
job-type: batch
ttl: "12h"
NodeMetricโ
apiVersion: slo.koordinator.sh/v1alpha1
kind: NodeMetric
metadata:
name: node-sample
spec:
metricCollectPolicy:
aggregateDurationSeconds: 60
reportIntervalSeconds: 15
nodeMemoryCollectPolicy: usageWithoutPageCache
NodeSLOโ
apiVersion: slo.koordinator.sh/v1alpha1
kind: NodeSLO
metadata:
name: production-node-slo
spec:
resourceUsedThresholdWithBE:
enable: true
cpuSuppressThresholdPercent: 70
memoryEvictThresholdPercent: 75
resourceQOSStrategy:
beClass:
cpuQOS:
enable: true
groupIdentity: 1
memoryQOS:
enable: true
minLimitPercent: 20
lowLimitPercent: 40
throttlingPercent: 80
Section sources
- examples/spark-jobs/cluster-colocation-profile.yaml
- config/samples/analysis_v1aplha1_recommendation.yaml
- config/samples/scheduling_v1alpha1_reservation.yaml
- config/samples/slo_v1alpha1_nodemetric.yaml
Controller Integrationโ
Koordinator's CRDs integrate with the Kubernetes controller pattern through dedicated controllers that reconcile the desired state defined in the spec with the actual cluster state.
Controller Responsibilitiesโ
- RecommendationController: Analyzes workload resource usage and generates optimization recommendations
- ColocationProfileController: Applies colocation policies to matching pods during admission
- QuotaProfileController: Manages elastic quota profiles and their application to workloads
- ReservationController: Handles resource reservation lifecycle, scheduling, and allocation
- NodeMetricController: Collects and aggregates node resource metrics
- NodeSLOController: Enforces node-level SLOs and QoS policies
Reconciliation Processโ
Each controller follows the standard reconciliation loop:
- Watches for CRD events (create, update, delete)
- Fetches the current resource state
- Compares spec (desired state) with status (current state)
- Takes actions to reconcile differences
- Updates status to reflect current state
- Emits events and conditions for observability
Controllers use Kubernetes client-go libraries for API interactions and leverage informers for efficient event handling. They also implement proper error handling and retry mechanisms to ensure reliability.
Section sources
- pkg/controller/colocationprofile/colocationprofile_controller.go
- pkg/slo-controller/nodemetric/nodemetric_controller.go
- pkg/slo-controller/nodeslo/nodeslo_controller.go
Common Configuration Pitfallsโ
Invalid Field Combinationsโ
- Setting both ttl and expires in Reservation (mutually exclusive)
- Specifying conflicting QoS policies in NodeSLO
- Using invalid enum values for qosClass or allocationPolicy
Missing Required Fieldsโ
- Omitting quotaName or nodeSelector in ElasticQuotaProfile
- Not specifying template or owners in Reservation
- Forgetting to set required selectors in ClusterColocationProfile
Validation Errorsโ
- Exceeding percentage limits (e.g., cpuSuppressThresholdPercent > 100)
- Using invalid duration formats in TTL fields
- Providing malformed label selectors or match expressions
Best Practicesโ
- Use descriptive names and labels for easy identification
- Set appropriate TTL values to avoid resource leaks
- Test colocation profiles in staging environments before production
- Monitor reservation utilization to optimize resource allocation
- Regularly review and update NodeSLO configurations based on workload patterns
Section sources