API Reference¶
Packages¶
inference.networking.k8s.io/v1¶
Package v1 contains API Schema definitions for the inference.networking.k8s.io API group.
Resource Types¶
EndpointPickerFailureMode¶
Underlying type: string
EndpointPickerFailureMode defines the options for how the parent handles the case when the Endpoint Picker extension is non-responsive.
Validation: - Enum: [FailOpen FailClose]
Appears in: - EndpointPickerRef
Field | Description |
---|---|
FailOpen |
EndpointPickerFailOpen specifies that the parent should forward the request to an endpoint of its picking when the Endpoint Picker extension fails. |
FailClose |
EndpointPickerFailClose specifies that the parent should drop the request when the Endpoint Picker extension fails. |
EndpointPickerRef¶
EndpointPickerRef specifies a reference to an Endpoint Picker extension and its associated configuration.
Appears in: - InferencePoolSpec
Field | Description | Default | Validation |
---|---|---|---|
group Group |
Group is the group of the referent API object. When unspecified, the default value is "", representing the Core API group. |
MaxLength: 253 MinLength: 0 Pattern: ^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$ |
|
kind Kind |
Kind is the Kubernetes resource kind of the referent API object. When unspecified, the referent is assumed to be a "Service" kind. ExternalName services can refer to CNAME DNS records that may live outside of the cluster and as such are difficult to reason about in terms of conformance. They also may not be safe to forward to (see CVE-2021-25740 for more information). Implementations MUST NOT support ExternalName Services. |
Service | MaxLength: 63 MinLength: 1 Pattern: ^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$ |
name ObjectName |
Name is the name of the referent API object. | MaxLength: 253 MinLength: 1 |
|
portNumber PortNumber |
PortNumber is the port number of the Endpoint Picker extension service. When unspecified, implementations SHOULD infer a default value of 9002 when the kind field is "Service" or unspecified (defaults to "Service"). |
Maximum: 65535 Minimum: 1 |
|
failureMode EndpointPickerFailureMode |
FailureMode configures how the parent handles the case when the Endpoint Picker extension is non-responsive. When unspecified, defaults to "FailClose". |
FailClose | Enum: [FailOpen FailClose] |
Group¶
Underlying type: string
Group refers to a Kubernetes Group. It must either be an empty string or a RFC 1123 subdomain.
This validation is based off of the corresponding Kubernetes validation: https://github.com/kubernetes/apimachinery/blob/02cfb53916346d085a6c6c7c66f882e3c6b0eca6/pkg/util/validation/validation.go#L208
Valid values include:
- "" - empty string implies core Kubernetes API group
- "gateway.networking.k8s.io"
- "foo.example.com"
Invalid values include:
- "example.com/bar" - "/" is an invalid character
Validation:
- MaxLength: 253
- MinLength: 0
- Pattern: ^$|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$
Appears in: - EndpointPickerRef - ParentReference
InferencePool¶
InferencePool is the Schema for the InferencePools API.
Field | Description | Default | Validation |
---|---|---|---|
apiVersion string |
inference.networking.k8s.io/v1 |
||
kind string |
InferencePool |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata . |
||
spec InferencePoolSpec |
Spec defines the desired state of the InferencePool. | ||
status InferencePoolStatus |
Status defines the observed state of the InferencePool. |
InferencePoolSpec¶
InferencePoolSpec defines the desired state of the InferencePool.
Appears in: - InferencePool
Field | Description | Default | Validation |
---|---|---|---|
selector LabelSelector |
Selector determines which Pods are members of this inference pool. It matches Pods by their labels only within the same namespace; cross-namespace selection is not supported. The structure of this LabelSelector is intentionally simple to be compatible with Kubernetes Service selectors, as some implementations may translate this configuration into a Service resource. |
||
targetPorts Port array |
TargetPorts defines a list of ports that are exposed by this InferencePool. Currently, the list may only include a single port definition. |
MaxItems: 1 MinItems: 1 |
|
endpointPickerRef EndpointPickerRef |
EndpointPickerRef is a reference to the Endpoint Picker extension and its associated configuration. |
InferencePoolStatus¶
InferencePoolStatus defines the observed state of the InferencePool.
Appears in: - InferencePool
Field | Description | Default | Validation |
---|---|---|---|
parents ParentStatus array |
Parents is a list of parent resources, typically Gateways, that are associated with the InferencePool, and the status of the InferencePool with respect to each parent. A controller that manages the InferencePool, must add an entry for each parent it manages and remove the parent entry when the controller no longer considers the InferencePool to be associated with that parent. A maximum of 32 parents will be represented in this list. When the list is empty, it indicates that the InferencePool is not associated with any parents. |
MaxItems: 32 |
Kind¶
Underlying type: string
Kind refers to a Kubernetes Kind.
Valid values include:
- "Service"
- "HTTPRoute"
Invalid values include:
- "invalid/kind" - "/" is an invalid character
Validation:
- MaxLength: 63
- MinLength: 1
- Pattern: ^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$
Appears in: - EndpointPickerRef - ParentReference
LabelKey¶
Underlying type: string
LabelKey was originally copied from: https://github.com/kubernetes-sigs/gateway-api/blob/99a3934c6bc1ce0874f3a4c5f20cafd8977ffcb4/apis/v1/shared_types.go#L694-L731 Duplicated as to not take an unexpected dependency on gw's API.
LabelKey is the key of a label. This is used for validation of maps. This matches the Kubernetes "qualified name" validation that is used for labels. Labels are case sensitive, so: my-label and My-Label are considered distinct.
Valid values include:
- example
- example.com
- example.com/path
- example.com/path.html
Invalid values include:
- example~ - "~" is an invalid character
- example.com. - can not start or end with "."
Validation:
- MaxLength: 253
- MinLength: 1
- Pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?([A-Za-z0-9][-A-Za-z0-9_.]{0,61})?[A-Za-z0-9]$
Appears in: - LabelSelector
LabelSelector¶
LabelSelector defines a query for resources based on their labels. This simplified version uses only the matchLabels field.
Appears in: - InferencePoolSpec
Field | Description | Default | Validation |
---|---|---|---|
matchLabels object (keys:LabelKey, values:LabelValue) |
MatchLabels contains a set of required {key,value} pairs. An object must match every label in this map to be selected. The matching logic is an AND operation on all entries. |
MaxItems: 64 MinItems: 1 |
LabelValue¶
Underlying type: string
LabelValue is the value of a label. This is used for validation of maps. This matches the Kubernetes label validation rules: * must be 63 characters or less (can be empty), * unless empty, must begin and end with an alphanumeric character ([a-z0-9A-Z]), * could contain dashes (-), underscores (_), dots (.), and alphanumerics between.
Valid values include:
- MyValue
- my.name
- 123-my-value
Validation:
- MaxLength: 63
- MinLength: 0
- Pattern: ^(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?$
Appears in: - LabelSelector
Namespace¶
Underlying type: string
Namespace refers to a Kubernetes namespace. It must be a RFC 1123 label.
This validation is based off of the corresponding Kubernetes validation: https://github.com/kubernetes/apimachinery/blob/02cfb53916346d085a6c6c7c66f882e3c6b0eca6/pkg/util/validation/validation.go#L187
This is used for Namespace name validation here: https://github.com/kubernetes/apimachinery/blob/02cfb53916346d085a6c6c7c66f882e3c6b0eca6/pkg/api/validation/generic.go#L63
Valid values include:
- "example"
Invalid values include:
- "example.com" - "." is an invalid character
Validation:
- MaxLength: 63
- MinLength: 1
- Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$
Appears in: - ParentReference
ObjectName¶
Underlying type: string
ObjectName refers to the name of a Kubernetes object. Object names can have a variety of forms, including RFC 1123 subdomains, RFC 1123 labels, or RFC 1035 labels.
Validation: - MaxLength: 253 - MinLength: 1
Appears in: - EndpointPickerRef - ParentReference
ParentReference¶
ParentReference identifies an API object. It is used to associate the InferencePool with a parent resource, such as a Gateway.
Appears in: - ParentStatus
Field | Description | Default | Validation |
---|---|---|---|
group Group |
Group is the group of the referent API object. When unspecified, the referent is assumed to be in the "gateway.networking.k8s.io" API group. |
gateway.networking.k8s.io | MaxLength: 253 MinLength: 0 Pattern: ^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$ |
kind Kind |
Kind is the kind of the referent API object. When unspecified, the referent is assumed to be a "Gateway" kind. |
Gateway | MaxLength: 63 MinLength: 1 Pattern: ^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$ |
name ObjectName |
Name is the name of the referent API object. | MaxLength: 253 MinLength: 1 |
|
namespace Namespace |
Namespace is the namespace of the referenced object. When unspecified, the local namespace is inferred. Note that when a namespace different than the local namespace is specified, a ReferenceGrant object is required in the referent namespace to allow that namespace's owner to accept the reference. See the ReferenceGrant documentation for details: https://gateway-api.sigs.k8s.io/api-types/referencegrant/ |
MaxLength: 63 MinLength: 1 Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$ |
ParentStatus¶
ParentStatus defines the observed state of InferencePool from a Parent, i.e. Gateway.
Appears in: - InferencePoolStatus
Field | Description | Default | Validation |
---|---|---|---|
conditions Condition array |
Conditions is a list of status conditions that provide information about the observed state of the InferencePool. This field is required to be set by the controller that manages the InferencePool. Supported condition types are: "Accepted" "ResolvedRefs" |
MaxItems: 8 |
|
parentRef ParentReference |
ParentRef is used to identify the parent resource that this status is associated with. It is used to match the InferencePool with the parent resource, such as a Gateway. |
Port¶
Port defines the network port that will be exposed by this InferencePool.
Appears in: - InferencePoolSpec
Field | Description | Default | Validation |
---|---|---|---|
number PortNumber |
Number defines the port number to access the selected model server Pods. The number must be in the range 1 to 65535. |
Maximum: 65535 Minimum: 1 |
PortNumber¶
Underlying type: integer
PortNumber defines a network port.
Validation: - Maximum: 65535 - Minimum: 1
Appears in: - EndpointPickerRef - Port
gst