![]() |
A model server runner that launches model server in kubernetes cluster.
Inherits From: BaseModelServerRunner
tfx.components.infra_validator.model_server_runners.kubernetes_runner.KubernetesRunner(
model_path: Text,
serving_binary: tfx.components.infra_validator.serving_bins.ServingBinary
,
serving_spec: infra_validator_pb2.ServingSpec
)
Args | |
---|---|
model_path
|
An IV-flavored model path. (See model_path_utils.py) |
serving_binary
|
A ServingBinary to run. |
serving_spec
|
A ServingSpec instance. |
Methods
GetEndpoint
GetEndpoint() -> Text
Get an endpoint to the model server to connect to.
Endpoint will be available after the model server job has reached the Running state.
Raises | |
---|---|
AssertionError
|
if runner hasn't reached the Running state. |
Start
Start() -> None
Start the model server in non-blocking manner.
Start()
will transition the job state from Initial to Scheduled. Serving
platform will turn the job into Running state in the future.
In Start()
, model server runner should prepare the resources model server
requires including config files, environment variables, volumes, proper
authentication, computing resource allocation, etc.. Cleanup for the
resources does not happen automatically, and you should call Stop()
to do
that if you have ever called Start()
.
It is not allowed to run Start()
twice. If you need to restart the job,
you should create another model server runner instance.
Stop
Stop() -> None
Stop the model server in blocking manner.
Model server job would be gracefully stopped once infra validation logic is
done. Here is the place you need to cleanup every resources you've created
in the Start()
. It is recommended not to raise error during the Stop()
as it will usually be called in the finally
block.
Stop()
is guaranteed to be called if Start()
is ever called, unless the
process dies unexpectedly due to external factors (e.g. SIGKILL). Stop()
can be called even when Start()
was not completed. Stop()
should not
assume the completion of Start()
.
Stop()
is also called when graceful shutdown for the executor (not
model server) is requested. Stop()
method should be finished within the
graceful shutdown period, and it is perfectly fine to add a retry logic
inside Stop()
until the deadline is met.
WaitUntilRunning
WaitUntilRunning(
deadline: float
) -> None
Wait until model server job is running.
When this method is returned without error, the model server job is in the Running state where you can perform all the infra validation logic. It does not guarantee that model server job would remain in the Running state forever, (e.g. preemption could happen in some serving platform) and any kind of infra validation logic failure can be caused from model server job not being in the Running state. Still, it is a validation failure and we blame model for this.
Args | |
---|---|
deadline
|
A deadline time in UTC timestamp (in seconds). |
Returns | |
---|---|
Whether the model is available or not. |