Service Information¶
Prompt Keda service information.
Architecture¶
KEDA is deployed with the KEDA operator in the Prompt Processing vClusters in the keda Kubernetes namespace.
KEDA controls autoscaling for Prompt Processing by replacing the Kubernetes Auto Scaler. KEDA uses scalars to determine how aggressively to scale up and down. Prompt KEDA is configured to autoscale based on the number of Fanned Out events in Prompt Redis. This is called lag.
KEDA is not configured to scale down Scaled Jobs while they are running. There is a max run time setting that can be configured in KEDA, but it is not configured because Prompt Processing is configured to consume multiple rounds of Fanned Out messages before finishing. This is to reduce churn from new Pods being created.
Architecture Diagram¶
Associated Systems¶
Configuration Location¶
Config Area |
Location |
|---|---|
Configuration |
|
Vault Secrets Dev |
secret/rubin/usdf-prompt-processing-dev/prompt-keda |
Vault Secrets Prod |
secret/rubin/usdf-prompt-processing/prompt-keda |
Data Flow¶
See Data Flow
Dependencies - S3DF¶
- Below are the S3DF Dependencies.
Kubernetes
SLAC LDAP to authenticate to vCluster
Prompt Redis for creating and scaling scaled jobs.
Dependencies - External¶
- Below are external dependencies.
Internet access to pull Keda docker image.
Disaster Recovery¶
The application can be redeployed with not data being needed to be restored.