diff --git a/release-notes.mdx b/release-notes.mdx index 2a345628..f87c13c5 100644 --- a/release-notes.mdx +++ b/release-notes.mdx @@ -261,7 +261,7 @@ Flash now supports deploying endpoints to [multiple datacenters](/flash/configur - **Self-service worker upgrade**: Rebuild and roll workers from the dashboard without support tickets. - **Edit template from endpoint page**: Inline edit and redeploy the underlying template directly from the endpoint view. - **Improved Serverless metrics page**: Refinements to charts and filters for quicker root-cause analysis. -- [Flex and active workers](/serverless/pricing): Discounted always-on "active" capacity for baseline load with on-demand "flex" workers for bursts. +- [Flex and active workers](/serverless/pricing): Always-on "active" workers for baseline load with on-demand "flex" workers for bursts. - **Billing explorer**: Inspect costs by resource, region, and time to identify optimization opportunities. diff --git a/serverless/development/optimization.mdx b/serverless/development/optimization.mdx index 2997eee9..5b9b39fe 100644 --- a/serverless/development/optimization.mdx +++ b/serverless/development/optimization.mdx @@ -52,7 +52,7 @@ For private models, [embed them in your Docker image](/serverless/workers/create ### Maintain active workers -Set [active workers](/serverless/endpoints/endpoint-configurations#active-workers) > 0 to eliminate cold starts entirely. Active workers cost up to 30% less than flex workers. +Set [active workers](/serverless/endpoints/endpoint-configurations#active-workers) > 0 to eliminate cold starts entirely. **Formula**: `Active workers = (Requests/min × Request duration in seconds) / 60` diff --git a/serverless/endpoints/endpoint-configurations.mdx b/serverless/endpoints/endpoint-configurations.mdx index 6489caa5..e99692dc 100644 --- a/serverless/endpoints/endpoint-configurations.mdx +++ b/serverless/endpoints/endpoint-configurations.mdx @@ -55,7 +55,7 @@ For endpoints with fewer than five workers, all workers use the highest-priority ### Active workers -Minimum number of workers that remain warm and ready at all times. Setting this to 1+ eliminates cold starts. Active workers incur charges when idle but receive a 20-30% discount. +Minimum number of workers that remain warm and ready at all times. Setting this to 1+ eliminates cold starts. Active workers incur charges continuously, including when idle. ### Max workers diff --git a/serverless/pricing.mdx b/serverless/pricing.mdx index ae1ac737..b5ff0105 100644 --- a/serverless/pricing.mdx +++ b/serverless/pricing.mdx @@ -20,7 +20,7 @@ Serverless offers pay-per-second pricing with no upfront costs. You're billed fr | | Flex workers | Active workers | |---|--------------|----------------| | **Behavior** | Scale to zero when idle | Always running (24/7) | -| **Pricing** | Standard per-second rate | 20–30% discount | +| **Pricing** | Standard per-second rate | Discounts available through sales inquiry | | **Best for** | Variable workloads, cost optimization | Consistent traffic, low-latency requirements | ## GPU pricing diff --git a/serverless/workers/overview.mdx b/serverless/workers/overview.mdx index 173994c1..a2d22639 100644 --- a/serverless/workers/overview.mdx +++ b/serverless/workers/overview.mdx @@ -39,7 +39,7 @@ To deploy workers with AI/ML models, follow this order of preference: Workers can run in two modes depending on your latency and cost requirements: -- **Active workers** run continuously (24/7) and are always ready to process requests instantly. They eliminate cold starts entirely and receive a discounted rate, making them ideal for latency-sensitive or high-traffic applications. +- **Active workers** run continuously (24/7) and are always ready to process requests instantly. They eliminate cold starts entirely, making them ideal for latency-sensitive or high-traffic applications. - **Flex workers** scale dynamically based on demand, spinning down to zero when idle. They incur cold starts when scaling up but cost nothing when not in use, making them ideal for variable or sporadic workloads.