Experimentation

When a function has more than one variant, the LLM Service Daemon (LSD) needs a way to pick which one serves a given request. Three strategies are available, configured per function (and optionally overridden per namespace):

[functions.generate_summary.experimentation]
type = "static"   # default | static | adaptive

Default

No experimentation config means uniform random sampling across all active variants.

Static (weighted A/B test)

[functions.generate_summary.experimentation]
type = "static"
candidate_variants = { gpt4o = 0.8, claude_sonnet = 0.2 }
fallback_variants = ["gpt4o"]

candidate_variants can also be a plain array (["gpt4o", "claude_sonnet"]) for equal weights. fallback_variants are used if every candidate fails.

Adaptive (multi-armed bandit)

[functions.generate_summary.experimentation]
type = "adaptive"
algorithm = "track_and_stop"

Adaptive experimentation shifts traffic toward better-performing variants over time based on collected feedback, using a track-and-stop algorithm (currently the only one implemented). This needs a metric to optimize against, defined in the function’s evaluators or via feedback.

Namespaces

Override the experimentation strategy for a subset of traffic without touching the base config:

[functions.generate_summary.experimentation]
type = "static"
candidate_variants = ["gpt4o"]

[functions.generate_summary.experimentation.namespaces.beta_users]
type = "static"
candidate_variants = ["claude_sonnet"]

Requests tagged with a namespace use that namespace’s config; everyone else falls back to the base config.

Inspecting the live split

curl http://localhost:3000/internal/functions/generate_summary/variant_sampling_probabilities