Skip to content

Add custom-forward-backward and forward endpoints for RL custom losses#276

Open
dvmazur wants to merge 1 commit into
mainfrom
dmazur/custom-losses-endpoints
Open

Add custom-forward-backward and forward endpoints for RL custom losses#276
dvmazur wants to merge 1 commit into
mainfrom
dmazur/custom-losses-endpoints

Conversation

@dvmazur
Copy link
Copy Markdown

@dvmazur dvmazur commented May 11, 2026

Summary

  • Adds four new RL training operation endpoints for the custom losses feature:
    • POST /rl/training-sessions/{session_id}/operations/custom-forward-backward — submit a forward-backward pass driven by externally computed log-prob gradients
    • GET /rl/training-sessions/{session_id}/operations/custom-forward-backward/{operation_id} — poll status/result
    • POST /rl/training-sessions/{session_id}/operations/forward — submit a no-grad forward pass to retrieve per-token log-probabilities
    • GET /rl/training-sessions/{session_id}/operations/forward/{operation_id} — poll status/result
  • Adds corresponding schema definitions: RL.CustomForwardBackwardBody, RL.CustomForwardBackwardOperation, RL.CustomForwardBackwardResult, RL.ForwardBody, RL.ForwardOperation, RL.ForwardResult, RL.TargetLogprobs, RL.TargetLogprobGradients
  • All new schemas follow the existing RL.* naming convention and OpenAPI 3.1 style of the file

Companion PR in together-shaping: https://github.com/togethercomputer/together-shaping/pull/3333

Test plan

  • Verify new path blocks render correctly in the OpenAPI viewer
  • Confirm all $ref targets resolve (no dangling references)
  • Check that RL.DType (referenced by RL.TargetLogprobGradients) already exists in the file — it does, at the RL.DType schema definition

🤖 Generated with Claude Code

Adds four new RL training operation endpoints:
- POST /rl/training-sessions/{session_id}/operations/custom-forward-backward
- GET  /rl/training-sessions/{session_id}/operations/custom-forward-backward/{operation_id}
- POST /rl/training-sessions/{session_id}/operations/forward
- GET  /rl/training-sessions/{session_id}/operations/forward/{operation_id}

Also adds the corresponding schema definitions:
RL.CustomForwardBackwardBody, RL.CustomForwardBackwardOperation,
RL.CustomForwardBackwardResult, RL.ForwardBody, RL.ForwardOperation,
RL.ForwardResult, RL.TargetLogprobs, RL.TargetLogprobGradients.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 11, 2026

✱ Stainless preview builds for togetherai

This PR will update the togetherai SDKs with the following commit messages.

go

chore(internal): regenerate SDK with no functional changes

openapi

feat(api): add forward and custom-forward-backward operations to RL training sessions

python

chore(internal): regenerate SDK with no functional changes

terraform

chore(internal): regenerate SDK with no functional changes

typescript

chore(internal): regenerate SDK with no functional changes

Edit this comment to update them. They will appear in their respective SDK's changelogs.

togetherai-openapi studio · code · diff

Your SDK build had at least one new note diagnostic, which is a regression from the base state.
generate ✅

New diagnostics (4 note)
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/custom-forward-backward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/custom-forward-backward/{operation_id}`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/forward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/forward/{operation_id}`
togetherai-go studio · code · diff

Your SDK build had at least one new note diagnostic, which is a regression from the base state.
generate ✅build ⏭️ (prev: build ✅) → lint ✅test ❗

go get github.com/stainless-sdks/togetherai-go@39b6984abe29b46bc12019699639842737cbb83f
New diagnostics (4 note)
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/custom-forward-backward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/custom-forward-backward/{operation_id}`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/forward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/forward/{operation_id}`
togetherai-python studio · code · diff

Your SDK build had at least one new note diagnostic, which is a regression from the base state.
generate ⚠️build ⏭️ (prev: build ✅) → lint ⏭️ (prev: lint ✅) → test ⏭️

New diagnostics (4 note)
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/custom-forward-backward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/custom-forward-backward/{operation_id}`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/forward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/forward/{operation_id}`
togetherai-typescript studio · code · diff

Your SDK build had at least one new note diagnostic, which is a regression from the base state.
generate ⚠️build ⏭️ (prev: build ✅) → lint ⏭️ (prev: lint ✅) → test ✅

New diagnostics (4 note)
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/custom-forward-backward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/custom-forward-backward/{operation_id}`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/forward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/forward/{operation_id}`
togetherai-terraform studio · code · diff

Your SDK build had at least one new note diagnostic, which is a regression from the base state.
generate ✅lint ✅test ✅

New diagnostics (4 note)
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/custom-forward-backward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/custom-forward-backward/{operation_id}`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /rl/training-sessions/{session_id}/operations/forward`
💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `get /rl/training-sessions/{session_id}/operations/forward/{operation_id}`

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-05-11 18:57:01 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant