Skip to content

Remove durabletask-protobuf submodule and vendor the proto file#130

Open
YunchuWang wants to merge 2 commits intomicrosoft:mainfrom
YunchuWang:remove-protobuf-submodule
Open

Remove durabletask-protobuf submodule and vendor the proto file#130
YunchuWang wants to merge 2 commits intomicrosoft:mainfrom
YunchuWang:remove-protobuf-submodule

Conversation

@YunchuWang
Copy link
Copy Markdown
Member

Issue describing the changes in this PR

This PR mirrors microsoft/durabletask-java#207 for the Go SDK. It removes the submodules/durabletask-protobuf git submodule (the submodule was complex to maintain) and vendors the proto file directly with a helper script for refreshing it from upstream.

Main changes

  • Remove the legacy submodules/durabletask-protobuf git submodule (deletes .gitmodules and the submodule entry).
  • Vendor orchestrator_service.proto at internal/durabletask-protobuf/protos/orchestrator_service.proto. The file is byte-for-byte identical to the proto at the commit the submodule was pinned to (4207e1d), so the committed internal/protos/*.pb.go files remain in sync and existing builds stay deterministic.
  • Add internal/durabletask-protobuf/PROTO_SOURCE_COMMIT_HASH to track the upstream commit hash that the vendored proto was synced from (initialized to 4207e1dbd14cedc268f69c3befee60fcaad19367).
  • Add internal/durabletask-protobuf/update-proto.sh, a helper script that mirrors the Gradle downloadProtoFiles task from the Java repo. It resolves the latest commit SHA on a configurable branch (default main), downloads orchestrator_service.proto at that exact SHA (so the recorded hash always matches the downloaded file), and writes the SHA to PROTO_SOURCE_COMMIT_HASH. After running the script, regenerate the bindings with protoc and commit the updated .pb.go files.
  • Add internal/durabletask-protobuf/README.md documenting the directory layout and update workflow.
  • Update .github/workflows/pr-validation.yml to drop submodules: true from the checkout step and point protoc -I at the new vendored location (./internal/durabletask-protobuf/protos).
  • Update README.md to remove the --recurse-submodules cloning instructions and update the protoc example to point at the new path.
  • Update CHANGELOG.md with an Unreleased entry.

Why not auto-update from main in CI (deviation from Java PR)?

The Java PR wires the Gradle downloadProtoFiles task into every build so CI always pulls the latest proto from main. That works for Java because the Java sources are regenerated from the proto on every build and aren't committed.

The Go SDK is different: the generated internal/protos/*.pb.go files are committed (so go build and the Docker build don't need protoc). Auto-pulling main in CI would mean PRs are validated against a different proto than go build/Docker consume, which makes builds non-reproducible and risks compile failures on unrelated PRs whenever upstream proto evolves.

To keep builds deterministic, this PR keeps the vendored proto as the source of truth in CI. Updating to a newer upstream proto is a maintainer-driven action: run ./internal/durabletask-protobuf/update-proto.sh, regenerate the .pb.go files with protoc, run the tests, and commit all three (proto + PROTO_SOURCE_COMMIT_HASH + .pb.go) together.

Validation

  • go build ./... passes.
  • go vet ./... passes.
  • protoc --go_out=. --go-grpc_out=. -I ./internal/durabletask-protobuf/protos orchestrator_service.proto regenerates internal/protos/*.pb.go cleanly (only diff vs. committed is protoc-gen-go version skew).
  • ./internal/durabletask-protobuf/update-proto.sh main end-to-end smoke test downloaded the latest proto and updated the hash file correctly.
  • Pre-existing test suite behavior on Windows is unchanged (same tests fail on this branch as on main).

Pull request checklist

  • My changes do not require documentation changes (the documentation updates are part of this PR).
  • My changes are added to the CHANGELOG.md.
  • I have added all required tests (no new behavior; existing tests cover the gRPC bindings).

Mirrors microsoft/durabletask-java#207 for the Go SDK.

* Removes the submodules/durabletask-protobuf git submodule.

* Vendors orchestrator_service.proto under internal/durabletask-protobuf/protos/ at the previously pinned commit (4207e1d) so the existing committed .pb.go files remain in sync and CI builds stay deterministic.

* Adds internal/durabletask-protobuf/PROTO_SOURCE_COMMIT_HASH to track the upstream source commit.

* Adds internal/durabletask-protobuf/update-proto.sh, a helper script that resolves the commit SHA on a branch, downloads orchestrator_service.proto at that exact SHA, and writes the SHA to PROTO_SOURCE_COMMIT_HASH.

* Adds internal/durabletask-protobuf/README.md documenting the workflow.

* Updates the PR validation workflow to drop `submodules: true` and point the `protoc -I` path at the vendored location.

* Updates README.md and CHANGELOG.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 1, 2026 16:54
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes the durabletask-protobuf git submodule and replaces it with a vendored orchestrator_service.proto plus a small helper workflow to keep the vendored proto (and generated Go bindings) deterministic and easy to refresh.

Changes:

  • Added a vendored orchestrator_service.proto under internal/durabletask-protobuf/protos/ and a PROTO_SOURCE_COMMIT_HASH file to track upstream provenance.
  • Added internal/durabletask-protobuf/update-proto.sh and documentation (internal/durabletask-protobuf/README.md) describing how to refresh the vendored proto and regenerate .pb.go files.
  • Updated CI and repository docs/changelog to remove submodule assumptions and point protoc at the vendored proto location.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
internal/durabletask-protobuf/update-proto.sh Adds a script to download the upstream proto at an exact commit SHA and update the tracked hash.
internal/durabletask-protobuf/protos/orchestrator_service.proto Vendors the proto definition previously sourced via submodule.
internal/durabletask-protobuf/README.md Documents directory layout and update workflow for the vendored proto.
internal/durabletask-protobuf/PROTO_SOURCE_COMMIT_HASH Records the upstream commit SHA for the currently vendored proto.
README.md Updates cloning/protoc instructions to reflect removal of submodules and new proto location.
CHANGELOG.md Notes submodule removal and new vendoring/update workflow under Unreleased.
.gitmodules Removes the durabletask-protobuf submodule entry.
.github/workflows/pr-validation.yml Removes submodule checkout and updates protoc -I path to vendored proto directory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +36 to +39
COMMIT_HASH=$(printf '%s' "${COMMIT_RESPONSE}" \
| grep -o '"sha"[[:space:]]*:[[:space:]]*"[^"]*"' \
| head -n 1 \
| sed -E 's/.*"sha"[[:space:]]*:[[:space:]]*"([^"]+)".*/\1/')
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With set -euo pipefail, the grep | head | sed pipeline that computes COMMIT_HASH will cause the script to exit immediately if grep finds no matches (exit status 1). That prevents the later COMMIT_HASH validation block from running and makes failures (rate-limits, bad branch, API response changes) harder to diagnose. Consider making the extraction step non-fatal (e.g., handle the pipeline exit status explicitly or neutralize grep’s non-match exit) so you can emit a clear error message.

Suggested change
COMMIT_HASH=$(printf '%s' "${COMMIT_RESPONSE}" \
| grep -o '"sha"[[:space:]]*:[[:space:]]*"[^"]*"' \
| head -n 1 \
| sed -E 's/.*"sha"[[:space:]]*:[[:space:]]*"([^"]+)".*/\1/')
# Use sed so a missing match leaves COMMIT_HASH empty without exiting early
# under `set -euo pipefail`; the validation block below will emit the error.
COMMIT_HASH=$(printf '%s' "${COMMIT_RESPONSE}" \
| sed -nE 's/.*"sha"[[:space:]]*:[[:space:]]*"([^"]+)".*/\1/p' \
| head -n 1)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 86e76d7 — replaced the grep | head | sed pipeline with sed -nE '...'p | head -n 1 (returns empty string instead of failing under set -euo pipefail). The validation block now emits a clear error and dumps the API response on failure.

Comment thread .github/workflows/pr-validation.yml Outdated

- name: Generate grpc code
run: protoc --go_out=. --go-grpc_out=. -I ./submodules/durabletask-protobuf/protos orchestrator_service.proto
run: protoc --go_out=. --go-grpc_out=. -I ./internal/durabletask-protobuf/protos orchestrator_service.proto
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about vendored?

Suggested change
run: protoc --go_out=. --go-grpc_out=. -I ./internal/durabletask-protobuf/protos orchestrator_service.proto
run: protoc --go_out=. --go-grpc_out=. -I ./vendored/durabletask-protobuf/protos orchestrator_service.proto

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 86e76d7 — directory renamed to vendored/durabletask-protobuf (workflow, README, CHANGELOG, and script paths all updated).

@@ -0,0 +1 @@
4207e1dbd14cedc268f69c3befee60fcaad19367 No newline at end of file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put the full URL, including the GitHub organization, repo, and branch name so that we know what this hash applies to?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 86e76d7PROTO_SOURCE_COMMIT_HASH is now a 4-line structured file:

Source: https://github.com/microsoft/durabletask-protobuf
Branch: main
Commit: 4207e1dbd14cedc268f69c3befee60fcaad19367
URL:    https://github.com/microsoft/durabletask-protobuf/blob/4207e1dbd14cedc268f69c3befee60fcaad19367/protos/orchestrator_service.proto

update-proto.sh writes this format automatically.

- Rename internal/durabletask-protobuf -> vendored/durabletask-protobuf (cgillum)
- Expand PROTO_SOURCE_COMMIT_HASH to include Source/Branch/Commit/URL (cgillum)
- Replace 'grep -o | head | sed' pipeline in update-proto.sh with single
  'sed -nE ...p | head -n 1', so it doesn't abort silently under
  'set -euo pipefail' when the pattern is absent (Copilot bot)
- update-proto.sh now accepts any ref (branch/tag/SHA) and dumps the API
  response when no commit can be resolved
- Add .gitattributes (*.sh text eol=lf) so Windows clones don't
  reintroduce CRLF into update-proto.sh

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants