Skip to content

feat: add phoneme synthesis capabilities and enhance TTS functionality#53

Open
stretchyboy wants to merge 2 commits into
ilesinge:masterfrom
stretchyboy:phonemes
Open

feat: add phoneme synthesis capabilities and enhance TTS functionality#53
stretchyboy wants to merge 2 commits into
ilesinge:masterfrom
stretchyboy:phonemes

Conversation

@stretchyboy

Copy link
Copy Markdown
  • Introduced new phoneme synthesis methods in Dj class: speak_phoneme and speak_phoneme_chunk for generating audio samples from ARPAbet phonemes.
  • Implemented phoneme chunking logic to maintain word boundaries during synthesis.
  • Added a new phonemize module for ARPAbet to IPA conversion and phoneme chunking.
  • Updated web routes to handle phoneme generation requests and caching of responses.
  • Enhanced HTML templates with versioning for asset management.
  • Added tests for sentence chunk planning to ensure correct isolation of words during synthesis.
  • Included pronouncing library for stress pattern extraction from ARPAbet strings.

- Introduced new phoneme synthesis methods in Dj class: speak_phoneme and speak_phoneme_chunk for generating audio samples from ARPAbet phonemes.
- Implemented phoneme chunking logic to maintain word boundaries during synthesis.
- Added a new phonemize module for ARPAbet to IPA conversion and phoneme chunking.
- Updated web routes to handle phoneme generation requests and caching of responses.
- Enhanced HTML templates with versioning for asset management.
- Added tests for sentence chunk planning to ensure correct isolation of words during synthesis.
- Included pronouncing library for stress pattern extraction from ARPAbet strings.
@ilesinge

Copy link
Copy Markdown
Owner

OK that's really cool @stretchyboy ! Thanks for this improvement suggestion, it speaks to me :)
For disclaimer purposes, did you use AI? (not that I reject it but I would review the PR more carefully)
Did you deploy any test version somewhere public (that I could test) or is it just local? Could you share a screenshot?
I'll test it locally though if no public test version is deployed.
Thanks again!

@ilesinge ilesinge self-assigned this Jun 15, 2026
@ilesinge ilesinge added the enhancement New feature or request label Jun 15, 2026
@stretchyboy

stretchyboy commented Jun 16, 2026 via email

Copy link
Copy Markdown
Author

@ilesinge

Copy link
Copy Markdown
Owner

Sorry to hear about your hip, get well soon!
I'll look at it all when I have a bit more time, but I confirm it seems super cool. In the meantime, if you have more info to share, don't hesitate.
Alex

@stretchyboy

stretchyboy commented Jun 17, 2026

Copy link
Copy Markdown
Author

I don't seem to have understood how the branches work with pull requests and it's accidentally sucked in a separate change on a totally different project. https://codeberg.org/uzu/strudel/pulls/2059/commits/aac01e4293da74d490404e739ff8048adc142a59. How do I take this back out?

@stretchyboy

Copy link
Copy Markdown
Author

I don't seem to have understood how the branches work with pull requests and it's accidentally sucked in a separate change on a totally different project. https://codeberg.org/uzu/strudel/pulls/2059/commits/aac01e4293da74d490404e739ff8048adc142a59. How do I take this back out?

I think I have now fixed that.

@stretchyboy stretchyboy marked this pull request as draft June 17, 2026 14:12
@stretchyboy

Copy link
Copy Markdown
Author

I'm putting it up on codespaces for more testing and hit a few issues.

@stretchyboy

Copy link
Copy Markdown
Author

I can't work out how to do the TTS api authorization on codespaces but the sentence chunking is working here
https://bug-free-system-g5w6w7wjwc96vq-5000.app.github.dev/

@stretchyboy stretchyboy marked this pull request as ready for review June 17, 2026 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants