Skip to content

Fix spelling suggestions involving Unicode characters#4232

Merged
ahejlsberg merged 2 commits into
mainfrom
fix-4223
Jun 6, 2026
Merged

Fix spelling suggestions involving Unicode characters#4232
ahejlsberg merged 2 commits into
mainfrom
fix-4223

Conversation

@ahejlsberg
Copy link
Copy Markdown
Member

Fixes #4223.

Copilot AI review requested due to automatic review settings June 6, 2026 21:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a regression test for issue #4223 and adjusts the compiler’s spelling-suggestion heuristic so “Did you mean …” suggestions behave correctly when the misspelled identifier contains non-ASCII characters.

Changes:

  • Added a new compiler regression test covering Unicode identifiers that previously produced divergent TS1435 “Did you mean …” suggestions.
  • Updated GetSpellingSuggestion heuristics to base length thresholds on Unicode-aware length for the input name.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
internal/core/core.go Adjusts spelling-suggestion length heuristics to use a Unicode-aware length for the target name.
testdata/tests/cases/compiler/unicodeSpellingSuggestions.ts Adds a regression test for Unicode spelling suggestions (issue #4223).
testdata/baselines/reference/compiler/unicodeSpellingSuggestions.errors.txt Captures expected diagnostic output for the new test.
testdata/baselines/reference/compiler/unicodeSpellingSuggestions.types Captures expected type baseline output for the new test.
testdata/baselines/reference/compiler/unicodeSpellingSuggestions.symbols Captures expected symbol baseline output for the new test.
Comments suppressed due to low confidence (1)

internal/core/core.go:580

  • GetSpellingSuggestion now computes maximumLengthDifference/bestDistance using rune count for name, but the candidate-length filter still uses len(candidateName) (byte count). For candidates containing non-ASCII characters this mixes units and can incorrectly skip valid suggestions. It also redundantly converts candidateName to []rune for Levenshtein.
	runeName := []rune(name)
	maximumLengthDifference := max(2, int(float64(len(runeName))*0.34))
	bestDistance := math.Floor(float64(len(runeName))*0.4) + 0.9 // If the best result is worse than this, don't bother.
	buffers := levenshteinBuffersPool.Get().(*levenshteinBuffers)
	defer levenshteinBuffersPool.Put(buffers)
	var bestCandidate T
	hasBest := false
	for candidate := range candidates {
		candidateName := getName(candidate)
		maxLen := max(len(candidateName), len(runeName))
		minLen := min(len(candidateName), len(runeName))
		if candidateName != "" && maxLen-minLen <= maximumLengthDifference {
			if candidateName == name {
				continue
			}
			// Only consider candidates less than 3 characters long when they differ by case.
			// Otherwise, don't bother, since a user would usually notice differences of a 2-character name.
			if len(candidateName) < 3 && !strings.EqualFold(candidateName, name) {
				continue
			}
			distance := levenshteinWithMax(buffers, runeName, []rune(candidateName), bestDistance)

@ahejlsberg ahejlsberg added this pull request to the merge queue Jun 6, 2026
Merged via the queue into main with commit 3e477ab Jun 6, 2026
22 checks passed
@ahejlsberg ahejlsberg deleted the fix-4223 branch June 6, 2026 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spelling suggestions ('Did you mean') diverge for non-ASCII identifiers

3 participants