Skip to content

v1.0.0#15

Merged
cyclone-github merged 1 commit into
mainfrom
dev
May 22, 2026
Merged

v1.0.0#15
cyclone-github merged 1 commit into
mainfrom
dev

Conversation

@cyclone-github
Copy link
Copy Markdown
Owner

  • added flag "-text-match" to filter page text matches
  • memory and performance optimizations for -file and -url modes
  • -file mode streams wordlists from disk instead of loading entire files into RAM
  • reduced RAM usage for large -sort wordlists
  • default -timeout increased from 1 to 10 seconds
  • progress bars, stats, and errors now write to stderr
  • sanitize url fragments for dedup and extension checks
  • updated default User-Agent

@cyclone-github cyclone-github self-assigned this May 22, 2026
@cyclone-github cyclone-github added bug Something isn't working enhancement New feature or request labels May 22, 2026
@cyclone-github cyclone-github merged commit 717d460 into main May 22, 2026
4 checks passed
@cyclone-github cyclone-github deleted the dev branch May 22, 2026 01:48
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 28de35e28d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread spider.go
func countNgramsFromStream(r io.Reader, fileSize int64, ngramMin, ngramMax int, uniqueWords map[string]bool, ngramCounts map[string]int, trackUnique bool, progress func(processed, total int)) error {
cr := &countingReader{r: r}
scanner := bufio.NewScanner(cr)
scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove hard 1MB word cap in file streaming mode

Using bufio.Scanner with scanner.Buffer(..., 1024*1024) introduces a hard 1MB maximum token size in -file mode, so any input containing a single whitespace-delimited token larger than 1MB (for example long base64 blobs, minified assets, or machine-generated text) now fails with bufio.Scanner: token too long and exits. This is a regression from the previous os.ReadFile + strings.Fields path, which did not impose this per-token cap, and it can break real-world large text processing unexpectedly.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant