v1.0.0 by cyclone-github · Pull Request #15 · cyclone-github/spider

cyclone-github · 2026-05-22T01:46:06Z

added flag "-text-match" to filter page text matches
memory and performance optimizations for -file and -url modes
-file mode streams wordlists from disk instead of loading entire files into RAM
reduced RAM usage for large -sort wordlists
default -timeout increased from 1 to 10 seconds
progress bars, stats, and errors now write to stderr
sanitize url fragments for dedup and extension checks
updated default User-Agent

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 28de35e28d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-22T01:50:28Z

+func countNgramsFromStream(r io.Reader, fileSize int64, ngramMin, ngramMax int, uniqueWords map[string]bool, ngramCounts map[string]int, trackUnique bool, progress func(processed, total int)) error {
+	cr := &countingReader{r: r}
+	scanner := bufio.NewScanner(cr)
+	scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)


Remove hard 1MB word cap in file streaming mode

Using bufio.Scanner with scanner.Buffer(..., 1024*1024) introduces a hard 1MB maximum token size in -file mode, so any input containing a single whitespace-delimited token larger than 1MB (for example long base64 blobs, minified assets, or machine-generated text) now fails with bufio.Scanner: token too long and exits. This is a regression from the previous os.ReadFile + strings.Fields path, which did not impose this per-token cap, and it can break real-world large text processing unexpectedly.

Useful? React with 👍 / 👎.

v1.0.0

28de35e

cyclone-github self-assigned this May 22, 2026

cyclone-github added bug Something isn't working enhancement New feature or request labels May 22, 2026

cyclone-github merged commit 717d460 into main May 22, 2026
4 checks passed

cyclone-github deleted the dev branch May 22, 2026 01:48

chatgpt-codex-connector Bot reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.0#15

v1.0.0#15
cyclone-github merged 1 commit into
mainfrom
dev

cyclone-github commented May 22, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cyclone-github commented May 22, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant