From 564b7aa13856cad583c203ef62569f4e5a4bfce3 Mon Sep 17 00:00:00 2001 From: Philipp Dubach Date: Fri, 1 May 2026 18:52:34 +0200 Subject: [PATCH] Add Detection Guidance: false positives, human-writing signs, LLM idiolects MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Most of this skill tells the editor what to remove. This adds the inverse — what to leave alone, and how to decide. Sourced from Wikipedia: Signs of AI writing (revision fetched 2026-05-01), specifically the "Ineffective indicators", "Signs of human writing", and "Differences between LLMs" sections. Three subsections, no new patterns: - "What NOT to flag (false positives)" — the indicators that look AI-coded but are actually neutral (perfect grammar, em dashes alone, curly quotes alone, formal vocabulary, common transition words). The over-editing risk is real: if the skill is applied too aggressively, it strips legitimate prose. Closes with the "clusters matter, isolated signs don't" rule. - "Signs of human writing (preserve these)" — positive markers that should be left untouched: specific detail, mixed feelings, era-bound references, sentence-length variation, parenthetical self-corrections, and the November 30, 2022 cutoff for ruling out AI involvement entirely. - "LLM Idiolects" — quick triage notes per model family (ChatGPT/Grok verbose with artifacts; Gemini/Claude concise, no curly quotes by default). Tendencies, not rules. No pattern-count change. No README changes (the README's pattern table is unaffected since this section is meta-guidance, not new patterns). No version bump. --- SKILL.md | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/SKILL.md b/SKILL.md index 46639f02..7f2bf954 100644 --- a/SKILL.md +++ b/SKILL.md @@ -461,6 +461,51 @@ Avoiding AI patterns is only half the job. Sterile, voiceless writing is just as > > When users hit a slow page, they leave. + +## DETECTION GUIDANCE + +### What NOT to flag (false positives) + +A clean human writer can hit several of the patterns above without any AI involvement. Before rewriting, sanity-check that you are not gutting legitimate prose. The following are *not* reliable indicators on their own: + +- **Perfect grammar and consistent style.** Many writers are professionals or have been edited. Polish does not equal AI. +- **Mixed casual and formal registers.** This often signals a person in a technical field, a young writer, or someone with neurodivergent prose habits — not a chatbot. +- **"Bland" or "robotic" prose.** AI prose has *specific* tells. Generic dryness without those tells is just dry writing. +- **Formal or academic vocabulary.** AI overuses *specific* fancy words (see §7), not all fancy words. Don't flatten "ostensibly" or "constituent" just because they sound brainy. +- **Letter-style opening or closing on a comment.** Salutations and sign-offs predate ChatGPT by centuries. +- **Common transition words in isolation.** *Additionally*, *moreover*, *consequently* are AI-coded only when piled up. One *however* is not a tell. +- **Curly quotes alone.** macOS, Word, Google Docs, and most CMSes auto-curl by default. Curly quotes only count when stacked with other tells. +- **Em dashes alone.** Many editors and journalists use them often. Em dashes are evidence only when paired with formulaic sales-y rhythm. +- **Unsourced claims.** Most of the web is unsourced. Lack of citations doesn't prove anything. +- **Correct, complex formatting.** Visual editors and templates produce clean output without any AI. + +When in doubt, look for **clusters** of tells, not isolated ones. A single em dash means nothing; em dashes plus rule-of-three plus *vibrant tapestry* plus a "Conclusion" section is a confession. + + +### Signs of human writing (preserve these) + +When you see these, lean toward leaving the prose alone — they are evidence of a real person writing, and over-editing will destroy what makes the piece sound human: + +- **Specific, unusual, hard-to-fabricate detail.** A real address. A weird quote. The phrase "the lawyer who used to work upstairs from my dentist." LLMs round off specifics; humans hoard them. +- **Mixed feelings and unresolved tension.** "I think this is mostly good, but it bothers me, and I can't fully explain why." LLMs default to clean takes. +- **Dated, era-bound references.** Slang, memes, or in-jokes that map to a specific year and subculture. Models lag by a year or more. +- **First-person editorial choices the writer can defend.** If the writer can explain *why* they made a particular cut or used a particular word, that's a strong human signal. +- **Variety in sentence length.** Real writing alternates short and long. AI writing tends toward an even, mid-length cadence. +- **Genuine asides, parentheticals, or self-corrections.** "(I keep wanting to say 'almost' here, but it really was certain.)" Models rarely interrupt themselves like this. +- **Edits made before November 30, 2022.** ChatGPT's public launch. Anything older than that is, with very rare exceptions, not AI-written. + + +### LLM Idiolects (which model wrote this?) + +Each model family writes a little differently. Useful when triaging a suspected passage: + +- **ChatGPT (GPT-4 / 4o / 5):** Most prevalent. Heavy on broader-context throat-clearing, "evolving landscape," media-coverage padding. Most likely to leave reference-markup artifacts. Most likely to use em dashes (suppressed in 5.1 but still leaks through). +- **Grok:** Similar to ChatGPT in verbosity and broader-context framing. Leaves `` tags and `referrer=grok.com`. +- **Gemini (1.5–3 Pro):** More concise than ChatGPT. Avoids curly quotes by default. Less prone to "broader trends" puffery. +- **Claude (3.5–Opus 4.x):** Concise. Avoids curly quotes by default. Tends toward direct expository style; less likely to insert "It's important to note that..." but can fall into rule-of-three and inline-header lists when doing structured output. + +These are tendencies, not rules. All four families produce all the patterns in this guide given the right prompt. + --- ## Process