feat: add AI skill to find and improve the Pythonic interface to functions by timsaucer · Pull Request #1484 · apache/datafusion-python

timsaucer · 2026-04-09T15:51:32Z

Which issue does this PR close?

None

Rationale for this change

This adds an AI agent skill that can be used to search the repository and identify cases where we can make our interface more intuitive to users. Attached is also the diff recommended when using this skill in coordination with our existing agent directives about how to write functions.

What changes are included in this PR?

Add skill for searching repository for functions, investigating their upstream equivalent, and update the function inputs where appropriate.

I ran the skill and updated many function signatures.

Are there any user-facing changes?

Improved type hints and inputs allowed in Python.

…uiring lit() Update 47 functions in functions.py to accept native Python types (int, float, str) for arguments that are contextually literals, eliminating verbose lit() wrapping. For example, users can now write split_part(col("a"), ",", 2) instead of split_part(col("a"), lit(","), lit(2)). All changes are backward compatible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ions Update instr and position (aliases of strpos) to accept Expr | str for the substring parameter, matching the updated primary function signature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Alias functions that delegate to a primary function must have their type hints updated to match, even though coercion logic is only added to the primary. Added a new Step 3 to the implementation workflow for this. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

timsaucer · 2026-04-09T16:04:25Z

Since @kevinjqliu asked about it regarding the last skill I wrote, here is an export of the chat history:
chat-export-2026-04-09.md

One of the important things to note in the chat history is that I intentionally exited the session and started a new session so that the skill would be applied without prior context. Then as I reviewed the code it generated I gave it feedback on the fact that it missed the aliases. And so then I had the agent update the skill it was using.

I've found that this has to be an iterative process. The next step I'm going to take is to start a fresh session and have it review this PR, both the skill and the updates it makes. I'll keep iterating on the process of having the agent and myself review the code suggestions and update the skill.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR adds a new AI “make-pythonic” skill and applies its recommendations to make datafusion-python function wrappers accept native Python literals (e.g., str, int, float) in places where callers previously had to wrap values with lit().

Changes:

Added a reusable AI skill definition documenting how to audit and “pythonicize” function signatures.
Updated many python/datafusion/functions.py APIs to accept native types and internally coerce them to Expr.literal(...).
Updated doctest examples to demonstrate the simplified calling convention.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.

File	Description
python/datafusion/functions.py	Broad signature + coercion updates to accept native Python types; doctest examples updated accordingly.
.ai/skills/make-pythonic/SKILL.md	New skill documentation describing how to identify and implement pythonic argument coercions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T16:13:46Z