secp256k1: Implement FieldVal64 field element. by valery-osheter-cb · Pull Request #3708 · decred/dcrd

valery-osheter-cb · 2026-06-21T08:55:44Z

Overview

This is the first slice of the FieldVal64 work proposed in #3707: a pure-Go
alternative secp256k1 base-field element that stores the value as 4×uint64
limbs (base 2^64) instead of the existing FieldVal's 10×uint32 (base 2^26).
It stays fully reduced after every operation.
This PR is purely additive — it only adds new files under
dcrec/secp256k1/ and does not touch or alter the existing FieldVal or any
default build. Nothing is wired into the curve/ECDSA/Schnorr paths yet; that
is deferred to a later PR (see #3707 for the full plan).

Crandall reduction

The secp256k1 prime is a Crandall (pseudo-Mersenne) prime,
p = 2^256 - c with c = 0x1000003D1. A 512-bit product is folded modulo p by
multiplying the high 256 bits by the small constant c and adding them back
into the low 256 bits, repeating until the result fits in 256 bits, followed
by a single final conditional subtraction of p. Because c is small this
replaces a general division with a couple of multiply-add passes.

Constant time

The reduction and the final conditional subtraction are constant time
(borrow-mask select; no secret-dependent branches or memory access).

No new dependencies

This PR adds no new module dependencies. The CPU-feature detection and
assembly fast paths discussed in #3707 are intentionally not part of this
PR and will follow separately.

Commits

secp256k1: Add pure Go FieldVal64 (4x64) field element
secp256k1: Add tests for FieldVal64
secp256k1: Add benchmarks for FieldVal64

Testing

Differential tests against a math/big oracle, plus edge cases and
randomized inputs.
go build, go vet, and go test -run Field64 pass under
dcrec/secp256k1.

AI disclosure

AI-assisted.
Refs #3707

davecgh

Thanks for the PR.

Before anything else, I preface this by saying I really appreciate the submission and I know how much work this area of the code can be!

I mention that because this is an extremely critical area of consensus code that is notoriously easy to make a small mistake that goes unnoticed due to the inexhaustible testing surface and, as a result, reviews in this area fairly are critical in nature and particularly vigilant about consistency. So, please don't take the general strictness as being unappreciative.

I'll have to spend some more time reviewing this more meticulously, but to get started, I gave it a once over, and on the positive side, most of the code looks good and tight (aside from a few cases of code that isn't constant time). It makes judicious use of bits.Add64 and bits.Sub64 which weren't available when the code was originally written. That is particularly nice in general because the compiler automatically replaces them with the single instruction hardware intrinsics when it can (which is basically always on even remotely modern hardware). I notice that it also further takes advantage of that fact by arranging for long carry chains which is nice and great for efficiency.

On the negative side, unfortunately, most of the comments look very LLM generated. They basically ignore all of the existing style, and, well, have that distinctive, less than desirable LLM style to them. They do things like boldly claiming some things are true when they aren't and change wording out willy nilly versus the existing comments (e.g. reports versus returns, etc), which is another hallmark of LLMs.

Even worse, it strips almost all of the descriptive and helpful comments and replaces them with ones that really would not at all be helpful for anyone who isn't already intimately familiar with the literature and doesn't already know what it's doing. Not only that, it fails to document extremely important considerations for any consumers that the existing one does such as whether or not they're constant time and the implications of truncation.

As it stands, and I hate to say it since I know it sounds a bit harsh, but it would probably be easier to just scrap pretty much all of the public method comments, because they have very little, if any, redeeming value to them as it stands.

As a case in point, one of the comments is:

//
// Since the secp256k1 prime is ≡ 3 (mod 4), the square root is a^((p+1)/4),
// computed via the same addition chain used by FieldVal26 (254 squarings, 13
// multiplications).
func (f *FieldVal64) SquareRootVal(val *FieldVal64) bool {

That is an exported method and thus public documentation, yet it is referencing internal implementation details that are meaningless to anyone reading the docs given they neither need to know, nor can see, the addition chain used in the 10x26 implementation. What information does the fact it's calculating it as a^((p+1)/4) give to an API consumer?

Another example is:

// Pass 2: subtract p. Since p = 2^256 - C, the low 256 bits of t - p are
// identical to t + C, i.e. the folded result for a 2^256 overflow.

The 'C' there is introduced out of the blue and completely unexplained. I happen to know what it's talking about because I'm intimately familiar with the literature and the different reduction techniques such as Solinas and, in this case Crandall, which involves primes of the form "2^k - c", so I'm sure that is where the LLM pulled it from, but I can guarantee anyone who isn't will have no idea what that comment means.

A third example:

// Inverse sets f = f^(-1) (mod p) via Fermat's little theorem (a^(p-2)).
func (f *FieldVal64) Inverse() *FieldVal64 {

Once again, why would the caller care how it's calculated? How is that in any way useful to them?

davecgh · 2026-06-22T09:10:52Z

Benchmark comparison with benchstat on 10 runs on a Ryzen 7 5800X3D (rebased on #3709):

name                          old time/op    new time/op    delta
FieldNegate                     3.92ns ± 5%    2.49ns ± 4%  -36.60%  (p=0.000 n=10+9)
FieldAdd                        4.31ns ± 2%    3.77ns ± 4%  -12.46%  (p=0.000 n=10+10)
FieldMul                        57.8ns ± 3%    19.0ns ± 2%  -67.11%  (p=0.000 n=10+10)
FieldSqrt                       9.55µs ± 4%    4.03µs ± 1%  -57.76%  (p=0.000 n=10+10)
FieldSquare                     33.9ns ± 2%    14.2ns ± 3%  -58.20%  (p=0.000 n=10+10)
FieldInverse                    9.57µs ± 2%    4.17µs ± 3%  -56.39%  (p=0.000 n=10+10)
FieldIsGtOrEqPrimeMinusOrder    8.69ns ± 4%    0.33ns ± 7%  -96.15%  (p=0.000 n=10+8)

name                          old alloc/op   new alloc/op   delta
FieldNegate                      0.00B          0.00B          ~     (all equal)
FieldAdd                         0.00B          0.00B          ~     (all equal)
FieldMul                         0.00B          0.00B          ~     (all equal)
FieldSqrt                        0.00B          0.00B          ~     (all equal)
FieldSquare                      0.00B          0.00B          ~     (all equal)
FieldInverse                     0.00B          0.00B          ~     (all equal)
FieldIsGtOrEqPrimeMinusOrder     0.00B          0.00B          ~     (all equal)

name                          old allocs/op  new allocs/op  delta
FieldNegate                       0.00           0.00          ~     (all equal)
FieldAdd                          0.00           0.00          ~     (all equal)
FieldMul                          0.00           0.00          ~     (all equal)
FieldSqrt                         0.00           0.00          ~     (all equal)
FieldSquare                       0.00           0.00          ~     (all equal)
FieldInverse                      0.00           0.00          ~     (all equal)
FieldIsGtOrEqPrimeMinusOrder      0.00           0.00          ~     (all equal)

valery-osheter-cb · 2026-06-22T10:23:08Z

Thank you for your review.
I absolutely agree with your unflattering assessment of LLM.
I implemented saturated Field64 in both pure Go and ASM about one year ago without AI.
Recently I had an Idea to publish it, and I saw the best way is to make it as an alternative (optimized version) for existing implementation(s). As you can see, my English is poor, that why I decided to use AI to polish, but it seems bad idea, because it added bad comments. Do you want me to push my original handmade code in different branch?

davecgh · 2026-06-22T11:36:29Z

No problem. As long as you don't have a problem with it, I would be fine with going through and polishing the comments. I don't mind doing it myself. I generally don't because I know that a lot of people tend to get upset when you start rewording their comments.

The code is the most important part and it is a rather large improvement over the existing implementation. The new type is also significantly easier to use because it doesn't require manually tracking the magnitude and sprinkling in calls to Normalize at the right times to avoid getting wrong results.

valery-osheter-cb · 2026-06-22T14:16:50Z

I just pushed the commit with the fixes addressing your notes:

Reordered methods to match field.go and tidied exported doc comments.
Removed Normalize() and magnitude tracking (always fully reduced).
Added MulBy2–MulBy8, backed by a generic MulInt.
Reworked edge-case tests for the 4x64 limbs; removed TestField64ReductionCarry.
Audited for constant time, plus small fixes.

davecgh

Thanks for the updates. I've reviewed them and they look great. The comments are significantly improved, the non-constant time paths are now constant time as they should be, the addition of the MulByX is clean, and the conversion of MulInt to use a temporary + full multiply ensures that it continues to work with all ints for any consumers.

I still see various comment related things, but, as mentioned, I don't mind polishing those up separately.

I still need to go through in detail to assert there are no potential unhandled overflows and the generated assembly is constant time which I probably won't be able to get to for a couple of days since I'm working on some other things at the moment and reviewing this on my own time, but otherwise, I only spotted two small remaining things which I commented on inline.

valery-osheter-cb · 2026-06-23T08:07:26Z

I did the cosmetic fixes in the tests that you noticed

davecgh · 2026-06-23T13:17:58Z

+	p1, c = bits.Add64(p1, h0, 0)
+	p2, c = bits.Add64(p2, h1, c)
+	p3, c = bits.Add64(p3, h2, c)
+	p4 := h3 + c


Asserting overflow conditions:

Max possible value of h3 is (2^64 - 1)*(2^64 - 1) = 2^128 - 2^65 + 1, so the high word will be a max of 2^64-2. Max value of c is 1. Overflow not possible because 2^64-2+1 = 2^64-1 < 2^64.

Same for the other cases below.

✔️

davecgh · 2026-06-23T13:48:30Z

+	t1, carry = bits.Add64(t1, x[1], carry)
+	t2, carry = bits.Add64(t2, x[2], carry)
+	t3, carry = bits.Add64(t3, x[3], carry)
+	t4 += carry


Asserting overflow conditions:

Max possible value of t4 before the carries is (2^64 - 1)*(2^32 + 977), so the high word will be a max of 2^32 + 976 < 2^33 and a max of +2 added for carries. Overflow not possible.

✔️

valery-osheter-cb · 2026-06-23T18:33:15Z

I removed MulBy{5,6,7} and added the test for MulBy{2,3,4,8}
Please suggest next steps.

davecgh · 2026-06-23T19:50:26Z

Thanks for the updates. I believe the last things are just to squash the last few commits and rebase over master.

Also, if you would, please fix the description on the secp256k1: Refine FieldVal64 API and tests. commit. It's missing a blank line after the title line which makes the entire message show as if it were the summary in the oneline output as can be seen in the following output:

| * 4bc1f00b5 seck256k1: cosmetic changes in tests of Field64
| * 566f6edf8 seck256k1: cosmetic changes in tests of Field64
| * 7a8b3dec1 secp256k1: Refine FieldVal64 API and tests. Reorder methods to mirror field.go and tidy the exported doc comments so the two field backends stay consistent. Drop Normalize and magnitude tracking since FieldVal64 is always fully reduced, add MulBy2 through MulBy8 for the common small-constant multiplications, and back them with a generic MulInt. Rework the edge-case tests to exercise the 4x64 limbs rather than the 10x26 word boundaries inherited from FieldVal, remove the inconsistent TestField64ReductionCarry, and ensure the operations remain constant time, along with assorted small fixes.
| * 66df468c8 secp256k1: Add benchmarks for FieldVal64
| * f2441a82c secp256k1: Add tests for FieldVal64
| * 65e4f8f00 secp256k1: Add pure Go FieldVal64 (4x64) field element

Other than that, I have a bit more review of the generated asm to assert everything is being generated with constant time. I expect it should be fine based on the Go code, but I make it a point to double check that actual generated code.

Introduce FieldVal64, an alternative secp256k1 field element that packs the value into four uint64 limbs (base 2^64) and stays fully reduced after every operation. AI-assisted: developed with the help of an AI coding assistant; all code reviewed and tested by the author.

Add a test suite for FieldVal64 mirroring the existing FieldVal tests: serialization round-trips, arithmetic (add, negate, multiply, square), normalization, inverse, square root, and the bit predicates, plus randomized checks. Cases that only apply to the unnormalized FieldVal representation are adjusted since FieldVal64 is always fully reduced. AI-assisted: developed with the help of an AI coding assistant; all code reviewed and tested by the author.

Add micro-benchmarks for the FieldVal64 multiply, square, inverse, and related operations to track the pure Go performance. AI-assisted: developed with the help of an AI coding assistant; all code reviewed and tested by the author.

Reorder methods to mirror field.go and tidy the exported doc comments so the two field backends stay consistent. Drop Normalize and magnitude tracking since FieldVal64 is always fully reduced, add MulBy2, MulBy3, MulBy4, and MulBy8 for the common small-constant multiplications, and back them with a generic MulInt. Rework the edge-case tests to exercise the 4x64 limbs rather than the 10x26 word boundaries inherited from FieldVal, remove the inconsistent TestField64ReductionCarry, and ensure the operations remain constant time, along with assorted small fixes.

valery-osheter-cb · 2026-06-24T15:43:35Z

My implementation of Bernstein-Yang "safegcd" constant-time modular inversion is about 25% faster than current implementation that is based on Ferma's little theorem. Do you want me to publish?

davecgh

This looks good to go to me. I've manually reviewed all generated assembler to ensure constant time and asserted there are no unhandled cases where potential overflow could occur. That is, of course, in addition to reviewing all of the code and tests.

Thanks a ton for your efforts. Very nice work!

davecgh · 2026-06-24T16:31:31Z

My implementation of Bernstein-Yang "safegcd" constant-time modular inversion is about 25% faster than current implementation that is based on Ferma's little theorem. Do you want me to publish?

That would be great (in a separate PR though please)! I saw the paper a while back and had it on my list of things to implement at some point. I expect it is much simpler to implement with the new saturated field since it doesn't have to worry about dealing with denormalized forms and conversion.

That would help speed up signing and some other ways inversion is used a bit. I don't imagine it will have any impact on signature verification though because, IIRC, we avoid the expensive inversion to affine that is usually needed during verification by staying in Jacobian and taking advantage of secp256k1's cofactor of 1.

valery-osheter-cb mentioned this pull request Jun 21, 2026

secp256k1: Add alternative 64-bit field backend (FieldVal64) #3707

Open

davecgh changed the title ~~SECP256K1 Field64 purego~~ secp256k1: Implement FieldVal64 field element. Jun 22, 2026

davecgh reviewed Jun 22, 2026

View reviewed changes

davecgh added this to the 2.2.0 milestone Jun 22, 2026

davecgh mentioned this pull request Jun 22, 2026

secp256k1: Add additional benchmarks for field prime. #3709

Merged

davecgh reviewed Jun 23, 2026

View reviewed changes

Comment thread dcrec/secp256k1/field64_test.go Outdated

Comment thread dcrec/secp256k1/field64_test.go Outdated

davecgh reviewed Jun 23, 2026

View reviewed changes

Comment thread dcrec/secp256k1/field64.go Outdated

davecgh mentioned this pull request Jun 23, 2026

secp256k1: Add additional field64 benchmarks. #3710

Open

davecgh reviewed Jun 23, 2026

View reviewed changes

valery-osheter-cb added 3 commits June 24, 2026 09:24

secp256k1: Add benchmarks for FieldVal64

9db856e

Add micro-benchmarks for the FieldVal64 multiply, square, inverse, and related operations to track the pure Go performance. AI-assisted: developed with the help of an AI coding assistant; all code reviewed and tested by the author.

valery-osheter-cb force-pushed the field64-purego branch from 4bc1f00 to 5658396 Compare June 24, 2026 06:30

davecgh mentioned this pull request Jun 24, 2026

secp256k1: Refactor constant time selection function. #3711

Open

davecgh reviewed Jun 24, 2026

View reviewed changes

Comment thread dcrec/secp256k1/field64_test.go

valery-osheter-cb force-pushed the field64-purego branch from 5658396 to 5b6e6ef Compare June 24, 2026 15:19

davecgh approved these changes Jun 24, 2026

View reviewed changes

Uh oh!

Conversation

valery-osheter-cb commented Jun 21, 2026

Overview

Crandall reduction

Constant time

No new dependencies

Commits

Testing

AI disclosure

Uh oh!

davecgh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davecgh commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

valery-osheter-cb commented Jun 22, 2026

Uh oh!

davecgh commented Jun 22, 2026

Uh oh!

valery-osheter-cb commented Jun 22, 2026

Uh oh!

davecgh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

valery-osheter-cb commented Jun 23, 2026

Uh oh!

Uh oh!

davecgh Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

davecgh Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

valery-osheter-cb commented Jun 23, 2026

Uh oh!

davecgh commented Jun 23, 2026

Uh oh!

Uh oh!

valery-osheter-cb commented Jun 24, 2026

Uh oh!

davecgh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davecgh commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

davecgh commented Jun 22, 2026 •

edited

Loading

davecgh left a comment •

edited

Loading

davecgh left a comment •

edited

Loading