Try it Live

Run BLS12-381 examples in the interactive playground

Performance

Benchmarks and optimization strategies for BLS12-381 operations.

Native Benchmarks (BLST)

Measured on Apple M1 Pro (ARM64) and Intel i9-12900K (x86_64):

G1 Operations

Operation	M1 Pro	i9-12900K	Notes
G1 Add	12 μs	15 μs	Point addition
G1 Double	8 μs	10 μs	Point doubling
G1 Mul	65 μs	80 μs	Scalar multiplication
G1 MSM (10)	0.4 ms	0.5 ms	Multi-scalar mult
G1 MSM (100)	2.5 ms	3.2 ms	Pippenger’s algorithm
G1 MSM (1000)	18 ms	22 ms	Batch verification

G2 Operations

Operation	M1 Pro	i9-12900K	Notes
G2 Add	35 μs	45 μs	Extension field
G2 Double	25 μs	32 μs	Extension field
G2 Mul	160 μs	200 μs	Scalar multiplication
G2 MSM (10)	1.2 ms	1.5 ms	Multi-scalar mult
G2 MSM (100)	8 ms	10 ms	Pippenger’s algorithm

Pairing Operations

Operation	M1 Pro	i9-12900K	Notes
Single Pairing	0.9 ms	1.2 ms	e(P, Q)
Pairing Check (2)	1.5 ms	2.0 ms	Signature verification
Pairing Check (4)	2.2 ms	3.0 ms	Batch check
Final Exponentiation	0.4 ms	0.5 ms	Part of pairing
Miller Loop	0.5 ms	0.6 ms	Part of pairing

Hash-to-Curve

Operation	M1 Pro	i9-12900K	Notes
Hash to G1	120 μs	150 μs	RFC 9380
Hash to G2	280 μs	350 μs	RFC 9380

Signature Operations

Single Signature

Operation	Time	Throughput
Sign	180 μs	5,500/sec
Verify	1.2 ms	830/sec

Aggregated Signatures

Signers	Aggregate	Verify	vs Individual
10	0.1 ms	1.3 ms	9x faster
100	0.8 ms	1.5 ms	80x faster
1000	7 ms	3 ms	400x faster
10000	70 ms	20 ms	600x faster

Batch Verification

Random linear combination batch verification:

Signatures	Naive	Batched	Speedup
10	12 ms	3 ms	4x
100	120 ms	12 ms	10x
1000	1.2 s	50 ms	24x

Comparison with Other Curves

vs BN254

Operation	BLS12-381	BN254	Ratio
G1 Mul	80 μs	45 μs	1.8x slower
G2 Mul	200 μs	120 μs	1.7x slower
Pairing	1.2 ms	0.6 ms	2x slower
Security	128-bit	~100-bit	Higher

vs secp256k1

Operation	BLS12-381	secp256k1	Ratio
Sign	180 μs	50 μs	3.6x slower
Verify	1.2 ms	80 μs	15x slower
Aggregate (1000)	7 ms	N/A	Unique feature

Optimization Strategies

Multi-Scalar Multiplication (MSM)

Pippenger’s algorithm for large MSMs:

Complexity: O(n / log n) group operations

Points	Naive	Pippenger	Speedup
100	8 ms	2.5 ms	3.2x
1000	80 ms	18 ms	4.4x
10000	800 ms	120 ms	6.7x

Batch Pairing

Multi-pairing is more efficient than individual pairings:

// Single pairing check
e(P1, Q1) == e(P2, Q2)

// Optimized as multi-pairing
e(P1, Q1) * e(-P2, Q2) == 1

// Further optimized with shared final exponentiation
miller(P1, Q1) * miller(-P2, Q2) -> final_exp

Precomputation Tables

For fixed-base multiplication (e.g., generator):

// Precompute multiples of generator
const TABLE_SIZE = 256;
var precomputed: [TABLE_SIZE]G1Point = undefined;
precomputed[0] = G1.identity();
precomputed[1] = G1.generator();
for (2..TABLE_SIZE) |i| {
    precomputed[i] = G1.add(precomputed[i-1], precomputed[1]);
}

// Fast multiplication using table
fn mulGenerator(scalar: Fr) G1Point {
    // Use precomputed table for significant speedup
    // ~4x faster than naive double-and-add
}

Memory Requirements

Structure	Size	Notes
G1 Point (compressed)	48 bytes
G1 Point (uncompressed)	96 bytes
G2 Point (compressed)	96 bytes
G2 Point (uncompressed)	192 bytes
Scalar (Fr)	32 bytes
Public Key	48 bytes	G1 compressed
Signature	96 bytes	G2 compressed
Aggregated Signature	96 bytes	Same as single

Ethereum Beacon Chain

Data	Per Epoch	Storage
Attestations (naive)	~100 MB	N/A
Attestations (aggregated)	~1 MB	99% reduction
Sync committee sigs	96 bytes	Fixed

Profiling Tips

Hotspots

Typical time distribution in signature verification:

Component	Time
Hash to G2	25%
Miller loop	45%
Final exponentiation	30%

Optimization Priorities

Batch operations - Use MSM and multi-pairing
Precomputation - Cache generator multiples
Aggregation - Combine signatures before verification
Parallelization - Miller loops are independent

Hardware Acceleration

x86_64 (ADX/BMI2)

BLST uses:

MULX for carry-less multiplication
ADCX/ADOX for parallel add-with-carry
~30% speedup over generic implementation

ARM64 (NEON)

BLST uses:

Vector operations for field arithmetic
~25% speedup over generic

GPU Acceleration

For large MSMs (>10,000 points):

CUDA implementations available
~100x speedup for MSM operations
Not suitable for latency-sensitive signing

BLS12-381 Overview - Curve fundamentals
Security - Security considerations
Usage Patterns - Implementation patterns

Overview

Getting Started

Core Concepts

Skills

JSONRPCProvider

Contract

Primitives

Cryptography

EVM

Utils

Guides

Examples

Swift

Zig

Developer Documentation

Generated API (TypeDoc)

Try it Live

​Performance

​Native Benchmarks (BLST)

​G1 Operations

​G2 Operations

​Pairing Operations

​Hash-to-Curve

​Signature Operations

​Single Signature

​Aggregated Signatures

​Batch Verification

​Comparison with Other Curves

​vs BN254

​vs secp256k1

​Optimization Strategies

​Multi-Scalar Multiplication (MSM)

​Batch Pairing

​Precomputation Tables

​Memory Requirements

​Ethereum Beacon Chain

​Profiling Tips

​Hotspots

​Optimization Priorities

​Hardware Acceleration

​x86_64 (ADX/BMI2)

​ARM64 (NEON)

​GPU Acceleration

​Related