Ternary AI Silicon

The first dedicated silicon for ternary AI inference.

Processing-in-memory architecture that eliminates multiplication. 70B-parameter LLM at 3.7 watts. No GPU required.

880x Less energy per op
279x Less data movement
3.7W 70B model inference

Every AI chip wastes energy on math it doesn't need to do.

Microsoft's BitNet research proved that large language models achieve full-precision accuracy using only ternary weights: {-1, 0, +1}. With ternary weights, every multiplication becomes a sign flip or a zero.

Multiplication hardware is unnecessary. No one has built the silicon to exploit it.

Compute inside the memory. Move nothing.

Ternary compute replaces the multiply-accumulate unit with simple bitwise operations. A ternary MAC uses ~360x fewer gates than FP16. Fewer transistors switching means less energy at any process node.

Processing-in-memory computes directly inside the SRAM array. Weights never move. In conventional architectures, 90%+ of energy is spent moving data between memory and compute. Our architecture eliminates that bottleneck.

GPU (H100)

70B inference~30 tok/s
Power700W
Price$30,000
Efficiency0.04 tok/s/W

Entrit (28nm, projected)

70B inference104 tok/s *
Power3.7W *
Price~$200 *
Efficiency27.9 tok/s/W *

* Projected from validated gate-level synthesis and architectural modeling.

The advantage exists at every node.

Ternary compute eliminates multipliers at every process node. The advantage scales with the shrink — no dependence on bleeding-edge fabrication.

Node 70B Power Application
130nm Validation Architecture proof — first silicon
28nm 3.7W Defense edge AI, autonomous systems
7nm <1W Consumer devices, wearables, IoT
3nm <0.4W Medical implants, sensors, robotics

Three proofs and a patent.

Hardware synthesis

Full Yosys synthesis on SkyWater SKY130 130nm. 4,308 logic cells, zero errors. 700x simpler than FP16 MAC.

Software prototype

2B-parameter ternary LLM running at 65 tokens/second at 15W. Real model, real weights, measured performance.

Quantization codec

Custom ternary codec validated on models up to 7B parameters. 3.6 bits per weight, 4.5x compression. Larger models compress better.

Provisional patent filed

12 embodiments, 9 claim families, 46 figures. Full architectural coverage.

Let's talk.

Entrit Systems Inc. — Delaware C-Corp, 2026

eric@entrit.io