PhotonLayer explainer
Optical AI · pure Rust · by @ruvnet

The lens already did the thinking — so all that's left to capture is the answer.

PhotonLayer is a deterministic optical-AI front end: a learned piece of smart glass reshapes light so a tiny sensor captures the answer instead of the whole picture.

Then it signs a receipt proving exactly what it measured — a result you can re-run, not just a claim. Today it's a faithful software simulator written in Rust; the actual glass is on the roadmap.

An independent explainer for Reuven Cohen's (@ruvnet) PhotonLayer — built to help you actually implement his technology.

By Reuven Cohen · @ruvnet Updated · source @ fe86c9f MIT © Ruvector github.com/ruvnet/PhotonLayer ↗ Built with the Ruv-Explainer pipeline ↗
A beam of plain white light enters a glowing prism and fans into a full rainbow spectrum that resolves into a few clean glowing dots — the answer.

white light in → trained optics → a few numbers out: the answer, shaped by physics

64
sensor pixels, vs 1,024 in a full image (16× fewer)
83.3%
single-plane MNIST at 16× compression (gradient-trained)
88.8%
2-plane optical cascade, still 16× compression
~24s
to reproduce the headline result from clean, deterministically

01 Q1 · What is this?

A camera that captures the answer, not the picture

In one plain sentence, then grounded all the way down to earth.

PhotonLayer is a deterministic optical-AI front end: a learned piece of "smart glass" reshapes incoming light so a tiny sensor captures the answer instead of the whole picture — then signs a receipt proving exactly what it measured.

Here's the everyday version. A normal camera records every pixel of a scene, then a computer reads all of them to decide what it's looking at. That's a lot of data to move, store, and process — and the picture itself can leak (it's a viewable photo of you or your scene).

PhotonLayer flips the order. It puts a specially-shaped, trained piece of glass — a phase mask — in front of a small sensor. The glass bends the incoming light so that, by the time light lands on the sensor, the useful information has already been squeezed into just a handful of numbers. A tiny program reads those numbers and gives the answer.

The repo's own analogy: it's like a translator who listens to a whole speech and hands you a one-line summary — you never needed the full transcript to act on it. The "lens" is trained by trial-and-improvement to do that summarizing in the light itself, before anything is digitized.

Friendly view
A person listens to a long wall of speech and hands over a single tiny summary card — you never needed the full transcript to act.
The translator analogy: hear the whole speech, hand back one useful line. PhotonLayer does that summarizing in the light.
Technical view — the pipeline
scene 1024 px phase mask propagate (diffraction) sensor 64 px decoder tiny · 640 MAC "paper" decision ⛓ receipt BLAKE3 hash of { image · mask · config · output · metrics · build } → re-runnable proof
The real pipeline (README "How it works"): a trained mask + diffraction does analog preprocessing, the sensor records ~64 numbers, a tiny decoder reads the answer, and a BLAKE3 receipt binds every input/output.
Honest framing the repo insists onToday this is a software simulator written in Rust — the physics, the training, and the receipts are all real and reproducible. Building the actual glass is on the roadmap. So "smart glass" currently means a faithful simulation that runs on your computer or in your browser.
02 Q2 · What problem does it solve?

Every camera-based AI pays the "capture everything, then think" tax

Bandwidth, power, storage — and a recoverable image of whatever it looked at.

Today's pattern is always the same: grab the whole picture, then compute on all of it. That single design choice creates four costs you can't avoid.

The costWhy it hurts
BandwidthThousands of pixels per frame have to be read off the sensor and moved.
Power & computeA processor chews through every pixel before it can decide anything.
StorageFull frames pile up — more to keep, more to secure.
PrivacyA stored frame is a viewable photo of whatever it saw — a recoverable image of you, your documents, your scene.

PhotonLayer is a proof-of-concept for capture-the-answer sensing: do the first chunk of "figuring out what's in the picture" in the light, so the sensor only ever captures a tiny compressed measurement (e.g. 64 numbers instead of 1,024 pixels), and the stored thing is a measurement, not a viewable photo.

Friendly view
Left: a heavy cluttered camera grabbing a full pixel-dense photo. Right: a small calm sensor behind a prismatic lens capturing just a few bright dots.
Left: capture everything, then think — heavy and cluttered. Right: the optics already did the first pass, so only a few bright numbers are left to capture.
Technical view — where the cost lives
NORMAL CAMERA + AI read 1024 px compute 10,240 MACs decision + full photo stored ⚠ PHOTONLAYER mask read 64 px compute 640 MACs decision + measurement (not a photo) ✓
Same decision, but PhotonLayer reads 16× fewer sensor pixels and 16× fewer digital-decoder MACs (640 vs 10,240), and stores a measurement instead of a recoverable photo. Numbers from the README "Measured results" tables.
03 Q3 · Why is that a problem now?

Cameras-with-AI are everywhere — and "trust me, that's what it measured" no longer cuts it

Edge sensing is exploding, privacy rules are tightening, and results need to be provable.

Three pressures collide right now, and the "capture everything, then think" design sits in the worst spot for all three.

  • Sensing moved to the edge. Doorbells, conveyor belts, drones, wearables — tiny battery- and bandwidth-limited devices are doing vision. Reading and crunching a full frame per item is exactly the budget they don't have.
  • Privacy is now a liability, not a footnote. A stored full frame is a recoverable photo. The cheapest way to not leak a picture is to never capture one in the first place — capture a measurement that doesn't look like the scene.
  • Results need to be provable. As AI makes more real decisions, "the model said so" isn't enough. You need to show exactly what was measured and that it wasn't tampered with — a re-runnable experiment, not a screenshot.

PhotonLayer is the only proof-of-concept in this space that ships with bit-reproducible, signed receipts, so a result is a re-runnable experiment, not a claim. It reproduces a genuine research result — a single-layer ceiling-break to 83.3%, multi-plane 88.8%, all at 16× compression, all deterministic, from clean in ~24–62 seconds — and is unusually candid about exactly what it does and doesn't prove.

Friendly view
A glowing tamper-proof digital receipt with a green checkmark seal and a subtle fingerprint motif, edged with prismatic light.
A result you can prove: every run emits a signed receipt — a re-runnable experiment, stamped and tamper-evident.
Technical view — three forces, one design answer
EDGE tiny power / bandwidth PRIVACY don't store a photo PROVABILITY show what was measured Capture the answer few pixels · measurement, not photo + signed reproducible receipt
Edge limits, privacy, and provability all push toward the same answer: capture less, store a measurement not a photo, and prove it with a receipt.
04 Q4 · How does it solve it?

A trained phase mask does the first transform inside the light

"Phase mask," diffraction, sensor, decoder, receipt — each one in plain terms, then exactly.

"Phase mask," in plain terms: a flat piece of optics whose surface is patterned so it nudges different parts of the light wave to arrive slightly early or late — and that pattern is what gets learned for your specific task.

Friendly view
A flat patterned piece of smart glass bends incoming wavefronts of light — some parts arrive slightly early, some late — emerging as gently reshaped glowing waves.
The phase mask is patterned glass that bends the light wave: some parts arrive a touch early, some late. Train that pattern and the light arrives pre-sorted.
Technical view — the optical stack
incoming light phase mask propagation sensor decoder +.3π−.7π+.1π−.4π+.6π−.2π learned θ(x,y) Fresnel / Fraunhofer / Angular-Spectrum · deterministic FFT bin → 64 px nearest-centroid reads the class
Accurate to photonlayer-core: a learned phase profile θ(x,y) shifts the wave per region, scalar diffraction propagates it (Fresnel / Fraunhofer / Angular-Spectrum via a deterministic FFT), the sensor bins down to ~64 values, and a tiny decoder reads the class.

How the mask gets good: trial-and-improvement, then real gradients

A random mask already works a bit. To make it actually good at separating your classes, PhotonLayer trains the mask two ways:

  • Hill-climbing — try a small tweak, keep it only if accuracy improves. Simple, but it plateaus at an optimizer ceiling (~73%).
  • Analytic gradient descent — train the mask through a proven adjoint of the diffraction operator (Propagator::backward_into, validated by an exact-adjoint identity and a finite-difference grad-check). This clears the ceiling decisively, to 83.3% single-plane.
  • Multi-plane cascade — stack phase planes with free-space propagation between them, trained end-to-end through the composed adjoint. Each plane sees a genuinely different field (decorrelated to ~0.04), reaching 88.8% with 2 planes.
The receiptEvery run hashes every output-determining input — image, mask, config, output frame, metrics, build provenance — into one tamper-evident BLAKE3 digest. Change any one field and verification fails. That's what turns a result into a re-runnable experiment.
05 Q5 · What does a solved state look like?

Fewer pixels, a measurement that isn't a photo, and a receipt you can re-run

The before → after, stated as concrete, measured numbers from the repo.

"Solved" isn't abstract here — it's a set of measured numbers you can reproduce.

Before · normal camera + AI

Read 1,024 pixels per frame. Compute 10,240 MACs in the decoder. Store a full, viewable photo. Result is a claim — "trust the model."

After · PhotonLayer

Read 64 pixels (16× fewer). Compute 640 MACs (16× fewer). Store a measurement that doesn't look like the scene. Result ships with a signed, re-runnable receipt.

Measured on real MNIST (16× compression)Sensor pxDecoder MACsAccuracy
Full-image baseline (same tiny decoder)102410,24075.40%
Optical, hill-climbed mask6464073.05%
Optical, gradient-trained (single plane)6464083.30%
Optical, 2-plane cascade6464088.80%
Friendly view
The before-and-after again: a heavy full-photo camera versus a light sensor capturing a few clean bright dots.
Same decision, far lighter footprint — and the thing you keep is a measurement, not a recoverable picture.
Technical view — breaking the ceiling
60708090 73.05% hill-climb 75.40% full-image (1024 px) 83.30% gradient (64 px) 88.80% 2-plane (64 px) accuracy
The ceiling-break, exactly as reported: hill-climb plateaus at 73.05%; real gradient training clears it to 83.30% (single plane); a 2-plane cascade reaches 88.80% — all at 16× sensor compression with the same tiny decoder.
Read this number honestlyThe 83.30% > 75.40% gap is a statement about feature separability under a deliberately weak nearest-centroid decoder, not "optics beat digital." A small CNN on the same 1,024 raw pixels reaches ~99% and beats both. PhotonLayer's win is compression + privacy-by-physics + auditability, not raw accuracy.
06 The "oh — that's what it's for" moment

Priya runs a small recycling line

One named, ordinary person. A real before → after. No optics degree required.

Priya runs a small recycling line. A camera over the conveyor belt has to spot which bin each item belongs in. She's not an optics engineer; she just needs the sorter to be fast, cheap, and not a privacy headache — the belt sometimes carries documents and personal items.

Friendly view — Priya's sorting line
A warm illustration of a small recycling line: a conveyor with a bottle, paper and can, a smart prismatic-lens camera above sorting them into four colored bins.
A bottle, a sheet of paper, a can go past. A small smart-lens camera sorts each into one of four bins — without ever keeping a photo of what went by.
Before

The camera grabs a full image of every item — thousands of pixels each — and a computer chews through all of them to decide "plastic / paper / metal / glass." That's a lot of data per second, a chunky processor, and a stored photo of everything that went past, including the personal stuff. It works, but it's heavy, power-hungry, and the stored pictures make her lawyer nervous.

After PhotonLayer

Priya's team designs a learned phase mask for exactly this 4-way sort. Now the sensor captures a few numbers per item — a measurement already shaped to separate the four classes that does not look like the item (you can't read a document off it). A tiny decoder reads "paper" off those numbers. Each run also emits a signed receipt, so she can prove to an auditor exactly what was measured.

Technical view — what Priya's measured run actually showed
item learned 2×2 sensor · 64× fewer "paper" accuracy at this 64× reduction: learned 0.988 random 0.738 naïve subsample — lost source: photonlayer-core compression-sweep table (2×2 sensor, 64× reduction) ⛓ + signed receipt → prove exactly what was measured
In the repo's own measured sweep, the learned few-pixel front end hits ~99% on a 4-class sort where a random mask gets ~74% and naïve sub-sampling loses the signal entirely. Each run emits a verifiable receipt.

The "oh, that's what it's for" line: It's the difference between a camera that photographs the scene and then thinks about it, and one where the lens already did the thinking — so all that's left to capture is the answer.

The honest hedge the repo itself demandsThe high-accuracy numbers come from a noise-free simulation with a deliberately weak decoder — a statement about feature separability, not "optics beat digital." A small CNN on the raw pixels still wins on raw accuracy. Priya's win is compression + privacy-by-physics + auditability, not "more accurate than a normal camera."
08 "I already have a camera + AI — why this too?"

Why this vs. the tools you already have

Answered head-on: vs a normal camera+CNN, and vs "an optical neural network."

You might already use…What PhotonLayer changes
A normal camera + a CNNA normal camera captures the whole scene then computes. PhotonLayer captures far less by doing the first transform in the optics, and stores a measurement that need not look like the scene. The repo's words: "sees enough to decide, but captures far less than a full image."
"An optical neural network"The repo deliberately positions itself as NOT that. Its wedge is "auditable optical compression for task-useful sensing" — narrower and "far more defensible." Multi-layer 97–99% diffractive networks are explicitly out of scope. It doesn't compete on raw accuracy.
Any of the aboveThe one thing nothing else here gives you: bit-reproducible, signed receipts. Every run hashes its inputs/outputs into a tamper-evident digest, so a result is a re-runnable experiment, not a screenshot. That's the stated moat.
Friendly view
The before/after split once more: heavy full-photo camera versus a calm few-dots sensor with a prismatic lens.
It's not "a better camera." It's a different deal: capture less, store a measurement not a photo, prove it with a receipt.
Technical view — the defensible wedge
data captured → less auditable / provable → more camera + CNN multi-layer optical net PhotonLayer compressed + auditable
PhotonLayer's wedge: it gives up the raw-accuracy race to own the corner nothing else here occupies — heavily compressed and provable.
09 Q · How would you implement it?

What's actually inside — the four real crates

PhotonLayer is Rust crates, not a zip-of-models. Here's the real file-tree, each part in plain English.

One Rust workspace, four crates. This is the actual layout (from the README "Crates" table and Cargo.toml) — annotated so you know what each one is for.

PhotonLayer/ # one Rust workspace, four crates · MIT © Ruvectorcrates/ │ ├ photonlayer-core # the optical simulator: scalar diffraction (Fresnel/ │ │ # Fraunhofer/angular-spectrum), deterministic FFT, phase │ │ # mask, sensor, metrics, signed receipts ← the heart │ ├ photonlayer-bench # the experiments: learned-vs-random masks, the real-MNIST │ │ # compression benchmark, gradient + cascade training, │ │ # privacy probe ← the proof │ ├ photonlayer-wasm # browser build: run the whole thing in your browser, no install │ └ photonlayer-cli # command-line driver: bench / barcode / edge / privacy-gate /# verify-receipt ← the easy front doorexamples/ # hello_optics, compression, receipt, propagation_modes …docs/ # ADR-260/261 (core), ADR-263 (FiberGate roadmap)Cargo.toml # workspace · version 0.1.0 · edition 2021README.md # the dense original — this site is its plain-English handle

Use it in your own Rust project

Add the core crate and call the simulator directly. Re-running yields a bit-identical frame_hash — that's the determinism guarantee.

$ cargo add photonlayer-corecopy
use photonlayer_core::prelude::*;

let frame = ScalarSimulator.simulate(&img, &mask, &cfg)?;
// re-run with the same inputs → identical frame.frame_hash
// build & verify a tamper-evident receipt of exactly what ran:
let receipt = Receipt::build(&img, &mask, &cfg, &frame);
assert!(receipt.verify());
Friendly view — one engine, three ways to drive it
⚙️ core 🧪bench · proof ⌨️cli · front door 🌐wasm · no install
Think of core as the engine and bench / cli / wasm as three ways to put your hands on it — prove it, type at it, or click it in a browser.
Technical view — how the four crates fit together
photonlayer-core the heart · optics + receipts photonlayer-benchthe proof / experiments photonlayer-clithe easy front door photonlayer-wasmbrowser, no install
core is the engine. bench proves it, cli drives it from a terminal, and wasm runs it in a browser — every front door calls the same deterministic core.
10 Q7 · How do you start?

Three ways to start — all real, all in the README

Pick by how much you want to install: zero, a 30-line tour, or a dependency.

  1. No install — try it in your browser. Open the live demo ↗ — it runs entirely via WASM. Shape light through a mask, watch it compress to a tiny measurement, and verify the deterministic receipt.
  2. 30-line local tour. Run the minimal pipeline and see that a re-run is bit-identical:
    $ cargo run --release --example hello_optics -p photonlayer-corecopy
  3. Use it in your own Rust project. cargo add photonlayer-core, then call ScalarSimulator.simulate(&img, &mask, &cfg) — re-running yields a bit-identical frame_hash.
Friendly view
Light passing through patterned smart glass — the thing you get to play with the moment you open the live demo.
The fastest start needs nothing installed: open the live demo and shape light through a mask in your browser.
Technical view — three on-ramps, by how much you install
0 · BROWSER live demo (WASM) no install 1 · TOUR hello_optics 30-line local run 2 · BUILD cargo add into your Rust project each one re-runs bit-identically · the determinism guarantee
Pick your on-ramp by how much you want to install — and every one yields a bit-identical re-run.
If you want the real-data numbersMNIST isn't bundled. Fetch the IDX files into crates/photonlayer-bench/data/mnist/ first (see each test header). Without them, the examples skip cleanly (never panic) and run on synthetic data.

Open the repo on GitHub ↗

11 Take it with you · a studio to explore, one download to keep

The NotebookLM studio & the drop-in: watch, listen, read — then take the whole thing

A public NotebookLM studio (🎧 audio · 🎬 video · 🖼 slides · 📊 infographic · 📄 report) plus a one-zip drop-in: for-humans/ (primer + studio media) and for-ai/ (a searchable KB you wire into Claude Code in three steps).

Generated with Google NotebookLM · grounded in the same source repo

A full NotebookLM studio for PhotonLayer

Audio overview, an explainer video, a slide deck, an infographic and a written report — all in one place, all built for a curious newcomer. No account, no setup: just open it and explore.

  • 🎧 Audio overview
  • 🎬 Explainer video
  • 🖼 Slide deck
  • 📊 Infographic
  • 📄 Written report

🎬 Explainer video

A short, plain-language walkthrough — what PhotonLayer is, the before→after, why it matters, how to start.

📊 The whole idea on one page

The before→after at a glance: a camera that records every pixel vs. light pre-shaped so a tiny sensor reads only the answer — 16× less data, privacy by physics, a signed receipt.

PhotonLayer infographic: AI that sees the answer, not the image. Traditional sensing captures everything and thinks later, leaking privacy and burning compute; PhotonLayer uses a learned phase mask to shape light so a tiny sensor captures the answer in a few numbers — 16x less data, 88.8% accuracy, with a signed receipt for trusted audits.

🖼 Slide deck for a newcomer

A 12-slide deck that walks the idea from "what is this?" to "how do I try it?" — built for someone seeing optical AI for the first time.

PhotonLayer slide deck cover: The camera that captures the answer, not the picture — an introduction to PhotonLayer, the deterministic optical-AI front end.
Friendly view — what you actually get when you unzip it

This is the real file-tree of the download — every file, in plain English. The studio media is highlighted: play the audio overview first, it's the gentlest way in.

photonlayer-dropin/ # one zip · two halves · grounded in ruvnet/PhotonLayer @ fe86c9ffor-humans/ # read & listen first — no setup needed │ ├ photonlayer-primer.md # the 7-question plain-English orientation │ └ studio/ # NotebookLM media — start here │ ├ 🎧 photonlayer-audio.m4a # AUDIO OVERVIEW — play this FIRST (Priya's story, ~deep-dive) │ ├ 📊 photonlayer-infographic.png # the whole idea on one page (before→after) │ ├ 🖼 photonlayer-slides.pdf # newcomer-friendly deck │ ├ 📄 photonlayer-report.md # the written deep-dive briefing (figures + honest limits) │ └ audio-overview-prompt.md # the optimized prompt that generated the audiofor-ai/ # the searchable knowledge pack — wire into your agentphotonlayer-kb.rvf # single 384-dim bge-small HNSW knowledge base ← the brainphotonlayer-kb.rvf.embed.json # embedder config — REQUIRED so queries use the right model…passages.jsonl · …ids.json # full passage text + per-id indexask-kb.mjs · kb-mcp-server.mjs # CLI query + MCP server for Claude Code ← the front doorskb.config.mjs · resolve-deps.mjs · package.json # run `npm i` here

Inside for-humans/studio/ — start here

  • 🎧 photonlayer-audio.m4a — a NotebookLM audio overview that walks Priya's recycling-line story end to end. Play this first..m4a
  • 📊 photonlayer-infographic.png — the whole idea on one landscape page: the before→after of raw camera data vs. pre-shaped light..png
  • 🖼 photonlayer-slides.pdf — a newcomer-friendly deck: what PhotonLayer is, why it matters, how to start..pdf
  • 📄 photonlayer-report.md — the written deep-dive: the four crates, the measured results, and the honest limits, all grounded in the repo..md

The explainer video is large, so it lives in the public NotebookLM studio (linked from the zip's README) rather than inside the download.

Download the drop-in zip ↓ Self-contained · KB passed Gate A (tuned 99.7 / held-out 100). No account, no cloud.

Three steps to wire the AI half into Claude Code

  1. Unzip & install:
    $ unzip photonlayer-dropin.zip && cd photonlayer-dropin/for-ai && npm installcopy
  2. Point Claude Code at it with a .mcp.json in your project root that runs for-ai/kb-mcp-server.mjs.
  3. Add a one-line gate to CLAUDE.md: "Before answering anything about PhotonLayer, query the photonlayer-kb tool and ground the answer in what it returns."

Prefer the terminal? Query it directly, no Claude Code needed:

$ node ask-kb.mjs photonlayer "what does PhotonLayer actually do"copy
Technical view — how the two halves serve two audiences from one download
photonlayer-dropin.zip one download for-humans/ primer.md 🎧 audio overview · 📄 report → a person reads & listens for-ai/ photonlayer-kb.rvf + MCP server ask-kb.mjs → Claude Code searches it
One zip, two audiences: for-humans/ is a primer plus a studio (🎧 audio + 📄 report) a person reads and listens to; for-ai/ is the same knowledge as a searchable .rvf + MCP server your assistant queries — so it answers from the real source instead of guessing.
12 In the repo's own words — don't soften them

Honest limits

PhotonLayer is unusually candid about what it does and doesn't prove. So is this page.

Friendly view — what's real vs. what's a goal
Proven today ✓ On the roadmap ◷ ✓ deterministic Rust sim ✓ BLAKE3 receipts ✓ 16× / 64× compression ✓ 83.3% ceiling-break ✓ runs offline ◷ real glass hardware ◷ noise / quantization ◷ cross-platform bits ◷ formal privacy proof ◷ beating a CNN
The left column is real and reproducible right now; the right column is honest future work the repo names itself.
Technical view — where each claim's confidence comes from
real-hardware regime — not yet characterized simulation-verified noise-free scalar diffraction continuous phase · x86-64 bit-identical accuracy = feature-separability claim
Every number on this page lives in the green inner box (simulated, reproducible). The dashed outer box — real glass, sensor noise, fabrication — is explicitly unproven.
It's a simulator, not hardware (yet)"Today this is a software simulator written in Rust … building the actual glass is on the roadmap." Real hardware is expected to degrade the numbers.
Not a new accuracy state-of-the-artA single task-trained optical layer + tiny decoder = competitive single-layer optical compression, not SOTA. The gap vs the full-image baseline is feature-separability under a deliberately weak nearest-centroid decoder; a small CNN on the same raw pixels reaches ~99% and beats both.
No privacy or security guaranteeIt stores a learned measurement, not the raw image — "a description, not a theorem." The bundled probe measures linear invertibility only; nonlinear (CNN/U-Net) reconstruction is expected to succeed. Never read it as "cannot be reconstructed" or "privacy-preserving."
The "16× MAC reduction" counts the digital decoder only(640 vs 10,240). The optical FFT-scale transform is passive in real hardware but not free in this simulator, and is not counted.
All figures are noise-free scalar-diffraction simulationwith continuous phase. Robustness to phase quantization, sensor noise, and fabrication error is not yet characterized (a quantization/SNR ablation is on the roadmap).
Determinism is verified within runs/builds on x86-64, not yet cross-platformFull Linux/macOS/WASM bit-identity is a design goal, not yet proven — the open obstacle is platform libm transcendentals (sin/cos/atan2) differing by a ULP.
MNIST data isn't bundledThe real-data runs need you to fetch the IDX files first; without them the examples skip cleanly and run on synthetic data.