voir / lens · i

The instrument that sees and is seen.

An instrument that addresses the room, not the screen. A presence held in space, with eye contact returned. A conversation that does not leave the device unless you decide it should.

What it does

Speak into the room. The instrument hears the field around the words — reverberation, distance, the second voice at the kitchen table. The answer is delivered in the same register the room offered up.

The presence holds eye contact when you look back. It mouths what it says — not as performance, as fidelity to the moment that produced the line.

How it works

Spatial intelligence runs first — surfaces, depth, hand pose, the body in the chair. A live transcript becomes a viseme stream. The rig drives a face that knows where you are standing. The map stays on the device; the conversation stays in the room.

What it's built on

ARKit world tracking. RealityKit composition. The MPFB avatar rig. Azure Speech visemes. ElevenLabs synthesis. Cross-mode persistence through VoirCompanionStore.

spatial intelligence · voice in the room · viseme-driven presence · persistent anchor across modes — every line of it runs on the device, in your space, in real time.

DOSSIER · LENS — FILE 0001 / ΔCLEARANCE · PUBLIC FACSIMILE

Lens of Record

FORM FACTOR
iPhone Pro, held at chest height.
OPTICS
One wide. One ultra-wide. One LiDAR.
OBSERVER
You. Singular.
LATENCY
Frame to viseme — under 80 ms.
RANGE
The room around you. The hand in front.
VOICE
Yours, in register. Theirs, in answer.

One person. One room. One minute kept on the record because you chose to keep it.

SPECIMEN REGISTRY · L · LENS APPARATUSDRAG · ZOOM · ROTATE
L-001Pinhole frustum
L-002Optical stack
L-003Iris, nine blades
L-004Room mesh, one chair
L-005Starfield, isotropic

DISCIPLINES — FOUR / SEVENTEEN

What the lens listens for.

  1. iOptical witnessThe room is mapped before the question is asked. Light, depth, surface normal, hand pose — all kept on-device and never named back to you.OUTPUT · room mesh, pose graph
  2. iiAcoustic witnessSpeech is heard inside its room. Reverb, distance, two voices in a small kitchen — the field is part of the meaning.OUTPUT · transcript, viseme stream
  3. iiiEmbodied replyThe avatar mouths what it says. Eye contact when you look back. Blink every six to eight seconds. The reply is not on a screen; the reply is in the room.OUTPUT · MPFB rig, real-time
  4. ivCross-mode memoryA conversation begun in SpaciAR survives into LearnAR. The lens carries the thread. The chambers do not start from zero.OUTPUT · VoirCompanionStore

SPECIFICATION — REDUCED FACSIMILE

The instrument hears the room around the words, not just the words inside it. The instrument answers in your register, not its own. The conversation is private until you decide it is not.

— voir / charter, line 4