CountAR
Type, and Voir counts — screws on a workbench, almonds in a bowl, cells under a lens.
What it does
Type the noun. The camera draws a numbered halo over every instance of that noun in view. Move the camera, the halos follow. Walk to the next surface, the count resumes. The instrument tracks the same object across frames even as it leaves and re-enters the viewport.
How it works
An open-vocabulary detector takes the noun as a prompt and returns bounding boxes. A tracker assigns persistent IDs to the boxes across frames using a simple IOU-based association — so a screw counted at frame 0 stays screw #4 at frame 60, even if you nudged the workbench.
What it's built on
YOLO-world for open-vocab detection. A small CountTracker actor for ID persistence. CoreML for on-device inference. The COCO class vocabulary as a fallback when the prompt doesn't have a matching open-vocab head.