Want camera features that feel magical without sending frames to a server? 👨
Want camera features that feel magical without sending frames to a server?
In 2025, you can run OCR, face blur, and pose tracking entirely on-device in React Native -- fast, battery-aware, and privacy-first.
This guide gives you the architecture, code patterns, and performance tricks to ship a production-ready AI camera.
🧠What we're building
* A Vision Camera screen with frame processors (JSI)
* OCR (text detection) for instant read/scan
* Face Blur (privacy by default)
* Pose (keypoints, angles, simple rep counter)
* Overlays (Skia/Reanimated) that don't jank
* Model packaging, GPU/NNAPI delegates, throttling & thermal safety
Tech stack: react-native-vision-camera (frame processors) + your choice of ML Kit / MediaPipe / TFLite runners behind a tiny JSI plugin.
🧱 Architecture (bird's-eye view)
[Camera (VisionCamera)]
â””---- Frame Processor (JSI on the UI thread)
├-- OCR Runner (MLKit/TFLite)
├-- Face Detector (MediaPipe)
â””-- Pose Detector (MoveNet/BlazePose)
↳ returns lightweight structs (boxes, keypoints)
[Overlays]
├-- Skia / Reanimated Canvas (draw off main JS where possible)
â””-- Throttled state bridge (SharedValues / atomic refs)
[Model Assets]
├-- Packaged in app (first run) OR lazy-downloaded (signed)
â””-- GPU delegates: Metal (iOS), NNAPI/GPU (Android)
Key idea: Do heavy math inside the frame processor (JSI/native) and only pass minimal results (arrays of floats/ints) to JS for drawing. That keeps FPS high.
🧪 Choosing your model runners (quick cheat sheet)
TaskBest it just works Best speedNotesOCRML Kit TextTFLite OCRML Kit is easy, small, good accuracyFaceMediaPipe FaceMediaPipeRobust, fast, GPU-friendlyPoseMoveNet (TFLite)BlazePoseMoveNet lightning is tiny & fast