Production-ready React Native library for intelligent image analysis using Google ML Kit (on-device)
npm install mediversal-rn-image-intelligence

A production-ready React Native library for intelligent image analysis using Google ML Kit's on-device APIs. Detect faces, extract text, and identify objects in images—all processed locally on the device for maximum privacy and performance.
This library provides a simple interface to Google ML Kit's powerful machine learning capabilities for mobile applications. You can detect human faces with detailed metadata, extract printed text using optical character recognition, and identify objects with confidence scores—all without sending any data to external servers.
Since all processing happens entirely on-device, your users' data never leaves their device. There's no need for an internet connection, no data is uploaded to external servers, and you get fast, reliable results even in offline scenarios.
Face Detection
Detect human faces in images with comprehensive metadata including smiling probability, eye open probability, and head rotation angles. Perfect for selfie validation, group photo analysis, and identity verification workflows.
Text Recognition (OCR)
Extract printed text from images with high accuracy. Ideal for document scanning, business card reading, receipt processing, and any scenario where you need to digitize text from photos.
Object Detection
Identify and classify objects in images with confidence scores. Recognize common objects like animals, vehicles, furniture, food items, and more. Each detected object includes a bounding box, tracking ID, and multiple classification labels.
Privacy-First Architecture
All image processing happens on-device using Google ML Kit. No images or extracted data are ever transmitted to external servers, ensuring complete privacy and GDPR compliance.
Cross-Platform Support
Works seamlessly on both iOS and Android with a unified API. Built with TurboModule specifications for the new React Native architecture.
Performance Optimized
Designed for mobile performance with efficient processing and the option to run face detection, text recognition, and object detection in parallel.
TypeScript Support
Fully typed API provides excellent autocomplete and type safety in your development environment.
Install the package using npm:
``bash`
npm install mediversal-rn-image-intelligence
Or if you prefer yarn:
`bash`
yarn add mediversal-rn-image-intelligence
After installing the package, you'll need to install the CocoaPods dependencies. Navigate to your iOS directory:
`bash`
cd ios
pod install
cd ..
Add the required permissions to your ios/YourApp/Info.plist file:
`xml`
Requirements for iOS:
- iOS 12.0 or higher
- Xcode 12.0 or higher
- CocoaPods 1.10 or higher
First, make sure your minimum SDK version is set to 21 or higher in android/build.gradle:
`gradle`
buildscript {
ext {
minSdkVersion = 21
compileSdkVersion = 33
targetSdkVersion = 33
}
}
Update your android/app/build.gradle to use Java 17:
`gradle
android {
compileOptions {
sourceCompatibility JavaVersion.VERSION_17
targetCompatibility JavaVersion.VERSION_17
}
kotlinOptions {
jvmTarget = "17"
}
}
`
Add the necessary permissions to android/app/src/main/AndroidManifest.xml:
`xml`
Remember to request these permissions at runtime in your application code:
`typescript
import { PermissionsAndroid, Platform } from 'react-native';
async function requestPermissions() {
if (Platform.OS === 'android') {
await PermissionsAndroid.requestMultiple([
PermissionsAndroid.PERMISSIONS.CAMERA,
PermissionsAndroid.PERMISSIONS.READ_EXTERNAL_STORAGE,
]);
}
}
`
If you see a "16KB-compatible" warning during build, add this to android/gradle.properties:
`properties`
android.bundle.enableUncompressedNativeLibs=false
The simplest way to use the library is with the analyzeImage function. Pass it an image URI and it will return the analysis results:
`typescript
import { analyzeImage } from 'mediversal-rn-image-intelligence';
async function analyzeMyImage() {
try {
const result = await analyzeImage('file:///path/to/image.jpg');
console.log('Contains face:', result.containsFace);
console.log('Contains text:', result.containsPrintedText);
console.log('Contains objects:', result.containsObjects);
if (result.faces) {
console.log('Number of faces:', result.faces.length);
}
if (result.printedText) {
console.log('Extracted text:', result.printedText);
}
if (result.objects) {
result.objects.forEach((obj, index) => {
console.log(Object ${index + 1}:); ${label.text}: ${(label.confidence * 100).toFixed(1)}%
obj.labels.forEach((label) => {
console.log(
`
);
});
});
}
} catch (error) {
console.error('Analysis failed:', error);
}
}
You can customize the analysis behavior by passing an options object:
`typescript
import {
analyzeImage,
type AnalysisOptions,
} from 'mediversal-rn-image-intelligence';
const options: AnalysisOptions = {
detectFaces: true,
detectPrintedText: true,
detectObjects: true,
faceDetectionMode: 'accurate',
minFaceSize: 0.15,
};
const result = await analyzeImage('file:///path/to/image.jpg', options);
`
detectFaces (boolean, default: true)
Enable or disable face detection. Set to false if you only need text or object recognition to improve performance.
detectPrintedText (boolean, default: true)
Enable or disable text recognition. Set to false if you only need face or object detection to improve performance.
detectObjects (boolean, default: true)
Enable or disable object detection. Set to false if you only need face or text detection to improve performance.
faceDetectionMode ('fast' | 'accurate', default: 'fast')
Choose between fast processing for real-time scenarios or accurate mode for higher quality detection. Use 'fast' for camera previews and 'accurate' for analyzing stored images.
minFaceSize (number, default: 0.1)
The minimum face size to detect, expressed as a proportion of the image's smaller dimension. Valid range is 0.0 to 1.0. Increase this value to filter out small or distant faces.
This library works seamlessly with React Native image picker libraries. Here are examples using react-native-image-picker:
`typescript
import { launchImageLibrary } from 'react-native-image-picker';
import { analyzeImage } from 'mediversal-rn-image-intelligence';
async function pickAndAnalyze() {
const response = await launchImageLibrary({
mediaType: 'photo',
quality: 1,
});
if (response.assets && response.assets[0].uri) {
const result = await analyzeImage(response.assets[0].uri);
console.log(result);
}
}
`
`typescript
import { launchCamera } from 'react-native-image-picker';
import { analyzeImage } from 'mediversal-rn-image-intelligence';
async function captureAndAnalyze() {
const response = await launchCamera({
mediaType: 'photo',
quality: 1,
});
if (response.assets && response.assets[0].uri) {
const result = await analyzeImage(response.assets[0].uri);
console.log(result);
}
}
`
Here's a full working example of a React component that lets users select an image and displays comprehensive analysis results:
`typescript
import React, { useState } from 'react';
import {
Text,
Image,
ScrollView,
StyleSheet,
Alert,
Pressable,
ActivityIndicator,
View,
} from 'react-native';
import { SafeAreaProvider, SafeAreaView } from 'react-native-safe-area-context';
import { launchCamera, launchImageLibrary } from 'react-native-image-picker';
import {
analyzeImage,
type AnalysisOptions,
} from 'mediversal-rn-image-intelligence';
type Mode = 'face' | 'ocr' | 'object';
export default function AllFeaturesDemo() {
const [mode, setMode] = useState
const [imageUri, setImageUri] = useState
const [result, setResult] = useState
const [loading, setLoading] = useState(false);
const pickImage = async (fromCamera: boolean) => {
const res = fromCamera
? await launchCamera({ mediaType: 'photo' })
: await launchImageLibrary({ mediaType: 'photo' });
if (res.didCancel || !res.assets?.[0]?.uri) return;
const uri = res.assets[0].uri;
setImageUri(uri);
setResult(null);
setLoading(true);
try {
const options: AnalysisOptions = {
detectFaces: mode === 'face',
detectPrintedText: mode === 'ocr',
detectObjects: mode === 'object',
faceDetectionMode: 'accurate',
minFaceSize: 0.15,
};
const analysis = await analyzeImage(uri, options);
setResult(analysis);
} catch (e) {
Alert.alert('Error', 'Image analysis failed');
} finally {
setLoading(false);
}
};
return (
{/ ---------- Mode Switch ---------- /}
active={mode === 'face'}
onPress={() => setMode('face')}
/>
active={mode === 'ocr'}
onPress={() => setMode('ocr')}
/>
active={mode === 'object'}
onPress={() => setMode('object')}
/>
{/ ---------- Pick Buttons ---------- /}
onPress={() => pickImage(true)}
>
onPress={() => pickImage(false)}
>
{loading && (
)}
{imageUri && (
)}
{/ ---------- Results ---------- /}
{result && mode === 'face' && (
value={
result.containsFace
? ${result.faces?.length ?? 0} face(s) detected
: 'No face detected'
}
/>
)}
{result && mode === 'ocr' && (
{result.printedText || 'No text found'}
)}
{result && mode === 'object' && (
{result.objects?.length > 0 ? (
result.objects.map((obj: any, index: number) => {
const mainLabel = obj.labels?.[0];
return (
{mainLabel?.text ?? 'Unknown object'}
{mainLabel?.confidence !== undefined && (
{(mainLabel.confidence * 100).toFixed(1)}%
)}
);
})
) : (
)}
)}
);
}
/ ---------- Small UI Components ---------- /
function Tab({
label,
active,
onPress,
}: {
label: string;
active: boolean;
onPress: () => void;
}) {
return (
style={[styles.tab, active && styles.tabActive]}
>
{label}
);
}
function ResultCard({ title, value }: { title: string; value: string }) {
return (
);
}
/ ---------- Styles ---------- /
const styles = StyleSheet.create({
container: {
flex: 1,
backgroundColor: '#F8FAFC',
},
scroll: {
paddingBottom: 32,
},
title: {
fontSize: 24,
fontWeight: '800',
margin: 16,
},
tabRow: {
flexDirection: 'row',
marginHorizontal: 16,
backgroundColor: '#E5E7EB',
borderRadius: 14,
padding: 4,
},
tab: {
flex: 1,
paddingVertical: 10,
borderRadius: 12,
alignItems: 'center',
},
tabActive: {
backgroundColor: '#0F172A',
},
tabText: {
fontWeight: '700',
color: '#334155',
},
tabTextActive: {
color: '#FFFFFF',
},
buttonRow: {
gap: 12,
marginHorizontal: 16,
marginTop: 12,
},
primaryBtn: {
backgroundColor: '#2563EB',
padding: 16,
borderRadius: 14,
alignItems: 'center',
},
secondaryBtn: {
backgroundColor: '#E5E7EB',
padding: 16,
borderRadius: 14,
alignItems: 'center',
},
btnText: {
color: '#FFFFFF',
fontWeight: '700',
fontSize: 16,
},
secondaryText: {
fontWeight: '700',
fontSize: 16,
},
loadingBox: {
marginTop: 24,
alignItems: 'center',
},
loadingText: {
marginTop: 8,
color: '#475569',
},
card: {
margin: 16,
backgroundColor: '#FFFFFF',
borderRadius: 16,
padding: 12,
elevation: 3,
},
image: {
width: '100%',
height: 260,
borderRadius: 12,
},
resultCard: {
marginHorizontal: 16,
marginTop: 8,
backgroundColor: '#FFFFFF',
borderRadius: 16,
padding: 16,
elevation: 2,
},
resultTitle: {
fontSize: 16,
fontWeight: '700',
marginBottom: 6,
},
resultValue: {
fontSize: 15,
fontWeight: '600',
},
resultText: {
fontSize: 14,
lineHeight: 20,
},
objectItem: {
marginBottom: 8,
},
objectName: {
fontWeight: '700',
fontSize: 15,
},
objectConfidence: {
color: '#2563EB',
fontWeight: '600',
},
emptyText: {
color: '#475569',
fontWeight: '600',
},
});
`
`typescript`
analyzeImage(imageUri: string, options?: AnalysisOptions): Promise
Analyzes an image and returns the detection results.
Parameters:
imageUri (string, required)
The local file URI pointing to the image you want to analyze. Supported URI formats vary by platform:
- Android: file://, content://, or absolute filesystem paths
- iOS: file://, ph://, assets-library://, or absolute filesystem paths
options (AnalysisOptions, optional)
Configuration object to customize the analysis behavior. See the Advanced Configuration section for available options.
Returns:
A Promise that resolves to an AnalysisResult object containing the detection results.
Throws:
An error if the image URI is invalid or if all enabled analyses fail.
`typescript`
isAvailable(): Promise
Checks whether the native modules are properly linked and available for use.
Returns:
A Promise that resolves to true if the library is ready to use, false otherwise.
This is useful for verifying that the installation was successful before attempting to analyze images.
`typescript`
interface AnalysisResult {
containsFace: boolean;
containsPrintedText: boolean;
containsObjects: boolean;
faces?: FaceData[];
printedText?: string;
objects?: ObjectData[];
errors?: {
faceDetection?: string;
textRecognition?: string;
objectDetection?: string;
};
}
containsFace
Boolean indicating whether at least one face was detected in the image.
containsPrintedText
Boolean indicating whether any text was detected in the image.
containsObjects
Boolean indicating whether any objects were detected in the image.
faces
Optional array of FaceData objects, one for each detected face. Only present if faces were detected.
printedText
Optional string containing all the text extracted from the image. Only present if text was detected.
objects
Optional array of ObjectData objects, one for each detected object. Only present if objects were detected.
errors
Optional object containing error messages if any detection method encountered issues. This allows partial results - for example, face detection might succeed while text recognition fails.
`typescript`
interface FaceData {
boundingBox: BoundingBox;
smilingProbability?: number;
leftEyeOpenProbability?: number;
rightEyeOpenProbability?: number;
headEulerAngleY?: number;
headEulerAngleZ?: number;
trackingId?: number;
}
boundingBox
The location and size of the detected face within the image.
smilingProbability
A number between 0.0 and 1.0 indicating the likelihood that the person is smiling. 0.0 means definitely not smiling, 1.0 means definitely smiling, and 0.5 indicates uncertainty.
leftEyeOpenProbability and rightEyeOpenProbability
Numbers between 0.0 and 1.0 indicating the likelihood that each eye is open.
headEulerAngleY (yaw)
The rotation of the head from left to right in degrees. A value of 0 means the face is looking straight at the camera, positive values mean the head is turned to the right, and negative values mean turned to the left.
headEulerAngleZ (roll)
The tilt of the head in degrees. A value of 0 means the head is upright, positive values mean tilted clockwise, and negative values mean tilted counter-clockwise.
trackingId
A unique identifier for the face, useful for tracking the same face across multiple frames in video scenarios.
`typescript`
interface ObjectData {
boundingBox: BoundingBox;
trackingId?: number;
labels: ObjectLabel[];
}
boundingBox
The location and size of the detected object within the image.
trackingId
A unique identifier for the object, useful for tracking the same object across multiple frames in video scenarios.
labels
Array of classification labels for the detected object, ordered by confidence score (highest first).
`typescript`
interface ObjectLabel {
text: string;
confidence: number;
index: number;
}
text
The classification label (e.g., "Dog", "Car", "Person", "Food").
confidence
A number between 0.0 and 1.0 indicating the confidence of the classification. Higher values mean more confident predictions.
index
The internal index of the label in Google ML Kit's classification model.
`typescript`
interface BoundingBox {
x: number;
y: number;
width: number;
height: number;
}
Represents the rectangular region containing a detected face or object. The coordinate system has its origin (0, 0) at the top-left corner of the image, with x increasing to the right and y increasing downward. All values are in pixels.
`typescript`
interface AnalysisOptions {
detectFaces?: boolean;
detectPrintedText?: boolean;
detectObjects?: boolean;
faceDetectionMode?: 'fast' | 'accurate';
minFaceSize?: number;
}
All fields are optional and have sensible defaults. See the Advanced Configuration section for detailed descriptions.
`typescript
const result = await analyzeImage(photoUri, {
detectFaces: true,
detectPrintedText: false,
detectObjects: false,
});
const faceCount = result.faces?.length || 0;
console.log(Found ${faceCount} people in the photo);`
`typescript
const result = await analyzeImage(documentUri, {
detectFaces: false,
detectPrintedText: true,
detectObjects: false,
});
if (result.printedText) {
await saveToDatabase(result.printedText);
}
`
`typescript
const result = await analyzeImage(selfieUri, {
detectFaces: true,
detectPrintedText: false,
detectObjects: false,
minFaceSize: 0.3,
faceDetectionMode: 'accurate',
});
const isValidSelfie =
result.faces?.length === 1 && result.faces[0].smilingProbability! > 0.5;
`
`typescript
const result = await analyzeImage(idCardUri, {
detectFaces: true,
detectPrintedText: true,
detectObjects: false,
faceDetectionMode: 'accurate',
});
const hasRequiredElements = result.containsFace && result.containsPrintedText;
`
`typescript
const result = await analyzeImage(imageUri, {
detectFaces: false,
detectPrintedText: false,
detectObjects: true,
});
if (result.objects && result.objects.length > 0) {
result.objects.forEach((obj) => {
const primaryLabel = obj.labels[0];
console.log(
Detected: ${primaryLabel.text} (${(
primaryLabel.confidence * 100
).toFixed(1)}% confident)`
);
});
}
`typescript
const result = await analyzeImage(photoUri, {
detectFaces: true,
detectPrintedText: false,
detectObjects: true,
});
const tags = [];
if (result.faces && result.faces.length > 0) {
tags.push(
${result.faces.length} person${result.faces.length > 1 ? 's' : ''}
);
}
if (result.objects) {
result.objects.forEach((obj) => {
if (obj.labels[0].confidence > 0.7) {
tags.push(obj.labels[0].text.toLowerCase());
}
});
}
console.log('Photo tags:', tags.join(', '));
`
`typescript
const result = await analyzeImage(foodPhotoUri, {
detectFaces: false,
detectPrintedText: false,
detectObjects: true,
});
const foodItems = result.objects
?.filter((obj) =>
obj.labels.some(
(label) =>
label.text.toLowerCase().includes('food') && label.confidence > 0.6
)
)
.map((obj) => obj.labels[0].text);
console.log('Food items detected:', foodItems);
`
`typescript
const result = await analyzeImage(petPhotoUri, {
detectFaces: false,
detectPrintedText: false,
detectObjects: true,
});
const pets = result.objects?.filter((obj) =>
obj.labels.some((label) =>
['dog', 'cat', 'bird', 'fish'].includes(label.text.toLowerCase())
)
);
if (pets && pets.length > 0) {
console.log(Found ${pets.length} pet(s) in the image);`
}
Choose the Right Detection Mode
Use 'fast' mode for real-time scenarios like camera previews where speed is more important than perfect accuracy. Use 'accurate' mode when analyzing stored images where quality matters more than speed.
Disable Unused Features
If you only need face detection, set detectPrintedText: false and detectObjects: false to improve performance. Similarly, if you only need object detection, disable the other features.
Optimize Image Size
Large images take longer to process. Consider resizing images before analysis if you're working with high-resolution photos. Libraries like react-native-image-resizer can help:
`typescript
import ImageResizer from 'react-native-image-resizer';
const resized = await ImageResizer.createResizedImage(
uri,
1024,
1024,
'JPEG',
80
);
const result = await analyzeImage(resized.uri);
`
First Run Download
The first time you use the library, Google ML Kit needs to download its models:
- Face Detection: approximately 1-2 MB
- Text Recognition: approximately 10-15 MB
- Object Detection: approximately 20-30 MB
These models are downloaded automatically and cached on the device. This only happens once per device, but the first analysis may take longer than subsequent ones.
Cache Results When Appropriate
If you're analyzing the same image multiple times, consider caching the results instead of re-processing the image.
Parallel vs Sequential Processing
By default, all enabled detection methods run in parallel for maximum speed. However, this uses more device resources. If battery life is a concern, you can run analyses sequentially by calling analyzeImage multiple times with different options.
If you see "Module not found" errors on iOS, make sure you've installed the CocoaPods dependencies:
`bash`
cd ios
pod install
cd ..
npx react-native run-ios
For generic Gradle build errors, try cleaning the build:
`bash`
cd android
./gradlew clean
cd ..
npx react-native run-android
If you encounter compilation errors related to Java compatibility, ensure your android/app/build.gradle specifies Java 17:
`gradle`
android {
compileOptions {
sourceCompatibility JavaVersion.VERSION_17
targetCompatibility JavaVersion.VERSION_17
}
kotlinOptions {
jvmTarget = "17"
}
}
If you see a warning about "16KB-compatible", add this to android/gradle.properties:
`properties`
android.bundle.enableUncompressedNativeLibs=false
Remember that on Android, you need to request permissions at runtime, not just declare them in the manifest:
`typescript
import { PermissionsAndroid, Platform } from 'react-native';
if (Platform.OS === 'android') {
await PermissionsAndroid.request(
PermissionsAndroid.PERMISSIONS.READ_EXTERNAL_STORAGE
);
}
``
Make sure you're using the correct URI format for your platform:
- Android accepts: file://, content://, or absolute paths
- iOS accepts: file://, ph://, assets-library://, or absolute paths
Object detection works best with clear, well-lit images containing common objects. If you're getting no results:
1. Ensure the image quality is good with adequate lighting
2. Try images with more prominent, well-framed objects
3. Verify that objects are not too small in the frame
4. Check that the image isn't too dark, blurry, or low resolution
5. Remember that ML Kit recognizes common objects best - very specific or unusual items may not be detected
This library is designed with privacy as a core principle. All image processing happens entirely on-device using Google ML Kit. Your users' images and the data extracted from them never leave their device. No internet connection is required for the library to function, and no data is transmitted to external servers.
This makes the library GDPR compliant by default, as user data stays under their control. The library doesn't collect any analytics or telemetry, and the source code is open for review to verify these privacy guarantees.
The library includes the following Google ML Kit components:
- Face Detection (Android: version 16.1.5, iOS: version 4.0.0)
- Text Recognition v2 (Android: version 19.0.0, iOS: version 4.0.0)
- Object Detection (Android: version 17.0.2, iOS: version 4.0.0)
It's built with TurboModule specifications to be compatible with React Native's new architecture. The package includes full TypeScript type definitions and integrates with iOS via CocoaPods and Android via Gradle.
The object detection feature can recognize a wide variety of common objects across multiple categories:
Animals
Dog, Cat, Bird, Fish, Horse, Rabbit, and various other domestic and wild animals
Vehicles
Car, Truck, Bicycle, Motorcycle, Bus, Train, Airplane, Boat
Household Items
Chair, Table, Bed, Sofa, Lamp, Television, Refrigerator, Microwave
Food and Beverages
Fruits, Vegetables, Beverages, Prepared dishes, Snacks, Desserts
Electronics
Phone, Laptop, Computer, TV, Camera, Keyboard, Mouse
Clothing and Accessories
Shirt, Pants, Shoes, Hat, Bag, Glasses
Outdoor Objects
Trees, Flowers, Buildings, Roads, Signs
Sports Equipment
Ball, Racket, Bicycle, Skateboard
And many more categories. Each detection includes multiple classification labels with confidence scores, allowing you to choose the most appropriate label for your use case or combine multiple labels for more accurate classification.
Contributions are welcome. Please read the CONTRIBUTING.md file in the repository for guidelines on how to submit issues and pull requests.
This project is licensed under the MIT License. See the LICENSE file for complete terms.
Sushant Singh
Email: sushantbibhu@gmail.com
GitHub: @thisissushant
If you need help or want to discuss the library:
Email: sushantbibhu@gmail.com
Issues: GitHub Issues (https://github.com/thisissushant/mediversal-rn-image-intelligence/issues)
Discussions: GitHub Discussions (https://github.com/thisissushant/mediversal-rn-image-intelligence/discussions)
This library is built on top of Google ML Kit and React Native. Thanks to both teams for their excellent work that makes projects like this possible.
---
If this library has been helpful for your project, consider giving it a star on GitHub (https://github.com/thisissushant/mediversal-rn-image-intelligence) to help others discover it.
Made with care for the React Native community.