A Guide to Text Recognition in React Native

5 min readNov 5, 2023

Imagine converting handwritten documents or printed documents into digital text using your smartphone. Well, this capability is not just a dream but a reality within the world of React Native.

The journey started with experimentation with various third-party React Native text detection libraries and several machine learning tools, including ML Kit, TensorFlow, and Tesseract OCR.

After extensive exploration and research, the researchers found that incorporating the desired functionality of converting handwritten or printed text into digital text directly within React Native was not a straightforward task. They also attempted to find suitable JavaScript libraries for this purpose but couldn’t find any that showed significant promise or met their requirements.

Ultimately, we decided to go with native modules for both iOS and Android and successfully crafted an effective and efficient way to implement text detection from images in React Native using ML Kit.

To learn how to enable pictures to speak, follow these steps along with us.

In other words, let us guide you through the process of making images convey their content audibly.

1. Implement Image capture or image picker functionality

We have employed the react-native-image-picker library to handle image-related tasks. If your requirements call for different camera functionality, you are free to select an alternative library that suits your specific needs.

2. Implement TextDetectionModule for native Android

In your project-level build.gradle file, and add Google's Maven repository.

repositories {
        mavenLocal()
        google()
        mavenCentral()
 }

Add the dependencies for the ML Kit Android libraries to your module’s app-level gradle file, which is usually located in the app/build.gradle.

dependencies {
    // your code
    implementation 'com.google.mlkit:text-recognition:16.0.0'
}

Now in the Android folder create the TextDetectionModule.java.

package com.textdetection;

import android.net.Uri;
import android.util.Log;

import androidx.annotation.NonNull;

import com.facebook.react.bridge.Arguments;
import com.facebook.react.bridge.Promise;
import com.facebook.react.bridge.ReactApplicationContext;
import com.facebook.react.bridge.ReactContextBaseJavaModule;
import com.facebook.react.bridge.ReactMethod;
import com.facebook.react.bridge.WritableArray;
import com.facebook.react.bridge.WritableMap;
import com.google.android.gms.tasks.OnFailureListener;
import com.google.android.gms.tasks.OnSuccessListener;
import com.google.mlkit.vision.common.InputImage;
import com.google.mlkit.vision.text.Text;
import com.google.mlkit.vision.text.TextRecognition;
import com.google.mlkit.vision.text.TextRecognizer;
import com.google.mlkit.vision.text.latin.TextRecognizerOptions;

import java.io.IOException;

public class TextDetectionModule extends ReactContextBaseJavaModule {
    TextDetectionModule(ReactApplicationContext context) {
        super(context);
    }

    @Override
    public String getName() {
        return  "TextDetectionModule";
    }

    @ReactMethod
    public void recognizeImage(String url, Promise promise) {
        Log.debug("TextRecognitionModule", "Url: " + url);
        Uri uri = Uri.parse(url);
        InputImage image;
        try {
            image = InputImage.fromFilePath(getReactApplicationContext(), uri);
            // When using Latin script library
            TextRecognizer recognizer =
                    TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS);

            recognizer.process(image)
                            .addOnSuccessListener(new OnSuccessListener<Text>() {
                                @Override
                                public void onSuccess(Text result) {

                                    WritableMap response = Arguments.createMap();
                                    WritableArray blocks = Arguments.createArray();

                                    for (Text.TextBlock block : result.getTextBlocks()) {
                                        WritableMap blockObject = Arguments.createMap();
                                        blockObject.putString("text", block.getText());
                                        blocks.pushMap(blockObject);
                                    }
                                    response.putArray("blocks", blocks);
                                    promise.resolve(response);
                                }
                            })
                            .addOnFailureListener(
                                    new OnFailureListener() {
                                        @Override
                                        public void onFailure(@NonNull Exception e) {
                                            promise.reject("Create Event Error", e);
                                        }
                                    });

        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Here, we’re passing both the application context and the file URI as arguments to the inputImage.fromFilePath() method.

InputImage image = InputImage.fromFilePath(getReactApplicationContext(), uri);

Next, initialize the TextRecognizer an object from ML Kit.

TextRecognizer recognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)

Pass the image to the process method:

recognizer.process(image)
                  .addOnSuccessListener(new OnSuccessListener<Text>() {
                    @Override
                    public void onSuccess(Text result) {
                        // Task completed successfully
                    }
                })
                .addOnFailureListener(
                        new OnFailureListener() {
                            @Override
                            public void onFailure(@NonNull Exception e) {
                                // Task failed with an exception
                            }
                        });

Finally, ML Kit provides you with the extracted text. When text recognition succeeds, it creates a structured response containing recognized text blocks.

public void onSuccess(Text result) {
    WritableMap response = Arguments.createMap();
    WritableArray blocks = Arguments.createArray();

    for (Text.TextBlock block : result.getTextBlocks()) {
        WritableMap blockObject = Arguments.createMap();
        blockObject.putString("text", block.getText());
        blocks.pushMap(blockObject);
    }
    response.putArray("blocks", blocks);
    promise.resolve(response);
}

To summarize everything in Android, here is the full method recognizeImage that we can utilize.

@ReactMethod
    public void recognizeImage(String url, Promise promise) {
        Uri uri = Uri.parse(url);
        InputImage image;
        try {
            image = InputImage.fromFilePath(getReactApplicationContext(), uri);
            // When using Latin script library
            TextRecognizer recognizer =
                    TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS);

            recognizer.process(image)
                    .addOnSuccessListener(new OnSuccessListener<Text>() {
                        @Override
                        public void onSuccess(Text result) {

                            WritableMap response = Arguments.createMap();
                            WritableArray blocks = Arguments.createArray();

                            for (Text.TextBlock block : result.getTextBlocks()) {
                                WritableMap blockObject = Arguments.createMap();
                                blockObject.putString("text", block.getText());
                                blocks.pushMap(blockObject);
                            }
                            
                            response.putArray("blocks", blocks);
                            promise.resolve(response);
                        }
                    })
                    .addOnFailureListener(
                            new OnFailureListener() {
                                @Override
                                public void onFailure(@NonNull Exception e) {
                                    promise.reject("Create Event Error", e);
                                }
                            });

        } catch (IOException e) {
            e.printStackTrace();
        }
    }

Implement TextDetectionModule for iOS

Add the specified ML Kit dependencies to your Podfile.

pod 'GoogleMLKit/TextRecognition', '3.2.0'

Similar to the Android approach, provide an image URL as input to your native method, which can be invoked from JavaScript. Then, create an MLKVisionImage object using an UIImage or CMSampleBuffer, ensuring that you set the .orientation.

MLKVisionImage *visionImage = [[MLKVisionImage alloc] initWithImage:image];
visionImage.orientation = image.imageOrientation;

Following that, initialize a TextRecognizer instance with the options related to the SDK, which we previously declared as a dependency.

MLKTextRecognizerOptions *latinOptions = [[MLKTextRecognizerOptions alloc] init];
MLKTextRecognizer *textRecognizer = [MLKTextRecognizer textRecognizerWithOptions:latinOptions];

Then, process the image by passing it to the process(_:completion:) method:

 [textRecognizer processImage:visionImage
                    completion:^(MLKText *_Nullable result,
                                 NSError *_Nullable error) {
    if (error != nil || result == nil) {
      reject(@"text_recognition", @"text recognition is failed", nil);
      return;
    }
    NSMutableDictionary *response= [NSMutableDictionary dictionary];
   // detect text
    resolve(response);
  }];

Ultimately, extract text from blocks of recognized text. If the text recognition operation succeeds, it returns Text as an object.

NSMutableDictionary *response = [NSMutableDictionary dictionary];
NSMutableArray *blocks = [NSMutableArray array];

for (MLKTextBlock *block in result.blocks) {
      NSMutableDictionary *blockDict = [NSMutableDictionary dictionary];
      [blockDict setValue: block.text forKey:@"text"];
      [blocks addObject: blockDict];
}

[response setValue:blocks forKey:@"blocks"];
resolve(response);

With ML Kit, implemented the recognizeImage native method for text detection in images for iOS.

RCTTextDetectionModule.m

#import <Foundation/Foundation.h>
#import "RCTTextDetectionModule.h"
#import <React/RCTLog.h>
#import <AVFoundation/AVFoundation.h>
#import <VisionKit/VisionKit.h>
#import <Vision/Vision.h>

@import MLKit;

@implementation RCTTextDetectionModule

RCT_EXPORT_MODULE();

RCT_EXPORT_METHOD(recognizeImage:(NSString *)url  resolver:(RCTPromiseResolveBlock)resolve
                  rejecter:(RCTPromiseRejectBlock)reject)
{
  
  RCTLogInfo(@"URL: %@", url);
  
  NSURL *_url = [NSURL URLWithString:url];
  NSData *imageData = [NSData dataWithContentsOfURL:_url];
  UIImage *image = [UIImage imageWithData:imageData];
  
  MLKVisionImage *visionImage = [[MLKVisionImage alloc] initWithImage:image];
  visionImage.orientation = image.imageOrientation;
  
  MLKTextRecognizerOptions *latinOptions = [[MLKTextRecognizerOptions alloc] init];
  MLKTextRecognizer *textRecognizer = [MLKTextRecognizer textRecognizerWithOptions:latinOptions];
  
  [textRecognizer processImage:visionImage
                    completion:^(MLKText *_Nullable result,
                                 NSError *_Nullable error) {
    if (error != nil || result == nil) {
      reject(@"text_recognition", @"text recognition is failed", nil);
      return;
    }
    NSMutableDictionary *response= [NSMutableDictionary dictionary];
    
    NSMutableArray *blocks = [NSMutableArray array];
    
    for (MLKTextBlock *block in result.blocks) {
      NSMutableDictionary *blockDict = [NSMutableDictionary dictionary];
      [blockDict setValue: block.text forKey:@"text"];
      [blocks addObject: blockDict];
    }
    
    [response setValue:blocks forKey:@"blocks"];
    
    resolve(response);
  }];
  
}

@end

Access Native Modules in React Native side

Now that we’ve finished implementing native methods, we can access and utilize them in our React Native application with NativeModules.

import { NativeModules } from 'react-native';

const { TextDetectionModule } = NativeModules;

export const recognizeImage = (url: string) => {
  return TextDetectionModule.recognizeImage(url);

Thank you for reading this article! Don’t forget to clap only if you think I deserve it👏 and buy me a coffee.

If you have any queries related to ReactNative, I’m always happy to help you. You can reach me on LinkedIn and Gmail.

Happy Learning🚀 Happy Coding💻.

A Guide to Text Recognition in React Native

1. Implement Image capture or image picker functionality

2. Implement TextDetectionModule for native Android

Implement TextDetectionModule for iOS

Access Native Modules in React Native side

Written by Sanjana Human In Tech

No responses yet