DocTR (OCR)

DocTR is an Optical Character Recognition (OCR) model.

You can use DocTR to read the text in an image.

To use DocTR with Inference, you will need a Roboflow API key. If you don't already have a Roboflow account, sign up for a free Roboflow account.

Then, retrieve your API key from the Roboflow dashboard. Learn how to retrieve your API key.

Run the following command to set your API key in your coding environment:

export ROBOFLOW_API_KEY=<your api key>

Let's retrieve the text in the following image:

A shipping container

Create a new Python file and add the following code:

import os
from inference_sdk import InferenceHTTPClient

CLIENT = InferenceHTTPClient(
    api_url="https://infer.roboflow.com",
    api_key=os.environ["ROBOFLOW_API_KEY"]
)

result = CLIENT.ocr_image(inference_input="./container.jpg")  # single image request
print(result)

Above, replace container.jpeg with the path to the image in which you want to detect objects.

The results of DocTR will appear in your terminal:

{'result': '', 'time': 3.98263641900121, 'result': 'MSKU 0439215', 'time': 3.870879542999319}

Benchmarking¶

We ran 100 inferences on an NVIDIA T4 GPU to benchmark the performance of DocTR.

DocTR ran 100 inferences in 365.22 seconds (3.65 seconds per inference, on average).

DocTR (OCR)

Benchmarking¶

See Also¶