Skip to content
nyeoWM edited this page Jul 8, 2022 · 10 revisions

AI4WRD-OCR WIKI

AI4WRD-OCR Features

Basic OCR Work-Flow

image

Use the navigation drop down menu to navigate between the various sections of the app. The app currently consists of three sections:

  1. Load frame
  2. Crop
  3. OCR-Livestream

The main workflow consists of the following:

Choosing video stream -> capturing screenshots of specific screens -> cropping specific sections of each of the screens -> starting optical character recognition

1. Load Frame

captureScreen1 Load Frame Screen

Select Language

image

Use the drop down menu to select additional languages to detect. Currently English is the default language, with the option to simultaneously detect either traditional or simplified Chinese too.

Select and preview Video Stream

image

Use the drop down menu to select the video stream that you want to perform Optical Character Recognition on, then click the run widget to preview the video stream.

Capture screenshots

image

Capture the necessary screenshots of the various screens that you would like to perform optical character recognition on. Later on, you would get to define crops for each of the screens and the software would automatically detect and crop the relevant video stream to perform optical character recognition.

Screenshots are captured using the Capture screenshot button. The screenshots captured will be displayed bellow the button.

2. Crop

cropScreen1 Crop Screen

Select Screenshot to Crop

image

Drop down menu to select the screenshot to crop. Crops will be automatically saved and associated with the specified screenshots internally.

Cropping

image

Drag the box to specify sections of the screen to crop. Later the application will perform Optical Character Recognition on the specified crops. A screen below will preview the selected crop.

Once you are satisfied with the crop, click the crop button and the crop will be saved listed below. You can proceed to perform crops as you like.

Additional Configurations

image

You can additionally specify if you would like to see the crop preview in real time, the color of cropping box, and the zoom level of the crops.

3. OCR Livestream

liveStream Optical Character Recognition Livestream Screen

Click on the Done Crop check box to initialize the models.

Note: If this is your first time running the program, it might take some time as the software will need to download the required models. Please ensure that your internet connection is stable. Check the terminal output if it does not respond after a significant amount of time, if it still does not respond you might need to restart the program.

Select Optical Character Recognition Library

image

Drop down menu to select the optical character recognition library to be used. Currently there are two libraries available: Tesseract and Easy-Ocr. We recommend tesseract for printed characters on screens and easy-ocr for streams from video cameras.

OCR Confidence Level Cut off

image

Specifies the minimum confidence level for optical character recognition. If the confidence level drops bellow the specified level the text will not be displayed.

Video Preview and Optical Character Recognition Output

Video Preview

image preview of the current video stream

Note that the video stream is the same videos stream as selected from the [load frame page]((#1-load-frame). If you would like to select a different video stream, please return to the load frame page using the navigation drop down menu and select a different stream.

OCR Output

image

Saving and Loading Configurations

image

Output to CSV and MQTT

CSV Output

MQTT Output

Clone this wiki locally