Skip to content
nyeoWM edited this page Jul 8, 2022 · 4 revisions
  1. AI4WRD-OCR WIKI
    1. AI4WRD-OCR Features - [Basic OCR Work Flow](#basic-ocr-work-flow) - [Saving and Loading Configurations](#saving-and-loading-configurations) - [Output to CSV and MQTT](#output-to-csv-and-mqtt)
    1. Basic OCR Work-Flow ![image](https://user-images.githubusercontent.com/72961684/177920801-ee56cbc2-3484-414e-b9db-145eceb1184f.png) <br />2 Use the navigation drop down menu to navigate between the various sections of the app. The app currently consists of three sections: 1. [Load frame](#1-load-frame) 2. [Crop](#2-crop) 3. [OCR-Livestream](#3-ocr-livestream)

The main workflow consists of the following:

Choosing video stream -> capturing screenshots of specific screens -> cropping specific sections of each of the screens -> starting optical character recognition

      1. 1. Load Frame ![captureScreen1](https://user-images.githubusercontent.com/72961684/177920292-e7f2d1c4-bcbd-42fb-bf68-0eb8eeeb0836.png)
  • Load Frame Screen*
        1. Select Language ![image](https://user-images.githubusercontent.com/72961684/177920417-a8123c02-7bd2-44ab-9620-dce4fd367626.png)

Use the drop down menu to select additional languages to detect. Currently English is the default language, with the option to simultaneously detect either traditional or simplified Chinese too.

        1. Select and preview Video Stream ![image](https://user-images.githubusercontent.com/72961684/177924587-8bd24c72-02c0-43c5-97b1-e9f1d53abbc9.png)

Use the drop down menu to select the video stream that you want to perform Optical Character Recognition on, then click the run widget to preview the video stream.

        1. Capture screenshots ![image](https://user-images.githubusercontent.com/72961684/177923480-ecce7563-7beb-4fb4-b9aa-dabf2dcc22f1.png)

Capture the necessary screenshots of the various screens that you would like to perform optical character recognition on. Later on, you would get to define crops for each of the screens and the software would automatically detect and crop the relevant video stream to perform optical character recognition.

Screenshots are captured using the Capture screenshot button. The screenshots captured will be displayed bellow the button.

      1. 2. Crop ![cropScreen1](https://user-images.githubusercontent.com/72961684/177924658-5a39af1b-8bf9-47c8-b85d-e7992a30f29b.png)
  • Crop Screen*
        1. Select Screenshot to Crop ![image](https://user-images.githubusercontent.com/72961684/177925831-db2b2526-4807-4465-ac42-790215da7738.png)

Drop down menu to select the screenshot to crop. Crops will be automatically saved and associated with the specified screenshots internally.

        1. Cropping ![image](https://user-images.githubusercontent.com/72961684/177926138-4d5de4e3-e621-468c-b6de-2b062ff9cafc.png)

Drag the box to specify sections of the screen to crop. Later the application will perform Optical Character Recognition on the specified crops. A screen below will preview the selected crop.

Once you are satisfied with the crop, click the crop button and the crop will be saved listed below. You can proceed to perform crops as you like.

        1. Additional Configurations ![image](https://user-images.githubusercontent.com/72961684/177926622-f1000513-2fbd-40ad-8ce4-593621f22be8.png)

You can additionally specify if you would like to see the crop preview in real time, the color of cropping box, and the zoom level of the crops.

      1. 3. OCR Livestream ![liveStream](https://user-images.githubusercontent.com/72961684/177926830-c0897b98-a05c-4a6c-bb86-24dd6bf66e61.png)
  • Optical Character Recognition Livestream Screen*

Click on the Done Crop check box to initialize the models.

  • Note: If this is your first time running the program, it might take some time as the software will need to download the required models. Please ensure that your internet connection is stable. Check the terminal output if it does not respond after a significant amount of time, if it still does not respond you might need to restart the program.*
        1. Select Optical Character Recognition Library ![image](https://user-images.githubusercontent.com/72961684/177927355-ed714fef-129f-4e4d-9c03-7611aee9152f.png)

Drop down menu to select the optical character recognition library to be used. Currently there are two libraries available: Tesseract and Easy-Ocr. We recommend tesseract for printed characters on screens and easy-ocr for streams from video cameras.

        1. OCR Confidence Level Cut off ![image](https://user-images.githubusercontent.com/72961684/177927855-1b671f0b-0705-412f-81de-e7c80a28bac5.png)

Specifies the minimum confidence level for optical character recognition. If the confidence level drops bellow the specified level the text will not be displayed.

        1. Video Preview and Optical Character Recognition Output
          1. Video Preview ![image](https://user-images.githubusercontent.com/72961684/177928283-3195b942-f0a2-45ef-94d0-51acd85aafb3.png)
  • preview of the current video stream*

Note that the video stream is the same videos stream as selected from the first page

          1. OCR Output ![image](https://user-images.githubusercontent.com/72961684/177928142-5c1035fe-24dc-4f08-b78c-0592f1725de4.png)
        1. ####
    1. Saving and Loading Configurations ![image](https://user-images.githubusercontent.com/72961684/177925757-af2a6124-243f-40b3-8a69-125f0065096c.png)
    1. Output to CSV and MQTT
      1. CSV Output
      1. MQTT Output
Clone this wiki locally