AI4WRD-OCR

AI4WRD is an application that performs Data Extraction through Optical Character Recognition from window based applications. Features include:

Ability to use both Easy-OCR and Tesseract to perform Optical Character Recognition
Ability to crop the video stream to perform OCR on specific sections of the video
Ability to recognise specific screens using the Scale Invariant Feature Transform algorithm and associate them with specific sets of crops
Ability to save and load crop configurations
Output to both CSV and through the MQTT protocol

A more comprehensive introduction to AI4WRD is available at the AI4WRD-OCR Wiki! Portal to AI4WRD-OCR wiki.

Installation Instructions

Using Conda

Install conda, https://www.anaconda.com/products/distribution make sure to set environment variables
Create an environment with python 3.9
Install tesseract at https://github.com/UB-Mannheim/tesseract/wiki, install the chinese language models too
Set environment variable for tesseract https://tesseract-ocr.github.io/tessdoc/Installation.html
Download git https://git-scm.com/download/
Open git cli at a folder where you want to install ai4wrd
Run git clone https://github.com/msf4-0/AI4WRD-OCR
Start the anaconda environment as a cmd and cd to the AI4WRD folder
Run pip install -r requirements.txt
Run pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio===0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
Run export PYTHONIOENCODING=utf8 if on Linux or OSX or set PYTHONIOENCODING=utf8 if on Windows.
Run streamlit run mainapp.py

Using Pipenv

Install tesseract at https://github.com/UB-Mannheim/tesseract/wiki, install the chinese language models too
Set environment variable for tesseract https://tesseract-ocr.github.io/tessdoc/Installation.html
Install python3.9, either manually set the environment variables or tick the "set environment variable" during installation https://www.python.org/downloads/release/python-3913/
Download git https://git-scm.com/download/
Open git cli at a folder where you want to install ai4wrd
Run git clone https://github.com/msf4-0/AI4WRD-OCR
Start cmd as administrator and cd to the folder AI4WRD-OCR
Run pip install pipenv
Run pipenv shell
Run pip install -r requirements.txt
Run pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio===0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
Run export PYTHONIOENCODING=utf8 if on Linux or OSX or set PYTHONIOENCODING=utf8 if on Windows.
Run streamlit run mainapp.py

Brief User Guide

For a more detailed guide head to the AI4WRD-OCR wiki.

Application startup

Run the application by first activating your environment, then running the following command in a terminal: streamlit run mainapp.py

Load Frame

When starting up the app, you will arrive at the home page, "Load Frame"
This page gives you the option to select the videocapture device using a dropdown menu
"Language to detect" dropdown menu allows you to define the language that you want to perform OCR on.
Check the "Run" checkbox to preview the video capture
Click capture screenshot to take a screenshot of the current video
Proceed to the crop tool after capturing all required screenshots

Cropping tool

In the "Crop" page, the first frame of the livestream will be visible to you with a box within
In the dropdown menu "choose image", you can choose which one of your previous screenshots you would want to specify the crops for
Drag the box to crop the section that you would want to perform ocr on
The OCR will be performed on these cropped sections on a later page
Select the "Save Crop" button to save the crop, saved crops will appear at the bottom of the page
You may continue to add and store more crops
The zoom function is available on the sidebar for you to enlarge crops if the text appears to be too small
Note that the OCR accuracy can be affected depending on the size and clarity of your cropped text
Proceed to the next page when you have finished cropping for all desired screenshots
You may save or load crop configuration using the text box and buttons at the top of the page

OCR on Livestream

In the "OCR Livestream" page, check the done crop checkbox to begin livestream
The livestream from your capture device will be visible at the top of the page
The livestream of the cropped sections will also appear below the main livestream
The OCR result of the crop is displayed below each crop
The "Choose Model" dropdown menu allows you to select the model used for ocr
The "OCR Confidence Cut-Off" widget allows you to filter out detected text below the defined threshold
The "Save Continuous to csv" button will allow you to continuously save new OCR data into a csv file
The "Save Previous to csv" button allows you to save all the previous OCR data to a csv file
The csv file will be saved at the specified path
The "Publish to mqtt button" allows you to publish detected data to a mqtt broker on the Address, Port and Topic specified. Note: if there are multiple crops they will be published to different topics based on the topic name you specified according to the format <topic specified><crop number>. For instance, if there are 2 crops and the topic specified is ai4wrdOutput, the text from crop 1 will be published to ai4wrdOutput1 and the text from crop 2 will be publishe to ai4wrOutput2
Note that the zoom tool is still available for you to enlarge crops

Licensing

This software is licensed under the GNU GPLv3 LICENSE © Selangor Human Resource Development Centre. 2021. All Rights Reserved. Users that want to modify and distribute versions of AI4WRD and do not wish to conform to obligations to share the source code are free to contact SHRDC for alternative licensing options.

Contributing

We welcome any and all contributions through pull requests, whether it be bug fixes or new features.

Citation

streamlit-cropper component: https://github.com/turner-anderson/streamlit-cropper
easy-ocr library: https://github.com/JaidedAI/EasyOCR
tesseract library: https://github.com/tesseract-ocr/tesseract

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
resources		resources
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cropApp.py		cropApp.py
livestreamApp.py		livestreamApp.py
loadApp.py		loadApp.py
mainapp.py		mainapp.py
processFrame.py		processFrame.py
requirements.txt		requirements.txt
resizeFrame.py		resizeFrame.py
saveLoad.py		saveLoad.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI4WRD-OCR