Receipt Information Extraction (RIE)

Intro

Receipt information extraction is the process of automatically identifying and extracting relevant information from receipts, such as the date, merchant name, total amount, and individual item prices.
Receipt information extraction has numerous practical applications, such as in accounting, expense tracking, and financial analysis. By automating the extraction of receipt information, businesses can save time and reduce errors associated with manual data entry.
In this project, we focus on working with Vietnamese-language’s receipt.

Contributors

Duong T. Thanh (@duongttr)
Nguyen N. Doan Hieu (@ndhieunguyen)
Khoi N. The (@nguyenthekhoig7)
Hau T. Hoang (@hautran7201)
Kiet T. Tuan

Project's flow

The image will be processed through YOLOv8 for detecting bounding boxes of texts. After cropping out, the image will be pushed to OCR engine (Pytesseract in this case) to read the content. Images and texts are combined to push to LayoutLMv3 for classifying classes.

Evaluation

Best F1-Score on evaluation dataset = 0.93222

Presentation

You can take a look at our presentation slide for more details

Dataset

GG Drive

Pretrained models

Check Releases section for downloading latest models.

How-to-use

Fine-tuning model

Download dataset and place it to folder dataset, follow this structure:

images: Folder contains receipt images
train.json: Train annotation
val.json: Val annotation

Then run this command:

python train.py --output_dir <output_dir> \
                 --max_steps 15000 \
                 --batch_size 4

Check the source code for more parameters.

Run the program

Clone the repo

git clone https://github.com/duongttr/vireceipt-information-extraction.git
cd vireceipt-information-extraction

Install dependency

conda env create -f environment.yml

or

pip install -r requirements

Run localhost

python -m streamlit run VIE_run.py

References

Tesseract documentation. Tesseract OCR. (n.d.). Retrieved March 31, 2023, from https://tesseract-ocr.github.io/
A new state-of-the-art computer vision model. YOLOv8. (n.d.). Retrieved March 31, 2023, from https://yolov8.com/ Tesseract documentation. Tesseract OCR. (n.d.). Retrieved March 31, 2023, from https://tesseract-ocr.github.io/
Huang, Y., Lv, T., Cui, L., Lu, Y., & Wei, F. (2022). LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking. ArXiv. /abs/2204.08387

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
images		images
notebooks		notebooks
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
VIE_run.py		VIE_run.py
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Receipt Information Extraction (RIE)

Intro

Contributors

Project's flow

Evaluation

Presentation

Dataset

Pretrained models

How-to-use

Fine-tuning model

Run the program

References

About

Releases 1

Packages

Contributors 4

Languages

License

duongttr/vireceipt-information-extraction

Folders and files

Latest commit

History

Repository files navigation

Receipt Information Extraction (RIE)

Intro

Contributors

Project's flow

Evaluation

Presentation

Dataset

Pretrained models

How-to-use

Fine-tuning model

Run the program

References

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 4

Languages

Packages