Skip to content

Latest commit

 

History

History
28 lines (18 loc) · 818 Bytes

README.md

File metadata and controls

28 lines (18 loc) · 818 Bytes

Attention Is All You Need

This repo is a work-in-progress towards the goal of a minimal implementation of transformers with multi-head self attention, for my own curiosity and deeper understanding. The model is trained and evaluated on a toy dataset where the task is to reverse a sequence of integers.

Usage

  1. Clone the repository:

    git clone https://github.com/naivoder/AttentionIsAllYouNeed.git
    cd AttentionIsAllYouNeed
  2. Install the dependencies:

    pip install -r requirements.txt
  3. Run the training script:

    python main.py

Acknowledgements

Special thanks to Aladdin Persson for his explanation of torch.einsum for the attention mechanism.