Attention Is All You Need

This repo is a work-in-progress towards the goal of a minimal implementation of transformers with multi-head self attention, for my own curiosity and deeper understanding. The model is trained and evaluated on a toy dataset where the task is to reverse a sequence of integers.

Usage

Clone the repository:

git clone https://github.com/naivoder/AttentionIsAllYouNeed.git
cd AttentionIsAllYouNeed

Install the dependencies:
```
pip install -r requirements.txt
```
Run the training script:
```
python main.py
```

Acknowledgements

Special thanks to Aladdin Persson for his explanation of torch.einsum for the attention mechanism.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Attention Is All You Need

Usage

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

Attention Is All You Need

Usage

Acknowledgements