From 942ba7a681a24d99bf30e4ea04b1755fa59a64b7 Mon Sep 17 00:00:00 2001 From: Jiewen Tan Date: Wed, 31 Jul 2024 12:52:12 -0700 Subject: [PATCH] [Pallas] Fix the doc (#7788) --- docs/pallas.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/pallas.md b/docs/pallas.md index bbc968f95dd..99bf9b72496 100644 --- a/docs/pallas.md +++ b/docs/pallas.md @@ -1,6 +1,6 @@ # Custom Kernels via Pallas -With the rise of OpenAI [triton](https://openai.com/research/triton), custom kernels become more and more popular in the GPU community, for instance, the introduction of [FlashAttention](https://github.com/Dao-AILab/flash-attention) and [PagedAttention](https://blog.vllm.ai/2023/06/20/vllm.html). In order to provide the feature parity in the TPU world, Google has introduced [Pallas](http://go/jax-pallas) and [Mosaic](http://go/mosaic-tpu). For PyTorch/XLA to continue pushing the performance in TPU, we have to support custom kernels, and the best way is through Pallas and Mosaic. The design doc is [TBA](). +With the rise of OpenAI [triton](https://openai.com/research/triton), custom kernels become more and more popular in the GPU community, for instance, the introduction of [FlashAttention](https://github.com/Dao-AILab/flash-attention) and [PagedAttention](https://blog.vllm.ai/2023/06/20/vllm.html). In order to provide the feature parity in the TPU world, Google has introduced [Pallas](https://jax.readthedocs.io/en/latest/pallas/index.html). For PyTorch/XLA to continue pushing the performance in TPU, we have to support custom kernels, and the best way is through Pallas. The design doc is [TBA](). Let's assume you have a Pallas kernel defined as follow: ```python3