From a6ab3fe9fb218e5a0ff294d9efb0c4de8f8d8ef5 Mon Sep 17 00:00:00 2001
From: JackCaoG <59073027+JackCaoG@users.noreply.github.com>
Date: Tue, 16 Jul 2024 16:15:04 -0700
Subject: [PATCH] Minor update to the docs (#7691)

---
 docs/ddp.md    | 4 ++--
 docs/fsdp.md   | 2 +-
 docs/fsdpv2.md | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/ddp.md b/docs/ddp.md
index 09e1c12f9d5..1fe68fa6cd5 100644
--- a/docs/ddp.md
+++ b/docs/ddp.md
@@ -1,8 +1,8 @@
-# How to do `DistributedDataParallel`
+# How to do DistributedDataParallel(DDP)
 
 This document shows how to use torch.nn.parallel.DistributedDataParallel in xla,
 and further describes its difference against the native xla data parallel
-approach.
+approach. You can  find a minimum runnable example [here](https://github.com/pytorch/xla/blob/master/examples/data_parallel/train_resnet_ddp.py).
 
 
 ## Background / Motivation
diff --git a/docs/fsdp.md b/docs/fsdp.md
index f9a49812e12..3c86e99cbc9 100644
--- a/docs/fsdp.md
+++ b/docs/fsdp.md
@@ -61,7 +61,7 @@ The implementation of this class is largely inspired by and mostly follows the s
 ---
 
 ### Example training scripts on MNIST and ImageNet
-
+* Minimum example : [`examples/fsdp/train_resnet_fsdp_auto_wrap.py`](https://github.com/pytorch/xla/blob/master/examples/fsdp/train_resnet_fsdp_auto_wrap.py)
 * MNIST: [`test/test_train_mp_mnist_fsdp_with_ckpt.py`](https://github.com/pytorch/xla/blob/master/test/test_train_mp_mnist_fsdp_with_ckpt.py) (it also tests checkpoint consolidation)
 * ImageNet: [`test/test_train_mp_imagenet_fsdp.py`](https://github.com/pytorch/xla/blob/master/test/test_train_mp_imagenet_fsdp.py)
 
diff --git a/docs/fsdpv2.md b/docs/fsdpv2.md
index fe9b782a082..6ad04dc1eab 100644
--- a/docs/fsdpv2.md
+++ b/docs/fsdpv2.md
@@ -1,10 +1,10 @@
-# Fully Sharded Data Parallel via SPMD
+# Fully Sharded Data Parallel(FSDP) via SPMD
 
 Fully Sharded Data Parallel via SPMD or FSDPv2 is an utility that re-expresses the famous FSDP algorithm in SPMD. [This](https://github.com/pytorch/xla/blob/master/torch_xla/experimental/spmd_fully_sharded_data_parallel.py) is
 an experimental feature that aiming to offer a familiar interface for users to enjoy all the benefits that SPMD brings into
 the table. The design doc is [here](https://github.com/pytorch/xla/issues/6379).
 
-Please review the [SPMD user guide](./spmd.md) before proceeding.
+Please review the [SPMD user guide](./spmd_basic.md) before proceeding. You can also find a minimum runnable example [here](https://github.com/pytorch/xla/blob/master/examples/fsdp/train_decoder_only_fsdp_v2.py).
 
 Example usage:
 ```python3