R2.1 #5651

ManfeiBai · 2023-09-26T22:21:57Z

No description provided.

Update more places Add torch_pin

Summary: This change enables megacore_dense by default to allow asynchorous cc ops especailly for GSPMD. Test Plan: CI Co-authored-by: Jiewen Tan <jwtan@google.com>

* Add optiona to unbundle libtpu * Add clarifying note

* Fix fsdp not freeing forzen full params * add test * formatting * remove unnecessary env var in test Co-authored-by: Liyang90 <liyanglu@google.com>

* Update project metadata and remove useless files * Update README * Add manylinux platform tag * formatting

* Add resnet50-weight-only-quant colab notebook * update notebook with llama blog link Co-authored-by: Siyuan Liu <lsiyuan@google.com>

Co-authored-by: Jiewen Tan <jwtan@google.com>

* Change `pjrt://` init method to `xla://` (#5560) * Update PJRT documentation for the 2.1 release (#5557) * Update PJRT documentation for the 2.1 release * clarify plugins * clarify PJRT doc * Update `pjrt://` to `xla://`

…) (#5576)

…posing LoweringContext… (#5431) (#5580) * Adding more explicit HLO lowering control by exposing LoweringContext (and utilities) to python for Neuron * fixing linter issues * fixing spacing * apply comments and fix compilation errors * add test for new apis * fix linter * update test * update test * modify test * reverse back to GetIrValue() * update test inputs with random numbers * skip unittest because it only fails in CI --------- Co-authored-by: aws-kingrj <78175353+aws-kingrj@users.noreply.github.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-3-186.us-west-2.compute.internal> Co-authored-by: seanlatias <seanlatias@gmail.com>

Co-authored-by: aws-kingrj <78175353+aws-kingrj@users.noreply.github.com>

* Move where clear pending IR is called to avoid crash * fix CI * fix CI and add some debugging messages

* Allow downcasting RngUniform genenration for Bernoulli Co-authored-by: Yeounoh Chung <yeounoh@google.com>

* Enable autocast for XLA:GPU * linter fix * XLA autocast test for GPU and TPU * linter fix * Ensure that xla autocast is properly enabled for GPU and does not crash when torch cuda is not available. * linter fix * Add tests * Support bf16 * linter fix * exclude unsupported test cases * increase GPU test timeout to 300 Co-authored-by: Yeounoh Chung <yeounoh@google.com>

Copy of #5594 on release branch

Summary: The 20230826 version contains openxla/xla@3b8a539 whch regresses our LLaMA2 GSPMD training benchmark. Let's rollback to version before it. Test Plan: CI. Co-authored-by: Jiewen Tan <jwtan@google.com>

* Fix log spam when libtpu is loaded (#5619) * fix conflict * only cherry-pick --------- Co-authored-by: Will Cromar <wcromar@google.com>

…5647) * lower NativeDropoutBackward (#5642) * lower NativeDropoutBackward * fix lowering and add python test * lower native_dropout (#5643) * prototype version (compiling error) * Add native_dropout manual lowering. * fix to tensor IR and add a simple native_dropout test * fix data type issue and update test case * fix IR hash issue * fix corner case when probability==0 * remove typo line * add test case when probability=0 --------- Co-authored-by: JackCaoG <59073027+JackCaoG@users.noreply.github.com> Co-authored-by: zpcore <piz@google.com>

vanbasten23 and others added 24 commits August 29, 2023 01:09

Disable cxx abi in ansible when building pt/xla for branch r2.0 (#5332)

3ac818e

Update pytorch git tag for r2.1 (#5529)

3b34ae1

Update more places Add torch_pin

Enable megacore_dense by default (#5520) (#5531)

bd1958f

Summary: This change enables megacore_dense by default to allow asynchorous cc ops especailly for GSPMD. Test Plan: CI Co-authored-by: Jiewen Tan <jwtan@google.com>

Add option to unbundle libtpu (#5534) (#5536)

7a01093

* Add optiona to unbundle libtpu * Add clarifying note

Revert 2.1 terraform changes (#5537)

a1d3651

Fix FSDP for Models with Frozen Weights (#5484) (#5539)

89574d2

* Fix fsdp not freeing forzen full params * add test * formatting * remove unnecessary env var in test Co-authored-by: Liyang90 <liyanglu@google.com>

Update r2.1 wheel to be compatible with PyPI (#5550)

691838f

* Update project metadata and remove useless files * Update README * Add manylinux platform tag * formatting

Add resnet50-weight-quant colab notebook (#5407) (#5556)

9bcf3f7

* Add resnet50-weight-only-quant colab notebook * update notebook with llama blog link Co-authored-by: Siyuan Liu <lsiyuan@google.com>

[Pin Update] Version 20230826(#5527) (#5555)

2c07df9

Co-authored-by: Jiewen Tan <jwtan@google.com>

Cherry pick pjrt:// init method rename and doc updates (#5562)

ee72332

* Change `pjrt://` init method to `xla://` (#5560) * Update PJRT documentation for the 2.1 release (#5557) * Update PJRT documentation for the 2.1 release * clarify plugins * clarify PJRT doc * Update `pjrt://` to `xla://`

Update torch pin (#5575)

02858b9

Handle dynamo function without input (#5565) (#5577)

f0f9126

Make cpu tensor on XLA dynamo backend a warning instead of error (#5549…

c55ef0b

…) (#5576)

fixing num_local_processes typo (#5573) (#5579)

cbb1602

Co-authored-by: aws-kingrj <78175353+aws-kingrj@users.noreply.github.com>

Move where clear pending IR is called to avoid crash (#5552) (#5582)

3e2bf8a

* Move where clear pending IR is called to avoid crash * fix CI * fix CI and add some debugging messages

Fix release branch and tag patterns for GitHub Actions (#5587) (#5590)

79ce1c5

Improve bernoulli rng-bit-generation memory footprint (#5581) (#5589)

fa5d132

* Allow downcasting RngUniform genenration for Bernoulli Co-authored-by: Yeounoh Chung <yeounoh@google.com>

Cherry-pick: Don't trigger CI build on release tag push (#5595)

5fcf79c

Copy of #5594 on release branch

XLA pin downgrade to 20230825 (#5592) (#5604)

53fa74c

Summary: The 20230826 version contains openxla/xla@3b8a539 whch regresses our LLaMA2 GSPMD training benchmark. Let's rollback to version before it. Test Plan: CI. Co-authored-by: Jiewen Tan <jwtan@google.com>

Clean up README for 2.1 release branch. (#5618)

2d2c80c

Fix log spam when libtpu is loaded (#5619) (#5621)

b62be6e

* Fix log spam when libtpu is loaded (#5619) * fix conflict * only cherry-pick --------- Co-authored-by: Will Cromar <wcromar@google.com>

ManfeiBai closed this Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R2.1 #5651

R2.1 #5651

ManfeiBai commented Sep 26, 2023

R2.1 #5651

R2.1 #5651

Conversation

ManfeiBai commented Sep 26, 2023