v0.16.2
What's New
1. PyTorch Nightly Support
Composer now supports PyTorch Nightly and Cuda 12! Along with new docker images based on nightly PyTorch versions and release candidates, we've updated our PyTorch monkeypatches to support the latest version of PyTorch. These monkeypatches add additional functionality in finer-grain FSDP wrapping and patch bugs related to sharded checkpoints. We are in the process of upstreaming these changes into PyTorch.
Bug Fixes
1. MosaicML Logger Robustness
MosaicML logger now is robust to platform timeouts and other errors. Additionally, it can now be disabled by setting the environment variable MOSAICML_PLATFORM
to 'False'
when training on the MosaicML platform.
2. GCS Integration
GCS authentication is now supported with HMAC keys, patching a bug in the previous implementation.
3. Optimizer Monitor Norm Calculation (#2531)
Previously, the optimizer monitor incorrectly reduced norms across GPUs. It now correctly computes norms in a distributed setting.
What's Changed
- fix: when there is no train_metrics, do not checkpoint by @furkanbiten in #2502
- Remove metric saving by @mvpatel2000 in #2514
- Fix daily tests by removing gpu marker by @j316chuck in #2515
- Refactor mosaic_fsdp.py by @b-chu in #2506
- Disable slack notifications for PRs by @mvpatel2000 in #2517
- Add custom sharding to ChunkShardingSpec by @b-chu in #2507
- Update nightly docker image to torch nightly 09-03-23 by @j316chuck in #2518
- Update pre-commit in setup.py by @b-chu in #2522
- Add FSDP custom wrap with torch 2.1 by @mvpatel2000 in #2460
- Fix GCSObjectStore bug where hmac keys auth doesn't work by @eracah in #2519
- Bump gitpython from 3.1.34 to 3.1.35 by @dependabot in #2525
- Bump pytest from 7.4.0 to 7.4.2 by @dependabot in #2523
- Upgrade to MLFlow version 2.5.0 by @ngcgarcia in #2528
- Disable cifar daily test by @mvpatel2000 in #2527
- Mosaicml logger robustness improvements by @mvpatel2000 in #2530
- Fix metrics keys sort in DecoupledAdamW for OptimizerMonitor FSDP metric agreggation by @m1kol in #2531
- Fix github actions for GCS integration testing by @mvpatel2000 in #2532
- Fix GCS tests by @mvpatel2000 in #2535
- Change cast for mosaicml logger by @mvpatel2000 in #2538
- Bump Version to 0.16.2 by @mvpatel2000 in #2537
- Bump transformers version by @dakinggg in #2539
New Contributors
- @ngcgarcia made their first contribution in #2528
- @m1kol made their first contribution in #2531
Full Changelog: v0.16.1...v0.16.2