Release Release version v0.5.0 · mosaicml/composer

We are excited to share Composer v0.5, a library of speed-up methods for efficient neural network training. This release features:

Revamped checkpointing API based on community feedback
New baselines: ResNet34-SSD, GPT-3, and Vision Transformers
Additional improvements to our documentation
Support for bfloat16
Streaming dataset support
Unified functional API for our algorithms

Highlights

Checkpointing API

Checkpointing models are now a Callback, so that users can easily write and add their own callbacks. The callback is automatically appended if a save_folder is provided to the Trainer.

trainer = Trainer(
    model=model,
    algorithms=algorithms,
    save_folder="checkpoints",
    save_interval="1ep"
)

Alternatively, CheckpointSaver can be directly added as a callback:

trainer = Trainer(..., callbacks=[
    CheckpointSaver(
        save_folder='checkpoints',
        name_format="ep{epoch}-ba{batch}/rank_{rank}",
        save_latest_format="latest/rank_{rank}",
        save_interval="1ep",
        weights_only=False,
    )
])

Subclass from CheckpointSaver to add your own logic for saving the best model, or saving at specific intervals. Thanks to @mansheej @siriuslee and other users for their feedback.

bloat16

We've added experimental support for bfloat16, which can be provided via the precision argument to the Trainer:

trainer = Trainer(
    ...,
    precision="bfloat16"
)

Streaming datasets

We've added support for fast streaming datasets. For NLP-based datasets such as C4, we use the HuggingFace datasets backend, and add dataset-specific shuffling, tokenization , and grouping on-the-fly. To support data parallel training, we added specific sharding logic for efficiency. See C4Datasets for more details.

Vision streaming datasets are supported via a patched version of the webdatasets package, and added support for data sharding by workers for fast augmentations. See composer.datasets.webdataset for more details.

Baseline GPT-3, ResNet34-SSD, and Vision Transformer benchmarks

Configurations for GPT-3-like models ranging from 125m to 760m parameters are now released, and use DeepSpeed Zero Stage 0 for memory-efficient training.

We've also added the Single Shot Detection (SSD) model (Wei et al, 2016) with a ResNet34 backbone, based on the MLPerf reference implementation.

Our first Vision Transformer benchmark is the ViT-S/16 model from Touvron et al, 2021, and based on the vit-pytorch package.

See below for the full details:

What's Changed

Export Transforms in composer.algorithms by @ajaysaini725 in #603
Make batchnorm default for UNet by @dskhudia in #535
Fix no_op_model algorithm by @dskhudia in #614
Pin pre-1.0 packages by @bandish-shah in #595
Updated dark mode composer logo, and graph by @nqn in #617
Jenkins + Docker Improvements by @ravi-mosaicml in #621
update README links by @hanlint in #628
Remove all old timing calls by @ravi-mosaicml in #594
Remove state shorthand by @mvpatel2000 in #629
add bfloat16 support by @nikhilsardana in #433
v0.4.0 Hotfix: Docker documentation updates by @bandish-shah in #631
Fix wrong icons in the method cards by @hanlint in #636
fix autocast for pytorch < 1.10 by @nikhilsardana in #639
Add tutorial notebooks to the README by @moinnadeem in #630
Converted Stateless Schedulers to Classes by @ravi-mosaicml in #632
Jenkinsfile Fixes Part 2 by @ravi-mosaicml in #627
Add C4 Streaming dataset by @abhi-mosaic in #489
CONTRIBUTING.md additions by @kobindra in #648
Hide showing object as a base class; fix skipping documentation of forward; fixed docutils dependency. by @ravi-mosaicml in #643
Matthew/functional docstrings update by @growlix in #622
docstrings improvements for core modules by @dskhudia in #598
ssd-resnet34 on COCO map 0.23 by @florescl in #646
Fix broken "best practices" link by @growlix in #649
Update progressive resizing to work for semantic segmentation by @coryMosaicML in #604
Let C4 Dataset overwrite num_workers if set incorrectly by @abhi-mosaic in #655
Lazy imports for pycocotools by @abhi-mosaic in #656
W&B excludes final eval metrics when plotted as a fxn of epoch or trainer/global_step by @growlix in #633
Update GPT3-yamls for default 8xA100-40GB by @abhi-mosaic in #663
Set WandB default to log rank zero only by @abhi-mosaic in #461
Update schedulers guide by @hanlint in #661
[XS] Fix a TQDM deserialization bug by @jbloxham in #665
Add defaults to the docstrings for algorithms by @hanlint in #662
Fix ZeRO config by @jbloxham in #667
[XS] fix formatting for colout by @hanlint in #666
Composer.core docstring touch-up by @ravi-mosaicml in #657
Add Uniform bounding box sampling option for CutOut and CutMix by @coryMosaicML in #634
Update README.md by @ravi-mosaicml in #678
Fix bug in trainer test by @hanlint in #651
InMemoryLogger has get_timeseries() method by @growlix in #644
Batchwise resolution for SWA by @growlix in #654
Fixed the conda build script so it runs on jenkins by @ravi-mosaicml in #676
Yahp version update to 0.1.0 by @Averylamp in #674
Streaming vision datasets by @knighton in #284
Fix DeepSpeed checkpointing by @jbloxham in #686
Vit by @A-Jacobson in #243
[S] cleanup tldr; standardize __all__ by @hanlint in #688
Unify algorithms part 2: mixup, cutmix, label smoothing by @dblalock in #658
composer.optim docstrings by @jbloxham in #653
Fix DatasetHparams, WebDatasetHparams docstring by @growlix in #697
Models docstrings by @A-Jacobson in #469
docstrings improvements for composer.datasets by @dskhudia in #694
Updated contributing.md and the style guide by @ravi-mosaicml in #670
Ability to retry ADE20k crop transform by @Landanjs in #702
Add mmsegmentation DeepLabv3(+) by @Landanjs in #684
Unify functional API part 3 by @dblalock in #715
Update example notebooks by @coryMosaicML in #707
[Checkpointing - PR1] Store the rank_zero_seed on state by @ravi-mosaicml in #680
[Checkpointing - PR2] Added in new Checkpointing Events by @ravi-mosaicml in #690
[Checkpointing - PR3] Clean up RNG and State serialization by @ravi-mosaicml in #692
[Checkpointing - PR4] Refactored the CheckpointLoader into a load_checkpoint function by @ravi-mosaicml in #693
Update {blurpool,factorize,ghostbn} method cards by @dblalock in #711
[Checkpointing - PR 5] Move the CheckpointSaver to a callback. by @ravi-mosaicml in #687
Update datasets docstrings by @growlix in #709
add notebooks and functional api by @hanlint in #714
Migrating from PTL notebook by @florescl in #436
Docs 0.4.1: Profiler section and tutorials by @bandish-shah in #696
Improve datasets docstrings by @knighton in #695
Update C4Dataset to repeat, handle max_samples safely by @abhi-mosaic in #722
Fix docs build by @ravi-mosaicml in #773
v0.5 Release by @hanlint in #732

New Contributors

@nikhilsardana made their first contribution in #433
@knighton made their first contribution in #284

Full Changelog: v0.4.0...v0.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release version v0.5.0