Release Version 0.4.0
What's Changed
- Release/0.3.0 by @ravi-mosaicml in #102
- Create dataloader on trainer init() by @ravi-mosaicml in #92
- label smoothing will not work without alpha set by @A-Jacobson in #100
- Warmup and cosine annealing warm restarts combine sequentially by @jacobfulano in #99
- Moved device.prepare() to init by @ravi-mosaicml in #111
run_event
for callbacks, removed deferred logging by @ravi-mosaicml in #85- Remove
composer.trainer.ddp
; replace withcomposer.utils.ddp
by @ravi-mosaicml in #105 - Running callbacks befor algorithms for the INIT event in the engine by @ravi-mosaicml in #113
- Replaced
atexit
with cleanup methods by @ravi-mosaicml in #112 - Deepspeed Integration by @jbloxham in #109
- Fix loss reporting by @jbloxham in #130
- Run Directory Uploader by @ravi-mosaicml in #101
- Dataloader Upgrades by @ravi-mosaicml in #114
- Synthetic Datasets and Subset Sampling by @ravi-mosaicml in #110
- Remove argparse from setup.py by @ravi-mosaicml in #131
- Fixed pickling of torch.memory_format objects by @ravi-mosaicml in #132
- Fixed issue #135; rename
total_batch_size
totrain_batch_size
by @ravi-mosaicml in #137 - Implement MosaicMLLoggerBackend by @ajaysaini725 in #81
- Add a linear learning rate decay by @moinnadeem in #142
- Apply channels last on init by @ravi-mosaicml in #147
- Update Trainer checkpointing documentation by @moinnadeem in #150
- Address crashes with DDP + Checkpointing by @moinnadeem in #151
- Sudo in the dockerimage by @ravi-mosaicml in #152
- Remove curriculum learning by @ravi-mosaicml in #164
- Remove broken symlinks by @ravi-mosaicml in #163
- Removed dataclass from state by @ravi-mosaicml in #153
- Guard artifact uploading in wandb with ddp barriers by @ravi-mosaicml in #162
- add CODE_OF_CONDUCT.md by @kobindra in #160
- [XS] Fix wandb logger by @jbloxham in #172
- Print help on
run_mosaic_trainer.py
, cleaned up verbosity. by @ravi-mosaicml in #170 - DeepSpeed ZeRO config options by @jbloxham in #166
- DDP Seeding Across Processes by @ajaysaini725 in #173
- Fixed the run directory uploader test by @ravi-mosaicml in #177
- Fix broken gpu tests by @ravi-mosaicml in #181
- Conditionally skip tests when installed with mosaicml[dev] by @ravi-mosaicml in #185
- A yapf update broke some formatting...re-running the linter by @ravi-mosaicml in #188
- Timer PR parts 1 and 2 from #146 by @ravi-mosaicml in #174
- Fixed pyright issues by @ravi-mosaicml in #198
- Additional Tests by @ravi-mosaicml in #191
- Propagate processes that were sigkilled by @ravi-mosaicml in #184
- Add the ability to load a checkpoint without restoring state by @moinnadeem in #169
- Add ResNet-9 for CIFAR-10 by @dblalock in #193
- Added helper methods for torch.distributed.boradcast by @ravi-mosaicml in #189
- Checkpointing & DeepSpeed by @jbloxham in #199
- Distinguish between
dist
and DDP by @jbloxham in #201 - DeepSpeed precision fixes for CV by @jbloxham in #197
- Fix deterministic mode (and use it for tests); simplify checkpointing tests by @ravi-mosaicml in #203
- Load checkpoints from cloud storage by @ravirahman in #200
- Updated the
DataSpec
for the timing abstraction (#146) parts 3 and 4 by @ravi-mosaicml in #178 - Add larger GPT models by @jbloxham in #213
- Add BERT Base to Composer by @moinnadeem in #195
- Integrate the timer into the training loop by @ravi-mosaicml in #210
- Dockerfile enhancements by @ravi-mosaicml in #182
- Adding checkpointing at the end of training by @moinnadeem in #219
- Adding conditional branching on data_collator by @moinnadeem in #220
- Fixes apt sources bug fix by @Averylamp in #231
- Remove old timing calls from layer freezing by @ravi-mosaicml in #216
- Require
pip install -e
bepip install --user -e
when running as root by @ravi-mosaicml in #232 - DeepLabv3 + ADE20k benchmark by @Landanjs in #107
- Remove old timing calls from selective backprop by @ravi-mosaicml in #221
- Clean up the tests to make them work on jenkins by @ravi-mosaicml in #233
- Make the run directory rank-local; fix checkpoints saving and restoring by @ravi-mosaicml in #215
- Cleaned Up State by @ravi-mosaicml in #223
- Fix the speed monitor by @ravi-mosaicml in #238
- Fixed loggers and callbacks by @ravi-mosaicml in #240
- Fix ade20k padding fill calculation by @Landanjs in #250
- Adding fix for NLP learning rates by @moinnadeem in #235
- Training Loop Profiler by @ravi-mosaicml in #97
- WIP: Composer Jenkinsfile by @ravi-mosaicml in #82
- Fix broken tests by @ravi-mosaicml in #257
- Fix bug with AFTER_DATALOADER event; remove microbatches from state by @ravi-mosaicml in #258
- Remove the DDP DataLoader by @ravi-mosaicml in #245
- Fix Jenkins to work on PRs from Forks by @ravi-mosaicml in #267
- add ability to specify custom run name, with rank auto-appended by @dblalock in #264
- Remove secrets from the yaml by @ravi-mosaicml in #261
- Checkpoint logging and doc fixes by @ajaysaini725 in #270
- Remove custom W&B config changes by @siriuslee in #236
- Dramatically increase default dist_timeout by @jbloxham in #272
- Add factorization by @dblalock in #53
- Allow
str
anddict
in Trainerinit
signature by @hanlint in #277 - Add kwargs back to the closure by @jbloxham in #292
- Default to
num_classes=10
forCIFAR10_ResNet56
by @hanlint in #293 - Use
tqdm.auto
for notebooks by @hanlint in #298 - Added ResNet20 by @growlix in #289
- Optimizer Surgery by @ravi-mosaicml in #249
- Don't init dist when world_size is 1 by @jbloxham in #311
- Scheduler defaults to step-wise instead of epoch-wise by @hanlint in #312
- Added the version to composer.init by @ravi-mosaicml in #315
- Rename checkpoint API by @hanlint in #281
- Update setup.py by @Averylamp in #321
- Timm support by @A-Jacobson in #262
- [XS] use correct package name in error messages by @jbloxham in #331
- Multiple Evaluator Datasets by @anisehsani in #120
- Fixed all uses of textwrap.dedent by @ravi-mosaicml in #332
- Remove explicit YAHP constructs from algorithms by @jbloxham in #317
- Configure DeepSpeed with an ordinary DeepSpeed config dict by @jbloxham in #322
- Run
Event.BATCH_END
andEvent.EPOCH_END
after the timer is increm… by @ravi-mosaicml in #310 - Guard
dist.barrier
in the checkpointer with try/finally by @ravi-mosaicml in #334 - Replace composer ResNet with torchvision ResNet by @Landanjs in #314
- Fail fast if any step fails by @ravi-mosaicml in #333
- Replace most instances of "Mosaic" with "Composer" by @jbloxham in #335
- Ensure that the training dataloader does not have an active iterator. by @ravi-mosaicml in #337
- Fully flatten checkpoint params by @ravi-mosaicml in #325
- Added Pylint and docformatter by @ravi-mosaicml in #339
- Add compression flag by @mvpatel2000 in #336
- Fix cutmix and mixup reliance on num_classes model attribute by @Landanjs in #348
- Copy
extra_init_params
to get rid of recursive config dicts by @siriuslee in #316 - Composer Style Guide by @ravi-mosaicml in #319
- Get rid of
create_from_hparams
by @jbloxham in #351 - Added In Memory Logger, Timestamp Object by @ravi-mosaicml in #352
- Fix Checkpoints by @ravi-mosaicml in #359
- Add channels last standalone function by @dblalock in #356
- Quick style guide typo fix by @ajaysaini725 in #360
- Removed
template_default
fields in hparams by @ravi-mosaicml in #369 - removed byo_trainer by @anisehsani in #374
- Fix sample SD inference multiplication by @Landanjs in #376
- Support
import composer.functional as cf
by @dblalock in #368 - Fix composer.functional page no longer showing functions by @dblalock in #379
- Testing trainer.fit on each algorithm, callback, logger, and profiler by @ravi-mosaicml in #371
- Functional API renaming part 1 by @dblalock in #380
- Updated add_dataset_transform() to have flexible insertion point by @growlix in #320
- Rename
Event.TRAINING_START
toEvent.FIT
; removeEvent.TRAINING_END
by @ravi-mosaicml in #263 - Remove requirement for
validation
andmetrics
by @hanlint in #378 - Docs Refactor by @ravi-mosaicml in #386
- Documentation Outline by @ajaysaini725 in #302
- Fix tests without DDP by @ravi-mosaicml in #389
- Use
Makefile
instead of scripts; enable easier testing by @hanlint in #387 - Address Doc Fixes for Surgery and StochasticDepth by @ajaysaini725 in #413
- Cleanup
conftest.py
by @hanlint in #390 - Move
world_size
guard to trainer by @hanlint in #392 - Add defaults to functional API / share defaults across interfaces by @dblalock in #377
- Un-deprecate
steps_per_epoch
by @jbloxham in #418 - Remove the
walkthrough
section of the docs; replace with module-level docstrings by @ravi-mosaicml in #417 - Rename Loggers by @hanlint in #427
- Alternative docs theme: furo by @nqn in #341
- Clarify DWD defaults by @abhi-mosaic in #410
- Added :ignore-module-all: to docs by @ravi-mosaicml in #431
- Configured doctest by @ravi-mosaicml in #432
- Functional API renaming part 2 by @dblalock in #426
- Pytest Refactor Part 1 by @hanlint in #391
- Deprecate scale scheduler algorithm and move to trainer by @jbloxham in #438
- Removed dead code from the public library; refactored some imports. by @ravi-mosaicml in #437
- Trainer test refactor (pytest refactor phase 2) by @hanlint in #393
- Skip saving of direct serialization fields by @ravi-mosaicml in #445
- Hide gen_interpolation_lambda in mixup like in cutmix and augmix by @dblalock in #449
- Move all AlgorithmHparams classes to shared file by @dblalock in #452
- Trainer Docs + Param ordering + Alibi Export by @ajaysaini725 in #419
- Up and Running with Composer and Speedup Algorithms Demo Notebook by @growlix in #340
- Add NLP tutorial notebook by @Landanjs in #370
- add kaggle notebook by @A-Jacobson in #381
- Refactor Profiler init() by @bandish-shah in #422
- Random doc fixes by @ravi-mosaicml in #456
- support integer arguments to Trainer by @hanlint in #458
- Make algorithm functions either public or prefixed with "_" by @dblalock in #460
- bug in train metrics by @A-Jacobson in #466
- Fixes empty log lines if no algorithms are run by @siriuslee in #462
- Add default hparam values for cutout by @dblalock in #459
- Docstrings for
composer.utils
by @ravi-mosaicml in #439 - notebook tests by @hanlint in #468
resize_targets
set toFalse
by default by @siriuslee in #475- Remove
dist
warnings by @hanlint in #474 - Add missing defaults for one function by @dblalock in #476
- Store
metadata
in json files foralgorithms
by @hanlint in #471 - Davis/algos intrafile organization by @dblalock in #465
- Get functional API running enough for notebook by @dblalock in #479
- Remove colons from run directory timestamps by @ravi-mosaicml in #486
- Add custom methods notebook by @coryMosaicML in #330
- Move the clean notebooks script to the scripts folder by @ravi-mosaicml in #487
- Checkpoint Usability Initial Changes by @ajaysaini725 in #455
- Removing HF XFail on model registry by @moinnadeem in #490
- Clean up Imports and Tests by @ravi-mosaicml in #482
- Ravi/docs cleanup 2 by @ravi-mosaicml in #488
- Matthew/docstrings update by @growlix in #457
- No autodoc of forward by @ravi-mosaicml in #494
- Update init.py by @growlix in #493
- allow
from composer import ComposerModel
by @hanlint in #496 - Methods landing page by @nqn in #454
- Small docs change to include timing reference by @anisehsani in #500
- docstring for callbacks by @dskhudia in #470
- Docs cleanup #3 by @ravi-mosaicml in #502
- Adding network fixes for the Run Directory Uploader by @moinnadeem in #505
- Adding network retries for downloading GLUE by @moinnadeem in #506
- Matthew/loggers docstrings by @growlix in #499
- Fix Sphinx Warnings by @ravi-mosaicml in #520
- Anaconda configuration by @ravi-mosaicml in #507
- Update docstrings for Colout, CutOut, CutMix, Layer Freezing, Mixup, Label Smoothing, Progressive Resizing by @coryMosaicML in #483
- Stateless schedulers by @jbloxham in #463
- Rename
selective_backprop
toselect_using_loss
by @ravi-mosaicml in #532 - Update new README by @hanlint in #540
- Fix dark mode by @nqn in #573
- Fix the run directory uploader when use_procs=True and not using the … by @ravi-mosaicml in #547
- Console font too bright by @nqn in #574
- Fix pil_image_collate by @Landanjs in #514
- ADE20k DeepLabv3 optimized benchmark yaml by @Landanjs in #579
- separate hparams in module docstrings by @hanlint in #558
- Fix DataloaderHparam docs by @ravi-mosaicml in #534
- per #224, update function to use Timer and Time by @jzf2101 in #583
- Clean up Transformer models init function by @moinnadeem in #587
- Docstrings for composer.trainer by @ajaysaini725 in #522
- Additional updates to the loggers docstrings by @growlix in #544
- Profiler docstrings by @bandish-shah in #473
- Updated Model Cards by @ajaysaini725 in #375
- Unify augmentation API part 1 by @dblalock in #524
- Docstrings improvements for core.algorithm, core.callback, etc. by @dskhudia in #516
- Skip ResNet50 + DeepSpeed tests that are timing out by @hanlint in #601
- Make the default split_batch method a no-op if grad_accum is 1. by @ravi-mosaicml in #592
- Add functional/standalone API tutorial notebook by @dblalock in #326
- Merge v0.4 fixes by @hanlint in #606
- updated docstring examples by @growlix in #600
- [v0.4rc] Documentation Guides by @hanlint in #531
- Method cards by @jfrankle in #589
- Improved docstring for surgery algorithms by @dblalock in #602
- Fix Lint by @ravi-mosaicml in #611
- Fix Lint by @ravi-mosaicml in #612
- Updated 'Up and Running with Composer' by @growlix in #619
- Release v0.4.0 by @hanlint in #609
New Contributors
- @A-Jacobson made their first contribution in #100
- @jacobfulano made their first contribution in #99
- @kobindra made their first contribution in #160
- @ravirahman made their first contribution in #200
- @Landanjs made their first contribution in #107
- @siriuslee made their first contribution in #236
- @mvpatel2000 made their first contribution in #336
- @abhi-mosaic made their first contribution in #410
- @jzf2101 made their first contribution in #583
- @jfrankle made their first contribution in #589
Full Changelog: v0.3.1...v0.4.0