Skip to content

v0.10.0

Compare
Choose a tag to compare
@bandish-shah bandish-shah released this 22 Sep 06:25
· 1346 commits to dev since this release

🚀 Composer v0.10.0

Composer v0.10.0 is out! This latest release adds support for CometML Experiment tracking, automatic selection of evaluation batch size, API enhancements for Evaluation/Logging/Metrics and a preview of our new streaming datasets repository!

pip install --upgrade mosaicml==0.10.0

New Features

  1. ☄️ Comet Experiment Tracking (#1490)

    We've added support for the popular Comet experiment tracker! To enable, simply create the logger and pass it to the Trainer object at initialization:

    from composer import Trainer
    from composer.loggers import CometMLLogger
    
    cometml_logger = CometMLLogger()
    
    trainer = Trainer(
        ...
        loggers=[cometml_logger],
    )

    Please see our Logging and CometMLLogger docs pages for details on usage.

  2. 🪄 Automatic Evaluation Batch Size Selection (#1417)

    Composer now supports eval_batch_size='auto', which will choose the right evaluation batch size to avoid CUDA OOMs! Now, in conjunction with grad_accum='auto', you can run the same code on any hardware with no changes necessary. This makes it easy to add evaluation to a training script without having to pick and choose the right batch sizes to avoid CUDA OOMs.

  3. 🎯 Evaluation API Changes (#1479)

    The Evaluation API has been updated to be consistent with the Trainer API. If the eval_dataloader was provided to the Trainer during initialization, eval can be invoked without needing to provide anything additional:

    trainer = Trainer(
        eval_dataloader=...
    )
    trainer.eval()

    Alternatively, the eval_dataloader can be passed directly to the eval() method:

    trainer = Trainer(
        ...
    )
    trainer.eval(
        eval_dataloader=...
    )

    The eval_dataloader can be a pytorch dataloader, or for multiple metrics, a list of Evaluator objects.

  4. 🪵 Simplified Logging (#1416)

    We've significantly simplified our internal logging interface:

    • Removed the use of LogLevel throughout the logging, which was a mostly unused feature. Filtering logs are the responsibility of the logger.
    • For better compatibility with external logging interfaces such as CometML or Weights & Biases, loggers now support the following methods: log_metrics, log_hyperparameters, and log_artifacts. Previous calls to data_fit, data_epeoch, .. have been removed.
  5. 🎯 validate --> eval_forward (#1411 , #1419)

    Previously, ComposerModel implemented the validate(batch: Any) -> Tuple[Any, Any] method which returns an (input, target) tuple, and the Trainer handles updating the metrics. In v0.10, we return the metrics updating control to the user.

    Now, models instead implement def eval_forward(batch: Any) which returns the outputs of evaluation, and also def update_metric(batch, outputs, metric) which updates the metric.

    An example implementation for classification can be found in our ComposerClassifer base class:

        def update_metric(self, batch: Any, outputs: Any, metric: Metric) -> None:
            _, targets = batch
            metric.update(outputs, targets)
    
        def eval_forward(self, batch: Any, outputs: Optional[Any] = None) -> Any:
            return outputs if outputs is not None else self.forward(batch)
  6. 🕵️‍♀️ Evaluator changes

    The Evaluator class now stores evaluation metric names instead of metric instances. For example:

    glue_mrpc_task = Evaluator(
        label='glue_mrpc',
        dataloader=mrpc_dataloader,
        metric_names=['BinaryF1Score', 'Accuracy']
    )

    These metric names are matched against the metrics returned by the ComposerModel. The metric instances are now stored as deep copies in the State class as state.train_metrics or state.eval_metrics.

  7. 🚧 Streaming Datasets Repository Preview

    We're in the process of splitting out streaming datasets into it's own repository! Streaming datasets is a high-performance drop-in replacement for Torch IterableDataset objects and enables you to stream your training data from cloud based object stores. For an early preview, please checkout the Streaming repo.

  8. YAHP deprecation

    We are deprecating support for yahp, our hyperparameter configuration tool. Support for this will be removed in the following minor version release of Composer. We recommend users migrate to OmegaConf, or Hydra as tools.

Bug Fixes

What's Changed

New Contributors

Full Changelog: v0.9.0...v0.10.0