Skip to content

Releases: pytorch/tensordict

v0.5.0: `consolidate`, compile compatibility and better non-tensor support

30 Jul 21:39
Choose a tag to compare

This release is packed with new features and performance improvements.

What's new


There is now a TensorDict.consolidate method that will put all the tensors in a single storage. This will greatly speed-up serialization in multiprocessed and distributed settings.

PT2 support

TensorDict common ops (get, set, index, arithmetic ops etc) now work within torch.compile.
The list of supported operations can be found in test/ We encourage users to report any graph break caused by tensordict to us, as we are willing to improve the coverage as much as can be.

Python 3.12 support

#807 enables python 3.12 support, a long awaited feature!

Global reduction for mean, std and other reduction methods

It is now possible to get the grand average of a tensordict content using tensordict.mean(reduce=True).
This applies to mean, nanmean, prod, std, sum, nansum and var.

from_pytree and to_pytree

We made it easy to convert a tensordict to a given pytree structure and build it from any pytree using to_pytree and from_pytree. #832
Similarly, conversion to namedtuple is now made easy thanks to #788.


One can now iterate through a TensorDIct batch-dimension and apply a function on a separate process thanks to map_iter.
This should enable the construction of datasets using TensorDict, where the preproc step is executed on a separate process. #847

Using flatten and unflatten, flatten_keys and unflatten_keys as context managers

It is not possible to use flatten_keys and flatten as context managers (#908, #779):

with tensordict.flatten_keys() as flat_td:
    flat_td["flat.key"] = 0
assert td["flat", "key"] == 0

Building a tensordict using keyword arguments

We made it easy to build tensordicts with simple keyword arguments, like a dict is built in python:

td = TensorDict(a=0, b=1)
assert td["a"] == torch.tensor(0)
assert td["b"] == torch.tensor(1)

The batch_size is now optional for both tensordict and tensorclasses. #905

Load tensordicts directly on device

Thanks to #769, it is now possible to load a tensordict directly on a destination device (including "meta" device):

td = TensorDict.load(path, device=device)

New features

  • [Feature,Performance] to(device, pin_memory, num_threads) by @vmoens in #846
  • [Feature] Allow calls to get_mode, get_mean and get_median in case mode, mean or median is not present by @vmoens in #804
  • [Feature] Arithmetic ops for tensorclass by @vmoens in #786
  • [Feature] Best attempt to densly stack sub-tds when LazyStacked TDS are passed to maybe_dense_stack by @vmoens in #799
  • [Feature] Better dtype coverage by @vmoens in #834
  • [Feature] Change default interaction types to DETERMINISTIC by @vmoens in #825
  • [Feature] DETERMINISTIC interaction mode by @vmoens in #824
  • [Feature] Expose call_on_nested to apply and named_apply by @vmoens in #768
  • [Feature] Expose stack / cat as class methods by @vmoens in #793
  • [Feature] Load tensordicts on device, incl. meta by @vmoens in #769
  • [Feature] Make Probabilistic modules aware of CompositeDistributions out_keys by @vmoens in #810
  • [Feature] Memory-mapped nested tensors by @vmoens in #618
  • [Feature] Multithreaded apply by @vmoens in #844
  • [Feature] Multithreaded pin_memory by @vmoens in #845
  • [Feature] Support for non tensor data in h5 by @vmoens in #772
  • [Feature] TensorDict.consolidate by @vmoens in #814
  • [Feature] TensorDict.numpy() by @vmoens in #787
  • [Feature] TensorDict.replace by @vmoens in #774
  • [Feature] out argument in apply by @vmoens in #794
  • [Feature] to for consolidated TDs by @vmoens in #851
  • [Feature] zero_grad and requires_grad_ by @vmoens in #901
  • [Feature] add_custom_mapping and NPE refactors by @vmoens in #910
  • [Feature] construct tds with kwargs by @vmoens in #905
  • [Feature] determinstic_sample for composite dist by @vmoens in #827
  • [Feature] expand_as by @vmoens in #792
  • [Feature] flatten and unflatten as decorators by @vmoens in #779
  • [Feature] from and to_pytree by @vmoens in #832
  • [Feature] from_modules expand_identical kwarg by @vmoens in #911
  • [Feature] grad and data for tensorclasses by @vmoens in #904
  • [Feature] isfinite, isnan, isreal by @vmoens in #829
  • [Feature] map_iter by @vmoens in #847
  • [Feature] map_names for composite dists by @vmoens in #809
  • [Feature] online edition of memory mapped tensordicts by @vmoens in #775
  • [Feature] remove distutils dependency and enable 3.12 support by @GaetanLepage in #807
  • [Feature] to_namedtuple and from_namedtuple by @vmoens in #788
  • [Feature] view(dtype) by @vmoens in #835


  • [Performance] Faster getattr in TC by @vmoens in #912
  • [Performance] Faster lock_/unclock_ when sub-tds are already locked by @vmoens in #816
  • [Performance] Faster multithreaded pin_memory by @vmoens in #919
  • [Performance] Faster tensorclass by @vmoens in #791
  • [Performance] Faster tensorclass set by @vmoens in #880
  • [Performance] Faster to-module by @vmoens in #914

Bug Fixes

  • [BugFix,CI] Fix storage filename tests by @vmoens in #850
  • [BugFix] @Property setter in tensorclass by @vmoens in #813
  • [BugFix] Allow any tensorclass to have a data field by @vmoens in #906
  • [BugFix] Allow fake-tensor detection pass through in torch 2.0 by @vmoens in #802
  • [BugFix] Avoid collapsing NonTensorStack when calling where by @vmoens in #837
  • [BugFix] Check if the current user has write access by @MateuszGuzek in #781
  • [BugFix] Ensure dtype is preserved with autocast by @vmoens in #773
  • [BugFix] FIx non-tensor writing in modules by @vmoens in #822
  • [BugFix] Fix (keys, values) in sub by @vmoens in #907
  • [BugFix] Fix _make_dtype_promotion backward compat by @vmoens in #842
  • [BugFix] Fix pad_sequence behavior for non-tensor attributes of tensorclass by @kurtamohler in #884
  • [BugFix] Fix builds by @vmoens in #849
  • [BugFix] Fix compile + vmap by @vmoens in #924
  • [BugFix] Fix deterministic fallback when the dist has no support by @vmoens in #830
  • [BugFix] Fix device parsing in augmented funcs by @vmoens in #770
  • [BugFix] Fix empty tuple index by @vmoens in #811
  • [BugFix] Fix fallback of deterministic samples when mean is not available by @vmoens in #828
  • [BugFix] Fix functorch dim mock by @vmoens in #777
  • [BugFix] Fix gather device by @vmoens in #815
  • [BugFix] Fix h5 auto batch size by @vmoens in #798
  • [BugFix] Fix key ordering in pointwise ops by @vmoens in #855
  • [BugFix] Fix lazy stack features (where and norm) by @vmoens in #795
  • [BugFix] Fix map by @vmoens in #862
  • [BugFix] Fix map test with fork on cuda by @vmoens in #765
  • [BugFix] Fix pad_sequence for non tensors by @vmoens in #784
  • [BugFix] Fix setting non-tensors as data in NonTensorData by @vmoens in
Read more


25 Apr 16:14
Choose a tag to compare

What's Changed

This new version of tensordict comes with a great deal of new features:

  • You can now operate pointwise arithmetic operations on tensordict. For locked tensordicts and inplace operations such as += or data.mul_, fused cuda kernels will be used which will drastically improve the runtime.

    • [Feature] Pointwise arithmetic operations using foreach by @vmoens in #722
    • [Feature] Mean, std, var, prod, sum by @vmoens in #751
  • Casting tensordicts to device is now much faster out-of-the box as data will be cast asynchronously (and it's safe too!)
    [BugFix,Feature] Optional non_blocking in set, to_module and update by @vmoens in #718
    [BugFix] consistent use of non_blocking in tensordict and torch.Tensor by @vmoens in #734
    [Feature] non_blocking=None by default by @vmoens in #748

  • The non-tensor data API has also been improved, see
    [BugFix] Allow inplace modification of non-tensor data in locked tds by @vmoens in #694
    [BugFix] Fix inheritance from non-tensor by @vmoens in #709
    [Feature] Allow non-tensordata to be shared across processes + memmap by @vmoens in #699
    [Feature] Better detection of non-tensor data by @vmoens in #685

  • @tensorclass now supports automatic type casting: annotating a value as a tensor or an int can ensure that the value will be cast to that type if the tensorclass decorator takes the autocast=True argument
    [Feature] Type casting for tensorclass by @vmoens in #735

  • now supports the "fork" start method. Preallocated outputs are also a possibility.
    [Feature] mp_start_method in tensordict map by @vmoens in #695
    [Feature] map with preallocated output by @vmoens in #667

  • Miscellaneous performance improvements
    [Performance] Faster flatten_keys by @vmoens in #727
    [Performance] Faster update_ by @vmoens in #705
    [Performance] Minor efficiency improvements by @vmoens in #703
    [Performance] Random speedups by @albanD in #728
    [Feature] Faster to(device) by @vmoens in #740

  • Finally, we have opened a discord channel for tensordict!
    [Badge] Discord shield by @vmoens in #736

  • We cleaned up the API a bit, creating a save and a load methods, or adding some utils such as fromkeys. One can also check if a key belongs to a tensordict as it is done with a regular dictionary with key in tensordict!
    [Feature] contains, clear and fromkeys by @vmoens in #721

Thanks for all our contributors and community for the support!

Other PRs

Read more

v0.3.2: Minor release

07 Apr 13:39
Choose a tag to compare

[BugFix,Feature] Optional non_blocking in set, to_module and update (#718)
[Refactor] Refactor contiguous (#716)
[Test] Add proper tests for torch.stack with lazy stacks (#715)
[BugFix] Fix dense stack usage in torch.stack (#714)
[BugFix] Dense stack lazy tds defaults to dense_stack_tds (#713)
[Feature] Store non tensor stacks in a single json (#711)
[Feature] TensorDict logger (#710)
[BugFix, Feature] tensorclass.to_dict and from_dict (#707)
[BugFix] Fix inheritance from non-tensor (#709)
[Performance] Faster update_ (#705)
[Benchmark] Benchmark update_ (#704)
[Performance] Minor efficiency improvements (#703)
[Feature] Allow non-tensordata to be shared across processes + memmap (#699)
[CI] Unpin mpmath (#702)
[CI] Remove snapshot from CI (#701)
[BugFix] Support empty tuple in lazy stack indexing (#696)
[CI] Pinning mpmath (#697)
[BugFix] Allow inplace modification of non-tensor data in locked tds (#694)
[Feature] Better detection of non-tensor data (#685)
[Feature] Warn when reset_parameters_recursive is a no-op (#693)
[BugFix,Feature] filter_empty in apply (#661)

See the release on PyPI:


27 Feb 01:33
Choose a tag to compare

Solves several bugs and performance issues.

List of changes:

v0.3.0: `MemoryMappedTensor`, pickle-free multithreaded serialization and more!

31 Jan 14:07
Choose a tag to compare

In this release we introduce a bunch of exciting features to TensorDict:

  • We deprecate MemmapTensor in favour of MemoryMappedTensor, which is fully backed by torch.Tensor and not numpy anymore. The new API is faster and way more bug-free than it used too. See #541

  • Saving tensordicts on disk can now be done via memmap, memmap_ and memmap_like which all support multithreading. If possible, serialization is pickle free (memmap + json) and is only used for classes that fail to be serialized with json. Serializing models using tensordict is now 3-10x faster than using, even for SOTA LLMs such as LLAMA.

  • TensorDict can now carry non tensor data through the NonTensorData class. Assigning non-tensor data can also be done via __setitem__ and they can be retrieved via __getitem__. #601

  • A bunch of new operations have appeared too such as named_apply (apply with key names) or tensordict.auto_batch_size_(), and operations like update can now be achieved for only a subset of keys.

  • Almost all operations in the library are now faster!

  • We are slowing deprecating lazy classes except for LazyStackedTensorDict. Whereas torch.stack used to systematically return a lazy stack, it now returns a dense stack if the set_lazy_legacy(mode) decorator is set to False (which will be the default in the next release). The old behaviour can be set with set_lazy_legacy(True). Lazy stacks can still be obtained using LazyStackedTensorDict.lazy_stack. Appropriate warnings are raised unless you have patched your code accordingly.

What's Changed

  • [Refactor] MemoryMappedTensor by @vmoens in #541
  • [Feature] Multithread memmap by @vmoens in #592
  • [Refactor] Graceful as_tensor by @vmoens in #549
  • [Test] Fix as_tensor test by @vmoens in #551
  • Fix assignment of str-typed value to _device attribute in MemmapTensor by @kurt-stolle in #552
  • [Refactor] Refactor split by @vmoens in #555
  • [Refactor] Refactor implement_for by @vmoens in #556
  • [Feature] Better constructors for MemoryMappedTensors by @vmoens in #557
  • [CI] Fix benchmark on gpu by @vmoens in #560
  • [CI] Add regular benchmarks to CI in PRs without upload by @vmoens in #561
  • [Refactor] Major refactoring of codebase by @vmoens in #559
  • [Benchmark] Benchmark split and chunk by @vmoens in #564
  • [Performance] Faster split, chunk and unbind by @vmoens in #563
  • [Feature] Consolidate functional calls by @vmoens in #565
  • [Refactor] Improve functional call efficiency by @vmoens in #567
  • [Refactor] Do not lock nested tensordict in tensordictparams by @vmoens in #568
  • [Performance] Faster params and buffer registration in TensorDictParams by @vmoens in #569
  • [BugFix] Graceful attribute error exit in TensorDictParams by @vmoens in #571
  • [Refactor] Upgrade pytree import by @vmoens in #573
  • [BugFix] Compatibility with missing _global_parameter_registration_hooks by @vmoens in #574
  • [Feature] Seed workers in by @vmoens in #562
  • [Performance] Faster update by @vmoens in #572
  • [Performance] Faster to_module by @vmoens in #575
  • [BugFix] _FileHandler for windows by @vmoens in #577
  • [Performance] Faster __init__ by @vmoens in #576
  • [Feature, Test] Add tests for partial update by @vmoens in #578
  • [BugFix] No fallback on TensorDictModule.__getattr__ for private attributes by @vmoens in #579
  • [BugFix] Fix deepcopy of TensorDictParams by @vmoens in #580
  • Add by @vmoens in #581
  • [BugFix] Delete parameter/buffer before setting it with regular setattr in to_module by @vmoens in #583
  • [Feature] named_apply and default value in apply by @vmoens in #584
  • [BugFix] Faster empty_like for MemoryMappedTensor by @vmoens in #585
  • [BugFix] Faster empty_like for MemoryMappedTensor (dup) by @vmoens in #586
  • [BugFix] Adapt MemoryMappedTensor for torch < 2.0 by @vmoens in #587
  • [Performance] Make copy_ a no-op if tensors are identical by @vmoens in #588
  • [BugFix] Fix non-blocking arg in copy_ by @vmoens in #590
  • [Feature] Unbind and stack tds in map with chunksize=0 by @vmoens in #589
  • [Performance] Faster dispatch by @vmoens in #487
  • [Feature] Saving metadata of tensorclass by @vmoens in #582
  • [BugFix] Fix osx tests by @vmoens in #591
  • [Feature] Weakref for unlocking tds by @vmoens in #595
  • [BugFix] Fix pickling of weakrefs by @vmoens in #597
  • [Feature] Return early a tensordict created through memmap with multiple threads by @vmoens in #598
  • [CI] Depend on torch nightly for nightly releases by @vmoens in #599
  • [Feature] Storing non-tensor data in tensordicts by @vmoens in #601
  • [Feature, Test] FSDP and DTensors by @vmoens in #600
  • [Minor] Fix type deletion in tensorclass load_memmap by @vmoens in #602
  • [BugFix] Fix ellipsis check by @vmoens in #604
  • [Feature] Best intention stack by @vmoens in #605
  • [Feature] Remove and check for prints in codebase using flake8-print by @vmoens in #603
  • [Doc] Doc revamp by @vmoens in #593
  • [BugFix, Doc] Fix tutorial by @vmoens in #606
  • [BugFix] Fix gh-pages upload by @vmoens in #607
  • [BugFix] Upload content of html directly by @vmoens in #608
  • [Feature] Improve in-place ops for TensorDictParams by @vmoens in #609
  • [BugFix, CI] Fix GPU benchmarks by @vmoens in #611
  • [Feature] inplace to_module by @vmoens in #610
  • [Versioning] Bump v0.3.0 by @vmoens in #613
  • [Feature] Support group specification by @lucifer1004 in #616
  • [Refactor] Remove remaining MemmapTensor references by @vmoens in #617
  • [Tests] Reorder and regroup tests by @vmoens in #614
  • [Performance] Faster set by @vmoens in #619
  • [Performance] Better shared/memmap inheritance and faster exclude by @vmoens in #621
  • [Benchmark] Benchmark select, exclude and empty by @vmoens in #623
  • [Feature] Improve the empty method by @vmoens in #622
  • [BugFix] Fix is_memmap attribute for memmap_like and memmap by @vmoens in #625
  • Bump jinja2 from 3.1.2 to 3.1.3 in /docs by @dependabot in #626
  • [BugFix] Remove shared/memmap inheritance from clone / select / exclude by @vmoens in #624
  • [BugFix] Fix index in list error by @vmoens in #627
  • [Refactor] Make unbind call tensor.unbind by @vmoens in #628
  • [Feature] auto_batch_size_ by @vmoens in #630
  • [BugFix] Fix NonTensorData interaction by @vmoens in #631
  • [Doc] More doc on how to set and get non-tensor data by @vmoens in #632
  • [Feature] _auto_make_functional and _dispatch_td_nn_modules by @vmoens in #633
  • [BugFIx] Fix exclude indent by @vmoens in #637
  • [BugFix] Limit number of threads in workers for .map() by @vmoens in #638
  • [Feature] Robust to lazy_legacy set to false and context managers for reshape ops by @vmoens in #634
  • [Minor] Typo in lazy legacy warnings by @Vmoe...
Read more


26 Oct 20:57
Choose a tag to compare

What's Changed

Full Changelog: v0.2.0...v0.2.1


05 Oct 06:54
Choose a tag to compare

New features

What's Changed

Read more


09 May 15:38
Choose a tag to compare

What's Changed

Full Changelog: v0.1.1...v0.1.2


06 May 21:08
Choose a tag to compare

What's Changed

  • [CI] Added workflow to let contributors self-assign issues by @sugatoray in #281
  • [BugFix] Fix reshape with non-expanded sizes by @vmoens in #283
  • [BugFix] Fix reshape with empty shape by @vmoens in #284
  • [BugFix] Improve utils (pad_sequence and make_tensordict) by @vmoens in #285
  • [BugFix] make_tensordict batch-size with tuple keys by @vmoens in #286
  • [BugFix] Fix memmap ownership to make it process-wise and allow for indexed memmap persistance by @vmoens in #288
  • [Refactor] Deprecate set_default by @tcbegley in #236
  • [BugFix] Fix get_functional and functional call with stateful envs by @vmoens in #287
  • [BugFix] Fix irecv for lazy tensordicts by @vmoens in #274
  • [Feature] h5 compatibility by @vmoens in #289
  • [Test, Refactor, Doc] add explicit test on set + remove hardcoded values by @apbard in #294
  • [Refactor] Make TensorDictBase available at root by @vmoens in #295
  • [Test, BugFix] execute h5 tests only if h5py is installed by @apbard in #298
  • [Refactor] Some improvement on modules by @vmoens in #296
  • [BugFix] Remove functionalized check in _decorate_funs by @vmoens in #300
  • [BugFix] deprecate CLASSES_DICT and _get_typed_output by @apbard in #299
  • [BugFix] add set/get and set_at,get_at methods to tensorclass by @apbard in #293
  • [BugFix] Fix function signature by @vmoens in #304
  • [Refactor] Faster functional module by @vmoens in #303
  • [Feature] forward getattr to wrapped module by @apbard in #290
  • [Feature] support tensorclasses in call by @apbard in #291
  • [Test] consolidate test_tensorclass and test_tensorclass_nofuture by @apbard in #302
  • [Test, Validation] Validate input model and add tests on input checks by @apbard in #305
  • [BugFix] Fix sequential calls to make_functional by @vmoens in #306
  • [Refactor] avoid adding _TENSORCLASS flag by @apbard in #301
  • [BugFix] Fix slow functional calls by @vmoens in #309
  • [BugFix] Fix deepcopy in benchmarks by @vmoens in #310
  • [BugFix] Fix sub-stack of td modules by @vmoens in #311
  • [Feature] Promote tensorclass by @vmoens in #307
  • [Test] Increase timeout for distributed and memmap tests by @apbard in #312
  • [BugFix] dispatch with empty batch-size by @vmoens in #315
  • [BugFix] Fix __setitem__ with broadcasting of tensordicts by @vmoens in #316
  • [BugFix] Allow for optional disabling of auto-batch size determination in dispatch by @vmoens in #317
  • [Refactor] td.set and sampling efficiency by @vmoens in #318
  • [CI] RL pipeline by @vmoens in #319
  • [BugFix] Fix sub-tensordict indexing and updating by @vmoens in #320
  • [Benchmark] More item benchmarks by @vmoens in #323
  • [Refactor] Use slots for faster creation by @vmoens in #321
  • [CI] Continuous benchmark trigger by @vmoens in #325
  • [CI] Continuous benchmark trigger (2) by @vmoens in #326
  • [Refactor] No check on batch-size when _run_checks=False by @vmoens in #322
  • [BugFix,CI] Codecov SHA error by @vmoens in #330
  • [Doc] Updated Docs with conda installation instruction by @sugatoray in #329
  • [Refactor] Compatibility with np.bool_ by @vmoens in #331
  • Deprecate interaction_mode with interaction_type by @Goldspear in #332
  • [CI] add benchmark test under regular pipeline by @apbard in #327
  • [Refactor] Make NormalParamExtractor available at tensordict.nn level by @vmoens in #334
  • [Refactor] Introduce InteractoinType Enum by @Goldspear in #333
  • [Feature] Recursive key selection for sequences by @vmoens in #335
  • [BugFix] nested tds in persistent tds may have the wrong batch-size by @vmoens in #336
  • [Refactor] TensorDictModuleBase by @vmoens in #337
  • [Minor] Doc and vmap fixes by @vmoens in #338
  • [Feature] Close for h5 tds by @vmoens in #339
  • [Benchmark] TDModule benchmarks by @vmoens in #343
  • [BugFix] Key checks in TensorDictSequential by @tcbegley in #340
  • [Feature] set_skip_existing and related by @vmoens in #342
  • [Refactor] copy _contextlib by @vmoens in #344
  • [BugFix] Add dispatch decorator to probabilistic modules by @tcbegley in #345
  • [BugFix] Add sample_log_prob to out_keys when return_log_prob=True by @tcbegley in #346
  • [BugFix] Fix missing "sample_log_prob" when no sample is needed by @vmoens in #347
  • [Doc] Fix doc workflow by @vmoens in #348
  • [Feature] select_out_keys by @vmoens in #350
  • [BugFix] Fix ModuleBase __new__ attribute and property creation by @vmoens in #353
  • [Feature] tensordict.flatten by @vmoens in #354
  • [BugFix] Fix none indexing by @vmoens in #357
  • [Feature] Named dims by @vmoens in #356
  • [BugFix] Fixing set_at_ with names by @vmoens in #359
  • [BugFix] Changing tensordict batch size with names by @vmoens in #360
  • [BugFix] Populate tensordict without names by @vmoens in #361
  • [BugFix] Fix nested names, to(device) names and other bugs by @vmoens in #362
  • [Refactor] Upgrade vmap imports by @vmoens in #308
  • [Feature] as_tensor by @vmoens in #363
  • [BugFix] Fix contiguous names by @vmoens in #364
  • [CI] Upgrade ubuntu version in GHA by @vmoens in #365
  • [Feature] tensordist.reduce by @vmoens in #366
  • [BugFix] Assigning None to names in lazy stacked td by @vmoens in #367
  • [Feature] Modules that output dicts by @vmoens in #368
  • [BugFix] Fix functional calls by @vmoens in #369
  • [BugFix] Fix functional check for non TensorDictModuleBase modules by @vmoens in #370
  • [BugFix] Fix pop for stacked tds by @vmoens in #371
  • [Feature] Keep dimension names in vmap by @vmoens in #372
  • [Versioning] v0.1.1 by @vmoens in #373

New Contributors

Full Changelog: 0.1.0...v0.1.1

0.1.0 [Beta]

16 Mar 12:20
Choose a tag to compare

First official release of tensordict!

What's Changed

Full Changelog: 0.0.3...v0.1.0