Dynamic dimensions in neural network layers? #755

bplevin36 · 2023-04-29T19:40:47Z

The new dynamic dimension support for Tensors is great! Are there any plans to support it for Modules as well? To allow for e.g. a Linear layer where the input size is a const generic but the output size is dynamic?

I tried hacking this together myself, but ran into problems with allocation. It seems like the existing Device APIs are pretty strongly angled towards being able to allocate a Module based purely on its type. I can probably implement allocation and Module on my type manually, but it would be great to have a more ergonomic workaround.

The text was updated successfully, but these errors were encountered:

coreylowman · 2023-05-01T13:01:25Z

Glad you like it! Yeah aligning the existing nn layer to support both compile time & run time dimensions would complicate things quite a bit. It's definitely possible, for one example we could have linear defined as:

struct Linear<In: Dim, Out: Dim, E: Dtype, D: DeviceStorage> {
    weight: Tensor<(In, Out), E, D>,
    bias: Tensor<(Out, ), E, D>,
}

and then you could create type aliases like this:

type ConstLinear<const I: usize, const O: usize, E, D> = Linear<Const<I>, Const<O>, E, D>;
type DynLinear<E, D> = Linear<usize, usize, E, D>;

As far as instantiation, you're right the existing method assumes sizes known at compile time. We'd like have to do something similar to what tensors do where we have a version of creation with compile time shapes (dev.zeros(), and a separate version for runtime sizes (dev.zeros_like(&(3, 5))).

I think my main concern would be the increased complexity in adding these, both from internals perspective and also external usability perspective.

The other part of this is that currently in the deep learning ecosystem, most training/inference libraries redefine neural network types in their own libraries. huggingface does this all over this place. As far as dfdx goes, hopefully people can define their own nn layers similar to that. So if someone wanted to implement linear/transformer/etc with runtime shapes outside of dfdx, that should be possible.

Thoughts?

coreylowman added the new feature New feature or request label May 8, 2023

coreylowman mentioned this issue Jul 14, 2023

[WIP] Adds #[derive(Sequential)] on nn builder structs #803

Closed

coreylowman linked a pull request Aug 18, 2023 that will close this issue

[Breaking] Rewrite of nn to enable runtime layer sizes, proc macro declarations, and more #854

Merged

coreylowman closed this as completed in #854 Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic dimensions in neural network layers? #755

Dynamic dimensions in neural network layers? #755

bplevin36 commented Apr 29, 2023

coreylowman commented May 1, 2023

Dynamic dimensions in neural network layers? #755

Dynamic dimensions in neural network layers? #755

Comments

bplevin36 commented Apr 29, 2023

coreylowman commented May 1, 2023