[Breaking] Rewrite of `nn` to enable runtime layer sizes, proc macro declarations, and more #854

coreylowman · 2023-08-18T20:21:01Z

Summary

This PR is a rewrite of the nn layer to make a few things possible:

Create networks that have both compile time known shapes and runtime known shapes.
Create networks that are structs, instead of tuples. This makes error messages easier to read, and fields easier to access.

Here's an example of both of these in action:

#[derive(Default, Clone, Sequential)]
#[built(Mlp)]
pub struct MlpConfig {
    pub l1: LinearConfig<Const<3>, usize>,
    pub act1: ReLU,
    pub l2: LinearConfig<usize, Const<10>>,
    pub act2: ReLU,
}

Here we define the MLP to have the input and output sizes known at compile time, but the interior hidden dimension is known at runtime. Since the struct has #[derive(Sequential)], the layers are executed in order of declaration.

Also notice the #[built(Mlp)] which indicates the name of the new type that is defined alongside this struct (which contains the actual modules).

As far as instantiating this object, it's also pretty straightforward:

// NOTE: if this was all compile time, we could just do `Default::default()`
let structure = MlpConfig {
    l1: LinearConfig::new(Const, 5),
    act1: Default::default(),
    l2: LinearConfig::new(5, Const),
    act2: Default::default(),
};
let module: Mlp<f32, Cpu> = dev.build_module_ext::<f32>(structure);

Note that you actually have to instantiate the architecture now as an object, instead of it being at type. This is to support runtime values.

Breaking changes

dfdx has been renamed to dfdx-core and doesn't include nn items. dfdx now contains the new nn items, and re-exports everything from dfdx-core.
dfdx::optim has been moved under dfdx::nn::optim
EMA functionality is removed (can be added back in the future, will require more proc macros)
TensorCollection removed
Saving nn layers to npy is removed, now only safetensors is supported.
The old builders structs are now structs with "Config" postfixed (e.g. LinearConfig instead of builders::Linear)
to_device functionality removed
to_dtype functionality removed
UnbiasedLinear renamed to MatMul
GeneralizedResidual renamed to GeneralizedAdd

…core

Initial commit of workspaces

5cfc8ab

This was linked to issues Aug 18, 2023

Please minimize the requirements for the optimizers #840

Closed

Impossible to create a module with a parameter that lacks a const shape? #839

Closed

Dynamic dimensions in neural network layers? #755

Closed

coreylowman mentioned this pull request Aug 18, 2023

Builder-style type constructors #697

Open

coreylowman added 25 commits August 19, 2023 10:54

Styling

f5bd91e

Adding AvgPool2D and MinPool2D

696fc63

Adding Max/Min PoolGlobal

cea1857

Adding Adam

0947184

Adding RMSprop

7e77b14

Adding SGD docstring

57c012d

Adding Dropout and DropoutOneIn

455300c

Reorg optimizers

117af4b

Adding all activations

c41aaee

Adding BatchNorm1D

9ab7b1a

Adding ConvTrans2D

897b319

Adding Embedding

099188d

Partial updates on examples

74be3dc

Adds AddInto and SplitInto

1f17d9d

Adding Upscale2D

84e64ad

Format

8c11a05

Moving nn benches

b650682

Sketching examples

bbb1d0e

Filling out build-module example

ae9d603

Filling out module-forward example

2ca4f37

Adding Sequential example

98f8e6a

Adding debug calls to module fields

4c881ba

Adding 04-gradients example

6c82d1a

Updates to 05-optim

1dfae77

Adding comments to nn layers

275afae

coreylowman added 26 commits September 5, 2023 11:45

Adding nightly wrappers around dfdx-nn layers

e946645

Moe layers to dfdx-nn/src/layers

a083496

Using save_safetensors in mnist example

ad27e6b

Changing Linear to use weight/bias instead of MatMul/Add

a6e4619

Merge branch 'main' into nn-rewrite

d8ff510

Adding conv1d layer

ecd9a68

sqrt before converting to E

fbe96e5

Fixing clippy errors

239c21d

updating nn examples

3ac5946

Fixing f64 tests

0c9a01c

Merge branch 'main' into nn-rewrite

22e5cc6

Merge remote-tracking branch 'origin/main' into nn-rewrite

346891f

Fixing documentation

cdd602a

Fixing doc tests

8b54609

Ignoring doctests in nn-derives

16fec9b

Fixing nightly feature propagation

f27a692

Rename dfdx -> dfdx-core, dfdx-nn -> dfdx. Move dfdx-nn-core to dfdx-…

4281e57

…core

Moving benches/examples to dfdx

01c01d3

Change all versions to be the same

2ac1a1d

Match dfdx and dfdx-core features

4044697

Update examples

25c7ef0

Fixing prelude/exports

341ee78

Fixing doctests

929602e

Fixing cargo doc

2ac9dc9

Moving feature flags & top level documnetation

a4db3d5

Update no-std

aef72b0

coreylowman changed the title ~~[Breaking] [WIP] Rewrite of nn to enable runtime layer sizes, proc macro declarations, and more~~ [Breaking] Rewrite of nn to enable runtime layer sizes, proc macro declarations, and more Oct 25, 2023

coreylowman merged commit 5e0c3dd into main Oct 25, 2023
8 checks passed

coreylowman deleted the nn-rewrite branch October 25, 2023 15:14

hovinen mentioned this pull request Jan 5, 2024

How does one update one model from another model? #905

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Breaking] Rewrite of `nn` to enable runtime layer sizes, proc macro declarations, and more #854

[Breaking] Rewrite of `nn` to enable runtime layer sizes, proc macro declarations, and more #854

coreylowman commented Aug 18, 2023 •

edited

Loading

[Breaking] Rewrite of nn to enable runtime layer sizes, proc macro declarations, and more #854

[Breaking] Rewrite of nn to enable runtime layer sizes, proc macro declarations, and more #854

Conversation

coreylowman commented Aug 18, 2023 • edited Loading

Summary

Breaking changes

[Breaking] Rewrite of `nn` to enable runtime layer sizes, proc macro declarations, and more #854

[Breaking] Rewrite of `nn` to enable runtime layer sizes, proc macro declarations, and more #854

coreylowman commented Aug 18, 2023 •

edited

Loading