CNN training on MNIST does not converge

* `cnn_mnist` example which trains a CNN network on MNIST data stays at random (10%) accuracy over epochs;
* `cnn_from_keras` example which loads a pre-trained CNN from Keras and achieves expected high accuracy (90.14%)

The above suggests that the forward passes of `conv2d`, `maxpool2d`, and `flatten` layers are implemented correctly.

The culprit may be in the implementation of `backward` methods for any of these layers, or in the backward flow of data.

This should be fixed before the release of v0.13.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CNN training on MNIST does not converge #145

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CNN training on MNIST does not converge #145

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions