Skip to content

clearState breaks nn.MV #390

@Morpheus3000

Description

@Morpheus3000

I am trying to save my network at specific intervals, but it fails on the backward pass of the next iteration, immediately after saving the model with the following error

> ~/torch/install/bin/luajit: ~/torch/install/share/lua/5.1/nn/Container.lua:67: 
> In 1 module of nn.Sequential:
> ~/torch/install/share/lua/5.1/nn/MV.lua:50: attempt to index a nil value
> stack traceback:
> 	~/torch/install/share/lua/5.1/nn/MV.lua:50: in function <~/torch/install/share/lua/5.1/nn/MV.lua:47>
> 	[C]: in function 'xpcall'
> 	~/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
> 	~/torch/install/share/lua/5.1/nn/Sequential.lua:58: in function 'updateGradInput'
> 	~/torch/install/share/lua/5.1/nngraph/gmodule.lua:420: in function 'neteval'
> 	~/torch/install/share/lua/5.1/nngraph/gmodule.lua:454: in function 'updateGradInput'
> 	~/torch/install/share/lua/5.1/nn/Module.lua:31: in function 'backward'
> 	trainLoop.lua:225: in function 'opfunc'
> 	~/torch/install/share/lua/5.1/optim/adadelta.lua:31: in function 'solver'
> 	trainLoop.lua:228: in main chunk
> 	[C]: in function 'dofile'
> 	~/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
> 	[C]: at 0x00405d50
> 
> WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
> stack traceback:
> 	[C]: in function 'error'
> 	~/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
> 	~/torch/install/share/lua/5.1/nn/Sequential.lua:58: in function 'updateGradInput'
> 	~/torch/install/share/lua/5.1/nngraph/gmodule.lua:420: in function 'neteval'
> 	~/torch/install/share/lua/5.1/nngraph/gmodule.lua:454: in function 'updateGradInput'
> 	~/torch/install/share/lua/5.1/nn/Module.lua:31: in function 'backward'
> 	trainLoop.lua:225: in function 'opfunc'
> 	~/torch/install/share/lua/5.1/optim/adadelta.lua:31: in function 'solver'
> 	trainLoop.lua:228: in main chunk
> 	[C]: in function 'dofile'
> 	~/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
> 	[C]: at 0x00405d50

After some digging around found that switching off the network:clearState() fixes that.
I looked around for it, but all the issues that states the problem has been closed, so I assuming they are fixed. Yet I face this error.
Any work around for this, till it gets fixed?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions