Make basic BC/IC classes correctly trainable#2077
Make basic BC/IC classes correctly trainable#2077kyouma wants to merge 8 commits intolululxvi:masterfrom
Conversation
It was possible to include trainable variables into BC/IC only using OperatorBC (at least for the main types of BC/IC), but it was not explicitly stated in the documentation or FAQ. Now the basic BC/IC classes require to specify whether they depend on trainable variables (this disables function caching for them) or not. The backward compatibility will not be broken as the order of arguments is the same, and this argument will require explicit setting during execution.
Use the current changes, but set the default value of this new flag to False and remove the boolean check. - Old scripts will work as they do now. - If someone needs to train a BC/IC, then will have to specify it just as the trainable variables. Co-authored-by: Edwin Chen <echen1ffa@gmail.com>
Set the default value of the `depends_on_trainable_variables` flag to False and remove the boolean check. - Old scripts will work as they do now. - If someone needs to train a BC/IC, then will have to specify it just as the trainable variables.
|
Hello. Please check the changes. I have updated the default value; removed the boolean check; updated the docstrings to reflect the changes in the documentation. The next step is to add an example and there emphasize the need to set the |
echen5503
left a comment
There was a problem hiding this comment.
It might be a good idea to warn the user if the depends_on_trainable_variables isn't set.
| [on_boundary(x[i], on[i]) for i in range(len(x))] | ||
| ) | ||
| self.component = component | ||
| self.depends_on_trainable_variables = depends_on_trainable_variables |
There was a problem hiding this comment.
Perhaps it might be a good idea to warn the user of the issue if self.depends_on_trainable is not explicitly given
There was a problem hiding this comment.
I can see 2 ways to check whether depends_on_trainable is set explicitly or not and print the warning:
- inside the BC/IC classes'
__init__()- this will bombard with the warnings even the users who do not solve inverse problems, but this does not create any interference between BC/IC classes and, for example, theModelclass that also can check the BCs properties; - inside
Model.compile()(i.e. examine the elements ofself.data.bcs) if and only ifexternal_trainable_variablesis not empty - this will not work for thetensorflow.compat.v1backend (this list is ignored byModel.compile()and, subsequently, by the users), and this may require a stricter control over future additions of classes connectible toModel.data(if they are not children ofDataclass).Edit: For
tensorflow.compat.v1there is no explicit caching in BCs and ICs, so maybe the 2nd option is also viable.
I have made a commit with the 1st option. Following the DeepXDE style, the warning is done with a usual print, so that it will not distract the users unaffected by the bug too much. Besides, I will make an example of an inverse problem with a trainable IC and explicitly state the necessity of setting depends_on_trainable to True there. And maybe even add a new entry to the FAQ section.
| elif utils.get_num_args(func) == 2: | ||
| return wrapper_nocache_auxiliary | ||
|
|
||
|
|
There was a problem hiding this comment.
Use @cache here, so it only prints once, and maybe add a docstring that tells future maintainers that it only prints once.
There was a problem hiding this comment.
Not sure that @cache is the best solution for this, but we shouldn't spam the user with warnings.
|
Nice, lru_cache(1) is cleaner than @cache by itself. To make final test, can you try non-converging code on this issue, verifying that it now converges? |
|
I was trying to modify the diffusion equation inverse problem example and move the unknown constant C from the PDE to the IC: and found out that if I put it into And if I place the unknown constant C (initial guess 1.0) into the bias of IC: then it runs even with caching on. And, as expected, does not converge (must become zero): When I disable caching for the IC, everything works: And also for C inside I think that adding I also have found this discussion #1727, where the author uses the same trick to make IC learnable. |
|
When you say "When I disable caching for the IC," you mean that you passed in the |
|
Yes, I do. Here it is. I am planning to add this as an example of trainable IC. This is the same problem as in DeepXDE examples, but I have moved |
echen5503
left a comment
There was a problem hiding this comment.
Looks good, just a few small changes. Does this bug apply to PDE as well?
| if utils.get_num_args(func) == 1: | ||
| return wrapper_cache | ||
| if utils.get_num_args(func) == 2: | ||
| elif utils.get_num_args(func) == 2: |
There was a problem hiding this comment.
What's the point of turning this into elif? Isn't this a common early return design pattern, where you don't need elif?
There was a problem hiding this comment.
I thought that because of ...or disable_caching part there can be a situation when both external conditions are True, and if in future returns are changed into assignments for some reason, a bug may be overlooked. That is why I decided to mark all these branches as incompatible.
There was a problem hiding this comment.
The outer elif seems necessary by this reasoning, but what about inner elif?
There was a problem hiding this comment.
For consistency and explicitness.
There was a problem hiding this comment.
Ok, I guess it's just a style choice.
|
As far as I understand, it only applies to IC and also BCs that wrap the |
|
That's my understanding too; PDE classes don't seem to use cache. |
echen5503
left a comment
There was a problem hiding this comment.
Need a few clarifications
| depends_on_trainable_variables = False | ||
| self.depends_on_trainable_variables = depends_on_trainable_variables | ||
|
|
||
| self.boundary_normal = npfunc_range_autocache( |
There was a problem hiding this comment.
Aren't you forgetting to add the flag here?
There was a problem hiding this comment.
If DeepXDE supports non-constant geometry, then indeed it is safer or even must-do to set the caching flag for boundary normal calculation, too.
Correct me if I am wrong, but as far as I know, DeepXDE does not support such geometry, as even parametrized geometry is reduced to some normalized constant geometry with parameters transferred into the functions.
If there is any concern that in future something may change, it is better to apply the flag here, too. Just like I have changed some lines with if...elif statements to make them more bug-proof in case of possible future changes.
There was a problem hiding this comment.
DeepXDE only supports constant geometry, it seems, and there is no plan to make nonconstant in the near future to my knowledge, current focus is fixing current geometry bugs.
There was a problem hiding this comment.
Then it may be OK to leave the boundary normal calculation as it is now. I can add a comment for each creation of self.boundary_normal saying that for support of learnable boundaries the "disable caching" flag must be passed inside it, too.
Or actually, because any changes for learnable geometry support might be done in other places of the library and this comment might be overlooked, I will test how no caching for self.boundary_normal affects performance. It it is ~5%, I will add the flag to self.boundary_normal, too.
There was a problem hiding this comment.
Actually, this is a more difficult question. If learnable geometry is introduced it is likely to be done via specifying settings of geometry objects, so one cannot pass the depends_on_trainable_variables of a BC object to manage the boundary normal caching. Thus, I have only added comments to warn future contributors.
| if depends_on_trainable_variables is None: | ||
| _warn_dependance_on_trainable_variables() | ||
| depends_on_trainable_variables = False | ||
| self.depends_on_trainable_variables = depends_on_trainable_variables |
There was a problem hiding this comment.
The rationale behind this is just consistency right? Because this is not used.
There was a problem hiding this comment.
The consistency and similar "interfaces" in API docs are beautiful, but I have added it here mainly to make it bug-proof in case of future changes when some type of caching is applied here, too. Apart from that, it is not necessary here at all, it is not even a style preference.
There was a problem hiding this comment.
Sorry, when I said "style choice" I meant "design/robustness choice"
There was a problem hiding this comment.
Now there is indeed no need to add these lines here (and in 1 or 2 other places, where BC does not use caching for func at all), but if in future someone wanted to change this behavior for some reason, this might be of help. As an alternative, a comment about necessity to add the ability to disable caching if it is applied here in future can be written, but I have decided to just implement a part of this mechanism here.
There was a problem hiding this comment.
I think adding a comment for the areas where the flag is added but isn't used is sufficient
36cdc53 to
b1e820b
Compare
| depends_on_trainable_variables = False | ||
| self.depends_on_trainable_variables = depends_on_trainable_variables | ||
|
|
||
| # If learnable geometry is introduced, the boundary normal caching must be disabled |
There was a problem hiding this comment.
What do you mean by learnable geometry? Finding the shape of the domain for inverse problem?
There was a problem hiding this comment.
Yes. Or actually, I should have write "learnable or variable" (for example, a domain that does not depend on any trainable variables, but still changes its shape or size over time). But the introduction of such things into DeepXDE seems to be a very distant prospect.
| def __init__(self, geom, func, on_boundary): | ||
| super().__init__(geom, on_boundary, 0) | ||
| def __init__(self, geom, func, on_boundary, depends_on_trainable_variables=None): | ||
| # `depends_on_trainable_variables` is here in order to be consistent |
There was a problem hiding this comment.
I agree with not putting a warning for those that don't use autocache.
There was a problem hiding this comment.
Should I change the default depends_on_trainable_variables argument value to False here and in other similar classes?
Hello.
In relation to the issue #2074.
Now it is possible to include trainable variables into BC/IC for inverse problems only by using
OperatorBC(at least for the main types of BC/IC), but it did not manage to find an explicit statement about this in the documentation or FAQ. Now, if one tries to useDirichletBC,NeumannBC,ICetc. that depends on trainable variables, the function values are cached and the trainable variable gradients are incorrect, which leads to terrible convergence behavior.This PR proposes to require from the basic BC/IC classes to specify whether they depend on trainable variables (this disables function caching for them) or not. It is done independently for each BC/IC to protect the performance.
I have put the new argument at the end of the class signatures in order to preserve backward compatibility for scripts with positional arguments. But the default value is None, which will cause an exception saying that the user need to set it correctly.
The alternative to these or different code changes is to add a warning to the documentation (at least the BC, IC and inverse problems examples sections) saying that for inverse problems with trainable variables in BC/IC one must use
OperatorBC.