译(四十八)-Pytorch的model.train()有什么用(2022/12/21更新)
如有翻译问题欢迎评论指出,谢谢。
2022/12/21更新:感谢BirB先生的修正,已修正错误的翻译。
距离写这篇过去十个月了,我介绍下我新的理解:
如果一个方法仅在训练时启用,那就if self.training:
来执行,比如dropout
,通过在训练时随机屏蔽若干个节点来提高鲁棒性(dropout
不用额外判断training
,它自己有设置)。
如果在输出时,训练与测试是两个不同的路径,也通过if self.training:
来区分,比如BNNeck
,训练比测试多使用一个全连接层。
模型通过model.train()
将self.training = True
,通过model.eval()
将self.training = False
,所以即便训练与测试共用一个模型,也能通过self.training
来区分现在属于训练还是测试。
PyTorch的model.train()有什么用?
aerin asked:
- 它是在
nn.Module
里调用forward()
?调用模型的时候不是会使用forward
方法吗,为什么还需要指定用train()
?
- 它是在
Answers:
Umang Gupta - vote: 192
model.train()
让你的模型知道现在正在训练。像 dropout、batchnorm 层在训练和测试时的作用不同,所以需要使它们运行在对应的模式中。更多细节:设置模式以训练(见源码)。调用
model.eval()
或model.train(mode=False)
启用测试。train
不只用来训练,还用来测试,区别在于mode
的不同设置。
译者注:最后一句有点看不懂,我按我理解的来,有误请留言:It is somewhat intuitive to expecttrain
function to train model but it does not do that. It just sets the mode.- 感谢BirB先生的修正,确实是这个意思和作用:
It is somewhat intuitive to expect train function to train model but it does not do that. It just sets the mode.
可能我们直观上会想着用train方法来训练模型,但是train方法并不是用来训练模型的。它只是用来切换训练\评估模式的。
- 感谢BirB先生的修正,确实是这个意思和作用:
prosti - vote: 75
module.train()
的源码:def train(self, mode=True): r"""Sets the module in training mode.""" self.training = mode for module in self.children(): module.train(mode) return self
module.eval
的源码:def eval(self): r"""Sets the module in evaluation mode.""" return self.train(False)
train
与eval
是 module 中对立的两种模式。标志默认为
True
。iacob - vote: 14
model.train()
model.eval()
设置模型为训练模式,即
•BatchNorm
层利用每个 batch 来统计
•Dropout
层激活,例。设置模型为评估/推理模式,即
•BatchNorm
layers use running statistics
•Dropout
层取消。
等效于model.train(False)
。- 注意:每个方法都不调用前向与后向传递,它们只用来告诉模型如何运行以及何时运行。
- 这很重要,一些模块(层) 在训练与推理时的设计不同,如果运行在错误的模式下,会产生意料之外的结果。
What does model.train() do in PyTorch?
aerin asked:
- Does it call
forward()
innn.Module
? I thought when we call the model,forward
method is being used. Why do we need to specify train()?
它是在nn.Module
里调用forward()
?调用模型的时候不是会使用forward
方法吗,为什么还需要指定用train()
?
- Does it call
Answers:
Umang Gupta - vote: 192
model.train()
tells your model that you are training the model. So effectively layers like dropout, batchnorm etc. which behave different on the train and test procedures know what is going on and hence can behave accordingly.
model.train()
让你的模型知道现在正在训练。像 dropout、batchnorm 层在训练和测试时的作用不同,所以需要使它们运行在对应的模式中。More details: It sets the mode to train (see source code). You can call either
model.eval()
ormodel.train(mode=False)
to tell that you are testing. It is somewhat intuitive to expecttrain
function to train model but it does not do that. It just sets the mode.
更多细节:设置模式以训练(见源码)。调用model.eval()
或model.train(mode=False)
启用测试。train
不只用来训练,还用来测试,区别在于mode
的不同设置。
译者注:最后一句有点看不懂,我按我理解的来,有误请留言:It is somewhat intuitive to expecttrain
function to train model but it does not do that. It just sets the mode.- 感谢BirB先生的修正,确实是这个意思和作用:
It is somewhat intuitive to expect train function to train model but it does not do that. It just sets the mode.
可能我们直观上会想着用train方法来训练模型,但是train方法并不是用来训练模型的。它只是用来切换训练\评估模式的。
- 感谢BirB先生的修正,确实是这个意思和作用:
prosti - vote: 75
Here is the code of
module.train()
:
module.train()
的源码:def train(self, mode=True): r"""Sets the module in training mode.""" self.training = mode for module in self.children(): module.train(mode) return self
And here is the
module.eval
.
module.eval
的源码:def eval(self): r"""Sets the module in evaluation mode.""" return self.train(False)
Modes
train
andeval
are the only two modes we can set the module in, and they are exactly opposite.
train
与eval
是 module 中对立的两种模式。That\'s just a
self.training
flag and currently onlyDropout
andBatchNorm
care about that flag.
目前只有Dropout
和BatchNorm
关心self.training
标志。By default, this flag is set to
True
.
标志默认为True
。iacob - vote: 14
model.train()
model.eval()
Sets your model in training mode i.e.
设置模型为训练模式,即
•BatchNorm
layers use per-batch statistics
•BatchNorm
层利用每个 batch 来统计
•Dropout
layers activated etc
•Dropout
层激活,例。Sets your model in evaluation (inference) mode i.e.
设置模型为评估/推理模式,即
•BatchNorm
layers use running statistics
•Dropout
layers de-activated etc.
•Dropout
层取消。
Equivalent tomodel.train(False)
.
等效于model.train(False)
。- Note: neither of these function calls run forward / backward passes. They tell the model how to act when run.
注意:每个方法都不调用前向与后向传递,它们只用来告诉模型如何运行以及何时运行。 - This is important as some modules (layers) (e.g.
Dropout
,BatchNorm
) are designed to behave differently during training vs inference, and hence the model will produce unexpected results if run in the wrong mode.
这很重要,一些模块(层) 在训练与推理时的设计不同,如果运行在错误的模式下,会产生意料之外的结果。
Much Photos Watch
How is the COVID situation over there?