📜  Python API Autograd和Initializer

📅  最后修改于: 2020-12-10 04:54:36             🧑  作者: Mango


本章介绍MXNet中的autograd和initializer API。

mxnet.autograd

这是MXNet的NDArray自动升级API。它具有以下类别-

类别:Function()

它用于autograd中的自定义差异。可以将其编写为mxnet.autograd.Function 。如果由于某种原因,用户不想使用默认链规则计算出的梯度,那么他/她可以使用mxnet.autograd的Function类来自定义微分以进行计算。它有两种方法,即Forward()和Backward()。

让我们借助以下几点来理解该类的工作方式-

  • 首先,我们需要使用正向方法定义计算。

  • 然后,我们需要在后退方法中提供自定义的区分。

  • 现在梯度计算过程中,代替用户定义后退函数,mxnet.autograd将使用由用户定义的后退函数。我们还可以强制转换为numpy数组,然后将其转换为正向和反向执行某些操作。

在使用mxnet.autograd之前。函数类,让我们使用后退和前进方法定义一个稳定的Sigmoid函数,如下所示:

class sigmoid(mx.autograd.Function):
   def forward(self, x):
      y = 1 / (1 + mx.nd.exp(-x))
      self.save_for_backward(y)
      return y
   
   def backward(self, dy):
      y, = self.saved_tensors
      return dy * y * (1-y)

现在,该函数类可以如下使用:

func = sigmoid()
x = mx.nd.random.uniform(shape=(10,))
x.attach_grad()
with mx.autograd.record():
m = func(x)
m.backward()
dx_grad = x.grad.asnumpy()
dx_grad

输出

运行代码时,您将看到以下输出-

array([0.21458015, 0.21291625, 0.23330082, 0.2361367 , 0.23086983,
0.24060014, 0.20326573, 0.21093895, 0.24968489, 0.24301809],
dtype=float32)

方法及其参数

以下是mxnet.autogard的方法及其参数。函数类-

Methods and its Parameters Definition
forward (heads[, head_grads, retain_graph, …]) This method is used for forward computation.
backward(heads[, head_grads, retain_graph, …]) This method is used for backward computation. It computes the gradients of heads with respect to previously marked variables. This method takes as many inputs as forward’s output. It also returns as many NDArray’s as forward’s inputs.
get_symbol(x) This method is used to retrieve recorded computation history as Symbol.
grad(heads, variables[, head_grads, …]) This method computes the gradients of heads with respect to variables. Once computed, instead of storing into variable.grad, gradients will be returned as new NDArrays.
is_recording() With the help of this method we can get status on recording and not recording.
is_training() With the help of this method we can get status on training and predicting.
mark_variables(variables, gradients[, grad_reqs]) This method will mark NDArrays as variables to compute gradient for autograd. This method is same as function .attach_grad() in a variable but the only difference is that with this call we can set the gradient to any value.
pause([train_mode]) This method returns a scope context to be used in ‘with’ statement for codes which do not need gradients to be calculated.
predict_mode() This method returns a scope context to be used in ‘with’ statement in which forward pass behavior is set to inference mode and that is without changing the recording states.
record([train_mode]) It will return an autograd recording scope context to be used in ‘with’ statement and captures code which needs gradients to be calculated.
set_recording(is_recording) Similar to is_recoring(), with the help of this method we can get status on recording and not recording.
set_training(is_training) Similar to is_traininig(), with the help of this method we can set status to training or predicting.
train_mode() This method will return a scope context to be used in ‘with’ statement in which forward pass behavior is set to training mode and that is without changing the recording states.

实施实例

在下面的示例中,我们将使用mxnet.autograd.grad()方法来计算水头相对于变量的梯度-

x = mx.nd.ones((2,))
x.attach_grad()
with mx.autograd.record():
z = mx.nd.elemwise_add(mx.nd.exp(x), x)
dx_grad = mx.autograd.grad(z, [x], create_graph=True)
dx_grad

输出

输出在下面提到-

[
[3.7182817 3.7182817]
]

我们可以使用mxnet.autograd.predict_mode()方法返回要在’with’语句中使用的范围-

with mx.autograd.record():
y = model(x)
with mx.autograd.predict_mode():
y = sampling(y)
backward([y])

mxnet.intializer

这是用于称重初始化程序的MXNet API。它具有以下类别-

类及其参数

以下是mxnet.autogard的方法及其参数。函数类:

Classes and its Parameters Definition
Bilinear() With the help of this class we can initialize weight for up-sampling layers.
Constant(value) This class initializes the weights to a given value. The value can be a scalar as well as NDArray that matches the shape of the parameter to be set.
FusedRNN(init, num_hidden, num_layers, mode) As name implies, this class initialize parameters for the fused Recurrent Neural Network (RNN) layers.
InitDesc It acts as the descriptor for the initialization pattern.
Initializer(**kwargs) This is the base class of an initializer.
LSTMBias([forget_bias]) This class initialize all biases of an LSTMCell to 0.0 but except for the forget gate whose bias is set to a custom value.
Load(param[, default_init, verbose]) This class initialize the variables by loading data from file or dictionary.
MSRAPrelu([factor_type, slope]) As name implies, this class Initialize the weight according to a MSRA paper.
Mixed(patterns, initializers) It initializes the parameters using multiple initializers.
Normal([sigma]) Normal() class initializes weights with random values sampled from a normal distribution with a mean of zero and standard deviation (SD) of sigma.
One() It initializes the weights of parameter to one.
Orthogonal([scale, rand_type]) As name implies, this class initialize weight as orthogonal matrix.
Uniform([scale]) It initializes weights with random values which is uniformly sampled from a given range.
Xavier([rnd_type, factor_type, magnitude]) It actually returns an initializer that performs “Xavier” initialization for weights.
Zero() It initializes the weights of parameter to zero.

实施实例

在下面的示例中,我们将使用mxnet.init.Normal()类创建一个初始化程序并检索其参数-

init = mx.init.Normal(0.8)
init.dumps()

输出

输出如下-

'["normal", {"sigma": 0.8}]'

init = mx.init.Xavier(factor_type="in", magnitude=2.45)
init.dumps()

输出

输出如下所示-

'["xavier", {"rnd_type": "uniform", "factor_type": "in", "magnitude": 2.45}]'

在下面的示例中,我们将使用mxnet.initializer.Mixed()类使用多个初始化程序来初始化参数-

init = mx.initializer.Mixed(['bias', '.*'], [mx.init.Zero(),
mx.init.Uniform(0.1)])
module.init_params(init)

for dictionary in module.get_params():
for key in dictionary:
print(key)
print(dictionary[key].asnumpy())

输出

输出如下所示-

fullyconnected1_weight
[[ 0.0097627 0.01856892 0.04303787]]
fullyconnected1_bias
[ 0.]