Skip to the content.

Detailed Specification for Optimizer Methods

In the mini-torch framework, the Optimizer class maintains references to each module’s (layer’s) parameters and their gradient information. It implements the Single Responsibility Principle (SRP) in OO design by taking responsibility for parameter updates.

__init__()`

The constructor for the Optimizer is responsible for receiving the components it needs to update and storing the hyperparameters for the optimization algorithm.

Method Signature

def __init__(self, modules, lr=0.01):

Specification

Example Implementation (for an SGD Optimizer)

def __init__(self, modules, lr=0.01):
    self.modules = modules
    self.lr = lr

step()

The step() method is the core engine of the optimizer. It is called once per batch to apply the gradients calculated during the backward pass and mathematically update the network’s weights.

Method Signature

def step(self):

Specification

Example Implementation (for an SGD Optimizer)

def step(self):
    for module in self.modules:
        params = module.parameters()
        grads = module.grads()
        
        # Update each parameter: p = p - lr * grad
        for i in range(len(params)):
            params[i] -= self.lr * grads[i]

zero_grad()

The zero_grad() method clears the gradient caches from the previous training step.

Method Signature

def zero_grad(self):

Specification

Example Implementation

def zero_grad(self):
    for module in self.modules:
        if hasattr(module, 'zero_grad'):
            module.zero_grad()