Memory issue

Hi @samleoqh 

Thank you for your release source code. It helps me a lot.

During training process, I met a problem related to memory. 
![image](https://github.com/samleoqh/MSCG-Net/assets/34616024/cecef4f0-de34-4368-b8fc-efc3a657a4ed)

The process consume a lot of memory, over 150GB RAM. I think the problem in the ```validate``` function. Because you append all the input/output data to the ```inputs_all, gts_all, predictions_all```

```
def validate(net, val_set, val_loader, criterion, optimizer, epoch, new_ep):
    net.eval()
    val_loss = AverageMeter()
    inputs_all, gts_all, predictions_all = [], [], []

    with torch.no_grad():
        for vi, (inputs, gts) in enumerate(val_loader):
            inputs, gts = inputs.cuda(), gts.cuda()
            N = inputs.size(0) * inputs.size(2) * inputs.size(3)
            outputs = net(inputs)

            val_loss.update(criterion(outputs, gts).item(), N)
            # val_loss.update(criterion(gts, outputs).item(), N)
            if random.random() > train_args.save_rate:
                inputs_all.append(None)
            else:
                inputs_all.append(inputs.data.squeeze(0).cpu())

            gts_all.append(gts.data.squeeze(0).cpu().numpy())
            predictions = outputs.data.max(1)[1].squeeze(1).squeeze(0).cpu().numpy()
            predictions_all.append(predictions)

    update_ckpt(net, optimizer, epoch, new_ep, val_loss,
                inputs_all, gts_all, predictions_all)

    net.train()
    return val_loss, inputs_all, gts_all, predictions_all
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory issue #34

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Memory issue #34

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions