Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAE.fit(X,y), Providing Y results in calling Torch.to_device on a list. Y is not explicitly ignored for unsupervised. #591

Open
wyler0 opened this issue Jun 27, 2024 · 0 comments

Comments

@wyler0
Copy link

wyler0 commented Jun 27, 2024

Documentation states y_train is ignored when passed to an unsupervised model fit function. Does not seem to be the case. Based on the code it appears this affects all BaseDeepLearningDetector inheriting classes, but I have only confirmed on VAE.

I only found this as I upgraded from Pyod 1.X to 2.X, and my 1.X code was failing.

model = VAE()
X_train = np.array(...)
y_train = np.array(...)

model.fit(X_train, y_train)

Error:

  File "[...]/lib/python3.10/site-packages/pyod/models/base_dl.py", line 194, in fit
    self.train(train_loader)
  File "[...]/lib/python3.10/site-packages/pyod/models/base_dl.py", line 229, in train
    loss = self.training_forward(batch_data)
  File "[...]/lib/python3.10/site-packages/pyod/models/vae.py", line 246, in training_forward
    x = x.to(self.device)
AttributeError: 'list' object has no attribute 'to

Offending code:

    def training_forward(self, batch_data):
        x = batch_data
        x = x.to(self.device)
        self.optimizer.zero_grad()
        x_recon, z_mu, z_logvar = self.model(x)
        loss = self.criterion(x, x_recon, z_mu, z_logvar,
                              beta=self.beta, capacity=self.capacity)
        loss.backward()
        self.optimizer.step()
        return loss.item()

The batch_data is assigned in the BaseDeepLearningDetector as follows:

        if self.preprocessing:
            self.X_mean = np.mean(X, axis=0)
            self.X_std = np.std(X, axis=0)
            train_set = TorchDataset(X=X, y=y,
                                     mean=self.X_mean, std=self.X_std)
        else:
            train_set = TorchDataset(X=X, y=y)
       
      [...]

        train_loader = torch.utils.data.DataLoader(
            dataset=train_set, batch_size=self.batch_size,
            shuffle=True, drop_last=True)

And in train we do the following:

    def train(self, train_loader):
        """Train the deep learning model.

        Parameters
        ----------
        train_loader : torch.utils.data.DataLoader
            The data loader for training the model.
        """
        for epoch in tqdm.trange(self.epoch_num,
                                 desc=f'Training: ',
                                 disable=not self.verbose == 1):
            start_time = time.time()
            overall_loss = []
            for batch_data in train_loader:
                loss = self.training_forward(batch_data)

So, it seems the x = x.to(self.device) need to handle cases where X is comprised of data and labels, or Y needs to be explicitly ignored in the train_loader instantiation.

@wyler0 wyler0 changed the title VAE.fit(X,y) Error VAE.fit(X,y), Providing Y results in calling Torch.to_device on a list. Y is not explicitly ignored for unsupervised. Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant