Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: AnomalyScoreDistribution args, kwargs #2482

Open
1 task done
Arman5563 opened this issue Jan 3, 2025 · 1 comment
Open
1 task done

[Bug]: AnomalyScoreDistribution args, kwargs #2482

Arman5563 opened this issue Jan 3, 2025 · 1 comment

Comments

@Arman5563
Copy link

Arman5563 commented Jan 3, 2025

Describe the bug

In the metrics.py file, we have the line

    image_metric.update(output["pred_scores"], output["label"].int())

On the other hand, for the update() method of anomalyScoreDistribution, the code is

    del args, kwargs  # These variables are not used.

    if anomaly_maps is not None:
        self.anomaly_maps.append(anomaly_maps)
    if anomaly_scores is not None:
        self.anomaly_scores.append(anomaly_scores)

Essentially, the args value is not empty and has the required scores and maps, which causes a bug whenever I tried to use this metric.

Dataset

Folder

Model

N/A

Steps to reproduce the behavior

The exact model does not seem to be relevant, at least for 'classification' learning types. I also used Ganomaly with the same results.

model_dfkde = Dfkde(backbone='efficientnet_b1',layers=['blocks.2'])
engine_dfkde = Engine(image_metrics=["AUROC","AUPR", "BinaryF1Score", "F1AdaptiveThreshold", "AnomalyScoreDistribution"],
task="classification", max_epochs=1, logger=csv_logger, log_every_n_steps=7)
engine_dfkde.fit(datamodule=datamodule, model=model_dfkde)

OS information

  • OS: N/A, Google Colab environment
  • Python version: [3.10]
  • Anomalib version: [1.2.0]
  • PyTorch version: [2.5.1+cu121]
  • PyTorch Lightning version: [2.5.0.post0]

Expected behavior

I expected for the run to go smoothly. It is worth noting that when I remove the "AnomalyScoreDistribution" metric, all is fine.

Screenshots

No response

Pip/GitHub

pip

What version/branch did you use?

No response

Configuration YAML

I do not have one.

Logs

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-3-902c33e2027f> in <cell line: 10>()
      8                 task="classification",  max_epochs=1, logger=csv_logger, log_every_n_steps=7)#, "BinaryPrecisionRecallCurve", "AnomalyScoreDistribution",
      9 # Train the model
---> 10 engine_dfkde.fit(datamodule=datamodule, model=model_dfkde)
     11 test_dict=engine_dfkde.test(datamodule=datamodule, model=model_dfkde)

21 frames
/usr/local/lib/python3.10/dist-packages/anomalib/engine/engine.py in fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    547             self.trainer.validate(model, val_dataloaders, datamodule=datamodule, ckpt_path=ckpt_path)
    548         else:
--> 549             self.trainer.fit(model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    550 
    551     def validate(

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py in fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    537         self.state.status = TrainerStatus.RUNNING
    538         self.training = True
--> 539         call._call_and_handle_interrupt(
    540             self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
    541         )

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/call.py in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
     45         if trainer.strategy.launcher is not None:
     46             return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
---> 47         return trainer_fn(*args, **kwargs)
     48 
     49     except _TunerExitException:

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py in _fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    573             model_connected=self.lightning_module is not None,
    574         )
--> 575         self._run(model, ckpt_path=ckpt_path)
    576 
    577         assert self.state.stopped

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py in _run(self, model, ckpt_path)
    980         # RUN THE TRAINER
    981         # ----------------------------
--> 982         results = self._run_stage()
    983 
    984         # ----------------------------

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py in _run_stage(self)
   1024                 self._run_sanity_check()
   1025             with torch.autograd.set_detect_anomaly(self._detect_anomaly):
-> 1026                 self.fit_loop.run()
   1027             return None
   1028         raise RuntimeError(f"Unexpected state {self.state}")

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/fit_loop.py in run(self)
    214             try:
    215                 self.on_advance_start()
--> 216                 self.advance()
    217                 self.on_advance_end()
    218             except StopIteration:

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/fit_loop.py in advance(self)
    453         with self.trainer.profiler.profile("run_training_epoch"):
    454             assert self._data_fetcher is not None
--> 455             self.epoch_loop.run(self._data_fetcher)
    456 
    457     def on_advance_end(self) -> None:

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/training_epoch_loop.py in run(self, data_fetcher)
    149             try:
    150                 self.advance(data_fetcher)
--> 151                 self.on_advance_end(data_fetcher)
    152             except StopIteration:
    153                 break

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/training_epoch_loop.py in on_advance_end(self, data_fetcher)
    368                 call._call_lightning_module_hook(self.trainer, "on_validation_model_zero_grad")
    369 
--> 370             self.val_loop.run()
    371             self.trainer.training = True
    372             self.trainer._logger_connector._first_loop_iter = first_loop_iter

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/utilities.py in _decorator(self, *args, **kwargs)
    177             context_manager = torch.no_grad
    178         with context_manager():
--> 179             return loop_run(self, *args, **kwargs)
    180 
    181     return _decorator

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/evaluation_loop.py in run(self)
    149                 self.on_iteration_done()
    150         self._store_dataloader_outputs()
--> 151         return self.on_run_end()
    152 
    153     def setup_data(self) -> None:

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/evaluation_loop.py in on_run_end(self)
    289 
    290         # hook
--> 291         self._on_evaluation_epoch_end()
    292 
    293         logged_outputs, self._logged_outputs = self._logged_outputs, []  # free memory

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/evaluation_loop.py in _on_evaluation_epoch_end(self)
    371         call._call_lightning_module_hook(trainer, hook_name)
    372 
--> 373         trainer._logger_connector.on_epoch_end()
    374 
    375     def _store_dataloader_outputs(self) -> None:

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/connectors/logger_connector/logger_connector.py in on_epoch_end(self)
    194     def on_epoch_end(self) -> None:
    195         assert self._first_loop_iter is None
--> 196         metrics = self.metrics
    197         self._progress_bar_metrics.update(metrics["pbar"])
    198         self._callback_metrics.update(metrics["callback"])

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/connectors/logger_connector/logger_connector.py in metrics(self)
    233         on_step = self._first_loop_iter is not None
    234         assert self.trainer._results is not None
--> 235         return self.trainer._results.metrics(on_step)
    236 
    237     @property

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/connectors/logger_connector/result.py in metrics(self, on_step)
    474         for _, result_metric in self.valid_items():
    475             # extract forward_cache or computed from the _ResultMetric
--> 476             value = self._get_cache(result_metric, on_step)
    477             if not isinstance(value, Tensor):
    478                 continue

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/connectors/logger_connector/result.py in _get_cache(result_metric, on_step)
    438                         category=PossibleUserWarning,
    439                     )
--> 440                 result_metric.compute()
    441                 result_metric.meta.sync.should = should
    442 

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/connectors/logger_connector/result.py in wrapped_func(*args, **kwargs)
    287             if self._computed is not None:
    288                 return self._computed
--> 289             self._computed = compute(*args, **kwargs)
    290             return self._computed
    291 

/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/connectors/logger_connector/result.py in compute(self)
    252                 return value / cumulated_batch_size
    253             return value
--> 254         return self.value.compute()
    255 
    256     @override

/usr/local/lib/python3.10/dist-packages/torchmetrics/metric.py in wrapped_func(*args, **kwargs)
    698                 should_unsync=self._should_unsync,
    699             ):
--> 700                 value = _squeeze_if_scalar(compute(*args, **kwargs))
    701                 # clone tensor to avoid in-place operations after compute, altering already computed results
    702                 value = apply_to_collection(value, Tensor, lambda x: x.clone())

/usr/local/lib/python3.10/dist-packages/anomalib/metrics/anomaly_score_distribution.py in compute(self)
     46     def compute(self) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]:
     47         """Compute stats."""
---> 48         anomaly_scores = torch.hstack(self.anomaly_scores)
     49         anomaly_scores = torch.log(anomaly_scores)
     50 

RuntimeError: hstack expects a non-empty TensorList

Code of Conduct

  • I agree to follow this project's Code of Conduct
@Arman5563
Copy link
Author

Update: I tried to override the class in the following way:
from anomalib.metrics import AnomalyScoreDistribution
import torch
class fixed_AnomalyScoreDistribution(AnomalyScoreDistribution):
def update(
self,
*args,
anomaly_scores: torch.Tensor | None = None,
anomaly_maps: torch.Tensor | None = None,
**kwargs,
) -> None:
"""Update the precision-recall curve metric."""
if args is not None and anomaly_scores is None and anomaly_maps is None:
if len(args)>=1:
anomaly_scores = args[0]
print(anomaly_scores.size())
#print(self.anomaly_scores)
del args, kwargs # These variables are not used.

    if anomaly_maps is not None:
        self.anomaly_maps.append(anomaly_maps)
    if anomaly_scores is not None:
        self.anomaly_scores.append(anomaly_scores)
def compute(self) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]:
    """Compute stats."""
    anomaly_scores = torch.hstack(self.anomaly_scores)
    anomaly_scores = torch.log(anomaly_scores)

    self.image_mean = anomaly_scores.mean()
    self.image_std = anomaly_scores.std()

    if self.anomaly_maps:
        anomaly_maps = torch.vstack(self.anomaly_maps)
        anomaly_maps = torch.log(anomaly_maps).cpu()

        self.pixel_mean = anomaly_maps.mean(dim=0).squeeze()
        self.pixel_std = anomaly_maps.std(dim=0).squeeze()

    return self.image_mean.item(), self.image_std.item(), self.pixel_mean, self.pixel_std

setattr(torchmetrics, "fixed_AnomalyScoreDistribution", fixed_AnomalyScoreDistribution)

Along with

from pytorch_lightning.loggers import CSVLogger

csv_logger = CSVLogger(save_dir="logs", name="dfkde_metrics")

model_dfkde = Dfkde(backbone='efficientnet_b1',layers=['blocks.2'])
engine_dfkde = Engine(image_metrics=["AUROC","AUPR", "BinaryF1Score", "F1AdaptiveThreshold", "fixed_AnomalyScoreDistribution"],
task="classification", max_epochs=1, logger=csv_logger, log_every_n_steps=7)
engine_dfkde.fit(datamodule=datamodule, model=model_dfkde)
test_dict=engine_dfkde.test(datamodule=datamodule, model=model_dfkde)

And I get

ValueError Traceback (most recent call last)
in <cell line: 10>()
8 task="classification", max_epochs=1, logger=csv_logger, log_every_n_steps=7)#, "BinaryPrecisionRecallCurve", "AnomalyScoreDistribution",
9 # Train the model
---> 10 engine_dfkde.fit(datamodule=datamodule, model=model_dfkde)
11 test_dict=engine_dfkde.test(datamodule=datamodule, model=model_dfkde)

17 frames
/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/connectors/logger_connector/result.py in _get_cache(result_metric, on_step)
445 if cache is not None:
446 if not isinstance(cache, Tensor):
--> 447 raise ValueError(
448 f"The .compute() return of the metric logged as {result_metric.meta.name!r} must be a tensor."
449 f" Found {cache}"

ValueError: The .compute() return of the metric logged as 'image_fixed_AnomalyScoreDistribution' must be a tensor. Found (-0.7639049887657166, 0.18638668954372406, tensor([]), tensor([]))

For the same code(noteworthy that the previous error disappears when I used my customized metric

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant