Skip to content

dirpath isn't updated when logger chages dir after first run #20092

@ScarWar

Description

@ScarWar

Bug description

I'm using the great library https://github.com/SkafteNicki/pl_crossvalidate to cross validate in my project. The library is overriding some of the internal behavior of the trainer and the logs directory.

The checkpoint path is resolved once during the first fold then it is short circuited and therefore never resolving to the new fold directory.

I suggest moving some of the initialization into the setup method

What version are you seeing the problem on?

master

How to reproduce the bug

trainer = KFoldTrainer(
        num_folds=training_args['folds'],
        max_epochs=training_args['epochs'],
        accelerator="gpu",
        callbacks=[
            ModelCheckpoint(
                monitor=training_args['monitor_metric'],
                save_top_k=1,
                mode=training_args['metric_mode'],
                verbose=True
            )
        ]
    )

Error messages and logs

No response

Environment

Current environment
  • CUDA:
    - GPU:
    - NVIDIA GeForce RTX 3050 Laptop GPU
    - available: True
    - version: 11.8
  • Lightning:
    - lightning: 2.3.1
    - lightning-cloud: 0.5.37
    - lightning-utilities: 0.11.3.post0
    - pytorch-lightning: 2.0.4
    - torch: 2.0.0
    - torch-cluster: 1.6.1
    - torch-geometric: 2.3.0
    - torch-scatter: 2.1.1
    - torchaudio: 2.0.0
    - torchmetrics: 1.4.0.post0
    - torchvision: 0.15.2a0
  • Packages:
    - absl-py: 1.4.0
    - addict: 2.4.0
    - aiofiles: 22.1.0
    - aiohttp: 3.8.5
    - aiosignal: 1.3.1
    - aiosqlite: 0.18.0
    - albumentations: 1.3.1
    - alphashape: 1.3.1
    - anyio: 3.5.0
    - appdirs: 1.4.4
    - argon2-cffi: 21.3.0
    - argon2-cffi-bindings: 21.2.0
    - arrow: 1.2.3
    - asttokens: 2.2.1
    - async-timeout: 4.0.2
    - attrs: 23.1.0
    - azure-core: 1.29.4
    - azure-identity: 1.14.0
    - azure-storage-blob: 12.18.2
    - babel: 2.11.0
    - backcall: 0.2.0
    - beautifulsoup4: 4.12.2
    - binaryornot: 0.4.4
    - bleach: 4.1.0
    - blessed: 1.19.1
    - blinker: 1.6.2
    - bottleneck: 1.3.5
    - branca: 0.6.0
    - brotlipy: 0.7.0
    - build: 0.10.0
    - cachecontrol: 0.13.1
    - cached-property: 1.5.2
    - cachetools: 5.3.1
    - certifi: 2024.6.2
    - cffi: 1.16.0
    - chardet: 5.2.0
    - charset-normalizer: 3.3.0
    - chex: 0.1.83
    - cleo: 2.0.1
    - click: 8.1.3
    - click-log: 0.4.0
    - click-plugins: 1.1.1
    - cligj: 0.7.2
    - colorama: 0.4.6
    - coloredlogs: 15.0.1
    - comm: 0.1.2
    - configargparse: 1.5.3
    - contourpy: 1.0.7
    - cookiecutter: 2.4.0
    - crashtest: 0.4.1
    - croniter: 1.3.15
    - cryptography: 41.0.4
    - cycler: 0.11.0
    - cython: 0.29.37
    - daal: 2024.5.0
    - daal4py: 2024.5.0
    - dash: 2.9.1
    - dash-core-components: 2.0.0
    - dash-html-components: 2.0.0
    - dash-table: 5.0.0
    - dataclasses: 0.8
    - dateutils: 0.6.12
    - debugpy: 1.6.6
    - decorator: 5.1.1
    - deepdiff: 6.3.1
    - defusedxml: 0.7.1
    - dill: 0.3.6
    - distlib: 0.3.7
    - dm-tree: 0.1.8
    - docopt: 0.6.2
    - docutils: 0.20.1
    - dulwich: 0.21.6
    - easydict: 1.10
    - elastic-transport: 8.4.1
    - elasticsearch: 8.10.0
    - entrypoints: 0.4
    - exceptiongroup: 1.0.4
    - executing: 1.2.0
    - fastapi: 0.100.0
    - fastjsonschema: 2.16.3
    - filelock: 3.12.4
    - fiona: 1.8.22
    - flask: 2.2.3
    - flatbuffers: 23.5.26
    - flax: 0.6.1
    - folium: 0.14.0
    - fonttools: 4.39.2
    - freetype-py: 2.4.0
    - frozenlist: 1.4.0
    - fsspec: 2023.6.0
    - future: 1.0.0
    - gdal: 3.5.3
    - geomloss: 0.2.6
    - geopandas: 0.12.2
    - gmpy2: 2.1.2
    - google-auth: 2.22.0
    - google-auth-oauthlib: 1.0.0
    - gpustat: 1.0.0
    - grpcio: 1.54.2
    - h11: 0.14.0
    - h5py: 3.8.0
    - hdbscan: 0.8.37
    - html5lib: 1.1
    - humanfriendly: 10.0
    - idna: 3.4
    - imageio: 2.31.1
    - importlib-metadata: 6.8.0
    - importlib-resources: 5.12.0
    - iniconfig: 1.1.1
    - inquirer: 3.1.3
    - insightface: 0.7.3
    - installer: 0.7.0
    - ipykernel: 6.22.0
    - ipython: 8.11.0
    - ipython-genutils: 0.2.0
    - ipywidgets: 8.0.4
    - isodate: 0.6.1
    - itsdangerous: 2.1.2
    - jaraco.classes: 3.3.0
    - jax: 0.4.13
    - jaxlib: 0.4.12
    - jedi: 0.18.2
    - jeepney: 0.8.0
    - jinja2: 3.1.2
    - joblib: 1.2.0
    - json5: 0.9.6
    - jsonpatch: 1.32
    - jsonpointer: 2.1
    - jsons: 1.6.3
    - jsonschema: 4.17.3
    - jupyter-client: 8.1.0
    - jupyter-core: 5.3.0
    - jupyter-events: 0.6.3
    - jupyter-server: 2.5.0
    - jupyter-server-fileid: 0.9.0
    - jupyter-server-terminals: 0.4.4
    - jupyter-server-ydoc: 0.8.0
    - jupyter-ydoc: 0.2.4
    - jupyterlab: 3.6.3
    - jupyterlab-pygments: 0.1.2
    - jupyterlab-server: 2.22.0
    - jupyterlab-widgets: 3.0.5
    - keyring: 24.2.0
    - kivy: 2.2.1
    - kiwisolver: 1.4.4
    - kneed: 0.8.2
    - lazy-loader: 0.3
    - lightning: 2.3.1
    - lightning-cloud: 0.5.37
    - lightning-utilities: 0.11.3.post0
    - llvmlite: 0.40.1
    - lockfile: 0.12.2
    - lxml: 4.9.1
    - mamba-gator: 5.2.0
    - mapclassify: 2.5.0
    - markdown: 3.4.4
    - markdown-it-py: 2.2.0
    - markupsafe: 2.1.2
    - mat73: 0.60
    - matplotlib: 3.8.4
    - matplotlib-inline: 0.1.6
    - mdurl: 0.1.0
    - mistune: 0.8.4
    - mkl-fft: 1.3.6
    - mkl-random: 1.2.2
    - mkl-service: 2.4.0
    - ml-dtypes: 0.4.0
    - more-itertools: 10.1.0
    - mpi4py: 3.1.4
    - mpmath: 1.3.0
    - msal: 1.24.1
    - msal-extensions: 1.0.0
    - msgpack: 1.0.7
    - multidict: 6.0.4
    - munch: 2.5.0
    - munkres: 1.1.4
    - nbclassic: 0.5.5
    - nbclient: 0.5.13
    - nbconvert: 6.5.4
    - nbformat: 5.7.0
    - nest-asyncio: 1.5.6
    - networkx: 3.1
    - notebook: 6.5.4
    - notebook-shim: 0.2.2
    - numba: 0.57.1
    - numexpr: 2.8.4
    - numpy: 1.24.3
    - nvidia-ml-py: 11.495.46
    - oauthlib: 3.2.2
    - onnx: 1.14.1
    - onnxruntime-gpu: 1.16.0
    - open3d: 0.17.0
    - opencv-python-headless: 4.7.0.72
    - opt-einsum: 3.3.0
    - optax: 0.2.2
    - ordered-set: 4.1.0
    - orjson: 3.9.2
    - packaging: 23.2
    - pandas: 1.5.3
    - pandocfilters: 1.5.0
    - parso: 0.8.3
    - patsy: 0.5.3
    - pexpect: 4.8.0
    - pickleshare: 0.7.5
    - pillow: 9.4.0
    - pip: 23.2.1
    - pipreqs: 0.4.11
    - pkginfo: 1.9.6
    - pl-crossvalidate: 0.1.0
    - platformdirs: 3.11.0
    - plotly: 5.13.1
    - pluggy: 1.0.0
    - poetry: 1.6.1
    - poetry-core: 1.7.0
    - poetry-plugin-export: 1.5.0
    - pooch: 1.4.0
    - portalocker: 2.8.2
    - pot: 0.9.0
    - pretty-errors: 1.2.25
    - prettytable: 3.9.0
    - prometheus-client: 0.14.1
    - prompt-toolkit: 3.0.38
    - protobuf: 4.21.12
    - psutil: 5.9.4
    - ptyprocess: 0.7.0
    - pure-eval: 0.2.2
    - py: 1.11.0
    - pyasn1: 0.4.8
    - pyasn1-modules: 0.2.8
    - pybind11: 2.11.1
    - pycparser: 2.21
    - pydantic: 1.10.10
    - pydiffmap: 0.2.0.1
    - pyglet: 1.5.27
    - pygments: 2.14.0
    - pygsp: 0.5.1
    - pyjwt: 2.7.0
    - pynndescent: 0.5.10
    - pyopengl: 3.1.6
    - pyopenssl: 23.2.0
    - pyparsing: 3.0.9
    - pyproj: 3.5.0
    - pyproject-hooks: 1.0.0
    - pyquaternion: 0.9.9
    - pyrender: 0.1.45
    - pyrsistent: 0.19.3
    - pysocks: 1.7.1
    - pytest: 7.4.0
    - python-dateutil: 2.8.2
    - python-editor: 1.0.4
    - python-json-logger: 2.0.7
    - python-multipart: 0.0.6
    - python-slugify: 8.0.1
    - pytorch-lightning: 2.0.4
    - pytz: 2022.7.1
    - pyu2f: 0.1.5
    - pyyaml: 6.0
    - pyzmq: 25.0.2
    - qudida: 0.0.4
    - rapidfuzz: 2.15.2
    - readchar: 4.0.5.dev0
    - requests: 2.31.0
    - requests-oauthlib: 1.3.1
    - requests-toolbelt: 1.0.0
    - rfc3339-validator: 0.1.4
    - rfc3986-validator: 0.1.1
    - rich: 13.3.5
    - rsa: 4.9
    - rtree: 1.0.1
    - scienceplots: 2.1.1
    - scikit-image: 0.22.0
    - scikit-learn: 1.3.0
    - scikit-learn-intelex: 20230131.200013
    - scipy: 1.10.1
    - seaborn: 0.13.2
    - secretstorage: 3.3.3
    - send2trash: 1.8.0
    - setuptools: 68.0.0
    - shapely: 2.0.1
    - shellingham: 1.5.3
    - six: 1.16.0
    - sniffio: 1.2.0
    - soupsieve: 2.4
    - stack-data: 0.6.2
    - starlette: 0.27.0
    - starsessions: 1.3.0
    - statsmodels: 0.14.0
    - sympy: 1.11.1
    - tabulate: 0.9.0
    - tbb: 2021.13.0
    - tenacity: 8.2.2
    - tensorboard: 2.13.0
    - tensorboard-data-server: 0.7.0
    - terminado: 0.17.1
    - text-unidecode: 1.3
    - threadpoolctl: 3.1.0
    - tifffile: 2023.9.26
    - tinycss2: 1.2.1
    - tomli: 2.0.1
    - tomlkit: 0.12.1
    - toolz: 0.12.1
    - torch: 2.0.0
    - torch-cluster: 1.6.1
    - torch-geometric: 2.3.0
    - torch-scatter: 2.1.1
    - torchaudio: 2.0.0
    - torchmetrics: 1.4.0.post0
    - torchvision: 0.15.2a0
    - tornado: 6.2
    - tqdm: 4.66.4
    - traitlets: 5.9.0
    - transforms3d: 0.4.1
    - trimesh: 3.21.5
    - triton: 2.0.0
    - trove-classifiers: 2023.10.17
    - typing-extensions: 4.5.0
    - typish: 1.9.3
    - umap-learn: 0.5.3
    - urllib3: 2.0.7
    - uvicorn: 0.22.0
    - virtualenv: 20.24.5
    - visdom: 0.2.4
    - wcwidth: 0.2.6
    - webencodings: 0.5.1
    - websocket-client: 0.58.0
    - websockets: 11.0.3
    - werkzeug: 2.2.3
    - wheel: 0.38.4
    - widgetsnbextension: 4.0.5
    - xyzservices: 2023.2.0
    - y-py: 0.5.9
    - yarg: 0.1.9
    - yarl: 1.9.2
    - ypy-websocket: 0.8.2
    - zipp: 3.17.0
  • System:
    - OS: Linux
    - architecture:
    - 64bit
    - ELF
    - processor: x86_64
    - python: 3.9.16
    - release: 5.15.153.1-microsoft-standard-WSL2
    - version: Proposal for help #1 SMP Fri Mar 29 23:14:13 UTC 2024

More info

Expected behavior

When new fold is created, the checkpoint path should be changed to the new directory

Current behavior

The checkpoint path is resolved once during the first fold then it is short circuited and therefore never resolving to the new fold directory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions