Releases: CogStack/cogstack-nlp
MedCAT v2.5.3
🩺 MedCAT v2.5.3 Release Notes
This release focuses on dependency correctness, plugin system improvements, better offline behaviour, and performance and robustness fixes across MedCAT v2.
🚀 New Features & Enhancements
- Lazy Plugin Component Registration – External components can be registered lazily by plugins (previously confusingly called addons), just like core components.
This avoids unnecessary imports and enables cleaner setups when multiple optional plugins are available but not immediately needed. - Plugin System Improvements – The plugin system has been extended to:
- Track which components are provided by which plugins (or core library)
- Record plugin metadata (author, URL, provided components)
- Display pipeline layout and required plugins in the model card (see #272 for example)
- Raise clear exceptions when loading models without required plugins (see #272 for example)
- Improved Embedding Linker Filters – Refactored filter handling in the embedding-based linker, resulting in up to ~70× speedups in scenarios where filter configuration is applied frequently.
🐛 Bug Fixes
- Core Dependency Fixes – Fixed multiple issues where required packages were used but not declared:
- Added missing core dependencies (
packaging,pyyaml,requests) - Fixed optional extras declaring incomplete dependency sets
- Added automated checks and CI workflows to ensure all modules are importable with the correct dependency configuration
- Added missing core dependencies (
- Offline v1 → v2 Model Conversion – Fixed failures when converting models without internet access:
spaCymodels are now correctly transferred during conversion- Fallback spaCy downloads no longer block conversion when offline
- Missing Dependency Detection – Fixed logic that previously marked optional extras as available if any dependency was installed.
Extras are now only considered available if all required dependencies are present. - Pydantic Configuration Serialisation – Fixed issues with serialising Pydantic models when extra attributes are present, and added regression tests.
- MetaCAT / Plugin Import Robustness – Fixed multiple cases where optional components caused failures due to eager imports or missing extras.
🧰 Other Improvements
- Terminology Cleanup – Clarified terminology by consistently using:
- Core components (e.g. NER, EL)
- Addon components (e.g. MetaCAT, RelCAT)
- Plugins for external codebases that provide components
This removes previous ambiguity around the term “addon”.
- Internal Refactoring – Introduced a cleaner abstraction layer for entity-providing components (NER, linkers, DeID), improving internal consistency and test coverage.
- CI & Stability Improvements – Improved stability workflows with better logging, HF cache warmups, and clearer diagnostics.
- Documentation & Tutorials – Updated tutorials, migration guides, and examples to reflect lazy registration, plugin usage, and current APIs.
This release significantly improves reliability, offline usability, and extensibility, particularly for users relying on optional components, plugins, or automated deployment pipelines.
What's Changed
- CU-869bag9m3: Fix embedding linker config component name. by @mart-r in #246
- feat(medcat): CU-869b9h7y6 Add faster linker by @mart-r in #243
- bug(medcat): CU-869banbwt Fix issues with installation from source by providing providing package directory by @mart-r in #247
- feat(medcat): CU-869b9n4mq Allow faster spacy tokenization by @mart-r in #244
- bug(medcat): CU-869bbj5u4 Fix local installs again by @mart-r in #248
- bug/medcat: CU-869bbj5u4 Fix core dependencies by @mart-r in #251
- bug(medcat): CU-869bdqfg4 Fix stability workflow by @mart-r in #255
- feat(medcat): CU-869bfagqw Add entry point addons and allow lazy component registration by @mart-r in #259
- feat(medcat/medcat-tutorial):CU-869bfagqw: Fix a few typos and add lazy registration to tutorials by @mart-r in #262
- bug(medcat): CU-869bfagqw Rename addons to plugins when referring to external code by @mart-r in #263
- bug(medcat): CU-869bj8g9k Fix hardcoded requirement for spacy model download by @mart-r in #273
- Bug (medcat): CU-869bj3d4g Missing dependency finder doesn't work properly by @mart-r in #271
- feat(medcat): CU-869bhknfm Refactor setting of filters for embedding linker by @mart-r in #268
- feat(medcat): CU-869bhm1zy Improve plugins by @mart-r in #272
- Allow release workflow for case-sensitive release tag by @mart-r in #281
- Fix MedCAT v2 release workflow by @mart-r in #283
- Make all release workflow stuff happen on the lower case tag by @mart-r in #284
Full Changelog: medcat/v2.4.0...medcat/v2.5.3
MedCAT-den v0.4.0
This is a rather major change for MedCAT-den. It adds support for multiple den back ends (#276).
Notably there are still things to add on top of the existing changes (such as moving models between back ends and sync ). But these will come later on.
What's Changed
- build(medcat-den): CU-869awy4ux Set pretend version before installation of package. by @mart-r in #192
- bug(medcat den): CU-869bc4adt Improve local cache by @mart-r in #252
- build: bump the actions-deps group with 3 updates by @dependabot[bot] in #261
- feat(medcat-den):CU-869bnqjuw Support multiple back ends by @mart-r in #276
Full Changelog: medcat-den/v0.3.0...medcat-den/v0.4.0
medcat-trainer/v3.3.0
What's Changed
- Remote model service option, and frontend fixes for overlapping, multiple annos per spans by @tomolopolis in #264
- fix(medcat-trainer): Decouple app image pushes from client package by @tomolopolis in #267
Full Changelog: medcat-trainer/v3.2.1...medcat-trainer/v3.3.0
medcat-trainer/v3.2.1 - patch to fix client build.
What's Changed
- chore: medcat-trainer: update client version for new release. by @tomolopolis in #249
Full Changelog: medcat-trainer/v3.2.0...medcat-trainer/v3.2.1
medcat-trainer/v3.2.0 - overlapping annos feature and cleanups
What's Changed
- CU-869b5ncen: fix bugs, beginnings of a test for api backend by @tomolopolis in #224
- Trainer remove medcat utils by @tomolopolis in #225
- Allow overlapping annos in Trainer by @tomolopolis in #245
Full Changelog: medcat-trainer/v3.1.0...medcat-trainer/v3.2.0
medcat/v2.4.0
🩺 MedCAT v2.4.0 Release Notes
This release focuses on internal changes and a small bug fix.
🚀 New Feature
- No new features in this release
🐛 Bug Fixes
- Serialisation of pyantic models – Fixed an issue where serialising pydnatic models wouldn't save extra attributes even if they were set. (#242)
🧰 Other Improvements
- Refactor component internals – Refactor the internals of components to make it easier to implement new NER and linker components. (#219)
What's Changed
- refactor(medcat): CU-869b44wz8 Better internal components by @mart-r in #219
- bug(medcat): CU-869b9dx49 Fix pydantic model serialisation by @mart-r in #242
Full Changelog: medcat/v2.3.0...medcat/v2.4.0
medcat/v2.3.0
🩺 MedCAT v2.3.0 Release Notes
This release focuses on improving robustness, fixing key import and model loading issues, and introducing a new CLI for downloading scripts compatible with MedCAT v2.
🚀 New Feature
- MedCAT Scripts Download CLI – Added a dedicated medcat-scripts download command that ensures compatibility between script versions and MedCAT v2. These scripts (previously in
working_with_cogstack) support model fine-tuning, evaluation, and related workflows. The new CLI automatically fetches the correct script version for your MedCAT installation. (#206, #210)
🐛 Bug Fixes
- Tokenizer Loading from Disk – Fixed an issue where models failed to load tokenizers correctly from disk. Previously, fallback to locally available spaCy models sometimes masked this problem or caused errors during model load. (#213)
- Legacy Conversion Imports – Made imports in the legacy model converter fully dynamic, allowing NER-only models to convert successfully even when optional extras like DeID, MetaCAT, or RelCAT are not installed. (#198, #205)
- Model Card Generation – Fixed an import error that occurred when generating model cards without MetaCAT installed. The process now skips MetaCAT-specific sections gracefully if the extra isn’t relevant. (#217)
- Embedding Linker Extras – Added missing optional extra for the embedding linker to ensure dependency correctness. (#209)
🧰 Other Improvements
- Elasticsearch Utilities – Moved Elasticsearch-related code (formerly
cogstack.pyinworking_with_cogstack) into a separate packagecogstack-es. It’s now available as three optional extras:es8,es9, andos(for OpenSearch). (#123) - Install Target Updates – Updated install targets (in docs and derivatives) for improved consistency and clarity. (#185)
- Dependency Cleanup – Removed duplicate lines for the transformers dependency in
pyproject.toml. (#204) - Release Script Improvements – Enhanced patch release scripts for greater flexibility. (#182)
- Documentation – Updated migration guide for clarity and accuracy. (#214)
What's Changed
- build(medcat related): CU-869awn3fm Update install targets by @mart-r in #185
- feat(medcat): CU-869azdc7x: Dynamic imports for legacy conversion by @mart-r in #198
- CU-869aa22g2 Add ElasticSearch bits from working_with_cogstack by @mart-r in #123
- fix(medcat): CU-869azdc7x: Dynamic imports for legacy conversion (#198) by @alhendrickson in #205
- CU-869azvxkn: Remove duplicate lines for transformers dependency. by @mart-r in #204
- feat(medcat): CU-869azeyvz Add scripts download CLI by @mart-r in #206
- bug(medcat): CU-869b07hr0 Add optional extra for embedding linker by @mart-r in #209
- feat(medcat): CU-869b09dk4 Update scripts download by @mart-r in #210
- docs(medcat): CU-869b2fcrc Update migration guide by @mart-r in #214
- bug(medcat): CU-869b36xv7 Avoid meta cat issue when getting model card by @mart-r in #217
- bug(medcat): CU-869b2hpam Fix issue loading tokenizers off disk by @mart-r in #213
- CU-869awf45h: Update patch release script for more flexibility. by @mart-r in #182
Full Changelog: medcat/v2.2.0...medcat/v2.3.0
medcat/v2.2.0
This minor release brings several bug fixes, new features, and maintenance improvements across MedCAT v2.
🚀 New Features
- Ontology Mapping Enhancements – Added support for mapping to additional ontologies, enabling better interoperability with external systems. (#147, #160)
- Parallel Entity Saving – Introduced new method to save entities when multiprocessing. (#144)
- Nested Entities Support – Improved handling and display of nested entities in annotations. (#159)
- PyPI Callback – Added automatic version checks and callback functionality. (#166)
- Embedding Linker – Added a new linker using MLM-based embeddings for more flexible linking. (#65)
- Bulk CUI Removal – Added a convenient method to remove multiple CUIs at once from the CDB. (#175)
- CDB Utilities – Added new utilities for CDB merging and navigation using parent-to-child (pt2ch) relations. (#176)
🐛 Bug Fixes
- Tokenizer Dash Handling – Fixed an issue in the regex tokenizer that included dashes along with words. (#138)
- MetaCAT Behaviour – Ported MetaCAT fixes from v1 to address incorrect context window handling and related issues. (#162 ; see #148 and #155)
- CUI Original Name Resolution – Added
CUIInfo['original_names']in converted legacy models. (#177)
🧰 Other Improvements
- Type Hints & Refactoring – Added missing type hints to various utility methods. (#146)
- Configuration Cleanup – Updated to the new Pydantic config format to remove warnings. (#169)
- Python Version Support – Dropped Python 3.9 support and added support for Python 3.13. (#172, #167)
- Testing and CI – Hotfix for component tests and added a nightly workflow to check library stability. (#125, #171)
- Documentation – Fixed links to demo and example models in documentation. (#164, #180)
What's Changed
- Medcat v2 components test hotfix by @mart-r in #125
- Bug(medcat): CU-869ag0tqj Fix regex tokenizer dashes by @mart-r in #138
- CU-869ahw0mw: Add argument to control data flow when saving results. by @mart-r in #144
- Refactor(medcat):CU-869ak0v7n Add type hints to util methods by @mart-r in #146
- feat(medcat): CU-869aknekf Add mapping to ontologies by @mart-r in #147
- fix for config.general.show_nested_entities by @pisong314 in #159
- feat(medcat): CU-869apb8ju Better ontology mapping by @mart-r in #160
- Bug(medcat)CU-869aprnhg: Port meta cat fixes from v1 by @mart-r in #162
- docs(medcat): CU-869ar9dcf Fix demo and example model links by @mart-r in #164
- feat(medcat): CU-869ary4dq Add PyPI callback by @mart-r in #166
- Bug(medcat): Move to new config for pydantic by @mart-r in #169
- CU-869aupp8v: Remove python 3.9 support by @mart-r in #172
- build(medcat): CU-869atpd59 Add python 3.13 support to MedCAT by @mart-r in #167
- Embedding Linker using MLM based embeddings by @adam-sutton-1992 in #65
- feat (MedCAT): CU-869auz1ck Add a bulk CUI removal method for the CDB by @mart-r in #175
- medact(feat): cdb utils for merging and navigation using pt2ch relations by @tomolopolis in #176
- bug(medcat): CU-869avau57 Fix cui to original names by @mart-r in #177
- build(medcat): CU-869aujr7h Add nightly workflow to check library stability by @mart-r in #171
- docs(medcat): CU-869avu9pv Fix docs by @mart-r in #180
New Contributors
- @pisong314 made their first contribution in #159
- @adam-sutton-1992 made their first contribution in #65
Full Changelog: medcat/v2.1.0...medcat/v2.2.0rc1
medcatrainer/v3.1.0 - medcat upgrade
New header, Small fixes, medcat v2 upgrade
What's Changed
- New header #179
- CU-869awj4r1: medcat-trainer (chore): update dep by @tomolopolis in #183
- CU-869awpaf3: (chore): medcat-trainer: remove cdb_utils, fix vestigia… by @tomolopolis in #188
- medcattrainer (chore): client update release, compose cfgs by @tomolopolis in #190
Full Changelog: medcat-trainer/v3.0.0...medcat-trainer/v3.1.0
MedCAT den v0.3.0
Minor release.
Most notably difference is the addition of additional API to the den.
What's Changed
- docs (medcat-den): Fix homepage and repo links in pyproject.toml by @mart-r in #168
- build(medcat-den): CU-869auqkgc Fix duplicates in TestPyPI publish by @mart-r in #174
- feat(medcat-den): CU-869an5f00 Add remote api by @mart-r in #163
- build: bump the actions-deps group with 2 updates by @dependabot[bot] in #170
- build(medcat-den): Fix duplicate push versions to TestPyPI by @mart-r in #178
Full Changelog: medcat-den/v0.2.1...medcat-den/v0.3.0