Skip to content

Commit

Permalink
adding citation info + toc
Browse files Browse the repository at this point in the history
  • Loading branch information
hsmaan committed Mar 1, 2024
1 parent aa62487 commit 01e9ce3
Showing 1 changed file with 20 additions and 2 deletions.
22 changes: 20 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,16 @@
# balanced-clustering
## Assessing clustering performance in imbalanced data contexts
# balanced-clustering <!-- omit in toc -->
## Assessing clustering performance in imbalanced data contexts <!-- omit in toc -->

Class imbalance is prevalent across real-world datasets, including images, natural language, and biological data. In unsupervised learning, clustering performance is often assessed with respect to a ground-truth set of labels using metrics such as the Adjusted Rand Index (ARI). Akin to the issue in classification when using overall accuracy, clustering metrics fail to capture information about class imbalance. imbalanced-clustering presents *balanced* clustering metrics, that take into account class imbalance and reweigh the results accordingly. Combined with vanilla clustering metrics (https://scikit-learn.org/stable/modules/clustering.html), imbalanced-clustering offers a more complete perspective on clustering and related tasks.

## Table of contents <!-- omit in toc -->
- [Installation via pip](#installation-via-pip)
- [Usage](#usage)
- [Detailed example](#detailed-example)
- [Notebooks](#notebooks)
- [Issues/bugs](#issuesbugs)
- [Citation information](#citation-information)

## Installation via pip

```
Expand Down Expand Up @@ -75,3 +83,13 @@ For more details on the implementation of the balanced clustering metrics, mathe
## Issues/bugs

If any issues occur in either installation or usage, please open them and include a reproducible example.

## Citation information

If you use the balanced clustering metrics in your research, please reference the following publication:

> The differential impacts of dataset imbalance in single-cell data integration
>
> Hassaan Maan, Lin Zhang, Chengxin Yu, Michael Geuenich, Kieran R. Campbell, Bo Wang
>
> bioRxiv December 19, 2022; doi: https://doi.org/10.1101/2022.10.06.511156

0 comments on commit 01e9ce3

Please sign in to comment.