Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contribution to project quality by mitigating code smells. #11240

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

viniEng
Copy link

@viniEng viniEng commented Nov 28, 2024

Hello, cBioPortal Community!
We are excited to contribute to your incredible project. As part of a Software Maintenance discipline in our academic program, we were encouraged to engage with open-source projects by identifying and addressing code smells, while analyzing and improving their quality. Leveraging Understand, a static analysis tool, we performed an in-depth review of the codebase and implemented targeted improvements. These efforts resulted in notable enhancements, particularly in terms of cyclomatic complexity reduction, modularity, and maintainability.

This pull request introduces refactorings across multiple classes and methods, focusing on enhancing readability, improving cohesion, and ensuring adherence to software engineering best practices. Through this contribution, we hope to make the codebase more sustainable for the community and easier for new contributors to navigate. Below, we provide a detailed account of the changes, their motivations, and the outcomes achieved.

1. Feature Envy:
Classes: MolecularProfileRepository and related components.
Issues:

  • Strong coupling between methods and unrelated classes (MolecularProfileRepository and AlterationCountsMapper).
    Misplaced responsibilities, with methods handling molecular profile logic instead of focusing on persistence.

Solution: Extracted molecular profile-related logic into a new utility class, MolecularProfileUtil.
Encapsulated grouping and interaction with molecular profiles in dedicated methods (groupIdentifiersByProfileType).
Impact: Increased cohesion in MolecularProfileRepository.
Centralized and reusable logic in MolecularProfileUtil.

2. Large Class and Long Method:
Classes: CustomEhcacheProvider and SessionServiceController.
Issues: Classes with multiple responsibilities and overly complex methods.
Difficulties in understanding and maintaining high-cyclomatic complexity code.
Solution: Split responsibilities into dedicated components:
CustomEhcacheProvider was modularized into:

  1. CacheConfigValidator
  2. CacheConfigurationBuilderUtil
  3. CacheManagerFactory

Extracted large methods (e.g., detectCacheConfigurationErrorsAndLog) into smaller, testable utility functions (CacheValidationUtils).
SessionServiceController was split into two controllers: VirtualStudyController for virtual study operations. CustomDataController for custom session data management.
Impact:

  • Reduced cyclomatic complexity:
  • detectCacheConfigurationErrorsAndLog decreased from 18 to 4.
  • SessionServiceController is now modular and follows the Single Responsibility Principle.
  • Improved readability, testability, and extensibility.

3. Long Method:
Class: DataBiner.
Issues: High-cyclomatic complexity in the calcNumericalDataBins method.
Muddled logic due to multiple responsibilities within a single method.
Solution:

  1. Applied the method extraction refactoring pattern.
  2. Segmented logic into auxiliary methods, such as:
  3. sortAndPrepareNumericalValues
  4. filterOutliers
  5. calculateBins

Impact:

  1. Cyclomatic complexity reduced from 27 to 13.
  2. Improved modularity and readability of binning logic.

Code Quality Improvements:
The refactorings resulted in measurable improvements in the following metrics:

Cyclomatic Complexity: Reduced in multiple classes and methods, leading to easier comprehension and maintenance.

Modularity: Improved by isolating responsibilities into smaller, reusable classes and methods.

Cohesion: Increased by ensuring each class has a focused responsibility.

Testability: Smaller methods and classes can now be independently tested, improving code reliability.

Tools and Process:
Static Analysis:

  • Used Understand to identify problem areas and assess code smells.
    Refactoring Techniques: Method extraction - Class extraction.
    Responsibility reallocation to utility classes.

Validation: All refactorings were tested to ensure no change in functionality.
Existing tests were run successfully, and new unit tests were added for the extracted utilities.

Challenges Faced: Understanding the legacy codebase and identifying areas for improvement required significant time due to the large and complex architecture.
Balancing modularization while maintaining existing interfaces and avoiding disruptions to other parts of the system.
Final Notes: These refactorings align the codebase with modern software design principles, ensuring that:

Code is easier to read and maintain.
Logic is modular and reusable. The project is better prepared for future enhancements.
This contribution follows the criteria of improving code quality based on collected metrics and addressing code smells. We are open to further feedback from the maintainers and community to refine this work. Thank you for the opportunity to collaborate!

@alisman alisman requested a review from haynescd December 20, 2024 15:00
@alisman
Copy link
Contributor

alisman commented Dec 20, 2024

@haynescd can you review this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants