Skip to content

Add category module for measure units with associated traits and implementations #6535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Jun 10, 2025

Conversation

younies
Copy link
Member

@younies younies commented May 6, 2025

Description

  • Introducing a new category.rs file that defines the MeasureUnitCategory trait and the CategorizedMeasureUnit struct. It also enumerates categories for measure units, including Length, Area, Volume, and Mass, with associated methods for each category.

  • introduces ids.rs that holds the ids of the units (which should be implemented in the next datagen versions)

Additionally, unit tests are included to verify the functionality of the new categories.

Design Doc: https://bit.ly/icu4x-number-formatter
Related Issue: #6161

…ementations

This commit introduces a new `category.rs` file that defines the `MeasureUnitCategory` trait and the `CategorizedMeasureUnit` struct. It also enumerates categories for measure units, including Length, Area, Volume, and Mass, with associated methods for each category. Additionally, unit tests are included to verify the functionality of the new categories.
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @younies, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces a new category.rs file that defines the MeasureUnitCategory trait and the CategorizedMeasureUnit struct. It also enumerates categories for measure units, including Length, Area, Volume, and Mass, with associated methods for each category. Additionally, the pull request includes unit tests to verify the functionality of the new categories. Finally, the pull request also modifies components/experimental/src/measure/mod.rs to include the new category module.

Highlights

  • New Module: Introduces a new category module for handling measure unit categories.
  • Traits and Structs: Defines the MeasureUnitCategory trait and the CategorizedMeasureUnit struct for associating measure units with specific categories.
  • Category Enumeration: Enumerates common measure unit categories like Length, Area, Volume, and Mass.
  • Unit Tests: Includes unit tests to verify the functionality of the new categories.

Changelog

  • components/experimental/src/measure/category.rs
    • Created a new file category.rs.
    • Defined the MeasureUnitCategory trait.
    • Defined the CategorizedMeasureUnit struct.
    • Enumerated Categories (Length, Area, Volume, Mass).
    • Implemented MeasureUnitCategory for each category.
    • Added methods to each category to return a static MeasureUnit (currently using todo!()).
    • Added unit tests for each category.
  • components/experimental/src/measure/mod.rs
    • Added pub mod category; to include the new module.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A unit's domain,
In categories we find,
Length, mass, and more.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR introduces a new category.rs file that defines the MeasureUnitCategory trait and the CategorizedMeasureUnit struct, along with categories for measure units. The addition of unit tests is also a good practice. Overall, the changes seem well-structured, but there are a few areas that could be improved.

Summary of Findings

  • Incomplete Implementations: The todo!() macro is used in several methods, indicating that the implementations are not yet complete. These need to be addressed before the code can be considered functional.
  • Category Enum Usage: The Categories enum is used in a way that might not be idiomatic Rust. Consider using a more Rust-like approach for defining and implementing the categories.
  • MeasureUnitCategory Trait: The MeasureUnitCategory trait is implemented on the Categories enum variants, which might not be the most intuitive design. Consider if the trait should be implemented on a separate struct or enum.

Merge Readiness

The code introduces a new module for measure unit categories, which is a valuable addition. However, the presence of todo!() macros and potential design improvements suggest that this PR is not yet ready for merging. Addressing the incomplete implementations and considering the suggested design changes would significantly improve the quality and functionality of the code. I am unable to approve this pull request, and recommend that others review and approve this code before merging. At a minimum, the todo!() implementations should be addressed before merging.

… Area, Volume, and Mass

This commit introduces separate structs for each measure unit category, implementing the `MeasureUnitCategory` trait for each. The previous enum-based approach has been replaced with a more structured design, enhancing clarity and maintainability. Unit tests have been updated accordingly to reflect these changes.
}

/// The categories of [`MeasureUnit`]s.
pub enum Categories {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure we need this type yet, and it should be called Category probably.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

/// A [`MeasureUnit`] that is related to the length category.
pub struct LengthMeasureUnit;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should have a categories module with individual types in them so they don't need longer names

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

younies added 6 commits May 13, 2025 14:22
…sentation

This commit modifies the `CategorizedMeasureUnit` struct to utilize `PhantomData` for the category type, removing the need for the `MeasureUnitCategory` trait and the `Categories` enum. The category-related structs for Length, Area, Volume, and Mass have been moved into a dedicated `categories` module, enhancing type safety and organization. Unit tests have been updated to reflect these structural changes.
This commit introduces a new module `ids.rs` that defines a constant array of reserved unit identifiers with fixed IDs, ensuring stability for unit creation and parsing. Additionally, the unit ID mappings in the JSON files have been updated to reflect the new reserved IDs, including changes to the trie and conversion information structures. The `SourceDataProvider` has been modified to incorporate these reserved unit IDs into the conversion and trie processing logic, enhancing the overall consistency of unit handling across the library.
impl category::Length {
pub fn meter() -> &'static MeasureUnit {
// static METER: MeasureUnit = MeasureUnit {
// single_units: vec![SingleUnit {
Copy link
Member Author

@younies younies May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sffc , short_vec can be used.

@robertbastian , we can use meter then call square for more efficiency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test cases, so, I do not think we need to add overhead of calling series of function of power() and si_prefix().

@younies younies marked this pull request as ready for review June 2, 2025 20:33
@younies younies requested review from sffc, robertbastian and a team as code owners June 2, 2025 20:33
@younies younies requested a review from Manishearth June 2, 2025 20:34
pub mod mass;
pub mod volume;

pub trait MeasureUnitCategory {}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sffc , I do not know why do we need these right now?

Copy link
Member Author

@younies younies Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what we need : 3ad67fc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not strictly required right now, but if you want to add any functionality to CategorizedMeasureUnit you'll need this

pub struct CategorizedMeasureUnit<T: MeasureUnitCategory> {
_category: PhantomData<T>,
pub unit: MeasureUnit,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sffc , ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@younies younies dismissed gemini-code-assist[bot]’s stale review June 3, 2025 14:50

I already addressed it

pub mod mass;
pub mod volume;

pub trait MeasureUnitCategory {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not strictly required right now, but if you want to add any functionality to CategorizedMeasureUnit you'll need this

/// A [`MeasureUnit`] that is related to a specific category.
///
/// This is useful for type inference and for ensuring that the correct units are used.
pub struct CategorizedMeasureUnit<T: MeasureUnitCategory> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to MeasureUnit and (ErasedMeasureUnit) in a follow-up

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// A [`MeasureUnit`] that is related to a specific category.
///
/// This is useful for type inference and for ensuring that the correct units are used.
pub struct CategorizedMeasureUnit<T: MeasureUnitCategory> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be some (unchecked) way of creating a CategorizedMeasureUnit from a MeasureUnit, otherwise these aren't constructible without compiled data. Something like

impl MeasureUnit {
  pub fn with_category_unchecked<T: MeasureUnitCategory>(self) -> CategorizedMeasureUnit<T> { ... }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think users should use our functions to construct them, and not build their own for now. I do not want someone to build mass unit with meter-per-second

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then it's impossible to build a CategorizedMeasureUnit without compiled data; that cannot be the goal

a unit could store its category in data, and with_category can fail if it's the wrong one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed, postpone this for now

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architecturally this looks aligned with my understanding of the design

@younies younies requested a review from robertbastian June 10, 2025 13:06
Comment on lines +24 to +27
si_prefix: SiPrefix {
power: 0,
base: Base::Decimal,
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a private constructor like SiPrefix::unit() if there's no prefix

Comment on lines +42 to +46
si_prefix: SiPrefix {
power: 0,
base: Base::Decimal,
},
unit_id: *crate::provider::Baked::UNIT_IDS_V1_UND_KILOGRAM,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: if CLDR treats kilogram as a different unit than gram, can we normalize this in datagen?

Suggested change
si_prefix: SiPrefix {
power: 0,
base: Base::Decimal,
},
unit_id: *crate::provider::Baked::UNIT_IDS_V1_UND_KILOGRAM,
si_prefix: SiPrefix {
power: 3,
base: Base::Decimal,
},
unit_id: *crate::provider::Baked::UNIT_IDS_V1_UND_GRAM,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this should be the behaviour after that

@younies younies merged commit 07f2e00 into unicode-org:main Jun 10, 2025
29 checks passed
@younies younies deleted the implement-measureunit branch June 10, 2025 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants