Add new iter_counts method #23

HarrisonMc555 · 2019-10-16T02:44:57Z

The iter method returns an iterator that returns each key the appropriate number of times. The new iter_counts method returns an iterator that returns tuples that contain the key and the number of times it appears.

This implements another feature referenced in #17.

Let me know what you think of the name/whatever.

The `iter` method returns an iterator that returns each key the appropriate number of times. The new `iter_counts` method returns an iterator that returns tuples that contain the key and the number of times it appears.

mashedcode · 2019-10-19T21:18:07Z

If one wants iter_counts why would they not just use HashMap? I guess I'm just interested in your use-case.

HarrisonMc555 · 2019-10-20T17:24:14Z

I ran into this crate when I was trying to keep track of the factors for a number. Conceptually, the order didn't matter and I wanted constant-time lookup for the factors. However, there also needed to be duplicates.

I could have constructed the HashMap manually, but it felt more natural to have a data structure that implemented that relationship in a natural way.

However, I then needed to loop through the number of times each factor appeared, and so I needed an iter_counts method.

I'll admit that it may not be the most common occurrence, but I think that it would be nice to have. I'm not in love with the naming scheme, but it was the best I could come up with at the time and seemed to make sense to me...

mashedcode · 2019-10-20T23:46:24Z

The python multiset also also implements an equivalent to iter_counts which they call items. Calling it something with iter_ is probably a good idea you had since we're in rust. On the other hand in the C++ std::multiset there's no such thing other than a counts method that counts the number of occurrences of a certain element.

mashedcode · 2019-10-21T00:23:38Z

src/multiset.rs

+    /// keys_with_counts.sort(); // Order is arbitrary
+    /// assert_eq!(keys_with_counts, vec![(&0, 2), (&1, 1)]);
+    /// ```
+    pub fn iter_counts(&self) -> IterCounts<K> {


This can probably be done with less code. Something like this should work:

pub fn iter_counts(&self) -> impl Iterator<Item = (&K, usize)> { self.elem_counts.iter().map(|(key, &counts)| (key, counts))

It's true. The general practice for the standard library has been to provide a concrete struct to return. However, returning an impl trait is a relatively recent change, so that may be considered a new best practice. I'd be willing to go that way. It may be worth considering for the sake of consistency, since we do need custom structs for several of the other methods (i.e. UnionCounts, etc.).

HarrisonMc555 · 2019-10-21T07:52:20Z

I think there should be some way of iterating through the multiset that is O(keys) instead of O(keys×counts). That could be an iter_counts, like proposed here, or an iter_values that only yields the values.

I forgot, we already have a distinct_elements that returns just the values/elements. I guess that gets us most of what we need.

The benefit to the iter_counts method over the distinct_elements method is that if you only had distinct_elements, you would have to do something like this multiset.distinct_elements().map(|elem| (elem, multiset.get(elem)).unwrap()) to recreate like iter_counts.

Never mind. With the HashMultiset, it would just be multiset.distinct_elements().map(|elem| (elem, multiset.count_of(elem))). I was going to say that you could avoid calls to unwrap, but since we're returning a usize (not an Option<T>) this works a lot better.

Well, considering the solution is fairly simple, I would be ok with not having this. It would be consistent with the union_counts and intersection_counts implementations if those get accepted from #22, but either way it's fine if this doesn't make it in.

The equivalent approach for anyone who Googles this in the future:

multiset.distinct_elements().map(|elem| (elem, multiset.count_of(elem)))

ma2bd · 2021-05-03T21:59:57Z

Hi. count_of certainly works but I can't help noticing that it costs an additional hashmap access per element. I would personally argue that an HashMultiSet<K> is fundamentally also an HashMap<K, usize> : many operations on multisets (add, sub, union, intersection etc) can be seen as a pointwise (possibly lossy) operation on the map values (+, checked_sub, max, min).

Implement iter_counts

dc2d992

The `iter` method returns an iterator that returns each key the appropriate number of times. The new `iter_counts` method returns an iterator that returns tuples that contain the key and the number of times it appears.

mashedcode reviewed Oct 21, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add new iter_counts method #23

Add new iter_counts method #23

Uh oh!

HarrisonMc555 commented Oct 16, 2019

Uh oh!

mashedcode commented Oct 19, 2019

Uh oh!

HarrisonMc555 commented Oct 20, 2019

Uh oh!

mashedcode commented Oct 20, 2019

Uh oh!

mashedcode Oct 21, 2019

Uh oh!

HarrisonMc555 Oct 21, 2019

Uh oh!

HarrisonMc555 commented Oct 21, 2019

Uh oh!

ma2bd commented May 3, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add new iter_counts method #23

Are you sure you want to change the base?

Add new iter_counts method #23

Uh oh!

Conversation

HarrisonMc555 commented Oct 16, 2019

Uh oh!

mashedcode commented Oct 19, 2019

Uh oh!

HarrisonMc555 commented Oct 20, 2019

Uh oh!

mashedcode commented Oct 20, 2019

Uh oh!

mashedcode Oct 21, 2019

Choose a reason for hiding this comment

Uh oh!

HarrisonMc555 Oct 21, 2019

Choose a reason for hiding this comment

Uh oh!

HarrisonMc555 commented Oct 21, 2019

Uh oh!

ma2bd commented May 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ma2bd commented May 3, 2021 •

edited

Loading