Skip to content

Commit

Permalink
Hash Map Lesson: Make use of terms more consistent
Browse files Browse the repository at this point in the history
  • Loading branch information
softy-dev committed Oct 26, 2024
1 parent 0124936 commit a12d73a
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions javascript/computer_science/hash_map_data_structure.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ You might be thinking, wouldn't it just be better to save the whole name as a ha

### Buckets

Buckets are storage that we need to store our elements. Simply, it's an array. For a specific key, we decide which bucket to use for storage through our hash function. The hash function returns a number that serves as the index of the array at which we store this specific key value pair. Let's say we wanted to store a person's full name as a key "Fred" with a value of "Smith":
Buckets are storage that we need to store our elements. We can consider each index of an array to have a bucket. For a specific key, we decide which bucket to use for storage through our hash function. The hash function returns a number that serves as the index of the array at which we store this specific key value pair. Let's say we wanted to store a person's full name as a key "Fred" with a value of "Smith":

1. Pass "Fred" into the hash function to get the hash code which is `385`.
1. Find the bucket at index `385`.
Expand All @@ -98,7 +98,7 @@ This is an oversimplified explanation; we'll discuss more internal mechanics lat

Now if we wanted to get a value using a key:

1. To retrieve the value, we hash the key and calculate its bucket number.
1. To retrieve the value, we hash the key and calculate its bucket's index.
1. If the bucket is not empty, then we go to that bucket.
1. Now we compare if the node's key is the same key that was used for the retrieval.
1. If it is, then we can return the node's value. Otherwise, we return `null`.
Expand Down Expand Up @@ -154,7 +154,7 @@ You probably understand by this point why we must write a good hashing function

### Growth of a hash table

Let's talk about the growth of our buckets. We don't have infinite memory, so we can't have infinite number of buckets. We need to start somewhere, but starting too big is also a waste of memory if we're only going to have a hash map with a single value in it. So to deal with this issue, we should start with a small array for our buckets. We'll use an array of size `16`.
Let's talk about our number of buckets. We don't have infinite memory, so we can't have an infinite amount of them. We need to start somewhere, but starting too big is also a waste of memory if we're only going to have a hash map with a single value in it. So to deal with this issue, we should start with a small array for our buckets. We'll use an array of size `16`.

<div class="lesson-note lesson-note--tip" markdown="1">

Expand All @@ -168,17 +168,17 @@ For example, if we are to find the bucket where the value `"Manon"` will land, t

As we continue to add nodes into our buckets, collisions get more and more likely. Eventually, however, there will be more nodes than there are buckets, which guarantees a collision (check the additional resources section for an explanation of this fact if you're curious).

Remember we don't want collisions. In a perfect world each bucket will either have 0 or 1 node only, so we grow our buckets to have more chance that our nodes will spread and not stack up in the same buckets. To grow our buckets, we create a new buckets list that is double the size of the old buckets list, then we copy all nodes over to the new buckets.
Remember we don't want collisions. In a perfect world, each bucket will either have 0 or 1 node only, so we grow our hash table to have more chance that our nodes will spread and not stack up in the same buckets. To grow our hash table, we create a new one that is double its size and then copy all existing nodes over to the buckets of this new table, hashing their keys again.

#### When do we know that it's time to grow our buckets size?
#### When do we know that it's time to grow our hash table?

To deal with this, our hash map class needs to keep track of two new fields, the `capacity` and the `load factor`.

- The `capacity` is the total number of buckets we currently have.

- The `load factor` is a number that we assign our hash map to at the start. It's the factor that will determine when it is a good time to grow our buckets. Hash map implementations across various languages use a load factor between `0.75` and `1`.

The product of these two numbers gives us a number, and we know it's time to grow when there are more entries in the hash map than that number. For example, if there are `16` buckets, and the load factor is `0.8`, then we need to grow the buckets when there are more than `16 * 0.8 = 12.8` entries - which happens on the 13th entry. Setting it too low will consume too much memory by having too many empty buckets, while setting it too high will allow our buckets to have many collisions before we grow them.
The product of these two numbers gives us a number, and we know it's time to grow when there are more entries in the hash map than that number. For example, if there are `16` buckets, and the load factor is `0.8`, then we need to grow the buckets when there are more than `16 * 0.8 = 12.8` entries - which happens on the 13th entry. Setting it too low will consume too much memory by having too many empty buckets, while setting it too high will allow our buckets to have many collisions before we resize the hash table.

### Computation complexity

Expand Down

0 comments on commit a12d73a

Please sign in to comment.