2 rows × 27 columns
\n", + "5 rows × 27 columns
\n", "" ], "text/plain": [ - " Name Region state summit_elev vertical_drop \\\n", - "104 Crystal Mountain Michigan Michigan 1132 375 \n", - "295 Crystal Mountain Washington Washington 7012 3100 \n", + " Name Region state summit_elev vertical_drop \\\n", + "0 Alyeska Resort Alaska Alaska 3939 2500 \n", + "1 Eaglecrest Ski Area Alaska Alaska 2600 1540 \n", + "2 Hilltop Ski Area Alaska Alaska 2090 294 \n", + "3 Arizona Snowbowl Arizona Arizona 11500 2300 \n", + "4 Sunrise Park Resort Arizona Arizona 11100 1800 \n", "\n", - " base_elev trams fastEight fastSixes fastQuads ... LongestRun_mi \\\n", - "104 757 0 0.0 0 1 ... 0.3 \n", - "295 4400 1 NaN 2 2 ... 2.5 \n", + " base_elev trams fastEight fastSixes fastQuads ... LongestRun_mi \\\n", + "0 250 1 0.0 0 2 ... 1.0 \n", + "1 1200 0 0.0 0 0 ... 2.0 \n", + "2 1796 0 0.0 0 0 ... 1.0 \n", + "3 9200 0 0.0 1 0 ... 2.0 \n", + "4 9200 0 NaN 0 1 ... 1.2 \n", "\n", - " SkiableTerrain_ac Snow Making_ac daysOpenLastYear yearsOpen \\\n", - "104 102.0 96.0 120.0 63.0 \n", - "295 2600.0 10.0 NaN 57.0 \n", + " SkiableTerrain_ac Snow Making_ac daysOpenLastYear yearsOpen \\\n", + "0 1610.0 113.0 150.0 60.0 \n", + "1 640.0 60.0 45.0 44.0 \n", + "2 30.0 30.0 150.0 36.0 \n", + "3 777.0 104.0 122.0 81.0 \n", + "4 800.0 80.0 115.0 49.0 \n", "\n", - " averageSnowfall AdultWeekday AdultWeekend projectedDaysOpen \\\n", - "104 132.0 54.0 64.0 135.0 \n", - "295 486.0 99.0 99.0 NaN \n", + " averageSnowfall AdultWeekday AdultWeekend projectedDaysOpen \\\n", + "0 669.0 65.0 85.0 150.0 \n", + "1 350.0 47.0 53.0 90.0 \n", + "2 69.0 30.0 34.0 152.0 \n", + "3 260.0 89.0 89.0 122.0 \n", + "4 250.0 74.0 78.0 104.0 \n", "\n", - " NightSkiing_ac \n", - "104 56.0 \n", - "295 NaN \n", + " NightSkiing_ac \n", + "0 550.0 \n", + "1 NaN \n", + "2 30.0 \n", + "3 NaN \n", + "4 80.0 \n", "\n", - "[2 rows x 27 columns]" + "[5 rows x 27 columns]" ] }, - "execution_count": 11, + "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "ski_data[ski_data['Name'] == 'Crystal Mountain']" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "So there are two Crystal Mountain resorts, but they are clearly two different resorts in two different states. This is a powerful signal that you have unique records on each row." + "#Code task 3#\n", + "#Call the head method on ski_data to print the first several rows of the data\n", + "ski_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "#### 2.6.3.2 Region And State" + "The output above suggests you've made a good start getting the ski resort data organized. You have plausible column headings. You can already see you have a missing value in the `fastEight` column" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "What's the relationship between region and state?" + "## 2.6 Explore The Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "You know they are the same in many cases (e.g. both the Region and the state are given as 'Michigan'). In how many cases do they differ?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Code task 10#\n", - "#Calculate the number of times Region does not equal state\n", - "(ski_data.Region ___ ski_data.state).___" + "### 2.6.1 Find Your Resort Of Interest" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "You know what a state is. What is a region? You can tabulate the distinct values along with their respective frequencies using `value_counts()`." + "Your resort of interest is called Big Mountain Resort. Check it's in the data:" ] }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 28, "metadata": {}, "outputs": [ { "data": { + "text/html": [ + "\n", + " | 151 | \n", + "
---|---|
Name | \n", + "Big Mountain Resort | \n", + "
Region | \n", + "Montana | \n", + "
state | \n", + "Montana | \n", + "
summit_elev | \n", + "6817 | \n", + "
vertical_drop | \n", + "2353 | \n", + "
base_elev | \n", + "4464 | \n", + "
trams | \n", + "0 | \n", + "
fastEight | \n", + "0.0 | \n", + "
fastSixes | \n", + "0 | \n", + "
fastQuads | \n", + "3 | \n", + "
quad | \n", + "2 | \n", + "
triple | \n", + "6 | \n", + "
double | \n", + "0 | \n", + "
surface | \n", + "3 | \n", + "
total_chairs | \n", + "14 | \n", + "
Runs | \n", + "105.0 | \n", + "
TerrainParks | \n", + "4.0 | \n", + "
LongestRun_mi | \n", + "3.3 | \n", + "
SkiableTerrain_ac | \n", + "3000.0 | \n", + "
Snow Making_ac | \n", + "600.0 | \n", + "
daysOpenLastYear | \n", + "123.0 | \n", + "
yearsOpen | \n", + "72.0 | \n", + "
averageSnowfall | \n", + "333.0 | \n", + "
AdultWeekday | \n", + "81.0 | \n", + "
AdultWeekend | \n", + "81.0 | \n", + "
projectedDaysOpen | \n", + "123.0 | \n", + "
NightSkiing_ac | \n", + "600.0 | \n", + "
330 rows × 3 columns
\n", "" ], "text/plain": [ - " state Ticket Price\n", - "0 Alaska AdultWeekday 65.0\n", - "1 Alaska AdultWeekday 47.0\n", - "2 Alaska AdultWeekday 30.0\n", - "3 Arizona AdultWeekday 89.0\n", - "4 Arizona AdultWeekday 74.0" + " Name Region state\n", + "0 Alyeska Resort Alaska Alaska\n", + "1 Eaglecrest Ski Area Alaska Alaska\n", + "2 Hilltop Ski Area Alaska Alaska\n", + "3 Arizona Snowbowl Arizona Arizona\n", + "4 Sunrise Park Resort Arizona Arizona\n", + ".. ... ... ...\n", + "325 Meadowlark Ski Lodge Wyoming Wyoming\n", + "326 Sleeping Giant Ski Resort Wyoming Wyoming\n", + "327 Snow King Resort Wyoming Wyoming\n", + "328 Snowy Range Ski & Recreation Area Wyoming Wyoming\n", + "329 White Pine Ski Area Wyoming Wyoming\n", + "\n", + "[330 rows x 3 columns]" ] }, - "execution_count": 20, + "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "ticket_prices.head()" + "#Code task 6#\n", + "#Use ski_data's `select_dtypes` method to select columns of dtype 'object'\n", + "ski_data.select_dtypes(object)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "This is now in a format we can pass to [seaborn](https://seaborn.pydata.org/)'s [boxplot](https://seaborn.pydata.org/generated/seaborn.boxplot.html) function to create boxplots of the ticket price distributions for each ticket type for each state." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Code task 16#\n", - "#Create a seaborn boxplot of the ticket price dataframe we created above,\n", - "#with 'state' on the x-axis, 'Price' as the y-value, and a hue that indicates 'Ticket'\n", - "#This will use boxplot's x, y, hue, and data arguments.\n", - "plt.subplots(figsize=(12, 8))\n", - "sns.boxplot(x=___, y=___, hue=___, data=ticket_prices)\n", - "plt.xticks(rotation='vertical')\n", - "plt.ylabel('Price ($)')\n", - "plt.xlabel('State');" + "#### 2.6.3.1 Unique Resort Names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Aside from some relatively expensive ticket prices in California, Colorado, and Utah, most prices appear to lie in a broad band from around 25 to over 100 dollars. Some States show more variability than others. Montana and South Dakota, for example, both show fairly small variability as well as matching weekend and weekday ticket prices. Nevada and Utah, on the other hand, show the most range in prices. Some States, notably North Carolina and Virginia, have weekend prices far higher than weekday prices. You could be inspired from this exploration to consider a few potential groupings of resorts, those with low spread, those with lower averages, and those that charge a premium for weekend tickets. However, you're told that you are taking all resorts to be part of the same market share, you could argue against further segment the resorts. Nevertheless, ways to consider using the State information in your modelling include:\n", - "\n", - "* disregard State completely\n", - "* retain all State information\n", - "* retain State in the form of Montana vs not Montana, as our target resort is in Montana\n", + "You saw earlier on that these three columns had no missing values. But are there any other issues with these columns? Sensible questions to ask here include:\n", "\n", - "You've also noted another effect above: some States show a marked difference between weekday and weekend ticket prices. It may make sense to allow a model to take into account not just State but also weekend vs weekday." + "* Is `Name` (or at least a combination of Name/Region/State) unique?\n", + "* Is `Region` always the same as `state`?" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": 39, "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Name\n", + "Crystal Mountain 2\n", + "Alyeska Resort 1\n", + "Brandywine 1\n", + "Boston Mills 1\n", + "Alpine Valley 1\n", + "Name: count, dtype: int64\n" + ] + } + ], "source": [ - "Thus we currently have two main questions you want to resolve:\n", - "\n", - "* What do you do about the two types of ticket price?\n", - "* What do you do about the state information?" + "#Code task 7#\n", + "#Use pandas' Series method `value_counts` to find any duplicated resort names\n", + "print(ski_data['Name'].value_counts().head())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### 2.6.4 Numeric Features" + "You have a duplicated resort name: Crystal Mountain." ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "Having decided to reserve judgement on how exactly you utilize the State, turn your attention to cleaning the numeric features." + "**Q: 1** Is this resort duplicated if you take into account Region and/or state as well?" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": 42, "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Alyeska Resort, Alaska 1\n", + "Snow Trails, Ohio 1\n", + "Brandywine, Ohio 1\n", + "Boston Mills, Ohio 1\n", + "Alpine Valley, Ohio 1\n", + "Name: count, dtype: int64" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "#### 2.6.4.1 Numeric data summary" + "#Code task 8#\n", + "#Concatenate the string columns 'Name' and 'Region' and count the values again (as above)\n", + "(ski_data['Name'] + ', ' + ski_data['Region']).value_counts().head()" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 43, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "Alyeska Resort, Alaska 1\n", + "Snow Trails, Ohio 1\n", + "Brandywine, Ohio 1\n", + "Boston Mills, Ohio 1\n", + "Alpine Valley, Ohio 1\n", + "Name: count, dtype: int64" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "#Code task 17#\n", - "#Call ski_data's `describe` method for a statistical summary of the numerical columns\n", - "#Hint: there are fewer summary stat columns than features, so displaying the transpose\n", - "#will be useful again\n", - "ski_data.___.___" + "#Code task 9#\n", + "#Concatenate 'Name' and 'state' and count the values again (as above)\n", + "(ski_data['Name'] + ', ' + ski_data['state']).value_counts().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Recall you're missing the ticket prices for some 16% of resorts. This is a fundamental problem that means you simply lack the required data for those resorts and will have to drop those records. But you may have a weekend price and not a weekday price, or vice versa. You want to keep any price you have." + "##**NB** because you know `value_counts()` sorts descending, you can use the `head()` method and know the rest of the counts must be 1." ] }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 45, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "104 Crystal Mountain, Michigan, Michigan\n", + "295 Crystal Mountain, Washington, Washington\n", + "dtype: object\n" + ] + } + ], + "source": [ + "#I also wanted to find out what region and state Crystal Mountain were in.\n", + "print(ski_data.loc[ski_data['Name'] == 'Crystal Mountain', 'Name'] + ', '+ ski_data.loc[ski_data['Name'] == 'Crystal Mountain', 'Region'] + ', '+ ski_data.loc[ski_data['Name'] == 'Crystal Mountain', 'state'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**A: 1** Your answer here\n", + "\n", + "**Becky's Answer:** Yes they are 2 unique resorts - One in Michigan and another in Washington" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "So there are two Crystal Mountain resorts, but they are clearly two different resorts in two different states. This is a powerful signal that you have unique records on each row." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### 2.6.3.2 Region And State" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "What's the relationship between region and state?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You know they are the same in many cases (e.g. both the Region and the state are given as 'Michigan'). In how many cases do they differ?" + ] + }, + { + "cell_type": "code", + "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "0 82.424242\n", - "2 14.242424\n", - "1 3.333333\n", - "dtype: float64" + "33" ] }, - "execution_count": 23, + "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "missing_price = ski_data[['AdultWeekend', 'AdultWeekday']].isnull().sum(axis=1)\n", - "missing_price.value_counts()/len(missing_price) * 100" + "#Code task 10#\n", + "#Calculate the number of times Region does not equal state\n", + "(ski_data.Region != ski_data.state).sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Just over 82% of resorts have no missing ticket price, 3% are missing one value, and 14% are missing both. You will definitely want to drop the records for which you have no price information, however you will not do so just yet. There may still be useful information about the distributions of other features in that 14% of the data." + "**Becky's Answer:** There are 33 entries where the Region and state are different." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "#### 2.6.4.2 Distributions Of Feature Values" + "You know what a state is. What is a region? You can tabulate the distinct values along with their respective frequencies using `value_counts()`." + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Region\n", + "New York 33\n", + "Michigan 29\n", + "Sierra Nevada 22\n", + "Colorado 22\n", + "Pennsylvania 19\n", + "Wisconsin 16\n", + "New Hampshire 16\n", + "Vermont 15\n", + "Minnesota 14\n", + "Idaho 12\n", + "Montana 12\n", + "Massachusetts 11\n", + "Washington 10\n", + "New Mexico 9\n", + "Maine 9\n", + "Wyoming 8\n", + "Utah 7\n", + "Salt Lake City 6\n", + "North Carolina 6\n", + "Oregon 6\n", + "Connecticut 5\n", + "Ohio 5\n", + "Virginia 4\n", + "West Virginia 4\n", + "Illinois 4\n", + "Mt. Hood 4\n", + "Alaska 3\n", + "Iowa 3\n", + "South Dakota 2\n", + "Arizona 2\n", + "Nevada 2\n", + "Missouri 2\n", + "Indiana 2\n", + "New Jersey 2\n", + "Rhode Island 1\n", + "Tennessee 1\n", + "Maryland 1\n", + "Northern California 1\n", + "Name: count, dtype: int64" + ] + }, + "execution_count": 54, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ski_data['Region'].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Note that, although we are still in the 'data wrangling and cleaning' phase rather than exploratory data analysis, looking at distributions of features is immensely useful in getting a feel for whether the values look sensible and whether there are any obvious outliers to investigate. Some exploratory data analysis belongs here, and data wrangling will inevitably occur later on. It's more a matter of emphasis. Here, we're interesting in focusing on whether distributions look plausible or wrong. Later on, we're more interested in relationships and patterns." + "A casual inspection by eye reveals some non-state names such as Sierra Nevada, Salt Lake City, and Northern California. Tabulate the differences between Region and state. On a note regarding scaling to larger data sets, you might wonder how you could spot such cases when presented with millions of rows. This is an interesting point. Imagine you have access to a database with a Region and state column in a table and there are millions of rows. You wouldn't eyeball all the rows looking for differences! Bear in mind that our first interest lies in establishing the answer to the question \"Are they always the same?\" One approach might be to ask the database to return records where they differ, but limit the output to 10 rows. If there were differences, you'd only get up to 10 results, and so you wouldn't know whether you'd located all differences, but you'd know that there were 'a nonzero number' of differences. If you got an empty result set back, then you would know that the two columns always had the same value. At the risk of digressing, some values in one column only might be NULL (missing) and different databases treat NULL differently, so be aware that on many an occasion a seamingly 'simple' question gets very interesting to answer very quickly!" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 56, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "state Region \n", + "California Sierra Nevada 20\n", + " Northern California 1\n", + "Nevada Sierra Nevada 2\n", + "Oregon Mt. Hood 4\n", + "Utah Salt Lake City 6\n", + "Name: count, dtype: int64" + ] + }, + "execution_count": 56, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "#Code task 18#\n", - "#Call ski_data's `hist` method to plot histograms of each of the numeric features\n", - "#Try passing it an argument figsize=(15,10)\n", - "#Try calling plt.subplots_adjust() with an argument hspace=0.5 to adjust the spacing\n", - "#It's important you create legible and easy-to-read plots\n", - "ski_data.___(___)\n", - "#plt.subplots_adjust(hspace=___);\n", - "#Hint: notice how the terminating ';' \"swallows\" some messy output and leads to a tidier notebook" + "#Code task 11#\n", + "#Filter the ski_data dataframe for rows where 'Region' and 'state' are different,\n", + "#group that by 'state' and perform `value_counts` on the 'Region'\n", + "(ski_data[ski_data.Region != ski_data.state]\n", + " .groupby('state')\n", + " ['Region'].value_counts())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "What features do we have possible cause for concern about and why?\n", - "\n", - "* SkiableTerrain_ac because values are clustered down the low end,\n", - "* Snow Making_ac for the same reason,\n", - "* fastEight because all but one value is 0 so it has very little variance, and half the values are missing,\n", - "* fastSixes raises an amber flag; it has more variability, but still mostly 0,\n", - "* trams also may get an amber flag for the same reason,\n", - "* yearsOpen because most values are low but it has a maximum of 2019, which strongly suggests someone recorded calendar year rather than number of years." + "The vast majority of the differences are in California, with most Regions being called Sierra Nevada and just one referred to as Northern California." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "##### 2.6.4.2.1 SkiableTerrain_ac" + "#### 2.6.3.3 Number of distinct regions and states" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 59, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "Region 38\n", + "state 35\n", + "dtype: int64" + ] + }, + "execution_count": 59, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "#Code task 19#\n", - "#Filter the 'SkiableTerrain_ac' column to print the values greater than 10000\n", - "ski_data.___[ski_data.___ > ___]" + "#Code task 12#\n", + "#Select the 'Region' and 'state' columns from ski_data and use the `nunique` method to calculate\n", + "#the number of unique values in each\n", + "ski_data[['Region', 'state']].nunique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "**Q: 2** One resort has an incredibly large skiable terrain area! Which is it?" + "Because a few states are split across multiple named regions, there are slightly more unique regions than states." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### 2.6.3.4 Distribution Of Resorts By Region And State" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If this is your first time using [matplotlib](https://matplotlib.org/3.2.2/index.html)'s [subplots](https://matplotlib.org/3.2.2/api/_as_gen/matplotlib.pyplot.subplots.html), you may find the online documentation useful." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 63, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "\n", + " | AdultWeekday | \n", + "AdultWeekend | \n", + "
---|---|---|
state | \n", + "\n", + " | \n", + " |
Alaska | \n", + "47.333333 | \n", + "57.333333 | \n", + "
Arizona | \n", + "81.500000 | \n", + "83.500000 | \n", + "
California | \n", + "78.214286 | \n", + "81.416667 | \n", + "
Colorado | \n", + "90.714286 | \n", + "90.714286 | \n", + "
Connecticut | \n", + "47.800000 | \n", + "56.800000 | \n", + "
\n", + " | state | \n", + "Ticket | \n", + "Price | \n", + "
---|---|---|---|
0 | \n", + "Alaska | \n", + "AdultWeekday | \n", + "65.0 | \n", + "
1 | \n", + "Alaska | \n", + "AdultWeekday | \n", + "47.0 | \n", + "
2 | \n", + "Alaska | \n", + "AdultWeekday | \n", + "30.0 | \n", + "
3 | \n", + "Arizona | \n", + "AdultWeekday | \n", + "89.0 | \n", + "
4 | \n", + "Arizona | \n", + "AdultWeekday | \n", + "74.0 | \n", + "
\n", + " | count | \n", + "mean | \n", + "std | \n", + "min | \n", + "25% | \n", + "50% | \n", + "75% | \n", + "max | \n", + "
---|---|---|---|---|---|---|---|---|
summit_elev | \n", + "330.0 | \n", + "4591.818182 | \n", + "3735.535934 | \n", + "315.0 | \n", + "1403.75 | \n", + "3127.5 | \n", + "7806.00 | \n", + "13487.0 | \n", + "
vertical_drop | \n", + "330.0 | \n", + "1215.427273 | \n", + "947.864557 | \n", + "60.0 | \n", + "461.25 | \n", + "964.5 | \n", + "1800.00 | \n", + "4425.0 | \n", + "
base_elev | \n", + "330.0 | \n", + "3374.000000 | \n", + "3117.121621 | \n", + "70.0 | \n", + "869.00 | \n", + "1561.5 | \n", + "6325.25 | \n", + "10800.0 | \n", + "
trams | \n", + "330.0 | \n", + "0.172727 | \n", + "0.559946 | \n", + "0.0 | \n", + "0.00 | \n", + "0.0 | \n", + "0.00 | \n", + "4.0 | \n", + "
fastEight | \n", + "164.0 | \n", + "0.006098 | \n", + "0.078087 | \n", + "0.0 | \n", + "0.00 | \n", + "0.0 | \n", + "0.00 | \n", + "1.0 | \n", + "
fastSixes | \n", + "330.0 | \n", + "0.184848 | \n", + "0.651685 | \n", + "0.0 | \n", + "0.00 | \n", + "0.0 | \n", + "0.00 | \n", + "6.0 | \n", + "
fastQuads | \n", + "330.0 | \n", + "1.018182 | \n", + "2.198294 | \n", + "0.0 | \n", + "0.00 | \n", + "0.0 | \n", + "1.00 | \n", + "15.0 | \n", + "
quad | \n", + "330.0 | \n", + "0.933333 | \n", + "1.312245 | \n", + "0.0 | \n", + "0.00 | \n", + "0.0 | \n", + "1.00 | \n", + "8.0 | \n", + "
triple | \n", + "330.0 | \n", + "1.500000 | \n", + "1.619130 | \n", + "0.0 | \n", + "0.00 | \n", + "1.0 | \n", + "2.00 | \n", + "8.0 | \n", + "
double | \n", + "330.0 | \n", + "1.833333 | \n", + "1.815028 | \n", + "0.0 | \n", + "1.00 | \n", + "1.0 | \n", + "3.00 | \n", + "14.0 | \n", + "
surface | \n", + "330.0 | \n", + "2.621212 | \n", + "2.059636 | \n", + "0.0 | \n", + "1.00 | \n", + "2.0 | \n", + "3.00 | \n", + "15.0 | \n", + "
total_chairs | \n", + "330.0 | \n", + "8.266667 | \n", + "5.798683 | \n", + "0.0 | \n", + "5.00 | \n", + "7.0 | \n", + "10.00 | \n", + "41.0 | \n", + "
Runs | \n", + "326.0 | \n", + "48.214724 | \n", + "46.364077 | \n", + "3.0 | \n", + "19.00 | \n", + "33.0 | \n", + "60.00 | \n", + "341.0 | \n", + "
TerrainParks | \n", + "279.0 | \n", + "2.820789 | \n", + "2.008113 | \n", + "1.0 | \n", + "1.00 | \n", + "2.0 | \n", + "4.00 | \n", + "14.0 | \n", + "
LongestRun_mi | \n", + "325.0 | \n", + "1.433231 | \n", + "1.156171 | \n", + "0.0 | \n", + "0.50 | \n", + "1.0 | \n", + "2.00 | \n", + "6.0 | \n", + "
SkiableTerrain_ac | \n", + "327.0 | \n", + "739.801223 | \n", + "1816.167441 | \n", + "8.0 | \n", + "85.00 | \n", + "200.0 | \n", + "690.00 | \n", + "26819.0 | \n", + "
Snow Making_ac | \n", + "284.0 | \n", + "174.873239 | \n", + "261.336125 | \n", + "2.0 | \n", + "50.00 | \n", + "100.0 | \n", + "200.50 | \n", + "3379.0 | \n", + "
daysOpenLastYear | \n", + "279.0 | \n", + "115.103943 | \n", + "35.063251 | \n", + "3.0 | \n", + "97.00 | \n", + "114.0 | \n", + "135.00 | \n", + "305.0 | \n", + "
yearsOpen | \n", + "329.0 | \n", + "63.656535 | \n", + "109.429928 | \n", + "6.0 | \n", + "50.00 | \n", + "58.0 | \n", + "69.00 | \n", + "2019.0 | \n", + "
averageSnowfall | \n", + "316.0 | \n", + "185.316456 | \n", + "136.356842 | \n", + "18.0 | \n", + "69.00 | \n", + "150.0 | \n", + "300.00 | \n", + "669.0 | \n", + "
AdultWeekday | \n", + "276.0 | \n", + "57.916957 | \n", + "26.140126 | \n", + "15.0 | \n", + "40.00 | \n", + "50.0 | \n", + "71.00 | \n", + "179.0 | \n", + "
AdultWeekend | \n", + "279.0 | \n", + "64.166810 | \n", + "24.554584 | \n", + "17.0 | \n", + "47.00 | \n", + "60.0 | \n", + "77.50 | \n", + "179.0 | \n", + "
projectedDaysOpen | \n", + "283.0 | \n", + "120.053004 | \n", + "31.045963 | \n", + "30.0 | \n", + "100.00 | \n", + "120.0 | \n", + "139.50 | \n", + "305.0 | \n", + "
NightSkiing_ac | \n", + "187.0 | \n", + "100.395722 | \n", + "105.169620 | \n", + "2.0 | \n", + "40.00 | \n", + "72.0 | \n", + "114.00 | \n", + "650.0 | \n", + "
\n", + " | Name | \n", + "Region | \n", + "state | \n", + "summit_elev | \n", + "vertical_drop | \n", + "base_elev | \n", + "trams | \n", + "fastEight | \n", + "fastSixes | \n", + "fastQuads | \n", + "... | \n", + "LongestRun_mi | \n", + "SkiableTerrain_ac | \n", + "Snow Making_ac | \n", + "daysOpenLastYear | \n", + "yearsOpen | \n", + "averageSnowfall | \n", + "AdultWeekday | \n", + "AdultWeekend | \n", + "projectedDaysOpen | \n", + "NightSkiing_ac | \n", + "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", + "Alyeska Resort | \n", + "Alaska | \n", + "Alaska | \n", + "3939 | \n", + "2500 | \n", + "250 | \n", + "1 | \n", + "0.0 | \n", + "0 | \n", + "2 | \n", + "... | \n", + "1.0 | \n", + "1610.0 | \n", + "113.0 | \n", + "150.0 | \n", + "60.0 | \n", + "669.0 | \n", + "65.0 | \n", + "85.0 | \n", + "150.0 | \n", + "550.0 | \n", + "
7 | \n", + "Bear Valley | \n", + "Sierra Nevada | \n", + "California | \n", + "8500 | \n", + "1900 | \n", + "6600 | \n", + "0 | \n", + "0.0 | \n", + "1 | \n", + "1 | \n", + "... | \n", + "1.2 | \n", + "1680.0 | \n", + "100.0 | \n", + "165.0 | \n", + "52.0 | \n", + "359.0 | \n", + "NaN | \n", + "NaN | \n", + "151.0 | \n", + "NaN | \n", + "
11 | \n", + "Heavenly Mountain Resort | \n", + "Sierra Nevada | \n", + "California | \n", + "10067 | \n", + "3500 | \n", + "7170 | \n", + "2 | \n", + "0.0 | \n", + "2 | \n", + "7 | \n", + "... | \n", + "5.5 | \n", + "4800.0 | \n", + "3379.0 | \n", + "155.0 | \n", + "64.0 | \n", + "360.0 | \n", + "NaN | \n", + "NaN | \n", + "157.0 | \n", + "NaN | \n", + "
12 | \n", + "June Mountain | \n", + "Sierra Nevada | \n", + "California | \n", + "10090 | \n", + "2590 | \n", + "7545 | \n", + "0 | \n", + "NaN | \n", + "0 | \n", + "2 | \n", + "... | \n", + "2.0 | \n", + "1500.0 | \n", + "NaN | \n", + "NaN | \n", + "58.0 | \n", + "250.0 | \n", + "NaN | \n", + "NaN | \n", + "128.0 | \n", + "NaN | \n", + "
13 | \n", + "Kirkwood | \n", + "Sierra Nevada | \n", + "California | \n", + "9800 | \n", + "2000 | \n", + "7800 | \n", + "0 | \n", + "0.0 | \n", + "0 | \n", + "2 | \n", + "... | \n", + "2.5 | \n", + "2300.0 | \n", + "200.0 | \n", + "200.0 | \n", + "47.0 | \n", + "354.0 | \n", + "NaN | \n", + "NaN | \n", + "167.0 | \n", + "NaN | \n", + "
... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "
299 | \n", + "Stevens Pass Resort | \n", + "Washington | \n", + "Washington | \n", + "5845 | \n", + "1800 | \n", + "4061 | \n", + "0 | \n", + "0.0 | \n", + "0 | \n", + "3 | \n", + "... | \n", + "1.0 | \n", + "1125.0 | \n", + "NaN | \n", + "116.0 | \n", + "82.0 | \n", + "460.0 | \n", + "NaN | \n", + "NaN | \n", + "145.0 | \n", + "450.0 | \n", + "
300 | \n", + "The Summit at Snoqualmie | \n", + "Washington | \n", + "Washington | \n", + "3865 | \n", + "1025 | \n", + "2840 | \n", + "0 | \n", + "NaN | \n", + "0 | \n", + "3 | \n", + "... | \n", + "0.8 | \n", + "1994.0 | \n", + "5.0 | \n", + "120.0 | \n", + "82.0 | \n", + "428.0 | \n", + "85.0 | \n", + "95.0 | \n", + "140.0 | \n", + "541.0 | \n", + "
301 | \n", + "White Pass | \n", + "Washington | \n", + "Washington | \n", + "6550 | \n", + "2050 | \n", + "4500 | \n", + "0 | \n", + "NaN | \n", + "0 | \n", + "2 | \n", + "... | \n", + "2.5 | \n", + "1402.0 | \n", + "30.0 | \n", + "148.0 | \n", + "67.0 | \n", + "400.0 | \n", + "69.0 | \n", + "69.0 | \n", + "144.0 | \n", + "90.0 | \n", + "
322 | \n", + "Grand Targhee Resort | \n", + "Wyoming | \n", + "Wyoming | \n", + "9920 | \n", + "2270 | \n", + "7851 | \n", + "0 | \n", + "0.0 | \n", + "0 | \n", + "2 | \n", + "... | \n", + "2.7 | \n", + "2602.0 | \n", + "NaN | \n", + "152.0 | \n", + "50.0 | \n", + "500.0 | \n", + "90.0 | \n", + "90.0 | \n", + "152.0 | \n", + "NaN | \n", + "
324 | \n", + "Jackson Hole | \n", + "Wyoming | \n", + "Wyoming | \n", + "10450 | \n", + "4139 | \n", + "6311 | \n", + "3 | \n", + "0.0 | \n", + "0 | \n", + "4 | \n", + "... | \n", + "4.5 | \n", + "2500.0 | \n", + "195.0 | \n", + "130.0 | \n", + "54.0 | \n", + "459.0 | \n", + "NaN | \n", + "NaN | \n", + "133.0 | \n", + "NaN | \n", + "
66 rows × 27 columns
\n", + "\n", + " | 11 | \n", + "27 | \n", + "39 | \n", + "45 | \n", + "140 | \n", + "231 | \n", + "266 | \n", + "267 | \n", + "
---|---|---|---|---|---|---|---|---|
Name | \n", + "Heavenly Mountain Resort | \n", + "Aspen / Snowmass | \n", + "Silverton Mountain | \n", + "Vail | \n", + "Big Sky Resort | \n", + "Mt. Bachelor | \n", + "Park City | \n", + "Powder Mountain | \n", + "
Region | \n", + "Sierra Nevada | \n", + "Colorado | \n", + "Colorado | \n", + "Colorado | \n", + "Montana | \n", + "Oregon | \n", + "Salt Lake City | \n", + "Utah | \n", + "
state | \n", + "California | \n", + "Colorado | \n", + "Colorado | \n", + "Colorado | \n", + "Montana | \n", + "Oregon | \n", + "Utah | \n", + "Utah | \n", + "
summit_elev | \n", + "10067 | \n", + "12510 | \n", + "13487 | \n", + "11570 | \n", + "11166 | \n", + "9065 | \n", + "10000 | \n", + "9422 | \n", + "
vertical_drop | \n", + "3500 | \n", + "4406 | \n", + "3087 | \n", + "3450 | \n", + "4350 | \n", + "3365 | \n", + "3200 | \n", + "2522 | \n", + "
base_elev | \n", + "7170 | \n", + "8104 | \n", + "10400 | \n", + "8120 | \n", + "7500 | \n", + "5700 | \n", + "6800 | \n", + "6900 | \n", + "
trams | \n", + "2 | \n", + "3 | \n", + "0 | \n", + "2 | \n", + "1 | \n", + "0 | \n", + "4 | \n", + "0 | \n", + "
fastEight | \n", + "0.0 | \n", + "0.0 | \n", + "0.0 | \n", + "0.0 | \n", + "1.0 | \n", + "0.0 | \n", + "0.0 | \n", + "0.0 | \n", + "
fastSixes | \n", + "2 | \n", + "1 | \n", + "0 | \n", + "3 | \n", + "2 | \n", + "0 | \n", + "6 | \n", + "0 | \n", + "
fastQuads | \n", + "7 | \n", + "15 | \n", + "0 | \n", + "15 | \n", + "5 | \n", + "8 | \n", + "10 | \n", + "1 | \n", + "
quad | \n", + "1 | \n", + "4 | \n", + "0 | \n", + "1 | \n", + "3 | \n", + "0 | \n", + "4 | \n", + "4 | \n", + "
triple | \n", + "5 | \n", + "3 | \n", + "0 | \n", + "1 | \n", + "7 | \n", + "3 | \n", + "7 | \n", + "1 | \n", + "
double | \n", + "3 | \n", + "5 | \n", + "1 | \n", + "0 | \n", + "5 | \n", + "0 | \n", + "4 | \n", + "0 | \n", + "
surface | \n", + "8 | \n", + "9 | \n", + "0 | \n", + "9 | \n", + "12 | \n", + "0 | \n", + "6 | \n", + "3 | \n", + "
total_chairs | \n", + "28 | \n", + "40 | \n", + "1 | \n", + "31 | \n", + "36 | \n", + "11 | \n", + "41 | \n", + "9 | \n", + "
Runs | \n", + "97.0 | \n", + "336.0 | \n", + "NaN | \n", + "195.0 | \n", + "317.0 | \n", + "101.0 | \n", + "341.0 | \n", + "167.0 | \n", + "
TerrainParks | \n", + "3.0 | \n", + "10.0 | \n", + "NaN | \n", + "3.0 | \n", + "8.0 | \n", + "5.0 | \n", + "8.0 | \n", + "2.0 | \n", + "
LongestRun_mi | \n", + "5.5 | \n", + "5.3 | \n", + "1.5 | \n", + "4.0 | \n", + "6.0 | \n", + "4.0 | \n", + "3.5 | \n", + "3.5 | \n", + "
SkiableTerrain_ac | \n", + "4800.0 | \n", + "5517.0 | \n", + "26819.0 | \n", + "5289.0 | \n", + "5800.0 | \n", + "4318.0 | \n", + "7300.0 | \n", + "8464.0 | \n", + "
Snow Making_ac | \n", + "3379.0 | \n", + "658.0 | \n", + "NaN | \n", + "461.0 | \n", + "400.0 | \n", + "20.0 | \n", + "750.0 | \n", + "NaN | \n", + "
daysOpenLastYear | \n", + "155.0 | \n", + "138.0 | \n", + "175.0 | \n", + "149.0 | \n", + "144.0 | \n", + "185.0 | \n", + "142.0 | \n", + "120.0 | \n", + "
yearsOpen | \n", + "64.0 | \n", + "72.0 | \n", + "17.0 | \n", + "57.0 | \n", + "46.0 | \n", + "61.0 | \n", + "56.0 | \n", + "47.0 | \n", + "
averageSnowfall | \n", + "360.0 | \n", + "300.0 | \n", + "400.0 | \n", + "354.0 | \n", + "400.0 | \n", + "462.0 | \n", + "355.0 | \n", + "500.0 | \n", + "
AdultWeekday | \n", + "NaN | \n", + "179.0 | \n", + "79.0 | \n", + "NaN | \n", + "NaN | \n", + "99.0 | \n", + "NaN | \n", + "88.0 | \n", + "
AdultWeekend | \n", + "NaN | \n", + "179.0 | \n", + "79.0 | \n", + "NaN | \n", + "NaN | \n", + "99.0 | \n", + "NaN | \n", + "88.0 | \n", + "
projectedDaysOpen | \n", + "157.0 | \n", + "138.0 | \n", + "181.0 | \n", + "142.0 | \n", + "144.0 | \n", + "185.0 | \n", + "143.0 | \n", + "146.0 | \n", + "
NightSkiing_ac | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "300.0 | \n", + "
\n", + " | Name | \n", + "Region | \n", + "state | \n", + "summit_elev | \n", + "vertical_drop | \n", + "base_elev | \n", + "trams | \n", + "fastSixes | \n", + "fastQuads | \n", + "quad | \n", + "... | \n", + "LongestRun_mi | \n", + "SkiableTerrain_ac | \n", + "Snow Making_ac | \n", + "daysOpenLastYear | \n", + "yearsOpen | \n", + "averageSnowfall | \n", + "AdultWeekday | \n", + "AdultWeekend | \n", + "projectedDaysOpen | \n", + "NightSkiing_ac | \n", + "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
34 | \n", + "Howelsen Hill | \n", + "Colorado | \n", + "Colorado | \n", + "7136 | \n", + "440 | \n", + "6696 | \n", + "0 | \n", + "0 | \n", + "0 | \n", + "0 | \n", + "... | \n", + "6.0 | \n", + "50.0 | \n", + "25.0 | \n", + "100.0 | \n", + "104.0 | \n", + "150.0 | \n", + "25.0 | \n", + "25.0 | \n", + "100.0 | \n", + "10.0 | \n", + "
115 | \n", + "Pine Knob Ski Resort | \n", + "Michigan | \n", + "Michigan | \n", + "1308 | \n", + "300 | \n", + "1009 | \n", + "0 | \n", + "0 | \n", + "0 | \n", + "0 | \n", + "... | \n", + "1.0 | \n", + "80.0 | \n", + "80.0 | \n", + "NaN | \n", + "2019.0 | \n", + "NaN | \n", + "49.0 | \n", + "57.0 | \n", + "NaN | \n", + "NaN | \n", + "
2 rows × 26 columns
\n", + "\n", + " | state | \n", + "resorts_per_state | \n", + "state_total_skiable_area_ac | \n", + "state_total_days_open | \n", + "state_total_terrain_parks | \n", + "state_total_nightskiing_ac | \n", + "
---|---|---|---|---|---|---|
0 | \n", + "Alaska | \n", + "3 | \n", + "2280.0 | \n", + "345.0 | \n", + "4.0 | \n", + "580.0 | \n", + "
1 | \n", + "Arizona | \n", + "2 | \n", + "1577.0 | \n", + "237.0 | \n", + "6.0 | \n", + "80.0 | \n", + "
2 | \n", + "California | \n", + "21 | \n", + "25948.0 | \n", + "2738.0 | \n", + "81.0 | \n", + "587.0 | \n", + "
3 | \n", + "Colorado | \n", + "22 | \n", + "43682.0 | \n", + "3258.0 | \n", + "74.0 | \n", + "428.0 | \n", + "
4 | \n", + "Connecticut | \n", + "5 | \n", + "358.0 | \n", + "353.0 | \n", + "10.0 | \n", + "256.0 | \n", + "
\n", + " | state | \n", + "state_population | \n", + "state_area_sq_miles | \n", + "
---|---|---|---|
0 | \n", + "Alabama | \n", + "4903185 | \n", + "52420 | \n", + "
1 | \n", + "Alaska | \n", + "731545 | \n", + "665384 | \n", + "
2 | \n", + "Arizona | \n", + "7278717 | \n", + "113990 | \n", + "
3 | \n", + "Arkansas | \n", + "3017804 | \n", + "53179 | \n", + "
4 | \n", + "California | \n", + "39512223 | \n", + "163695 | \n", + "
\n", + " | state | \n", + "resorts_per_state | \n", + "state_total_skiable_area_ac | \n", + "state_total_days_open | \n", + "state_total_terrain_parks | \n", + "state_total_nightskiing_ac | \n", + "state_population | \n", + "state_area_sq_miles | \n", + "
---|---|---|---|---|---|---|---|---|
0 | \n", + "Alaska | \n", + "3 | \n", + "2280.0 | \n", + "345.0 | \n", + "4.0 | \n", + "580.0 | \n", + "731545 | \n", + "665384 | \n", + "
1 | \n", + "Arizona | \n", + "2 | \n", + "1577.0 | \n", + "237.0 | \n", + "6.0 | \n", + "80.0 | \n", + "7278717 | \n", + "113990 | \n", + "
2 | \n", + "California | \n", + "21 | \n", + "25948.0 | \n", + "2738.0 | \n", + "81.0 | \n", + "587.0 | \n", + "39512223 | \n", + "163695 | \n", + "
3 | \n", + "Colorado | \n", + "22 | \n", + "43682.0 | \n", + "3258.0 | \n", + "74.0 | \n", + "428.0 | \n", + "5758736 | \n", + "104094 | \n", + "
4 | \n", + "Connecticut | \n", + "5 | \n", + "358.0 | \n", + "353.0 | \n", + "10.0 | \n", + "256.0 | \n", + "3565278 | \n", + "5543 | \n", + "
\n", + " | AdultWeekend | \n", + "AdultWeekday | \n", + "
---|---|---|
141 | \n", + "42.0 | \n", + "42.0 | \n", + "
142 | \n", + "63.0 | \n", + "63.0 | \n", + "
143 | \n", + "49.0 | \n", + "49.0 | \n", + "
144 | \n", + "48.0 | \n", + "48.0 | \n", + "
145 | \n", + "46.0 | \n", + "46.0 | \n", + "
146 | \n", + "39.0 | \n", + "39.0 | \n", + "
147 | \n", + "50.0 | \n", + "50.0 | \n", + "
148 | \n", + "67.0 | \n", + "67.0 | \n", + "
149 | \n", + "47.0 | \n", + "47.0 | \n", + "
150 | \n", + "39.0 | \n", + "39.0 | \n", + "
151 | \n", + "81.0 | \n", + "81.0 | \n", + "
\n", + " | resorts_per_state | \n", + "state_total_skiable_area_ac | \n", + "state_total_days_open | \n", + "state_total_terrain_parks | \n", + "state_total_nightskiing_ac | \n", + "resorts_per_100kcapita | \n", + "resorts_per_100ksq_mile | \n", + "
---|---|---|---|---|---|---|---|
state | \n", + "\n", + " | \n", + " | \n", + " | \n", + " | \n", + " | \n", + " | \n", + " |
Alaska | \n", + "3 | \n", + "2280.0 | \n", + "345.0 | \n", + "4.0 | \n", + "580.0 | \n", + "0.410091 | \n", + "0.450867 | \n", + "
Arizona | \n", + "2 | \n", + "1577.0 | \n", + "237.0 | \n", + "6.0 | \n", + "80.0 | \n", + "0.027477 | \n", + "1.754540 | \n", + "
California | \n", + "21 | \n", + "25948.0 | \n", + "2738.0 | \n", + "81.0 | \n", + "587.0 | \n", + "0.053148 | \n", + "12.828736 | \n", + "
Colorado | \n", + "22 | \n", + "43682.0 | \n", + "3258.0 | \n", + "74.0 | \n", + "428.0 | \n", + "0.382028 | \n", + "21.134744 | \n", + "
Connecticut | \n", + "5 | \n", + "358.0 | \n", + "353.0 | \n", + "10.0 | \n", + "256.0 | \n", + "0.140242 | \n", + "90.203861 | \n", + "
\n", + " | resorts_per_state | \n", + "state_total_skiable_area_ac | \n", + "state_total_days_open | \n", + "state_total_terrain_parks | \n", + "state_total_nightskiing_ac | \n", + "resorts_per_100kcapita | \n", + "resorts_per_100ksq_mile | \n", + "
---|---|---|---|---|---|---|---|
0 | \n", + "-0.806912 | \n", + "-0.392012 | \n", + "-0.689059 | \n", + "-0.816118 | \n", + "0.069410 | \n", + "0.139593 | \n", + "-0.689999 | \n", + "
1 | \n", + "-0.933558 | \n", + "-0.462424 | \n", + "-0.819038 | \n", + "-0.726994 | \n", + "-0.701326 | \n", + "-0.644706 | \n", + "-0.658125 | \n", + "
2 | \n", + "1.472706 | \n", + "1.978574 | \n", + "2.190933 | \n", + "2.615141 | \n", + "0.080201 | \n", + "-0.592085 | \n", + "-0.387368 | \n", + "
3 | \n", + "1.599351 | \n", + "3.754811 | \n", + "2.816757 | \n", + "2.303209 | \n", + "-0.164893 | \n", + "0.082069 | \n", + "-0.184291 | \n", + "
4 | \n", + "-0.553622 | \n", + "-0.584519 | \n", + "-0.679431 | \n", + "-0.548747 | \n", + "-0.430027 | \n", + "-0.413557 | \n", + "1.504408 | \n", + "
\n", + " | PC1 | \n", + "PC2 | \n", + "
---|---|---|
state | \n", + "\n", + " | \n", + " |
Alaska | \n", + "-1.336533 | \n", + "-0.182208 | \n", + "
Arizona | \n", + "-1.839049 | \n", + "-0.387959 | \n", + "
California | \n", + "3.537857 | \n", + "-1.282509 | \n", + "
Colorado | \n", + "4.402210 | \n", + "-0.898855 | \n", + "
Connecticut | \n", + "-0.988027 | \n", + "1.020218 | \n", + "
\n", + " | PC1 | \n", + "PC2 | \n", + "AdultWeekend | \n", + "
---|---|---|---|
state | \n", + "\n", + " | \n", + " | \n", + " |
Alaska | \n", + "-1.336533 | \n", + "-0.182208 | \n", + "57.333333 | \n", + "
Arizona | \n", + "-1.839049 | \n", + "-0.387959 | \n", + "83.500000 | \n", + "
California | \n", + "3.537857 | \n", + "-1.282509 | \n", + "81.416667 | \n", + "
Colorado | \n", + "4.402210 | \n", + "-0.898855 | \n", + "90.714286 | \n", + "
Connecticut | \n", + "-0.988027 | \n", + "1.020218 | \n", + "56.800000 | \n", + "
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('linearregression', LinearRegression())])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('linearregression', LinearRegression())])
SimpleImputer(strategy='median')
StandardScaler()
LinearRegression()
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('selectkbest',\n", + " SelectKBest(score_func=<function f_regression at 0x00000295311C3A60>)),\n", + " ('linearregression', LinearRegression())])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('selectkbest',\n", + " SelectKBest(score_func=<function f_regression at 0x00000295311C3A60>)),\n", + " ('linearregression', LinearRegression())])
SimpleImputer(strategy='median')
StandardScaler()
SelectKBest(score_func=<function f_regression at 0x00000295311C3A60>)
LinearRegression()
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('selectkbest',\n", + " SelectKBest(k=15,\n", + " score_func=<function f_regression at 0x00000295311C3A60>)),\n", + " ('linearregression', LinearRegression())])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('selectkbest',\n", + " SelectKBest(k=15,\n", + " score_func=<function f_regression at 0x00000295311C3A60>)),\n", + " ('linearregression', LinearRegression())])
SimpleImputer(strategy='median')
StandardScaler()
SelectKBest(k=15, score_func=<function f_regression at 0x00000295311C3A60>)
LinearRegression()
GridSearchCV(cv=5,\n", + " estimator=Pipeline(steps=[('simpleimputer',\n", + " SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('selectkbest',\n", + " SelectKBest(score_func=<function f_regression at 0x00000295311C3A60>)),\n", + " ('linearregression',\n", + " LinearRegression())]),\n", + " n_jobs=-1,\n", + " param_grid={'selectkbest__k': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,\n", + " 12, 13, 14, 15, 16, 17, 18, 19, 20,\n", + " 21, 22, 23, 24, 25, 26, 27, 28, 29,\n", + " 30, ...]})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
GridSearchCV(cv=5,\n", + " estimator=Pipeline(steps=[('simpleimputer',\n", + " SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('selectkbest',\n", + " SelectKBest(score_func=<function f_regression at 0x00000295311C3A60>)),\n", + " ('linearregression',\n", + " LinearRegression())]),\n", + " n_jobs=-1,\n", + " param_grid={'selectkbest__k': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,\n", + " 12, 13, 14, 15, 16, 17, 18, 19, 20,\n", + " 21, 22, 23, 24, 25, 26, 27, 28, 29,\n", + " 30, ...]})
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('selectkbest',\n", + " SelectKBest(k=8,\n", + " score_func=<function f_regression at 0x00000295311C3A60>)),\n", + " ('linearregression', LinearRegression())])
SimpleImputer(strategy='median')
StandardScaler()
SelectKBest(k=8, score_func=<function f_regression at 0x00000295311C3A60>)
LinearRegression()
GridSearchCV(cv=5,\n", + " estimator=Pipeline(steps=[('simpleimputer',\n", + " SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('randomforestregressor',\n", + " RandomForestRegressor(random_state=47))]),\n", + " n_jobs=-1,\n", + " param_grid={'randomforestregressor__n_estimators': [10, 12, 16, 20,\n", + " 26, 33, 42, 54,\n", + " 69, 88, 112,\n", + " 143, 183, 233,\n", + " 297, 379, 483,\n", + " 615, 784,\n", + " 1000],\n", + " 'simpleimputer__strategy': ['mean', 'median'],\n", + " 'standardscaler': [StandardScaler(), None]})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
GridSearchCV(cv=5,\n", + " estimator=Pipeline(steps=[('simpleimputer',\n", + " SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('randomforestregressor',\n", + " RandomForestRegressor(random_state=47))]),\n", + " n_jobs=-1,\n", + " param_grid={'randomforestregressor__n_estimators': [10, 12, 16, 20,\n", + " 26, 33, 42, 54,\n", + " 69, 88, 112,\n", + " 143, 183, 233,\n", + " 297, 379, 483,\n", + " 615, 784,\n", + " 1000],\n", + " 'simpleimputer__strategy': ['mean', 'median'],\n", + " 'standardscaler': [StandardScaler(), None]})
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('randomforestregressor',\n", + " RandomForestRegressor(random_state=47))])
SimpleImputer(strategy='median')
StandardScaler()
RandomForestRegressor(random_state=47)
GridSearchCV(cv=5,\n", + " estimator=Pipeline(steps=[('simpleimputer',\n", + " SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('randomforestregressor',\n", + " RandomForestRegressor(random_state=47))]),\n", + " n_jobs=-1,\n", + " param_grid={'randomforestregressor__n_estimators': [10, 12, 16, 20,\n", + " 26, 33, 42, 54,\n", + " 69, 88, 112,\n", + " 143, 183, 233,\n", + " 297, 379, 483,\n", + " 615, 784,\n", + " 1000],\n", + " 'simpleimputer__strategy': ['mean', 'median'],\n", + " 'standardscaler': [StandardScaler(), None]})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
GridSearchCV(cv=5,\n", + " estimator=Pipeline(steps=[('simpleimputer',\n", + " SimpleImputer(strategy='median')),\n", + " ('standardscaler', StandardScaler()),\n", + " ('randomforestregressor',\n", + " RandomForestRegressor(random_state=47))]),\n", + " n_jobs=-1,\n", + " param_grid={'randomforestregressor__n_estimators': [10, 12, 16, 20,\n", + " 26, 33, 42, 54,\n", + " 69, 88, 112,\n", + " 143, 183, 233,\n", + " 297, 379, 483,\n", + " 615, 784,\n", + " 1000],\n", + " 'simpleimputer__strategy': ['mean', 'median'],\n", + " 'standardscaler': [StandardScaler(), None]})
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', None),\n", + " ('randomforestregressor',\n", + " RandomForestRegressor(n_estimators=69, random_state=47))])
SimpleImputer(strategy='median')
None
RandomForestRegressor(n_estimators=69, random_state=47)
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', None),\n", + " ('randomforestregressor',\n", + " RandomForestRegressor(n_estimators=69, random_state=47))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(steps=[('simpleimputer', SimpleImputer(strategy='median')),\n", + " ('standardscaler', None),\n", + " ('randomforestregressor',\n", + " RandomForestRegressor(n_estimators=69, random_state=47))])
SimpleImputer(strategy='median')
None
RandomForestRegressor(n_estimators=69, random_state=47)
\n", + " | summit_elev | \n", + "vertical_drop | \n", + "base_elev | \n", + "trams | \n", + "fastSixes | \n", + "fastQuads | \n", + "quad | \n", + "triple | \n", + "double | \n", + "surface | \n", + "... | \n", + "resorts_per_100kcapita | \n", + "resorts_per_100ksq_mile | \n", + "resort_skiable_area_ac_state_ratio | \n", + "resort_days_open_state_ratio | \n", + "resort_terrain_park_state_ratio | \n", + "resort_night_skiing_state_ratio | \n", + "total_chairs_runs_ratio | \n", + "total_chairs_skiable_ratio | \n", + "fastQuads_runs_ratio | \n", + "fastQuads_skiable_ratio | \n", + "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
124 | \n", + "6817 | \n", + "2353 | \n", + "4464 | \n", + "0 | \n", + "0 | \n", + "3 | \n", + "2 | \n", + "6 | \n", + "0 | \n", + "3 | \n", + "... | \n", + "1.122778 | \n", + "8.161045 | \n", + "0.140121 | \n", + "0.129338 | \n", + "0.148148 | \n", + "0.84507 | \n", + "0.133333 | \n", + "0.004667 | \n", + "0.028571 | \n", + "0.001 | \n", + "
1 rows × 32 columns
\n", + "