- Marijose Cavazos - My github
- Paola Aleman - My github
- Javier Robles - My github
- Cesar Cruz - My github
The purpose of this project is to carry out an analysis of the real estate market behavior in Austin, Texas, from 2018 to 2021. Various variables that influence the determination of the sale price will be examined, such as the number of bathrooms, bedrooms, parking spaces, land area, and construction area, among others. This information will be used to predict, through linear regression and neural networks, whether the sale prices were in line with the market. The same machine learning methods will be used to obtain a sale price based on users' input data of their house.
We will create a visualization in HTML coding with the use of Python Flask, HTML/CSS and JavaScript.
- Did I buy my house above or below the market price?
- Did I sell my house above or below the market price?
- What is the price at which I could sell my house according to the market?
The chosen data has been downloaded from Kaggle. The purpose of the dataset is to collect information about the houses that participated in the real estate market in Austin, Texas, in the latest years.
There are 46 categories called: City, streetAddress, Zipcode, Latitude, Longitude, propertyTaxRate, garageSpaces, hasAssociation, hasCooling, hasGarage, hasHeating, hasSpa, hasView, homeType, parkingSpaces, yearBuilt, latestPrice, numPriceChanges, latest_saledate, latest_salemonth, latest_saleyear, latestPriceSource, numOfPhotos, numOfAccessibilityFeatures, numOfAppliances,numOfParkingFeatures, numOfPatioAndPorchFeatures, numOfSecurityFeatures, numOfWaterfrontFeatures, numOfWindowFeatures, numOfCommunityFeatures, lotSizeSqFt, livingAreaSqFt, numOfPrimarySchools, numOfElementarySchools, numOfMiddleSchools, numOfHighSchools, avgSchoolDistance, avgSchoolRating, avgSchoolSize, MedianStudentsPerTeacher, numOfBathrooms, numOfBedrooms, numOfStories, homeImage.
For this project we fetch and grabbed the data from /www.kaggle.com/ our data set were retrived form https://www.kaggle.com/code/threnjen/austin-housing-eda-nlp-models-visualizations/input
The first step was to import the file "austin_housing.csv" to Jupyter Notebook and analyze it in order to be able to select the most relevant variables for the project.
The dataset was filtered to 30 variables, and "numOfSchools" was created as a result of summing the different school levels. All this data was exported to the "austin_housing_reduced.csv" file, which contains these columns and their related information: city, streetAddress, zipcode, latitude, longitude, propertyTaxRate, garageSpaces, hasCooling, hasGarage, hasHeating, hasSpa, hasView, homeType, yearBuilt, latestPrice, numPriceChanges, numOfAccessibilityFeatures, numOfAppliances, numOfParkingFeatures, numOfPatioAndPorchFeatures, numOfSecurityFeatures, numOfWaterfrontFeatures, numOfWindowFeatures, numOfCommunityFeatures, lotSizeSqFt, livingAreaSqFt, avgSchoolRating, numOfBathrooms, numOfBedrooms, numOfStories, numOfSchools.
The Austin House Pricing Project utilized Flask to develop a web application consisting of app.py, index.html, and JavaScript files. The primary objective of this project was to display a map using Leaflet, showcasing information about houses sold between 2018 and 2021.
To enhance the functionality, machine learning techniques were employed. Both linear regression and neural networks were utilized to predict house prices based on user input.
The Flask application seamlessly integrated the machine learning models, enabling users to input specific parameters related to the house they were interested in. The models would then provide an estimated price based on the given inputs, utilizing the predictive capabilities of the trained models.
Overall, the project combined web development, data visualization, and machine learning techniques to create an interactive platform that empowered users to explore and obtain estimated prices for houses in Austin, Texas.
- Leaflet showing information about the selected home.
- Map filtering.
- Price prediction.
- Graphs.
- Javascript
- HTML/CSS
- JSON
- GitHub and GitHub Pages
- console.log
- Matplotlib.pyplot
- Flask
- Jupyter Notebook
- CORS
- LiveServer JS
- The model that best fits our data set was the neural networks model based on the MAE (Mean Average Error), which had a lower value.
- The linear regression model don´t take into consideration relevant fields like zipcode and year of construction.
- Categorized the predicted price values into good, bad and neutral by comparing it to the listed price.
- Offered different options for user to display visualizations comparing both models.
Austin Housing - EDA, NLP, Models, Visualizations. (2021). Retrived from https://www.kaggle.com/code/threnjen/austin-housing-eda-nlp-models-visualizations/input
Copyright:copyright: 2023. All Rights Reserved. © 2023 Aleman Paola, Javier Robles, Cavazos Maria Jose, César Cruz, BootCamp Tecnologico de Monterrey.