In this project, I analyze a dataset of Tweets for election predicition. I used a Jupyter Notebook with 3.6 Python. The external libraries used are NumPy, Pandas, Seaborn,Matplotlib,Tweepy and Text blob.
The Anaconda (https://www.anaconda.com) environment I used to create this is included in the repository
This study is an exercise to show how to use foundations of Data Science in order to import, study, visualize, and present the raw data in a method that is easy for any user to digest and understand.
First, the raw comma separated values (.cvs) data will be loaded into a Python (Pandas) dataframe.
Second, there will be some data exploration. This will be completed mostly by loading plots of different data slices in order to better understand the data with visualization. Visualizing the data makes generating a hypothesis easier.
Third, the data will be analyzed.
Lastly, a function has been created where a user can input their personal information to see their probability of winning party in the election
Algorithms: Naive bayes Classification, SVM.