-
Notifications
You must be signed in to change notification settings - Fork 82
/
Copy pathProblemStatement1.txt
8 lines (6 loc) · 948 Bytes
/
ProblemStatement1.txt
1
2
3
4
5
6
7
8
Now it's time for you to have a go at this. For starters you will have to work with the same data set - sales data, that we discussed in lessons. You will have to write some Mappers and Reducers yourself and then answer the questions about data that follow. You will have to do the data processing on your local pseudo-distributed cluster, but you will be able to see if your solution was correct by submitting your results to our system.
The three questions that you have to answer about this data set are:
Instead of breaking the sales down by store, instead give us a sales breakdown by product category across all of our stores.
Find the monetary value for the highest individual sale for each separate store.
Find the total sales value across all the stores, and the total number of sales. Assume there is only one reducer.
When you have finished writing running your mapreduce jobs, press 'Next' to submit and check the answers.