-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathSimulateHungryAcademic.m
More file actions
94 lines (67 loc) · 2.83 KB
/
SimulateHungryAcademic.m
File metadata and controls
94 lines (67 loc) · 2.83 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
clear all; close all;
% we'll look at our HA model over a month; each trial is a day
nTrials = 300;
% initilizing all option values to be equal, and sit between our two
% possible reward values (e.g. in this case, 0 or 1)
V = 1/2*ones(nTrials,4);
%
% pick some parms; our simple model has two (assuming V_init fixed):
% learning rate and softmax temperature
learnRate = 0.5;
betaParm = 5; % more on what these mean tomorrow.
% lets assign some reward probabilities to the options (i.e. how likely is
% HP to have a good meal and no food poisoning at each place)
% we'll do this using a vector, rProbs, with one row and four columns,
% where each element is the probability of reward for one restaurant
rProbs = [0.5 0.2 0.65 0.9]; % we'll use these to calculate reward
% time to let modelHA play
% iterate across trials
for t = 1:nTrials
% get choice probabilities for each of the four options based on values
% this is the softmax equation from the slides!
% (you can experiment w/ this and make HA be a hard maximizer, instead,
% or use epsilon-greedy, or some mix)
choiceProb(t,:) = exp(V(t,:)*betaParm)/sum(exp(V(t,:)*betaParm));
% decide what modelHA chooses by flipping a coin
Chosen(t) = modelHPChoose(choiceProb(t,:),4);
% decide what reward modelHA got based on choice
% i.e., flip another coin, with reward probability of whatever
% restaurant HA chose, and see if it turns up heads (1) or tails (0)
Reward(t) = rand<rProbs(Chosen(t));
% update values based on reward, using Rescorla-Wagner update rule
V(t+1,:) = V(t,:);
V(t+1,Chosen(t)) = V(t,Chosen(t)) + learnRate*(Reward(t)-V(t,Chosen(t)));
end
% make a new matrix for your simulated HAData: for instance, each row can
% be one trial, and each column a different variable: trial number, choice,
% reward
simHADataMoreTrials(:,1) = 1:nTrials;
simHADataMoreTrials(:,2) = Chosen;
simHADataMoreTrials(:,3) = Reward;
save('simHADataMoreTrials','simHADataMoreTrials')
%
% plot some stuff to see what modelHA is really doing here
% to start, try making a bar plot for how often HA chose each of the four
% options (hint: you can use a FOR loop to get % choices for each)
for option = 1:4
avgOptChoice(option) = length(find(Chosen==option))/nTrials;
end
close all
bar(avgOptChoice)
xlabel('Options')
ylabel('Choice frequency')
set(gca,'fontsize',18,'xticklabel',{'BGood','Chipotle','Falafel','Bagels'})
ylim([0 1])
% do the model choices makes sense, given the reward probabilities for each
% option? If no, why? If yes, great!
%%
% you can also look in more detail at this, for instance by plotting which
% option HA chose on each day (e.g. each trial); you can do this with a bar
% plot, a line plot, anything you want
close all
bar(Chosen)
set(gca,'fontsize',20)
xlabel('Trial')
ylabel('Choice')
xlim([0 31])
%%