Skip to content

Commit

Permalink
Update format (#160)
Browse files Browse the repository at this point in the history
* Add ref

* Change `name` keyword to `typename`

* Remove `layer` key from model configuration

* Move model config generaiton to agent
  • Loading branch information
mthrok authored Feb 4, 2017
1 parent 9468f7a commit c0120d4
Show file tree
Hide file tree
Showing 71 changed files with 454 additions and 465 deletions.
62 changes: 28 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ from luchador.episode_runner import EpisodeRunner

def main(env, agent, episodes, steps):
# Create environment
Environment = get_env(env['name'])
env = Environment(**env['args'])
Environment = get_env(env['typename'])
env = Environment(**env['args'])

# Create agent
Agent = get_agent(agent['name'])
Agent = get_agent(agent['typename'])
agent = Agent(**agent['args'])
agent.init(env)

Expand Down Expand Up @@ -109,7 +109,7 @@ $ python
To use this new agent, we need to create configuration file. As this agent does not take any constructor argument, configuration file is as simple as follow.

```yaml
name: MyRandomAgent
typename: MyRandomAgent
args: {}
```
Expand Down Expand Up @@ -163,23 +163,20 @@ Network architecture can be described using a set of layer configurations, and w
model_type: Sequential
layer_configs:
- scope: layer1
layer:
name: Conv2D
args:
n_filters: 32
filter_width: 8
filter_height: 8
strides: 4
padding: valid
typename: Conv2D
args:
n_filters: 32
filter_width: 8
filter_height: 8
strides: 4
padding: valid
- scope: layer2
layer:
name: ReLU
args: {}
typename: ReLU
args: {}
- scope: layer3
layer:
name: Dense
args:
n_nodes: 3
typename: Dense
args:
n_nodes: 3
```

You can feed this configuration to `luchador.nn.util.make_model` then the function will return the coresponding network architecture.
Expand All @@ -190,23 +187,20 @@ But having static parameters is sometimes inconvenient. For example, although th
model_type: Sequential
layer_configs:
- scope: layer1
layer:
name: Conv2D
args:
n_filters: 32
filter_width: 8
filter_height: 8
strides: 4
padding: valid
typename: Conv2D
args:
n_filters: 32
filter_width: 8
filter_height: 8
strides: 4
padding: valid
- scope: layer2
layer:
name: ReLU
args: {{}}
typename: ReLU
args: {{}}
- scope: layer3
layer:
name: Dense
args:
n_nodes: {n_actions}
typename: Dense
args:
n_nodes: {n_actions}
```

When you load this file with `luchador.nn.util.make_model('model.yml', n_actions=5)`, 5 is substituted at `{n_actions}`. Notice that `ReLU`'s `args` parameter became `{{}}` from `{}` so that it Python's `format` function will replace it to `{}`.
Expand Down
2 changes: 1 addition & 1 deletion example/ALEEnvironment_test.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: ALEEnvironment
typename: ALEEnvironment
args:
rom: space_invaders

Expand Down
2 changes: 1 addition & 1 deletion example/ALEEnvironment_train.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: ALEEnvironment
typename: ALEEnvironment
args:
rom: space_invaders

Expand Down
2 changes: 1 addition & 1 deletion example/CartPole_agent.yml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
name: CartPoleAgent
typename: CartPoleAgent
2 changes: 1 addition & 1 deletion example/CartPole_env.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: CartPole
typename: CartPole
args:
angle_limit: 12 # Degree
distance_limit: 2.4 # meter
Expand Down
19 changes: 10 additions & 9 deletions example/DQNAgent_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ alias:
save_prefix: &save_prefix DQN_integration_test
initial_parameter: &initial_parameter example/space_invaders_vanilla_dqn_99000.h5

name: DQNAgent
typename: DQNAgent
args:
recorder_config:
memory_size: 100
Expand All @@ -27,26 +27,27 @@ args:
terminal:
dtype: bool

model_config:
model_file: example/vanilla_dqn.yml
initial_parameter: *initial_parameter
input_channel: *stack
input_height: *height
input_width: *width

q_network_config:
model_config:
name: example/vanilla_dqn.yml
initial_parameter: *initial_parameter
input_channel: *stack
input_height: *height
input_width: *width
q_learning_config:
discount_rate: 0.99
# reward is clipped between the following min and max
min_reward: -1.0
max_reward: 1.0
cost_config:
name: SSE2
typename: SSE2
args:
# error between predicted Q value and target Q value is clipped by the following min and max
min_delta: -1.0
max_delta: 1.0
optimizer_config:
name: NeonRMSProp
typename: NeonRMSProp
args:
decay: 0.95
epsilon: 0.000001
Expand Down
19 changes: 10 additions & 9 deletions example/DQNAgent_train.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ alias:
save_prefix: &save_prefix DQN
initial_parameter: &initial_parameter null

name: DQNAgent
typename: DQNAgent
args:
recorder_config:
memory_size: 1000000
Expand All @@ -27,26 +27,27 @@ args:
terminal:
dtype: bool

model_config:
model_file: vanilla_dqn
initial_parameter: *initial_parameter
input_channel: *stack
input_height: *height
input_width: *width

q_network_config:
model_config:
name: vanilla_dqn
initial_parameter: *initial_parameter
input_channel: *stack
input_height: *height
input_width: *width
q_learning_config:
discount_rate: 0.99
# reward is clipped between the following min and max
min_reward: -1.0
max_reward: 1.0
cost_config:
name: SSE2
typename: SSE2
args:
# error between predicted Q value and target Q value is clipped by the following min and max
min_delta: -1.0
max_delta: 1.0
optimizer_config:
name: NeonRMSProp
typename: NeonRMSProp
args:
decay: 0.95
epsilon: 0.000001
Expand Down
2 changes: 1 addition & 1 deletion example/FlappyBirdEnv.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: FlappyBird
typename: FlappyBird
args:
random_seed: null

Expand Down
2 changes: 1 addition & 1 deletion example/MyRandomAgent.yml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
name: MyRandomAgent
typename: MyRandomAgent
2 changes: 1 addition & 1 deletion example/RemoteEnv.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: RemoteEnv
typename: RemoteEnv
args:
host: 0.0.0.0
port: 12345
163 changes: 76 additions & 87 deletions example/vanilla_dqn.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,100 +5,89 @@ input:
name: state
layer_configs:
- scope: layer0/preprocessing
layer:
name: TrueDiv
args:
denom: 255
typename: TrueDiv
args:
denom: 255
- scope: layer1/conv2D
layer:
name: Conv2D
args:
n_filters: 32
filter_width: 8
filter_height: 8
strides: 4
padding: valid
initializers:
bias: &initializer1
name: Uniform
args:
# 1 / sqrt(8 * 8 * 32) = 0.022097
maxval: 0.022
minval: -0.022
weight: *initializer1
typename: Conv2D
args:
n_filters: 32
filter_width: 8
filter_height: 8
strides: 4
padding: valid
initializers:
bias: &initializer1
typename: Uniform
args:
# 1 / sqrt(8 * 8 * 32) = 0.022097
maxval: 0.022
minval: -0.022
weight: *initializer1
- scope: layer1/ReLU
layer:
name: ReLU
typename: ReLU
- scope: layer2/conv2D
layer:
name: Conv2D
args:
n_filters: 64
filter_width: 4
filter_height: 4
strides: 2
padding: valid
initializers:
bias: &initializer2
name: Uniform
args:
# 1 / sqrt(4 * 4 * 64) = 0.03125
maxval: 0.031
minval: -0.031
weight: *initializer2
typename: Conv2D
args:
n_filters: 64
filter_width: 4
filter_height: 4
strides: 2
padding: valid
initializers:
bias: &initializer2
typename: Uniform
args:
# 1 / sqrt(4 * 4 * 64) = 0.03125
maxval: 0.031
minval: -0.031
weight: *initializer2
- scope: layer2/ReLU
layer:
name: ReLU
typename: ReLU
- scope: layer3/conv2D
layer:
name: Conv2D
args:
filter_width: 3
filter_height: 3
n_filters: 64
strides: 1
padding: valid
initializers:
bias: &initializer3
name: Uniform
args:
# 1 / sqrt(3 * 3 * 64) = 0.04166
maxval: 0.042
minval: -0.042
weight: *initializer3
typename: Conv2D
args:
filter_width: 3
filter_height: 3
n_filters: 64
strides: 1
padding: valid
initializers:
bias: &initializer3
typename: Uniform
args:
# 1 / sqrt(3 * 3 * 64) = 0.04166
maxval: 0.042
minval: -0.042
weight: *initializer3
- scope: layer3/ReLU
layer:
name: ReLU
typename: ReLU
- scope: layer4/flatten
layer:
name: Flatten
typename: Flatten
- scope: layer5/dense
layer:
name: Dense
args:
n_nodes: 512
initializers:
bias: &initializer5
name: Uniform
args:
# 1 / sqrt(3136) = 0.01785
# 3136 is expected #inputs to this layer when the input size to layer0 is 84 * 84 * 4
maxval: 0.018
minval: -0.018
weight: *initializer5
typename: Dense
args:
n_nodes: 512
initializers:
bias: &initializer5
typename: Uniform
args:
# 1 / sqrt(3136) = 0.01785
# 3136 is expected #inputs to this layer when the input size to layer0 is 84 * 84 * 4
maxval: 0.018
minval: -0.018
weight: *initializer5
- scope: layer5/ReLU
layer:
name: ReLU
typename: ReLU
- scope: layer6/dense
layer:
name: Dense
args:
n_nodes: {n_actions}
initializers:
bias: &initializer6
name: Uniform
args:
# 1 / sqrt(512) = 0.04419
maxval: 0.044
minval: -0.044
weight: *initializer6
typename: Dense
args:
n_nodes: {n_actions}
initializers:
bias: &initializer6
typename: Uniform
args:
# 1 / sqrt(512) = 0.04419
maxval: 0.044
minval: -0.044
weight: *initializer6
Loading

0 comments on commit c0120d4

Please sign in to comment.