Conversation
| return ObjectTypeNames | ||
|
|
||
| def get_object_type_ascii(): | ||
| return ObjectTypeAscii |
There was a problem hiding this comment.
I use ascii symbols from Cython render_ascii code, so I needed these to decode the symbols back to names.
This is not 100% safe, btw - in ascii, agent types are lost. Would be good to clean up later.
There was a problem hiding this comment.
we should clean this up. i used ascii in the env for fast debugging, and it was never meant to be used for anything else. instead of getting ascii from the env, can you add a "symbols" config somewhere, and use that instead?
There was a problem hiding this comment.
Moved to tools/, as requested by David in Asana
| @@ -0,0 +1,11 @@ | |||
| defaults: | |||
There was a problem hiding this comment.
do we need to index, can we just read the dir content from s3?
There was a problem hiding this comment.
It might be somewhat slow with 10k+ maps - s3 only allows to list 1k maps at a time, so would require pagination.
aws s3 ls s3://softmax-public/maps/ for me takes 1 second. might be because I'm far from the US or because of https cold start, but I didn't want to take a risk and waiting for 10 seconds for each map. Maybe I should measure it and pagination is well-optimized?
Even keeping the index on s3 is somewhat slow and could be 10x multiplier on data sent... Which is why I tried to keep that part agnostic.
I could add lazy indexing & local cache if you think it'd be better (but then I'd need to come up with conventions on where to cache, which is another layer of configuration, so I wanted to avoid that).
|
|
||
|
|
||
| @hydra.main(version_base=None, config_path="../configs", config_name="mapgen") | ||
| def main(cfg): |
There was a problem hiding this comment.
can we use argparse here isntead of hydra? i think it's a better match for utilities. we can still use hydra.compose() to load the map config, but other than that, i think this adds unnecessary configs and complexity.
python -m tools.mapgen --output-uri=s3://softmax-public/maps --max-maps=5 --num-workers=4 --env=env/mettagrid/simple
it would loop while output-uri has fewer than max-maps, creating and dumping a map in there with a random name. it would use a multiprocessing pool to do this in parallel with --num-workers
i don't want to maintain an index file, because it makes paralell generation harder, and listing a directory is easy
| ascii_map.save(target_uri) | ||
|
|
||
| # Show the map if requested | ||
| show_env(env, cfg.mapgen.show) |
There was a problem hiding this comment.
you can add --view flag that shows each map. by default output-uri can be /tmp/metta-maps and max-maps can be inf, so that when view is used, it dumps a new random map into /tmp/ and displays it
| return ObjectTypeNames | ||
|
|
||
| def get_object_type_ascii(): | ||
| return ObjectTypeAscii |
There was a problem hiding this comment.
we should clean this up. i used ascii in the env for fast debugging, and it was never meant to be used for anything else. instead of getting ascii from the env, can you add a "symbols" config somewhere, and use that instead?
| return len(self.lines) | ||
|
|
||
| @staticmethod | ||
| def from_env(env: MettaGridEnv, gen_time: float) -> "StorableMap": |
There was a problem hiding this comment.
this seems un-nessary. we should not need an env to get a map, since maps are needed to make an env.
| from mettagrid.map.scene import Scene, TypedChild | ||
|
|
||
|
|
||
| class RemoveAgents(Scene): |
There was a problem hiding this comment.
i don't love this pattern. i don't think i understand the problem you're trying to solve. we don't have to define game.num_agents, but we do need to compute it from the map, and make sure it doesn't change during a training run. open to ideas on how to do that best.
There was a problem hiding this comment.
I'm not happy about game.num_agents config duplication either (previously discussed here), but I think this handles a valid scenario even if we get rid of num_agents.
If we have the pre-generated maps, it's reasonable to want to run modifications of them without regenerating the map from scratch. Compare: in Starcraft, you can start the same for N players with M<N players.
Scenes already support drawing on top of existing maps, so this is not that much of a hack. (My initial take was to call this ForceAgentCount and support both increasing and decreasing the number of agents, but sharing code with Random scene was too annoying and I used two scene calls instead.)
In general, my policy for adding new scenes is to add things that might be useful, and we can remove the stuff that's not used later.
There was a problem hiding this comment.
Okay, let's ship this for now. Wanting to update pre-existing maps makes sense. Depending on how much that happens we may want a generic set of "update map" utilities, but this seems like a good tool to figure out what we'll want.
| @@ -0,0 +1,14 @@ | |||
| from mettagrid.map.node import Node | |||
There was a problem hiding this comment.
do we need this if we don't removeAgents?
There was a problem hiding this comment.
Technically we don't need Nop even with RemoveAgents, it's here just for a bit more clear config.
Load random uses this:
extra_root:
_target_: mettagrid.map.scenes.nop.Nop
children:
- where: full
scene:
_target_: mettagrid.map.scenes.remove_agents.RemoveAgents
- where: full
scene:
_target_: mettagrid.map.scenes.random.Random
agents: 40
Instead, it could be:
extra_root:
_target_: mettagrid.map.scenes.remove_agents.RemoveAgents
children:
- where: full
scene:
_target_: mettagrid.map.scenes.random.Random
agents: 40
These are equivalent, I think the former reads a bit better, but not too stuck up on it.
(I'm not sure extra_root is the best option here, maybe Load and LoadRandom generators should accept the plain list of scenes, or maybe even MapGen generator should take the list of scenes instead of a single root scene. where: full is kind of ugly.)
| python -m tools.index_s3_maps index_s3_maps.dir=s3://... | ||
| """ | ||
|
|
||
| def __init__(self, index_uri: str, extra_root: SceneCfg | None = None): |
|
| else: | ||
| # Type check and handling | ||
| if isinstance(where, dict) and "tags" in where: | ||
| if (isinstance(where, dict) or isinstance(where, DictConfig)) and "tags" in where: |
There was a problem hiding this comment.
(a) This smells funny, and makes me want to move in the direction of having a "where" object. I don't think we need to do that yet.
(b) isinstance's second argument can be a tuple of classes. So this could be if isinstance(where, (dict, DictConfig)): ...
| s3 = get_s3_client() | ||
| s3.put_object(Bucket=bucket, Key=key, Body=text) | ||
| else: | ||
| with open(uri, "w") as f: |
There was a problem hiding this comment.
Unless I'm somehow really confused, this won't open uri as a uri, but as a file. For example, you should be passing path/my_file.txt rather than file://path/my_file.txt.
My preference would be for us to be using uris, vs mixing s3 uris and filenames, so I think this should be changed to expect and strip the file:// prefix.
There was a problem hiding this comment.
I'm treating uri here in a loose sense, "the parameter that can be one of these syntax versions".
I tried to cover some subset of other config options in metta that are called "uri" - policy_uri and eval_db_uri. AFAICT both of these support plain file names.
I've changed to code to strip file:// but I'd prefer to keep the support for filenames. I like running mapgen with python -m tools.mapgen --output-dir=., and there'd be no auto-completion with file:// version.
Btw, there's a whole rabbit hole with "how relative paths are treated in file URIs" - file://path/to/file is supposed to be the absolute /path/to/file, and none of uri functions in metta repo do this... My code does things in the same incorrect way, for now (blindly strips file://).
Maybe we'll be able to extract these common functions under mettautils after we get monorepo?
There was a problem hiding this comment.
I've changed to code to strip file:// but I'd prefer to keep the support for filenames. I like running mapgen with python -m tools.mapgen --output-dir=., and there'd be no auto-completion with file:// version.
Okay, this sounds good. I still feel squeamish about the mixing of bonafide uris and not-quite-uris, but don't need to be dogmatic.
Maybe we'll be able to extract these common functions under mettautils after we get monorepo?
Agreed.
| Preview a random map: | ||
|
|
||
| ```bash | ||
| python -m tools.mapgen ./configs/game/map_builder/load_random.yaml --overrides='dir=s3://BUCKET/DIR' |
There was a problem hiding this comment.
This is confusing, since nothing in this command would make me expect I'm going to view a map.
For now, let's separate mapgen commands from view commands, since it's easier to reason about.
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| ascii_symbols = { |
There was a problem hiding this comment.
Okay for this diff, but let's fast-follow with consolidating these into mettagrid.yaml. E.g.,
objects:
altar:
hp: 30
input_battery: 3
output_heart: 1
max_output: 5
conversion_ticks: 1
cooldown: ${sampling:1, 20, 10}
initial_items: 1
ascii_symbol: a
...
There was a problem hiding this comment.
Several quick thoughts:
- ascii is not a good format to store maps anyway (see all the exceptions in
grid_object_to_asciifunction in this file; no support for colored mines and agents) - DCSS maps have metadata on symbols in maps,
SUBST: 1=agent.prey, 2=agent.predator, that might be interesting to try - or just dump the full CSV with long names? sacrifice readability, but more robust
- generally, I really want to unify all our serialization formats and/or have tools for converting between them, I know we have at least four by now (this ascii, another completely different ascii in
mettagrid.config.room.ascii, full names, int type ids...) - this is something I'm keeping in mind as high priority
There was a problem hiding this comment.
Oh, yeah. We're with you on not using ascii in the long run, and were considering asking you to rip it out -- but it's still being used by folks.
| ShowMode = Literal["raylib", "ascii", "ascii_border"] | ||
|
|
||
|
|
||
| def show_map(storable_map: StorableMap, mode: ShowMode | None): |
There was a problem hiding this comment.
As per the comment on the readme, can we split this into a separate entry point (mapview.py / mapshow / whatever). Maybe with a subdir, so we get
python -m tools.map.gen and python -m tools.map.view.
There was a problem hiding this comment.
Done; tools.map.gen can still do everything it could before, because "gen-and-view" is convenient. tools.map.view supports the same heuristics for dir vs file and reuses "load random" code for dirs.
|
|
||
| def main(): | ||
| parser = argparse.ArgumentParser() | ||
| parser.add_argument("--output-dir", type=str, help="Output directory, e.g. ./maps or s3://.../dir") |
There was a problem hiding this comment.
Can we figure out how to reasonably collapse --output-dir and --output-uri? Sorry for the back and forth, and for at least some level of inconsistency on my part. Dave and I chatted, and where we've landed is
- no output-dir, just
output-uri. output-uriis optional. If not provided, then assertcountis 1 and dump to stdout.- If
output-urihas a scheme, it should bes3orfile. If it's not provided, assume it'sfile. - If
output-uriends with a file suffix (let's say.and <= 4 characters? Seems like.yamlmatters the most, and.txt/.map/ similar also seem reasonable), then assume it's the full target, assertcountis 1, and write the output there. - Otherwise, assume
output-uriis a directory (/ prefix)
--output-urican be
There was a problem hiding this comment.
I feel like these heuristics are going to backfire for someone eventually, but ok :) Changed according to your spec. (I don't have any better ideas...)
Note: I realize that map.gen.view could be a bit more clever and test "is this a dir?" empirically instead of relying on heuristics. I did it with heuristics for now (same uri_is_file function in both scripts), because all combinations of file/dir and local/s3 would be annoying to handle; on s3 I think it's possible that name is both a dir and a file, because dirs don't really exist. And also because I expect that if heuristics will misfire then tools.map.gen will break too.
| from mettagrid.map.scene import Scene, TypedChild | ||
|
|
||
|
|
||
| class RemoveAgents(Scene): |
There was a problem hiding this comment.
Okay, let's ship this for now. Wanting to update pre-existing maps makes sense. Depending on how much that happens we may want a generic set of "update map" utilities, but this seems like a good tool to figure out what we'll want.
|
Addressed everything + added CLI tests. |
sasmith
left a comment
There was a problem hiding this comment.
Approving and merging, and I'll also bring this to the metta repo.
We've discussed and are ready to merge.
Plus documentation - check out https://github.com/Metta-AI/mettagrid/blob/78921f7856b72d740dcf76322232c3fe6fbb08bc/docs/mapgen.md