Skip to content
This repository was archived by the owner on Apr 3, 2026. It is now read-only.

S3 maps generation and loading#108

Merged
sasmith merged 20 commits into
mainfrom
s3-mapgen
Apr 23, 2025
Merged

S3 maps generation and loading#108
sasmith merged 20 commits into
mainfrom
s3-mapgen

Conversation

@berekuk
Copy link
Copy Markdown
Contributor

@berekuk berekuk commented Apr 20, 2025

@berekuk berekuk requested a review from sasmith April 20, 2025 01:57
Comment thread mettagrid/objects/constants.pyx Outdated
return ObjectTypeNames

def get_object_type_ascii():
return ObjectTypeAscii
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use ascii symbols from Cython render_ascii code, so I needed these to decode the symbols back to names.

This is not 100% safe, btw - in ascii, agent types are lost. Would be good to clean up later.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should clean this up. i used ascii in the env for fast debugging, and it was never meant to be used for anything else. instead of getting ascii from the env, can you add a "symbols" config somewhere, and use that instead?

Comment thread tests/mapgen.py
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to tools/, as requested by David in Asana

daveey
daveey previously requested changes Apr 20, 2025
Comment thread configs/index_s3_maps.yaml Outdated
@@ -0,0 +1,11 @@
defaults:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to index, can we just read the dir content from s3?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be somewhat slow with 10k+ maps - s3 only allows to list 1k maps at a time, so would require pagination.

aws s3 ls s3://softmax-public/maps/ for me takes 1 second. might be because I'm far from the US or because of https cold start, but I didn't want to take a risk and waiting for 10 seconds for each map. Maybe I should measure it and pagination is well-optimized?

Even keeping the index on s3 is somewhat slow and could be 10x multiplier on data sent... Which is why I tried to keep that part agnostic.

I could add lazy indexing & local cache if you think it'd be better (but then I'd need to come up with conventions on where to cache, which is another layer of configuration, so I wanted to avoid that).

Comment thread tools/mapgen.py Outdated


@hydra.main(version_base=None, config_path="../configs", config_name="mapgen")
def main(cfg):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use argparse here isntead of hydra? i think it's a better match for utilities. we can still use hydra.compose() to load the map config, but other than that, i think this adds unnecessary configs and complexity.

python -m tools.mapgen --output-uri=s3://softmax-public/maps --max-maps=5 --num-workers=4 --env=env/mettagrid/simple

it would loop while output-uri has fewer than max-maps, creating and dumping a map in there with a random name. it would use a multiprocessing pool to do this in parallel with --num-workers

i don't want to maintain an index file, because it makes paralell generation harder, and listing a directory is easy

Comment thread tools/mapgen.py Outdated
ascii_map.save(target_uri)

# Show the map if requested
show_env(env, cfg.mapgen.show)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add --view flag that shows each map. by default output-uri can be /tmp/metta-maps and max-maps can be inf, so that when view is used, it dumps a new random map into /tmp/ and displays it

Comment thread mettagrid/objects/constants.pyx Outdated
return ObjectTypeNames

def get_object_type_ascii():
return ObjectTypeAscii
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should clean this up. i used ascii in the env for fast debugging, and it was never meant to be used for anything else. instead of getting ascii from the env, can you add a "symbols" config somewhere, and use that instead?

Comment thread mettagrid/map/utils/serialization.py Outdated
return len(self.lines)

@staticmethod
def from_env(env: MettaGridEnv, gen_time: float) -> "StorableMap":
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems un-nessary. we should not need an env to get a map, since maps are needed to make an env.

from mettagrid.map.scene import Scene, TypedChild


class RemoveAgents(Scene):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't love this pattern. i don't think i understand the problem you're trying to solve. we don't have to define game.num_agents, but we do need to compute it from the map, and make sure it doesn't change during a training run. open to ideas on how to do that best.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not happy about game.num_agents config duplication either (previously discussed here), but I think this handles a valid scenario even if we get rid of num_agents.

If we have the pre-generated maps, it's reasonable to want to run modifications of them without regenerating the map from scratch. Compare: in Starcraft, you can start the same for N players with M<N players.

Scenes already support drawing on top of existing maps, so this is not that much of a hack. (My initial take was to call this ForceAgentCount and support both increasing and decreasing the number of agents, but sharing code with Random scene was too annoying and I used two scene calls instead.)

In general, my policy for adding new scenes is to add things that might be useful, and we can remove the stuff that's not used later.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's ship this for now. Wanting to update pre-existing maps makes sense. Depending on how much that happens we may want a generic set of "update map" utilities, but this seems like a good tool to figure out what we'll want.

@@ -0,0 +1,14 @@
from mettagrid.map.node import Node
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this if we don't removeAgents?

Copy link
Copy Markdown
Contributor Author

@berekuk berekuk Apr 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically we don't need Nop even with RemoveAgents, it's here just for a bit more clear config.

Load random uses this:

extra_root:
  _target_: mettagrid.map.scenes.nop.Nop

  children:
    - where: full
      scene:
        _target_: mettagrid.map.scenes.remove_agents.RemoveAgents

    - where: full
      scene:
        _target_: mettagrid.map.scenes.random.Random
        agents: 40

Instead, it could be:

extra_root:
  _target_: mettagrid.map.scenes.remove_agents.RemoveAgents

  children:
    - where: full
      scene:
        _target_: mettagrid.map.scenes.random.Random
        agents: 40

These are equivalent, I think the former reads a bit better, but not too stuck up on it.

(I'm not sure extra_root is the best option here, maybe Load and LoadRandom generators should accept the plain list of scenes, or maybe even MapGen generator should take the list of scenes instead of a single root scene. where: full is kind of ugly.)

Comment thread mettagrid/map/load_random.py Outdated
python -m tools.index_s3_maps index_s3_maps.dir=s3://...
"""

def __init__(self, index_uri: str, extra_root: SceneCfg | None = None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just support list-dir

@berekuk berekuk requested a review from daveey April 21, 2025 22:17
@berekuk
Copy link
Copy Markdown
Contributor Author

berekuk commented Apr 21, 2025

  • rewrote mapgen.py and index_s3_maps.py with argparse
  • changed LoadRandom to list s3 directly; moved previous version with index file to LoadRandomFromIndex in case we need it later
  • changed StorableMap so that it works from numpy grid instead of env's render_ascii (not all symbols are supported; I'll upgrade the format for storing maps soon, want to avoid pickle, I guess the easiest option is CSV? I don't love that it's less readable though)
  • remove a few places where Hydra was used; scenes in mapgen configs are now addressed by the path (unfortunately this means that scene paths are absolute from the mettagrid root... will continue to think how to improve, ideally we want to track DictConfig source somehow, wrap OmegaConf.load so that it returns our patched version of DictConfig that's file source-aware?)

Comment thread mettagrid/map/node.py Outdated
else:
# Type check and handling
if isinstance(where, dict) and "tags" in where:
if (isinstance(where, dict) or isinstance(where, DictConfig)) and "tags" in where:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a) This smells funny, and makes me want to move in the direction of having a "where" object. I don't think we need to do that yet.
(b) isinstance's second argument can be a tuple of classes. So this could be if isinstance(where, (dict, DictConfig)): ...

Comment thread mettagrid/map/utils/s3utils.py
Comment thread mettagrid/map/utils/s3utils.py Outdated
Comment thread mettagrid/map/utils/s3utils.py Outdated
s3 = get_s3_client()
s3.put_object(Bucket=bucket, Key=key, Body=text)
else:
with open(uri, "w") as f:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm somehow really confused, this won't open uri as a uri, but as a file. For example, you should be passing path/my_file.txt rather than file://path/my_file.txt.

My preference would be for us to be using uris, vs mixing s3 uris and filenames, so I think this should be changed to expect and strip the file:// prefix.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm treating uri here in a loose sense, "the parameter that can be one of these syntax versions".

I tried to cover some subset of other config options in metta that are called "uri" - policy_uri and eval_db_uri. AFAICT both of these support plain file names.

I've changed to code to strip file:// but I'd prefer to keep the support for filenames. I like running mapgen with python -m tools.mapgen --output-dir=., and there'd be no auto-completion with file:// version.


Btw, there's a whole rabbit hole with "how relative paths are treated in file URIs" - file://path/to/file is supposed to be the absolute /path/to/file, and none of uri functions in metta repo do this... My code does things in the same incorrect way, for now (blindly strips file://).

Maybe we'll be able to extract these common functions under mettautils after we get monorepo?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed to code to strip file:// but I'd prefer to keep the support for filenames. I like running mapgen with python -m tools.mapgen --output-dir=., and there'd be no auto-completion with file:// version.

Okay, this sounds good. I still feel squeamish about the mixing of bonafide uris and not-quite-uris, but don't need to be dogmatic.

Maybe we'll be able to extract these common functions under mettautils after we get monorepo?

Agreed.

Comment thread configs/index_s3_maps.yaml Outdated
Comment thread configs/index_s3_maps.yaml Outdated
Comment thread docs/mapgen.md Outdated
Comment thread tools/mapgen.py Outdated
@berekuk berekuk requested a review from sasmith April 22, 2025 13:50
Comment thread docs/mapgen.md Outdated
Preview a random map:

```bash
python -m tools.mapgen ./configs/game/map_builder/load_random.yaml --overrides='dir=s3://BUCKET/DIR'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing, since nothing in this command would make me expect I'm going to view a map.

For now, let's separate mapgen commands from view commands, since it's easier to reason about.


logger = logging.getLogger(__name__)

ascii_symbols = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay for this diff, but let's fast-follow with consolidating these into mettagrid.yaml. E.g.,

  objects:
    altar:
      hp: 30
      input_battery: 3
      output_heart: 1
      max_output: 5
      conversion_ticks: 1
      cooldown: ${sampling:1, 20, 10}
      initial_items: 1
      ascii_symbol: a
...

Copy link
Copy Markdown
Contributor Author

@berekuk berekuk Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several quick thoughts:

  • ascii is not a good format to store maps anyway (see all the exceptions in grid_object_to_ascii function in this file; no support for colored mines and agents)
  • DCSS maps have metadata on symbols in maps, SUBST: 1=agent.prey, 2=agent.predator, that might be interesting to try
  • or just dump the full CSV with long names? sacrifice readability, but more robust
  • generally, I really want to unify all our serialization formats and/or have tools for converting between them, I know we have at least four by now (this ascii, another completely different ascii in mettagrid.config.room.ascii, full names, int type ids...) - this is something I'm keeping in mind as high priority

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah. We're with you on not using ascii in the long run, and were considering asking you to rip it out -- but it's still being used by folks.

Comment thread tools/mapgen.py Outdated
ShowMode = Literal["raylib", "ascii", "ascii_border"]


def show_map(storable_map: StorableMap, mode: ShowMode | None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the comment on the readme, can we split this into a separate entry point (mapview.py / mapshow / whatever). Maybe with a subdir, so we get

python -m tools.map.gen and python -m tools.map.view.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; tools.map.gen can still do everything it could before, because "gen-and-view" is convenient. tools.map.view supports the same heuristics for dir vs file and reuses "load random" code for dirs.

Comment thread tools/mapgen.py Outdated

def main():
parser = argparse.ArgumentParser()
parser.add_argument("--output-dir", type=str, help="Output directory, e.g. ./maps or s3://.../dir")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we figure out how to reasonably collapse --output-dir and --output-uri? Sorry for the back and forth, and for at least some level of inconsistency on my part. Dave and I chatted, and where we've landed is

  1. no output-dir, just output-uri.
  2. output-uri is optional. If not provided, then assert count is 1 and dump to stdout.
  3. If output-uri has a scheme, it should be s3 or file. If it's not provided, assume it's file.
  4. If output-uri ends with a file suffix (let's say . and <= 4 characters? Seems like .yaml matters the most, and .txt / .map / similar also seem reasonable), then assume it's the full target, assert count is 1, and write the output there.
  5. Otherwise, assume output-uri is a directory (/ prefix)
    --output-uri can be

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like these heuristics are going to backfire for someone eventually, but ok :) Changed according to your spec. (I don't have any better ideas...)

Note: I realize that map.gen.view could be a bit more clever and test "is this a dir?" empirically instead of relying on heuristics. I did it with heuristics for now (same uri_is_file function in both scripts), because all combinations of file/dir and local/s3 would be annoying to handle; on s3 I think it's possible that name is both a dir and a file, because dirs don't really exist. And also because I expect that if heuristics will misfire then tools.map.gen will break too.

from mettagrid.map.scene import Scene, TypedChild


class RemoveAgents(Scene):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's ship this for now. Wanting to update pre-existing maps makes sense. Depending on how much that happens we may want a generic set of "update map" utilities, but this seems like a good tool to figure out what we'll want.

Comment thread docs/mapgen.md
@berekuk
Copy link
Copy Markdown
Contributor Author

berekuk commented Apr 22, 2025

Addressed everything + added CLI tests.

@berekuk berekuk requested a review from sasmith April 22, 2025 22:24
Copy link
Copy Markdown
Contributor

@sasmith sasmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving and merging, and I'll also bring this to the metta repo.

@sasmith sasmith dismissed daveey’s stale review April 23, 2025 07:30

We've discussed and are ready to merge.

@sasmith sasmith merged commit 851931e into main Apr 23, 2025
1 check passed
@berekuk berekuk mentioned this pull request Apr 23, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants