Skip to content

Created metrication for inter-proclet communication#1

Open
Bokai-Bi wants to merge 31 commits intoNu-NSDI23:mainfrom
Bokai-Bi:pull-to-original
Open

Created metrication for inter-proclet communication#1
Bokai-Bi wants to merge 31 commits intoNu-NSDI23:mainfrom
Bokai-Bi:pull-to-original

Conversation

@Bokai-Bi
Copy link

  • Added instrumentation for inter-proclet communication. For local communication, each proclet logs the total amount of function calls. For remote communication, each proclet logs the total amount of calls and the total size of data transferred on a per-target-machine basis. Data about local communication is stored in a Counter in the caller's header while data about remote communication is stored in a std::unordered_map also in the caller's header synchronized by its spin_lock. No data is logged on the callee's side.
  • Added a new benchmark bench/bench_proclet_logging, which benchmarks the performance of remote proclet communication. Only works when the main server and remote server are started under specific IPs specified in code.


caller_header->spin_lock.lock();

auto target_kvpair = caller_header->remote_call_map.find(target_ip);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the key is the "target_ip" not simply the "callee_id"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For locality improvements we are interested in the total aggregate communication from a proclet to all other machines, using the ip of the callee allows us to collect data on a per-target-machine basis. Using callee_id as key will not be optimal since proclets can migrate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually feel that using the callee_id might be better as it tells us whether should be colocate certain proclet pairs to improve locality. Using the per-machine metric hides all these details.

ProcletSlabGuard slab_guard(&caller_header->slab);
NodeIP target_ip = get_runtime()->rpc_client_mgr()->get_ip_by_proclet_id(id);

caller_header->spin_lock.lock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will limit the invocation tput of the caller proclet to roughly 1MOPS

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that the case? (out of interest)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because of the call to get_ip_by_proclet_id or the locking? The locking can theoretically be removed by replacing the map with an array that can be unsafely modified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the locking. Yeah making it lockless would be better.

Copy link
Contributor

@zainryan zainryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Bokai, it looks functionally correct! I have some comments wrt to the metric and synchronization, please look at the embedded comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants