diff --git a/graphs/images/command_graph-state.svg b/graphs/images/command_graph-state.svg
new file mode 100644
index 0000000..f3ed6a1
--- /dev/null
+++ b/graphs/images/command_graph-state.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="321px" height="81px" viewBox="-0.5 -0.5 321 81" content="&lt;mxfile host=&quot;Electron&quot; modified=&quot;2022-09-27T13:11:49.328Z&quot; agent=&quot;5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/20.3.0 Chrome/104.0.5112.114 Electron/20.1.3 Safari/537.36&quot; etag=&quot;UhkI7qaPpP-mJElBKW2Z&quot; version=&quot;20.3.0&quot; type=&quot;device&quot;&gt;&lt;diagram id=&quot;U5Sqd6-8AIvC4OhiIFR2&quot; name=&quot;Page-1&quot;&gt;zVVNk5NAEP01HNcKwxqTo8km60Gr1JRlPG2N0MAo0DgMCfjrbZieAOKm3MvWnmb69cfMvH40XrDNm3sty/QDRpB5YhE1XnDnCeGv1wtaOqRlRAQriyRaRYwNwEH9BgY5MalVBNUk0CBmRpVTMMSigNBMMKk1nqdhMWbTU0uZwAw4hDKbo19VZFKLrsSbAX8HKkndyf5ybT25dMH8kiqVEZ5HULDzgq1GNHaXN1vIOvYcLzZv/4j3cjENhfmfBEj2x/ph9enhp6/begPi85fy5tZWOcms5gd7YplRvU2MVJZubVqmYvmrRue4qfpGvaUAf1k2g5N2SbfuVSGzLoar0bVsQetmRi61BUTEOpuoTYoJUoHdgG401kUE3VsWZA0x7xFLAn0Cf4AxLUtI1gYJSk2esRcaZY6j/beu1KvXbN01XLk3WmcURrfHsTHK6swhrbdc3rw33K4Kax3ClYY4jUudgLkSJ2xcx9voAO78PWAOdB8K4O/OiVBDJo06TcUt+RtJLmmXSh9R9TJwIRjHFd1rpDPajA4coF59T1CiP1MiTRMVK/md2v+3XM6pMnAoZU/lmUbPtNWyKu0siFXTSebxfpxAG2iuMsjeFTPIhN6yeR4NBYbS0Txw2L8onjD4VLrEjK5dA2FtXgpdgXguvsgchqhV4/AvCnZ/AA==&lt;/diagram&gt;&lt;/mxfile&gt;"><defs/><g><path d="M 80 40 L 233.63 40" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 238.88 40 L 231.88 43.5 L 233.63 40 L 231.88 36.5 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" style="overflow: visible; text-align: left;"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 20px; margin-left: 160px;"><div data-drawio-colors="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); " style="box-sizing: border-box; font-size: 0px; text-align: center;"><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;"><font style="font-size: 16px">Finalize</font></div></div></div></foreignObject><text x="160" y="23" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="11px" text-anchor="middle">Finalize</text></switch></g><rect x="0" y="0" width="80" height="80" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" style="overflow: visible; text-align: left;"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 40px; margin-left: 1px;"><div data-drawio-colors="color: rgb(0, 0, 0); " style="box-sizing: border-box; font-size: 0px; text-align: center;"><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Modifiable</div></div></div></foreignObject><text x="40" y="44" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Modifiable</text></switch></g><rect x="240" y="0" width="80" height="80" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility" style="overflow: visible; text-align: left;"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 40px; margin-left: 241px;"><div data-drawio-colors="color: rgb(0, 0, 0); " style="box-sizing: border-box; font-size: 0px; text-align: center;"><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Executable</div></div></div></foreignObject><text x="280" y="44" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Executable</text></switch></g></g><switch><g requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"/><a transform="translate(0,-5)" xlink:href="https://www.diagrams.net/doc/faq/svg-export-text-problems" target="_blank"><text text-anchor="middle" font-size="10px" x="50%" y="100%">Text is not SVG - cannot display</text></a></switch></svg>
\ No newline at end of file
diff --git a/graphs/images/queue-state.svg b/graphs/images/queue-state.svg
new file mode 100644
index 0000000..d51956d
--- /dev/null
+++ b/graphs/images/queue-state.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="261px" height="85px" viewBox="-0.5 -0.5 261 85" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2022-02-23T12:06:49.025Z&quot; agent=&quot;5.0 (Windows)&quot; etag=&quot;O6wDKUc8lfXwM-PAlDJA&quot; version=&quot;16.6.2&quot; type=&quot;device&quot;&gt;&lt;diagram id=&quot;08eg8oAtyddRzy_uZ1ND&quot; name=&quot;Page-1&quot;&gt;xVbbcpswEP0aHtPBUN8e40ubB2cmrTtT51GGNSiRtVQIG/r1lUACVF8madrmxdYe7Uqr3bNn8ML5vvwsSJbeYwzMC/y49MKFFwTjSaB+NVA1QDg1QCJo3ECDDljTn2BA36AFjSF3HCUikzRzwQg5h0g6GBECj67bDpl7a0YSOAHWEWGn6Hcay7RBJ8G4w++AJqm9eTCaNjt7Yp3NS/KUxHjsQeHSC+cCUTarfTkHpmtn69LEfbqw2yYmgMuXBCwev02/PJPV6m53PxTD1eZp8nATNqccCCvMg71gxNR5s7TOWVamEKMfhU50tkMub/K6TbfKYeBnZbepVon+n0FCudr9ChGKmPLEHroV1sUiKl99kwXdSwOIVTOMiUKmmCAnbNmhM4EFj0E/0VdW57NCzBQ4UOATSFkZZpFCon6c3DOzCyWVm976UR/1IRgac1Gao2ujsgaXotr0jX6Ytru42rKBjGyBzUj0nNSJz5GhUFscObTv14++2GMD5ViICK401s4KEQnIK35By0Q1wYB7UNmquMqdQgGMSHpw0yJmtJI2zJx0KwSpeg4ZUi5zl5wPGuv54G6Xq0R7PmrRS6mDalq/guKDE4ovS4gKWdPyN74dUyphnZG6tEclaC5XSJ41GrOjpeacadgBhITyestOS2wDfFNjU/KRMY89tTFQ2hMai51rglPB15br40VF0KP/55rQSoAFljw+JxEq6e0ZkWguf0eZsBM/cCd+/LaJt+rjO+ozvq4+f1EmghfKxAUOv0km/qcIBCes7hHv/UWgHed/LgLK7D45mup2323h8hc=&lt;/diagram&gt;&lt;/mxfile&gt;"><defs/><g><path d="M 80 24 L 173.63 24" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 178.88 24 L 171.88 27.5 L 173.63 24 L 171.88 20.5 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 14px; margin-left: 130px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: nowrap;"><h2 style="font-size: 10px">Begin Recording<br /></h2></div></div></div></foreignObject><text x="130" y="17" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="11px" text-anchor="middle">Begin Recording&#xa;</text></switch></g><rect x="0" y="4" width="80" height="80" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 44px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Executing</div></div></div></foreignObject><text x="40" y="48" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Executing</text></switch></g><path d="M 180 64 L 86.37 64" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 81.12 64 L 88.12 60.5 L 86.37 64 L 88.12 67.5 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 74px; margin-left: 130px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: nowrap;"><font style="font-size: 10px"><b>End Recording</b></font></div></div></div></foreignObject><text x="130" y="77" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="11px" text-anchor="middle">End Recording</text></switch></g><rect x="180" y="4" width="80" height="80" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 44px; margin-left: 181px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Recording</div></div></div></foreignObject><text x="220" y="48" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Recording</text></switch></g></g><switch><g requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"/><a transform="translate(0,-5)" xlink:href="https://www.diagrams.net/doc/faq/svg-export-text-problems" target="_blank"><text text-anchor="middle" font-size="10px" x="50%" y="100%">Text is not SVG - cannot display</text></a></switch></svg>
\ No newline at end of file
diff --git a/graphs/index.md b/graphs/index.md
new file mode 100644
index 0000000..ed64df9
--- /dev/null
+++ b/graphs/index.md
@@ -0,0 +1,706 @@
+# SYCL_EXT_CODEPLAY_GRAPHS
+
+| Proposal ID | CP030 |
+|-------------|--------|
+| Name | SYCL_EXT_CODEPLAY_GRAPHS |
+| Date of Creation | 08 February 2022 |
+| Last Update | 13 September 2022 |
+| Version | v1.0 |
+| Target | SYCL 2020 vendor extension |
+| Current Status | _Work In Progress_ |
+| Implemented in | ComputeCpp prototype |
+| Reply-to | Ewan Crawford <ewan@codeplay.com> |
+| Original authors | Ewan Crawford <ewan@codeplay.com>, Duncan McBain <duncan@codeplay.com>, Ben Tracy <ben.tracy@codeplay.com> |
+| Contributors | Ewan Crawford <ewan@codeplay.com>, Duncan McBain <duncan@codeplay.com>, Ben Tracy <ben.tracy@codeplay.com>, Peter Žužek <peter@codeplay.com>,  Ruyman Reyes <ruyman@codeplay.com>, Gordon Brown <gordon@codeplay.com>, Erik Tomusk <erik@codeplay.com>,  Bjoern Knafla <bjoern@codeplay.com>,  Lukas Sommer <lukas.sommer@codeplay.com> |
+
+Paragraphs written in _italics_ are used for non-normative text adding editorial
+comments.
+
+## Motivation
+
+### Introduction
+
+Through the use of command groups SYCL is already able to create a DAG of kernel
+execution at runtime, as a command group object defines a set of requisites
+(edges) which must be satisfied for kernels (nodes) to be executed. However,
+because command-group submission is tied to execution on the queue, without
+having a prior construction step before starting execution, optimization
+opportunities are missed from the runtime not knowing the complete dependency
+graph ahead of execution.
+
+The following benefits would become available if the user could define a
+dependency graph to the SYCL runtime prior to execution:
+
+* Reduction in runtime overhead by only submitting a single graph object, rather
+  than many individual commands.
+
+* Enable more work to be done offline, in particular producing a graph ahead of
+  time allows for improved performance at runtime from reduced overhead.
+
+* Unlock DMA hardware features through graph analysis by the runtime.
+
+* Whole graph optimizations become available, including but not limited to:
+    * Kernel fusion/fission.
+    * Inter-node memory reuse from data staying resident on device.
+    * Identification of the peak intermediate output memory requirement, used
+      for more optimal memory allocation.
+
+As well as benefits to the SYCL runtime, there are also advantages to the user
+developing SYCL applications, as repetitive workloads no longer have to
+redundantly issue the same sequence of commands. Instead, a graph is only
+constructed once and submitted for execution as many times as is necessary, only
+changing the data in input buffers or USM allocations. For machine learning
+applications where the same command group pattern is run repeatedly for
+different inputs, this is particularly useful.
+
+### Vision
+
+This extension currently takes the form of a Codeplay vendor extension to
+make it easier to prototype and distribute an implementation for feedback.
+However, the long term goal is collaboration towards a KHR extension or
+integration into SYCL Next.
+
+Despite being a vendor extension the intention is that SYCL graphs should be
+implementable on multiple backends as enabled by SYCL 2020. Although in the
+short term Codeplay's prototype ComputeCpp implementation primarily targets an
+OpenCL backend with the [command-buffer][opencl-command-buffers] and
+[layered mutable dispatch][opencl-mutable-dispatch] extensions, we are keen not
+to encode any backend assumptions in this SYCL extension.
+
+### Requirements
+
+In order to achieve the goals described in previous sections, the following
+requirements were considered:
+
+1. Ability to update inputs/outputs of the graph between submissions, without
+   changing the overall graph structure.
+2. Enable low effort porting of existing applications to use the extension.
+3. Profiling, debugging, and tracing functionality at the granularity of graph
+   nodes.
+4. Integrate sub-graphs (previously constructed graphs) when constructing a new
+   graph.
+5. Support the USM model of memory as well as buffer model.
+6. Compatible with other SYCL extensions and features, e.g kernel fusion &
+   built-in kernels.
+7. Ability to record a graph with commands submitted to different devices in the
+   same context.
+8. A graph constructed using a device queue may be executed on another compatible
+   queue.
+9. Capability to serialize graphs to a binary format which can then be
+   de-serialized and executed. This is helpful for offline cases where a graph
+   can be created by an offline tool to be loaded and run without the end-user
+   incurring the overheads of graph creation.
+10. Backend interoperability, the ability to retrieve a native graph object from
+    the graph and use that in a native backend API.
+
+To allow for prototype implementations of this extension to be developed
+quickly for evaluation the scope of this proposal was limited to a subset
+of these requirements. In particular, the serialization functionality (9),
+backend interoperability (10), and a profiling/debugging interface (3) were
+omitted. As these are not easy to abstract over a number of backends without
+significant investigation. It is also hoped these features can be exposed as
+additive changes to the API, and so in introduced in future versions of the
+extension.
+
+Another reason for deferring a serialize/deserialize API (9) is that its scope
+could extend from emitting the graph in a binary format, to emitting a
+standardized IR format that enables further device specific graph optimizations.
+
+As described in [Design Discussion](#no-explicit-graph-api) this proposal
+defines an implicit recording design for constructing the graph rather than an
+explicit graph construction API. It is envisioned that an explicit API could
+better express several of the requirements above. In particular the ability
+for a user to specify exactly what device they'd like a node to run on when
+constructing multiple device graphs (7). See the section on
+[single queue submit](#single-queue-submit) for more discussion on the multiple
+device use case.
+
+## Extension
+
+### SYCL Graph Definition
+
+A SYCL graph is a collection of nodes and edges. From the SYCL perspective, this
+graph will be acyclic and directed (DAG) as users cannot express a cycle in the API.
+
+#### Node
+
+Nodes in a SYCL graph are defined as each of the command group submissions of the
+program. Each submission encompasses either one or both of a.) some data movement,
+b.) a single asynchronous kernel launch. Nodes cannot define forward edges, only
+backwards (i.e. kernels can only create dependencies on things that have already
+happened). This means that transparently a node can depend on a previously
+recorded graph (sub-graph), which works by creating edges to the individual nodes
+in the old graph. Explicit memory operations without kernels, such as a memory
+copy, are still classed as nodes under this definition, as the [SYCL 2020
+specification states][explicit-memory-ops] that these can be seen as specialized
+kernels executing on the device.
+
+#### Edge
+
+An edge in the SYCL-graph represents a data dependency between two nodes. These
+dependencies are expressed by the user code through buffer accessors. There is
+also the partial ability to track USM data dependencies provided the pointers
+used in the graph nodes are the same. With the limitation that a node taking
+an offset USM pointer input will not be identified as having an edge to
+another node taking a pointer input to the base address of the same USM
+allocation.
+
+### API Modifications
+
+```cpp
+namespace sycl {
+namespace ext::codeplay {
+
+// State of a queue, returned by info::queue::state
+enum class queue_state {
+  executing,
+  recording
+};
+
+// State of a graph
+enum class graph_state {
+  modifiable,
+  executable
+};
+
+// New object representing graph
+template<graph_state State = graph_state::modifiable>
+class command_graph {
+public:
+  /* Available only when: (State == graph_state::modifiable) */
+  command_graph(const property_list &propList = {});
+  /* Available only when: (State == graph_state::modifiable) */
+  command_graph<graph_state::executable> finalize(context &syclContext) const;
+  /* Available only when: (State == graph_state::executable) */
+  void update(const command_graph<graph_state::modifiable> &graph);
+};
+}  // namespace ext::codeplay
+
+// New methods added to the sycl::queue class
+class queue {
+public:
+  bool begin_recording(command_graph<graph_state::modifiable> &graph);
+  bool end_recording();
+  event submit(command_graph<graph_state::executable> graph);
+};
+}  // namespace sycl
+```
+
+#### Feature Test Macro
+
+This extension defines the [feature test macro][feature test macro]
+`SYCL_EXT_CODEPLAY_GRAPHS` which will be set to one of the following values
+depending on the version of the specification supported.
+
+| Value | Description            |
+|-------|------------------------|
+| 1     | First released version |
+
+#### New Graph Class
+
+This extension adds a new `command_graph` object which follows the
+[common reference semantics][common reference semantics] of other SYCL runtime
+objects.
+
+The `command_graph` object represents a collection of
+[command-group nodes](#node) and [their execution dependencies](#edge). A
+graph is built up by recording queue submissions, then once the user is happy
+that the graph is complete, the graph instance is finalized into an executable
+variant which can have no more nodes added to it. Finalization may be a
+computationally expensive operation as the runtime is able to perform
+optimizations based on the graph structure. After finalization the graph can be
+submitted for execution on a queue one or more times with reduced overhead.
+
+It is possible to submit an executable graph to a queue which is in the
+recording state. This will cause the graph being submitted to be added as a node
+in the graph being recorded to by the queue, this graph node is referred to as a
+*sub-graph*.
+
+##### Graph State
+
+An instance of a `command_graph` object can be in one of two states:
+
+* Modifiable - Graph is under construction and new nodes may be added to it.
+* Executable - Graph topology is fixed after finalization and graph is ready to
+  be submitted for execution.
+
+A `command_graph` object is constructed in the *recording* state and is made
+*executable* by the user invoking `command_graph::finalize()` to create a
+new executable instance of the graph. An *executable* graph cannot be converted
+to a *modifiable* graph. After finalizing a graph in the
+*modifiable* state it is valid for a user to add additional nodes and finalize
+again to create subsequent *executable* graphs. The state of a `command_graph` object
+is made explicit by templating on state to make the class strongly typed, with
+the default template argument being `graph_state::modifiable` to reduce code
+verbosity on construction.
+
+| ![](images/command_graph-state.svg) |
+|:--:|
+| <center> **Graph State Diagram** </center> |
+
+##### Graph Update
+
+A graph in the *executable* state can have each nodes inputs & outputs updated
+using the `command_graph::update()` method. This takes a graph in the
+*modifiable* state and updates the executable graph to use the node input &
+outputs of the modifiable graph. The modifiable graph must have the same
+topology as the graph originally used to create the executable graphs, with the
+nodes added in the same order.
+
+##### Graph Member Functions
+
+```cpp
+command_graph::command_graph<graph_state::modifiable>(const property_list &propList = {});
+```
+
+Creates a SYCL `command_graph` object in the *modifiable* state.
+Zero or more properties can be provided to the constructed SYCL `command_graph`
+via an instance of `property_list`.
+
+Preconditions:
+
+* This constructor is only available when the `command_graph` state is
+  `graph_state::modifiable`.
+
+Parameters:
+
+* `propList` - Optional parameter for passing properties. No new properties are
+  defined by this extension.
+
+```cpp
+command_graph<graph_state::executable> command_graph<graph_state::modifiable>::finalize(context &syclContext) const;
+```
+Synchronous operation that creates a graph in the *executable* state with a
+fixed topology that can be submitted for execution on any queue sharing the supplied context.
+It is valid to call this method multiple time to create subsequent executable graphs. It is also
+valid to continue to add new nodes to the modifiable graph instance after calling this
+function. It is valid to finalize an empty graph instance with no recorded
+commands.
+
+Preconditions:
+
+* This member function is only available when the `command_graph` state is
+  `graph_state::modifiable`.
+
+Parameters:
+
+* `syclContext` - The context asscociated with the queues to which the executable graph will be able to be submitted.
+
+Returns: An executable graph object which can be submitted to a queue.
+
+```cpp
+void command_graph<graph_state::executable> update(const command_graph<graph_state::modifiable> &graph);
+```
+
+Updates the executable graph node inputs & outputs from a topologically
+identical modifiable graph. The effects of the update will be visible
+on the next submission of the executable graph without the need for additional
+user synchronization.
+
+Parameters:
+
+* `graph` - Modifiable graph object to update graph node inputs & outputs with.
+  This graph must have the same topology as the original graph used on
+  executable graph creation.
+
+Preconditions:
+
+* This member function is only available when the `command_graph` state is
+  `graph_state::executable`.
+
+Exceptions:
+
+* Throws synchronously with error code `invalid` if the topology of `graph` is
+  not the same as the existing graph topology, or if the nodes were not added in
+  the same order.
+
+#### Queue Class Modifications
+
+This extension modifies the [SYCL queue class][queue class] such that
+[state](#queue-state) is introduced to queue objects, allowing an instance to be
+put into a mode where command-groups are recorded to a graph rather than
+submitted immediately for execution.
+
+[Three new member functions](#new-queue-member-functions) are also added to the
+`sycl::queue` class with this extension. Two functions for selecting the state
+of the queue, and another function for submitting a graph to the queue.
+
+##### Queue State
+
+The `sycl::queue` object can be in either of two states. The default
+`queue_state::executing` state is where the queue has its normal semantics of
+submitted command-groups being immediately scheduled for asynchronous execution.
+
+The alternative `queue_state::recording` state is used for graph construction.
+Instead of being scheduled for execution, command-groups submitted to the queue
+are recorded to a graph object as new nodes for each submission. After recording
+has finished and the queue returns to the executing state, the recorded commands are
+not then executed, they are transparent to any following queue operations.
+
+| ![Queue State](images/queue-state.svg) |
+|:--:|
+| <center> **Queue State Diagram** </center> |
+
+The state of a queue can be queried with `queue::get_info` using template parameter
+`info::queue::state`. The following entry is added to the
+[queue info table][queue info table] to define this query:
+
+| Queue Descriptors    | Return Type                  | Description                    |
+|----------------------|------------------------------|--------------------------------|
+| `info::queue::state` | `ext::codeplay::queue_state` | Returns the state of the queue |
+
+A default constructed event is returned when a user submits a command-group to
+a queue in the recording state. These events have status
+`info::event_command_status::complete` and a user waiting on them will return
+immediately.
+
+##### Queue Properties
+
+There are [two properties][queue properties] defined by the core SYCL
+specification that can be passed to a `sycl::queue` on construction via the
+property list parameter. They interact with this extension in the following
+ways:
+
+1. `property::queue::in_order` - When a queue is created with the in-order
+   property, recording its operations results in a straight-line graph, as each
+   operation has an implicit dependency on the previous operation. However,
+   a graph submitted to an in-order queue will keep its existing structure such
+   that the complete graph executes in-order with respect to the other
+   command-groups submitted to the queue.
+
+2. `property::queue::enable_profiling` - This property has no effect on graph
+   recording. When set on the queue a graph is submitted to however, it allows
+   profiling information to be obtained from the event returned by a graph
+   submission.
+
+For any other queue property that is defined by an extension, it is the
+responsibility of the extension to define the relationship between that queue
+property and this graph extension.
+
+##### New Queue Member Functions
+
+```cpp
+bool queue::begin_recording(ext::codeplay::command_graph<graph_state::modifiable> &graph)
+```
+
+Synchronously changes the state of the queue to the `queue_state::recording`
+state.
+
+Parameters:
+
+* `graph` - Graph object to start recording commands to.
+
+_The `command_graph` object is passed by reference in this entry point,
+rather than by value, to make it clear that it is not correct behavior
+for a graph instance to be destroyed while a queue is recording commands
+to that graph_.
+
+Returns: `true` if the queue was previously in the `queue_state::executing`
+state, `false` otherwise.
+
+Exceptions:
+
+* Throws synchronously with error code `invalid` if the queue is already
+  recording to a different graph.
+
+```cpp
+bool queue::end_recording()
+```
+
+Synchronously changes the state of the queue to the `queue_state::executing`
+state.
+
+Returns: `true` if the queue was previously in the `queue_state::recording`
+state, `false` otherwise.
+
+```cpp
+event queue::submit(ext::codeplay::command_graph<graph_state::executable> graph)
+```
+
+When invoked with the queue in the `queue_state::recording` state, a graph is
+added as a subgraph node. When invoked with the queue in the default
+`queue_state::executing` state, the graph is submitted for execution. There are
+no gurantees that more than one instance of `graph` will execute concurrently.
+Submitting a graph for execution, before a previous execution has been completed
+may result in serialized execution depending on the SYCL backend and
+characteristics of the graph.
+
+Parameters:
+
+* `graph` - Graph object to start recording commands to.
+
+When the queue is in the execution state, an `event` object used to synchronize
+graph submission with other command-groups is returned. Otherwise the queue is
+in the recording state, and a default event is returned.
+
+_The `command_graph` object is passed by value in this entry point,
+rather than by reference, to support the use case where a graph instance is
+submitted to a queue and the graph is then destroyed. Copying the graph allows
+the queue to still be able to execute the graph submission._
+
+#### Error Handling
+
+Errors are reported through exceptions, as usual in the SYCL API. For new APIs,
+submitting a graph for execution can generate unspecified asynchronous errors,
+while `command_graph::finalize()` may throw unspecified synchronous exceptions.
+Synchronous exception errors codes are defined for both
+`queue::begin_recording()` and `command_graph::update()`.
+
+When a queue is in recording mode asynchronous exceptions will not be
+generated, as no device execution is occuring. Synchronous errors specified as
+being thrown in the default queue executing state, will still be thrown when a
+queue is in the recording state.
+
+The `queue::begin_recording` and `queue::end_recording` entry-points return a
+`bool` value informing the user whether a state change occurred. False is
+returned rather than throwing an exception when state isn't changed. This design
+is because the queue is already in the state the user desires, so if the
+function threw an exception in this case, the application would likely swallow
+it and then proceed.
+
+#### Thread Safety in new functionality
+
+The new functions in this extension are thread-safe, the same as member
+functions of classes in the base SYCL specification. If user code does
+not perform synchronisation between two threads accessing the same queue,
+there is no strong ordering between events on that queue, and the kernel
+submissions, recording and finalization will happen in an undefined order.
+
+In particular, when one thread ends recording on a queue while another
+thread is submitting work, which kernels will be part of the subsequent
+graph is undefined. If user code enforces a total order on the queue
+events, then the behaviour is well-defined, and will match the observable
+total order.
+
+The returned value from the `info::queue::state` should be considered
+immediately stale in multi-threaded usage, as another thread could have
+preemptively changed the state of the queue.
+
+#### Storage lifetimes in graph submissions
+
+The lifetime of any buffer recorded as part of a submission
+to a command graph will be extended in keeping with the common reference
+semantics and buffer synchronization rules in the SYCL specification. It will be
+extended either for the lifetime of the graph (including both modifiable graphs
+and the executable graphs created from them) or until the buffer is no longer
+required by the graph (such as after being replaced through whole graph update).
+
+#### Host tasks
+
+A [host task][host task] is a native C++ callable, scheduled according to SYCL
+dependency rules. It is valid to record a host task as part of graph, though it
+may lead to sub-optimal graph performance because a host task node prevents the
+SYCL runtime from submitting the whole graph to the device at once.
+
+Host tasks can be updated as part of [whole graph update](#whole-graph-update)
+by replacing the whole node with the new callable. In a future explicit graph
+building API we envisage the updating of individual inputs/outputs of
+a host task node to be disallowed, as swapping out lambda captures is not
+possible in C++. Instead the whole host task callable would be replaced.
+
+### Example Usage
+
+The following snippet of code shows how a SYCL `queue` can be put into a
+recording state, which allows a `command_graph` object to be populated
+by the command-groups submitted to the queue. Once the graph is complete,
+recording finishes on the queue to put it back into the default executing
+state. The graph is then finalized so that no more nodes can be added. Lastly,
+the graph is submitted as a whole for execution via
+`queue::submit(command_graph<graph_state::executable>)`.
+
+```cpp
+  queue q{default_selector{}};
+
+  // New object representing graph of command-groups
+  ext::codeplay::command_graph<graph_state::modifiable> graph;
+  {
+    buffer<T> bufferA{dataA.data(), range<1>{elements}};
+    buffer<T> bufferB{dataB.data(), range<1>{elements}};
+    buffer<T> bufferC{dataC.data(), range<1>{elements}};
+
+    // `q` will be put in the recording state where commands are recorded to
+    // `graph` rather than submitted for execution immediately.
+    q.begin_recording(graph);
+
+    // Record commands to `graph` with the following topology.
+    //
+    //      increment_kernel
+    //       /         \
+    //   A->/        A->\
+    //     /             \
+    //   add_kernel  subtract_kernel
+    //     \             /
+    //   B->\        C->/
+    //       \         /
+    //     decrement_kernel
+
+    q.submit([&](handler &cgh) {
+      auto pData = bufferA.get_access<access::mode::read_write>(cgh);
+      cgh.parallel_for<increment_kernel>(range<1>(elements),
+                                         [=](item<1> id) { pData[id]++; });
+    });
+
+    q.submit([&](handler &cgh) {
+      auto pData1 = bufferA.get_access<access::mode::read>(cgh);
+      auto pData2 = bufferB.get_access<access::mode::read_write>(cgh);
+      cgh.parallel_for<add_kernel>(range<1>(elements),
+                                   [=](item<1> id) { pData2[id] += pData1[id]; });
+    });
+
+    q.submit([&](handler &cgh) {
+      auto pData1 = bufferA.get_access<access::mode::read>(cgh);
+      auto pData2 = bufferC.get_access<access::mode::read_write>(cgh);
+      cgh.parallel_for<subtract_kernel>(
+          range<1>(elements), [=](item<1> id) { pData2[id] -= pData1[id]; });
+    });
+
+    q.submit([&](handler &cgh) {
+      auto pData1 = bufferB.get_access<access::mode::read_write>(cgh);
+      auto pData2 = bufferC.get_access<access::mode::read_write>(cgh);
+      cgh.parallel_for<decrement_kernel>(range<1>(elements), [=](item<1> id) {
+        pData1[id]--;
+        pData2[id]--;
+      });
+    });
+
+    // queue will be returned to the executing state where commands are
+    // submitted immediately for extension.
+    q.end_recording();
+  }
+
+  // Finalize the modifiable graph to create an executable graph that can be
+  // submitted for execution.
+  ext::codeplay::command_graph<graph_state::executable> exec_graph = graph.finalize(q.get_context());
+
+  // Execute graph
+  q.submit(exec_graph);
+```
+
+### Design Discussion
+
+#### No Explicit Graph API
+
+The current proposal focuses on an API which implicitly creates a graph by
+recording commands, rather than an explicit graph creation API. Primarily the
+benefits of this design are it being easier for users to switch their code to
+use SYCL graphs with minimal changes, as well as less pressure on initial
+implementations.
+
+It is envisioned that in the future an explicit graph API will be
+additionally included in the extension and both interfaces would be available to
+the user, as with [CUDA Graphs][cuda-graphs]. The design of this explicit API
+will be informed by early implementations of the current proposal. This is an
+area we are looking to collaborate with other vendors on towards a KHR extension
+or SYCL Next.
+
+#### Single Queue Submit
+
+This extension allows a graph to be recorded from multiple different queues,
+but only submitted to a single queue. The device associated with a queue is not
+captured by the graph recording, only the commands submitted and their
+dependencies. The device used for graph execution is the device associated with
+the queue the graph is submitted to, and this device will execute the complete
+graph regardless of how the graph was composed from different queues during
+recording.
+
+The single queue submission design allows reuse of the normal SYCL
+`queue::submit()` mechanism, which is also a natural way to express sub-graphs.
+An alternative approach of calling a `command_graph::run()` method on the graph
+object itself was proposed, but this made sub-graph capture more difficult and
+is inconsistent with the normal SYCL execution model.
+
+The solution for users who would like an execution graph to contain nodes which
+execute on different devices in their platform, is to construct a separate
+executable graph object representing a sub-graph for each device they'd like to
+target. The user must then manually create and schedule the larger graph by
+using events as dependencies to connect individual device sub-graphs as
+they are submitted. This approach loses the separation of concerns between
+defining and scheduling a complete graph, and some of the advantages which go
+with it, however it is intended that better support for the multi-device graph
+use case will be provided by a future explicit graph building API.
+
+Additionally, if support for multiple devices per queue changes in future SYCL
+versions, then this extension can potentially take advantage of it to allow the
+above, rather than prematurely rolling its own support.
+
+#### Whole Graph Update
+
+Providing a way to update the inputs & outputs to a graph between submissions,
+without modifying the overall topology of the graph, is important functionality
+for enabling our target Machine Learning use cases.
+
+Methods for explicitly replacing inputs/edges in an already constructed graph
+present many potential problems for both API design and implementers. A simple
+find/replace API, e.g. `graph.update(oldBufferList, newBufferList)`, was
+considered but the potential implementation was deemed too problematic, and
+would require the application to do onerous book keeping to track updates made
+to the graph.
+
+Explicit input/edge replacement would also make more sense as part of an
+[explicit graph creation API](#no-explicit-graph-api), rather than the implicit
+capture API from this version of the proposal. For that reason we have decided
+to defer adding it until the explicit graph creation API is introduced.
+
+Instead, this proposal defines a whole graph update mechanism for updating
+inputs/edges. This relies on the computationally cheap graph recording being
+separated from the computation heavy finalization, and allows for the user to easily
+reuse code with new inputs to re-record and update an existing graph all at
+once. This is particularly useful for code with many inputs that need updating,
+where a manual update could be very complex.
+
+Whole graph update is an idea influenced by
+[CUDA Graphs - Whole Graph Update][cuda-whole-graph-update] and the
+`command_graph::update()` method for performing this operation is an equivalent
+to `cudaGraphExecUpdate`.
+
+If a SYCL backend does not support a mechanism for updating the arguments of
+existing commands, then this functionality can be emulated in the SYCL runtime
+by constructing a new backend graph object.
+
+#### Explicit Finalize
+
+Separating graph creation into two stages, begin/end recording and explicit
+finalization, formalizes the separation of concerns between the expensive
+operations involved in executable graph creation & optimization, and the actual
+recording of commands which should be low overhead.
+
+An explicit finalize entry-point, rather than returning an executable graph on
+end of recording, gives the user more granular control of when these stages
+happen in their application. As well as allowing the recording of a queue to a
+graph to resume after it has been stopped by an application.
+
+The ability for a user to effectively clone a graph is also possible with
+a separate finalize entry-point. A modifiable graph can be reused to create
+multiple executable graphs by the user repeatedly calling `finalize()`.
+
+## Issues
+
+### Mark Internal Memory
+
+To enable optimizations to remove memory objects internal to the graph, do we
+need an interface to identify buffers and USM allocations not used outside of
+the graph?
+
+**Outcome:** Intended for next version of the extension.
+
+## Changelog
+
+| Version | Change | Author | Date |
+|---------|--------|--------|------|
+| 0.1 | Initial Revision | Ewan Crawford | 08/02/2022 |
+| 0.2 | Action feedback from end of Q2 2022 | Ewan Crawford | 16/05/2022 |
+| 0.3 | Action feedback from end of Q3 2022 | Ben Tracy | 20/07/2022 |
+| 1.0 | First public version | Ewan Crawford | 13/09/2022 |
+
+[cuda-graphs]: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-graphs
+[cuda-whole-graph-update]: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#whole-graph-update
+[cudaGraphClone]: https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__GRAPH.html#group__CUDART__GRAPH_1g711477355655f00773a0885fbf2891d6
+[explicit-memory-ops]: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:explicitmemory
+[opencl-command-buffers]: https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer
+[opencl-mutable-dispatch]: https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch
+[common reference semantics]: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:reference-semantics
+[queue class]: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:interface.queue.class
+[queue properties]: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:queue-properties
+[feature test macro]: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#_feature_test_macros
+[host task]: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:interfaces.hosttasks
+[queue info table]: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#table.queue.info