Skip to content

Commit c26814d

Browse files
authored
Merge pull request #242 from neo4j/gql-and-neo4j-simplified-interface
gql and neo4j simplified interface
2 parents fc5bfeb + 138a99b commit c26814d

File tree

13 files changed

+149
-483
lines changed

13 files changed

+149
-483
lines changed

changelog.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@
44

55
- Do not automatically derive size and caption for `from_neo4j` and `from_gql_create`. Use the `size_property` and `node_caption` parameters to explicitly configure them.
66
- Change API of integrations to only provide basic parameters. Any further configuration should happen ons the Visualization Graph object:
7-
- `from_gds`
8-
- Drop parameters size_property, node_radius_min_max. `Use VG.resize_nodes(property=...)` instead
9-
- rename additional_node_properties to node_properties
10-
- Don't derive fields from properties. Use `VG.map_properties_to_fields` instead
117
- `from_pandas`
128
- Drop `node_radius_min_max` parameter. `VG.resize_nodes(...)` instead
9+
- `from_neo4j`, `from_gds`, `from_gql_create`
10+
- Drop parameters `size_property`, `node_radius_min_max`. Use `VG.resize_nodes(property=...)` instead
11+
- rename additional_node_properties to node_properties
12+
- Don't derive fields from properties. Use `VG.map_properties_to_fields` instead
1313

1414
## New features
1515

@@ -25,7 +25,7 @@
2525

2626
- Validate fields of a node and relationship not only at construction but also on assignment.
2727
- Allow resizing per node property such as `VG.resize_nodes(property="score")`.
28-
- Color nodes by label in `from_gds`.
28+
- Color nodes by label in `from_gds` and `from_gql_create`.
2929
- Add `table` property to nodes and relationships created by `from_snowflake`. This is used as a default caption.
3030

3131
## Other changes

docs/source/integration.rst

Lines changed: 6 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -164,22 +164,9 @@ The ``from_neo4j`` method takes one mandatory positional parameter:
164164
A ``data`` argument representing either a query result in the shape of a ``neo4j.graph.Graph`` or ``neo4j.Result``, or a
165165
``neo4j.Driver`` in which case a simple default query will be executed internally to retrieve the graph data.
166166

167-
We can also provide an optional ``size_property`` parameter, which should refer to a node property,
168-
and will be used to determine the sizes of the nodes in the visualization.
169-
170-
The ``node_caption`` and ``relationship_caption`` parameters are also optional, and indicate the node and relationship
171-
properties to use for the captions of each element in the visualization.
172-
By default, the captions will be set to the node labels relationship types, but you can specify any property that
173-
exists on these entities.
174-
175-
The last optional property, ``node_radius_min_max``, can be used (and is used by default) to scale the node sizes for
176-
the visualization.
177-
It is a tuple of two numbers, representing the radii (sizes) in pixels of the smallest and largest nodes respectively in
178-
the visualization.
179-
The node sizes will be scaled such that the smallest node will have the size of the first value, and the largest node
180-
will have the size of the second value.
181-
The other nodes will be scaled linearly between these two values according to their relative size.
182-
This can be useful if node sizes vary a lot, or are all very small or very big.
167+
The optional ``max_rows`` parameter can be used to limit the number of relationships shown in the visualization.
168+
By default, it is set to 10.000, meaning that if the database has more than 10.000 rows, a warning will be raised.
169+
Note, this only applies if the ``data`` parameter is a ``neo4j.Driver``.
183170

184171

185172
Example
@@ -222,20 +209,6 @@ The ``from_gql_create`` method takes one mandatory positional parameter:
222209

223210
* A valid ``query`` representing a GQL ``CREATE`` query as a string.
224211

225-
We can also provide an optional ``size_property`` parameter, which should refer to a node property,
226-
and will be used to determine the sizes of the nodes in the visualization.
227-
228-
The ``node_caption`` and ``relationship_caption`` parameters are also optional, and indicate the node and relationship properties to use for the captions of each element in the visualization.
229-
230-
The last optional property, ``node_radius_min_max``, can be used (and is used by default) to scale the node sizes for
231-
the visualization.
232-
It is a tuple of two numbers, representing the radii (sizes) in pixels of the smallest and largest nodes respectively in
233-
the visualization.
234-
The node sizes will be scaled such that the smallest node will have the size of the first value, and the largest node
235-
will have the size of the second value.
236-
The other nodes will be scaled linearly between these two values according to their relative size.
237-
This can be useful if node sizes vary a lot, or are all very small or very big.
238-
239212

240213
Example
241214
~~~~~~~
@@ -283,39 +256,14 @@ The ``from_snowflake`` method takes two mandatory positional parameters:
283256
* A `project configuration <https://neo4j.com/docs/snowflake-graph-analytics/current/jobs/#jobs-project>`_ as a dictionary, that specifies how you want your tables to be projected as a graph.
284257
This configuration is the same as the project configuration of the `Neo4j Snowflake Graph Analytics application <https://neo4j.com/docs/snowflake-graph-analytics/current/>`_.
285258

286-
``from_snowflake`` also takes an optional property, ``node_radius_min_max``, that can be used (and is used by default) to
287-
scale the node sizes for the visualization.
288-
It is a tuple of two numbers, representing the radii (sizes) in pixels of the smallest and largest nodes respectively in
289-
the visualization.
290-
The node sizes will be scaled such that the smallest node will have the size of the first value, and the largest node
291-
will have the size of the second value.
292-
The other nodes will be scaled linearly between these two values according to their relative size.
293-
This can be useful if node sizes vary a lot, or are all very small or very big.
294-
295-
296-
Special columns
297-
~~~~~~~~~~~~~~~
298-
299-
It is possible to modify the visualization directly by including columns of certain specific names in the node and relationship tables.
300-
301-
All such special columns can be found :doc:`here <./api-reference/node>` for nodes and :doc:`here <./api-reference/relationship>` for relationships.
302-
Though listed in ``snake_case`` here, ``SCREAMING_SNAKE_CASE`` and ``camelCase`` are also supported.
303-
Some of the most commonly used special columns are:
304-
305-
* **Node sizes**: The sizes of nodes can be controlled by including a column named "SIZE" in node tables.
306-
The values in these columns should be of a numeric type. This can be useful for visualizing the relative importance or size of nodes in the graph, for example using a computed centrality score.
307-
308-
* **Captions**: The caption text of nodes and relationships can be controlled by including a column named "CAPTION" in the tables.
309-
The values in these columns should be of a string type. This can be useful for displaying additional information about the nodes, such as their names or labels. If no "CAPTION" column is provided, the default captions in the visualization will be the names of the corresponding node and relationship tables.
310-
311-
Please also note that you can further customize the visualization after the `VisualizationGraph` has been created, by using the methods described in the :doc:`Customizing the visualization <./customizing>` section.
259+
You can further customize the visualization after the `VisualizationGraph` has been created, by using the methods described in the :doc:`Customizing the visualization <./customizing>` section.
312260

313261

314262
Default behavior
315263
~~~~~~~~~~~~~~~~
316264

317-
Unless there are "CAPTION" columns in the tables, the node and relationship captions will be set to the names of the corresponding tables.
318-
Similarly, if there are are no "COLOR" node table columns, the nodes will be colored be colored so that nodes from the same table have the same color, and different tables have different colors.
265+
The node and relationship captions will be set to the names of the corresponding tables.
266+
The nodes will be colored so that nodes from the same table have the same color, and different tables have different colors.
319267

320268

321269
Example

python-wrapper/src/neo4j_viz/gds.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,7 @@ def from_gds(
167167
VG = _from_dfs(node_df, rel_dfs, dropna=True)
168168

169169
for node in VG.nodes:
170-
node.caption = str(node.properties.get("labels"))
170+
node.caption = ":".join([label for label in node.properties["labels"]])
171171
for rel in VG.relationships:
172172
rel.caption = rel.properties.get("relationshipType")
173173

python-wrapper/src/neo4j_viz/gql_create.py

Lines changed: 32 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
from pydantic import BaseModel, ValidationError
66

77
from neo4j_viz import Node, Relationship, VisualizationGraph
8+
from neo4j_viz.colors import NEO4J_COLORS_DISCRETE, ColorSpace
89

910

1011
def _parse_value(value_str: str) -> Any:
@@ -91,10 +92,7 @@ def _parse_value(value_str: str) -> Any:
9192
return value_str.strip("'\"")
9293

9394

94-
def _parse_prop_str(
95-
query: str, prop_str: str, prop_start: int, top_level_keys: set[str]
96-
) -> tuple[dict[str, Any], dict[str, Any]]:
97-
top_level: dict[str, Any] = {}
95+
def _parse_prop_str(query: str, prop_str: str, prop_start: int) -> dict[str, Any]:
9896
props: dict[str, Any] = {}
9997
depth = 0
10098
in_string = None
@@ -115,10 +113,7 @@ def _parse_prop_str(
115113
k, v = pair.split(":", 1)
116114
k = k.strip().strip("'\"")
117115

118-
if k in top_level_keys:
119-
top_level[k] = _parse_value(v)
120-
else:
121-
props[k] = _parse_value(v)
116+
props[k] = _parse_value(v)
122117

123118
start_idx = i + 1
124119
else:
@@ -133,17 +128,12 @@ def _parse_prop_str(
133128
k, v = pair.split(":", 1)
134129
k = k.strip().strip("'\"")
135130

136-
if k in top_level_keys:
137-
top_level[k] = _parse_value(v)
138-
else:
139-
props[k] = _parse_value(v)
131+
props[k] = _parse_value(v)
140132

141-
return top_level, props
133+
return props
142134

143135

144-
def _parse_labels_and_props(
145-
query: str, s: str, top_level_keys: set[str]
146-
) -> tuple[Optional[str], dict[str, Any], dict[str, Any]]:
136+
def _parse_labels_and_props(query: str, s: str) -> tuple[Optional[str], dict[str, Any]]:
147137
prop_match = re.search(r"\{(.*)\}", s)
148138
prop_str = ""
149139
if prop_match:
@@ -155,17 +145,16 @@ def _parse_labels_and_props(
155145
final_alias = raw_alias if raw_alias else None
156146

157147
if prop_str:
158-
top_level, props = _parse_prop_str(query, prop_str, prop_start, top_level_keys)
148+
props = _parse_prop_str(query, prop_str, prop_start)
159149
else:
160-
top_level = {}
161150
props = {}
162151

163152
label_list = [lbl.strip() for lbl in alias_labels[1:]]
164153
if "labels" in props:
165154
props["__labels"] = props["labels"]
166155
props["labels"] = sorted(label_list)
167156

168-
return final_alias, top_level, props
157+
return final_alias, props
169158

170159

171160
def _get_snippet(q: str, idx: int, context: int = 15) -> str:
@@ -175,21 +164,20 @@ def _get_snippet(q: str, idx: int, context: int = 15) -> str:
175164
return q[start:end].replace("\n", " ")
176165

177166

178-
def from_gql_create(
179-
query: str,
180-
size_property: Optional[str] = None,
181-
node_caption: Optional[str] = "labels",
182-
relationship_caption: Optional[str] = "type",
183-
node_radius_min_max: Optional[tuple[float, float]] = (3, 60),
184-
) -> VisualizationGraph:
167+
def from_gql_create(query: str) -> VisualizationGraph:
185168
"""
186169
Parse a GQL CREATE query and return a VisualizationGraph object representing the graph it creates.
187170
188171
All node and relationship properties will be included in the visualization graph.
189-
If the properties are named as the fields of the `Node` or `Relationship` classes, they will be included as
190-
top level fields of the respective objects. Otherwise, they will be included in the `properties` dictionary.
172+
All properties of nodes and relationships will be included in the `properties` dictionary of the respective objects.
191173
Additionally, a "labels" property will be added for nodes and a "type" property for relationships.
192174
175+
By default:
176+
177+
* the caption of a node will be based on its `labels`.
178+
* the caption of a relationship will be based on its `type`.
179+
* the color of nodes will be set based on their label, unless there are more than 12 unique labels.
180+
193181
Please note that this function is not a full GQL parser, it only handles CREATE queries that do not contain
194182
other clauses like MATCH, WHERE, RETURN, etc, or any Cypher function calls.
195183
It also does not handle all possible GQL syntax, but it should work for most common cases.
@@ -199,15 +187,6 @@ def from_gql_create(
199187
----------
200188
query : str
201189
The GQL CREATE query to parse
202-
size_property : str, optional
203-
Property to use for node size, by default None.
204-
node_caption : str, optional
205-
Property to use as the node caption, by default the node labels will be used.
206-
relationship_caption : str, optional
207-
Property to use as the relationship caption, by default the relationship type will be used.
208-
node_radius_min_max : tuple[float, float], optional
209-
Minimum and maximum node radius, by default (3, 60).
210-
To avoid tiny or huge nodes in the visualization, the node sizes are scaled to fit in the given range.
211190
"""
212191

213192
query = query.strip()
@@ -251,19 +230,9 @@ def from_gql_create(
251230
node_pattern = re.compile(r"^\(([^)]*)\)$")
252231
rel_pattern = re.compile(r"^\(([^)]*)\)-\s*\[\s*:(\w+)\s*(\{[^}]*\})?\s*\]->\(([^)]*)\)$")
253232

254-
node_top_level_keys = Node.all_validation_aliases(exempted_fields=["id", "size", "caption"])
255-
rel_top_level_keys = Relationship.all_validation_aliases(exempted_fields=["id", "source", "target", "caption"])
256-
257233
def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) -> None:
258234
for err in e.errors():
259235
loc = err["loc"][0]
260-
if (loc == "size") and size_property is not None:
261-
loc = size_property
262-
if loc == "caption":
263-
if (entity_type == Node) and (node_caption is not None):
264-
loc = node_caption
265-
elif (entity_type == Relationship) and (relationship_caption is not None):
266-
loc = relationship_caption
267236
raise ValueError(
268237
f"Error for {entity_type.__name__.lower()} property '{loc}' with provided input '{err['input']}'. Reason: {err['msg']}"
269238
)
@@ -277,14 +246,14 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) ->
277246
node_m = node_pattern.match(part)
278247
if node_m:
279248
alias_labels_props = node_m.group(1).strip()
280-
alias, top_level, props = _parse_labels_and_props(query, alias_labels_props, node_top_level_keys)
249+
alias, props = _parse_labels_and_props(query, alias_labels_props)
281250
if not alias:
282251
alias = f"_anon_{anonymous_count}"
283252
anonymous_count += 1
284253
if alias not in alias_to_id:
285254
alias_to_id[alias] = str(uuid.uuid4())
286255
try:
287-
nodes.append(Node(id=alias_to_id[alias], **top_level, properties=props))
256+
nodes.append(Node(id=alias_to_id[alias], properties=props))
288257
except ValidationError as e:
289258
_parse_validation_error(e, Node)
290259

@@ -296,29 +265,29 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) ->
296265
right_node = rel_m.group(4).strip()
297266

298267
# Parse left node pattern
299-
left_alias, left_top_level, left_props = _parse_labels_and_props(query, left_node, node_top_level_keys)
268+
left_alias, left_props = _parse_labels_and_props(query, left_node)
300269
if not left_alias:
301270
left_alias = f"_anon_{anonymous_count}"
302271
anonymous_count += 1
303272
if left_alias not in alias_to_id:
304273
alias_to_id[left_alias] = str(uuid.uuid4())
305274
try:
306-
nodes.append(Node(id=alias_to_id[left_alias], **left_top_level, properties=left_props))
275+
nodes.append(Node(id=alias_to_id[left_alias], properties=left_props))
307276
except ValidationError as e:
308277
_parse_validation_error(e, Node)
309278
elif left_alias not in alias_to_id:
310279
snippet = _get_snippet(query, query.index(left_node))
311280
raise ValueError(f"Relationship references unknown node alias: '{left_alias}' near: `{snippet}`.")
312281

313282
# Parse right node pattern
314-
right_alias, right_top_level, right_props = _parse_labels_and_props(query, right_node, node_top_level_keys)
283+
right_alias, right_props = _parse_labels_and_props(query, right_node)
315284
if not right_alias:
316285
right_alias = f"_anon_{anonymous_count}"
317286
anonymous_count += 1
318287
if right_alias not in alias_to_id:
319288
alias_to_id[right_alias] = str(uuid.uuid4())
320289
try:
321-
nodes.append(Node(id=alias_to_id[right_alias], **right_top_level, properties=right_props))
290+
nodes.append(Node(id=alias_to_id[right_alias], properties=right_props))
322291
except ValidationError as e:
323292
_parse_validation_error(e, Node)
324293
elif right_alias not in alias_to_id:
@@ -331,9 +300,8 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) ->
331300
if rel_props_str:
332301
inner_str = rel_props_str.strip("{}").strip()
333302
prop_start = query.index(inner_str, query.index(inner_str))
334-
top_level, props = _parse_prop_str(query, inner_str, prop_start, rel_top_level_keys)
303+
props = _parse_prop_str(query, inner_str, prop_start)
335304
else:
336-
top_level = {}
337305
props = {}
338306
if "type" in props:
339307
props["__type"] = props["type"]
@@ -345,7 +313,6 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) ->
345313
id=rel_id,
346314
source=alias_to_id[left_alias],
347315
target=alias_to_id[right_alias],
348-
**top_level,
349316
properties=props,
350317
)
351318
)
@@ -357,28 +324,15 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) ->
357324
snippet = part[:30]
358325
raise ValueError(f"Invalid element in CREATE near: `{snippet}`.")
359326

360-
if size_property is not None:
361-
try:
362-
for node in nodes:
363-
node.size = node.properties.get(size_property)
364-
except ValidationError as e:
365-
_parse_validation_error(e, Node)
366-
if node_caption is not None:
367-
for node in nodes:
368-
if node_caption == "labels":
369-
if len(node.properties["labels"]) > 0:
370-
node.caption = ":".join([label for label in node.properties["labels"]])
371-
else:
372-
node.caption = str(node.properties.get(node_caption))
373-
if relationship_caption is not None:
374-
for rel in relationships:
375-
if relationship_caption == "type":
376-
rel.caption = rel.properties["type"]
377-
else:
378-
rel.caption = str(rel.properties.get(relationship_caption))
379-
380327
VG = VisualizationGraph(nodes=nodes, relationships=relationships)
381-
if (node_radius_min_max is not None) and (size_property is not None):
382-
VG.resize_nodes(node_radius_min_max=node_radius_min_max)
328+
329+
for node in VG.nodes:
330+
node.caption = ":".join([label for label in node.properties["labels"]])
331+
for rel in VG.relationships:
332+
rel.caption = rel.properties.get("type")
333+
334+
number_of_colors = len({str(n.properties.get("labels")) for n in VG.nodes})
335+
if number_of_colors <= len(NEO4J_COLORS_DISCRETE):
336+
VG.color_nodes(property="labels", color_space=ColorSpace.DISCRETE)
383337

384338
return VG

0 commit comments

Comments
 (0)