diff --git a/ChangeLog.md b/ChangeLog.md index 248dfe12..2278f012 100644 --- a/ChangeLog.md +++ b/ChangeLog.md @@ -5,6 +5,8 @@ Starting with v1.31.6, this file will contain a record of major features and upd ## Upcoming - Updated `create-graph` CLI commands in Neptune Analytics samples ([Link to PR](https://github.com/aws/graph-notebook/pull/565)) - Added `@neptune_graph_only` magics decorator ([Link to PR](https://github.com/aws/graph-notebook/pull/569)) +- New Northwind Use Case notebook ([Link to PR](https://github.com/aws/graph-notebook/pull/572)) + - Path: 01-Neptune-Database > 03-Sample-Applications > 07-Northwind-Use-Case ## Release 4.1.0 (February 1, 2024) - New Neptune Analytics notebook - Vector Similarity Algorithms ([Link to PR](https://github.com/aws/graph-notebook/pull/555)) diff --git a/src/graph_notebook/notebooks/01-Neptune-Database/03-Sample-Applications/07-Northwind-Use-Case/northwind-use-case.ipynb b/src/graph_notebook/notebooks/01-Neptune-Database/03-Sample-Applications/07-Northwind-Use-Case/northwind-use-case.ipynb new file mode 100644 index 00000000..597ed9cf --- /dev/null +++ b/src/graph_notebook/notebooks/01-Neptune-Database/03-Sample-Applications/07-Northwind-Use-Case/northwind-use-case.ipynb @@ -0,0 +1,1774 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "8e759b20", + "metadata": {}, + "source": [ + "# Northwind Use Case\n", + "\n", + "\n", + "## Introduction\n", + "\n", + "\n", + "### Northwind Database\n", + "Northwind is a well-known e-commerce database which is largely used for training purposes across various database platforms.\n", + "In this demonstration we are going to use the RDF Knowledge Graph version of the Northwind database to provide hands-on experience on Amazon Neptune by executing use case stories (SPARQL queries) on a Jupyter Notebook. It's a practical, learn-as-you-go experience.\n", + "\n", + "\n", + "\n", + "### EKGF Guiding Principals\n", + "The Enterprise Knowledge Graph Forum (EKGF) is now part of the Object Management Group (OMG).\n", + "The EKGF was established to define best practice and mature the marketplace for EKG adoption and provides 10 guiding principles which are intended to provide guidelines for the development and deployment of an Enterprise Knowledge Graph (EKG). The principles emphasise shared meaning and content reuse that are the cornerstone of operating in complex and interconnected environments.\n", + "The Northwind Use Case demonstrates Principle 7, which says:\n", + "*All artefacts around and information in the EKG are linked to defined and prioritised use cases. Nothing in the EKG exists without a known business justification and purpose.*\n", + "\n", + "\n", + "### SPARQL \n", + "\n", + "This demonstration covers a great deal of the syntax and semantics of the SPARQL query language, including FILTER, UNION, LIMIT, OFFSET, GROUP BY, ORDER BY, DISTINCT, OPTIONAL, BIND, BOUND, MINUS, FILTER NOT EXISTS, INSERT, DELETE, DESCRIBE, CONSTRUCT, REGEX, CONTAINS, HAVING, as well as String Matching and Manipulation, Aggregation Functions, Subqueries, and Property Paths, among others.\n", + "For more detail information on the stories below, please refer to this [Medium article](https://medium.com/@mbarbieri77/northwind-use-case-on-amazon-neptune-0e85378307a7)." + ] + }, + { + "cell_type": "markdown", + "id": "4abf0bec", + "metadata": {}, + "source": [ + "## Loading the Northwind Dataset\n", + "Download, unzip and copy the [Northwind n-triple file](https://github.com/mbarbieri77/EKG/blob/master/Northwind/SPARQL/SampleDatabase/dumpdataNTRIPLE7.nt.zip) to your S3 bucket. \n", + "\n", + "\n", + "Complete the instructions in the `%load` magic below in order to load the data into your Neptune Instance. \n", + "You will need to run it once to visualize the load form that needs to be filled up. \n", + "Note that the file **Format** must be `ntriples`, the **Named Graph URI** `http://www.mysparql.com/resource/northwind/NorthwindGraph`, and the `Source` the file `S3 URI` that you copy from your S3 bucket. You may also need to set up the appropriate permissions to the Neptune Cluster so it can read files from the S3 bucket. Please refer to AWS documenation [here](https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-tutorial-IAM-CreateRole.html). \n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ccd5ed09", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%load" + ] + }, + { + "cell_type": "markdown", + "id": "3e402279", + "metadata": {}, + "source": [ + "## Use Case Template\n", + "\n", + "Please find a complete use case template in the end of this notebook.\n", + "For the Northiwnd Use Case, we only filled up some of the sections (Outcome, Personas, Concepts and Stories) for simplicity. \n", + "\n", + " \n", + " \n", + "### Outcome\n", + "\n", + "\n", + "The required and desired short and long term business outcomes.\n", + "\n", + "#### Primary business outcomes\n", + "\n", + "- Enable Sales Analysis and Product Insights\n", + "- Enable Marketing Analysis and Reporting\n", + "- Improve sales decision-making by optimizing product recommendations, gained through analysis of customer preferences based on product co-purchases\n", + "- Improve HR Management Efficiency\n", + "\n", + "\n", + "#### Secondary business outcomes\n", + "\n", + "Answers to the user stories. In this case for the first story further below. \n", + "- \\\n", + " \n", + "\n", + "### Personas\n", + "\n", + "\n", + "All roles, titles and personas that stakeholders and users play in the context of this use case.\n", + "\n", + "- Human Resources Manager\n", + "- Sales Manager\n", + "- Marketing Manager\n", + "- Data Steward\n", + "\n", + "\n", + "### Concepts\n", + "\n", + "\n", + "Concepts referenced by stories that belong to a given use case or linked to use cases that don't have stories yet. This is not an exaustive list and does not include personas, which are also concepts. \n", + "\n", + "- Employee Title\n", + "- Customer Company Address\n", + "- Product Unit Price\n", + "- Supplier Company Name\n", + "- Order Customer\n", + "- Order Date\n" + ] + }, + { + "cell_type": "markdown", + "id": "e4d20d28", + "metadata": {}, + "source": [ + "## Stories" + ] + }, + { + "cell_type": "markdown", + "id": "6f7749c6", + "metadata": {}, + "source": [ + "### Create a concise report listing all the employees in the company\n", + "\n", + "***\n", + "\n", + "Full story:\n", + "> As a **\\**,
\n", + "> I want to **\\**
\n", + "> in order to **\\**\n", + "\n", + "Main Concepts:\n", + "> \\, \\\n", + "\n", + "Query:\n", + "> Given a Human Resources Manager persona, WHEN they want to create a report with all employees in the company, THEN the system should execute a SPARQL query to retrieve values for the rdfs:label, foaf:title, foaf:lastName, and foaf:firstName properties of each Employee.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ca016e40", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "PREFIX foaf: \n", + "PREFIX rdfs: \n", + "\n", + "SELECT\n", + " ?label\n", + " ?lastName\n", + " ?firstName\n", + " ?title\n", + "WHERE {\n", + " ?emp a :Employee ;\n", + " rdfs:label ?label ;\n", + " foaf:title ?title ;\n", + " foaf:lastName ?lastName ;\n", + " foaf:firstName ?firstName .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "57bb7883", + "metadata": {}, + "source": [ + ">In this demonstration, we will skip the template for the remaining stories and provide only brief descriptions." + ] + }, + { + "cell_type": "markdown", + "id": "5170df4f", + "metadata": {}, + "source": [ + ">### As a Human Resources Manager, I want to know all the employees located in the USA." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "72ca32e1", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%sparql \n", + "\n", + "PREFIX : \n", + "PREFIX foaf: \n", + "PREFIX rdfs: \n", + "\n", + "SELECT\n", + " ?label\n", + " ?lastName\n", + " ?firstName\n", + " ?title\n", + "WHERE {\n", + " ?emp a :Employee ;\n", + " rdfs:label ?label ;\n", + " foaf:lastName ?lastName ;\n", + " foaf:firstName ?firstName ;\n", + " foaf:title ?title ;\n", + " :country ?country .\n", + " FILTER(?country = \"USA\")\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "90336309", + "metadata": {}, + "source": [ + ">Note that the same filter can be applied directly as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "301efbf4", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "PREFIX foaf: \n", + "PREFIX rdfs: \n", + "\n", + "SELECT\n", + " ?label\n", + " ?lastName\n", + " ?firstName\n", + " ?title\n", + "WHERE {\n", + " ?emp a :Employee ;\n", + " rdfs:label ?label ;\n", + " foaf:lastName ?lastName ;\n", + " foaf:firstName ?firstName ;\n", + " foaf:title ?title ;\n", + " :country \"USA\" .\n", + "}\n", + " " + ] + }, + { + "cell_type": "markdown", + "id": "68e327a7", + "metadata": {}, + "source": [ + ">### As a Human Resources Manager, I want to know if the company has employees in the UK.\n", + ">This query returns a boolean indicating whether a query pattern matches any triples." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "522dc170", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + " \n", + "ASK {\n", + " ?emp a :Employee ;\n", + " :country \"UK\" .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "27e98624", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to be able to search companies by name.\n", + ">Note that the query below shows two ways of implementing the filter. You can comment out the first filter and uncomment the second one to verify its result." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bf92bce0", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?companyName\n", + " ?contactName\n", + " ?address\n", + " ?city\n", + " ?phone\n", + "WHERE {\n", + " ?s a :Customer ;\n", + " rdfs:label ?companyLabel ;\n", + " :companyName ?companyName ;\n", + " :contactName ?contactName ;\n", + " :address ?address ;\n", + " :city ?city ;\n", + " :phone ?phone .\n", + " FILTER (REGEX(?companyName, \"Rest\" , \"i\" )) # Case Insensitive\n", + " # FILTER CONTAINS (LCASE(?companyName), \"rest\") # Alternatively, you can use the string function CONTAINS.\n", + "}\n" + ] + }, + { + "cell_type": "markdown", + "id": "9c572f3e", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to create a basic report showing products supplied by companies located in the USA." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cd0feb22", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productID\n", + " ?productName\n", + " ?unitsInStock\n", + " ?unitPrice\n", + " ?categoryName\n", + " ?contactName\n", + "WHERE\n", + "{\n", + " ?product a :Product ;\n", + " :productID ?productID ;\n", + " :productName ?productName ;\n", + " :unitsInStock ?unitsInStock ;\n", + " :unitPrice ?unitPrice ;\n", + " :hasCategory ?category ;\n", + " :hasSupplier ?supplier .\n", + " ?category a :Category ;\n", + " :name ?categoryName .\n", + " ?supplier a :Supplier ;\n", + " :contactName ?contactName ;\n", + " :country \"USA\" .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "c9cba224", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to create a basic report showing customers who placed at least one order." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d084bfef", + "metadata": { + "scrolled": false + }, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT DISTINCT\n", + " ?customer\n", + " ?companyName\n", + " ?postalCode\n", + " ?city\n", + " ?country\n", + "WHERE {\n", + " ?order a :Order .\n", + " ?customer a :Customer .\n", + " ?order :hasCustomer ?customer .\n", + " ?customer :customerID ?customerID ;\n", + " :companyName ?companyName ;\n", + " :city ?city ;\n", + " :country ?country .\n", + " OPTIONAL {?customer :postalCode ?postalCode} . # Some regions don't use PostalCode.\n", + "}\n", + "ORDER BY\n", + " ?customer\n" + ] + }, + { + "cell_type": "markdown", + "id": "09ddd7f5", + "metadata": {}, + "source": [ + ">### As a Marketing Manager, I want to create a basic report showing customers who never placed an order." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "39a2c604", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT DISTINCT\n", + " ?customer\n", + " ?companyName\n", + " ?postalCode\n", + " ?city\n", + " ?country\n", + "WHERE {\n", + " ?customer a :Customer .\n", + " ?customer :customerID ?customerID ;\n", + " :companyName ?companyName ;\n", + " :city ?city ;\n", + " :country ?country .\n", + " OPTIONAL {?customer :postalCode ?postalCode} . # Some regions don't use PostalCode.\n", + " OPTIONAL {\n", + " ?order a :Order .\n", + " ?customer ^:hasCustomer ?order # for customers with no orders, ?order variable will be empty (not bound).\n", + "}\n", + " FILTER (!BOUND(?order)) # Checks if variable is not bound to a value.\n", + "}\n", + "ORDER BY\n", + " ?customer\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "82053b37", + "metadata": {}, + "source": [ + ">The same result can be obtained by using MINUS." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e8c91874", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?customer\n", + " ?companyName\n", + " ?postalCode\n", + " ?city\n", + " ?country\n", + "WHERE {\n", + " {\n", + " ?customer a :Customer ; # All customers\n", + " :customerID ?customerID ;\n", + " :companyName ?companyName ;\n", + " :city ?city ;\n", + " :country ?country .\n", + " } MINUS {\n", + " ?customer a :Customer . # Customers who placed orders\n", + " ?order a :Order .\n", + " ?order :hasCustomer ?customer .\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "6db9eaae", + "metadata": {}, + "source": [ + ">### As a Marketing Manager, I want to search products by name or a combination of identification number and price." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "75dcce9a", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productName\n", + " (STR(?unitPrice) AS ?strUnitPrice) # converting integer to string\n", + " ?supplierName\n", + " ?region\n", + " ?country\n", + "WHERE {\n", + " ?s a :Product ;\n", + " :productName ?productName ;\n", + " :productID ?productID ;\n", + " :hasSupplier ?supplier ; # Joining on supplier\n", + " :unitPrice ?unitPrice .\n", + " # getting supplier properties\n", + " ?supplier :companyName ?supplierName ;\n", + " :country ?country ;\n", + " OPTIONAL {?supplier :region ?region }. # not all suppliers have region\n", + " FILTER((REGEX(?productName, \"^T\", \"i\")) || (?productID = 46 && ?unitPrice > 16)) . # Logical operators\n", + "}\n" + ] + }, + { + "cell_type": "markdown", + "id": "2acb8436", + "metadata": {}, + "source": [ + ">### As a Marketing Manager, I want to know which products are in a given price range." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9a303626", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productName\n", + " ?companyName\n", + " ?unitPrice\n", + "WHERE {\n", + " ?s a :Product ;\n", + " :productID ?productID ;\n", + " :productName ?productName ;\n", + " :hasSupplier ?supplier ;\n", + " :unitPrice ?unitPrice .\n", + " ?supplier a :Supplier ;\n", + " :companyName ?companyName ;\n", + " :supplierID ?supplierID .\n", + " FILTER (?unitPrice >= 18 && ?unitPrice <= 20)\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "cc65710a", + "metadata": {}, + "source": [ + ">### As a Marketing Manager, I want to create a list of all suppliers located in Japan or Italy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0bf2584d", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?companyName\n", + " ?country\n", + "WHERE {\n", + " ?s a :Supplier ;\n", + " :companyName ?companyName ;\n", + " :country ?country .\n", + " FILTER (UCASE(?country) = \"JAPAN\" || ?country = \"Italy\") # case sensitive\n", + "}\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "9f3f6222", + "metadata": {}, + "source": [ + ">### As a Marketing Manager, I want to create a report containing all suppliers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cc07961b", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?companyName\n", + " ?fax\n", + "WHERE {\n", + " ?s a :Supplier ;\n", + " :companyName ?companyName ;\n", + " OPTIONAL {?s :fax ?fax} .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "9ecc3bfe", + "metadata": {}, + "source": [ + ">### As a Marketing Manager, I want to create a report containing all suppliers that have a fax number.\n", + ">Note: Fax was a machine from the 90s able to scan and transmit a document over the phone line :-)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7c2c5c97", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?companyName\n", + " ?fax\n", + "WHERE {\n", + " ?s a :Supplier ;\n", + " :companyName ?companyName ;\n", + " :fax ?fax .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "92d4d3f9", + "metadata": {}, + "source": [ + ">### As a Marketing Manager, I want to create a report containing all suppliers that don't have a fax number." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e3ce57aa", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?companyName\n", + " ?fax\n", + "WHERE {\n", + " ?s a :Supplier ;\n", + " :companyName ?companyName ;\n", + " OPTIONAL {?s :fax ?fax} .\n", + " FILTER (!BOUND(?fax))\n", + "}\n", + "ORDER BY \n", + " ?companyName" + ] + }, + { + "cell_type": "markdown", + "id": "ec939f48", + "metadata": {}, + "source": [ + ">The same result can be obtained by using the NOT EXISTS filter below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6cd45b40", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?companyName\n", + " ?fax\n", + "WHERE {\n", + " ?s a :Supplier ;\n", + " :companyName ?companyName .\n", + " FILTER NOT EXISTS {\n", + " SELECT\n", + " ?companyName\n", + " WHERE {\n", + " ?s a :Supplier ;\n", + " :companyName ?companyName ;\n", + " :fax ?fax .\n", + " }\n", + " }\n", + "}\n", + "ORDER BY \n", + " ?companyName" + ] + }, + { + "cell_type": "markdown", + "id": "dad66b69", + "metadata": {}, + "source": [ + ">### As a Marketing Manager, I want to create a report of products grouped by category and sorted by unit price descending." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3baa05f9", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productName\n", + " ?categoryName\n", + " ?unitPrice\n", + "WHERE {\n", + " ?s a :Product ;\n", + " :productID ?productID ;\n", + " :productName ?productName ;\n", + " :unitPrice ?unitPrice ;\n", + " :hasCategory ?category .\n", + " ?category :name ?categoryName .\n", + "}\n", + "ORDER BY\n", + " ASC(?categoryName)\n", + " DESC(?unitPrice)" + ] + }, + { + "cell_type": "markdown", + "id": "cb6e6a06", + "metadata": {}, + "source": [ + ">### As a Marketing Manager, I want to create a report with all countries I buy from." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "548eedd4", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT DISTINCT\n", + " ?country\n", + "WHERE{\n", + " ?s a :Supplier ;\n", + " :country ?country .\n", + "}\n", + "ORDER BY\n", + " ?country # Default sorting" + ] + }, + { + "cell_type": "markdown", + "id": "56a7ed41", + "metadata": {}, + "source": [ + ">### As a Data Steward, I want to generate an identification code for each of our employees." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0e189e03", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "PREFIX foaf: \n", + "\n", + "SELECT\n", + " (CONCAT (?firstName, \" \", ?lastName) AS ?fullName)\n", + " ?code\n", + "WHERE {\n", + " ?s a :Employee ;\n", + " foaf:firstName ?firstName ;\n", + " foaf:lastName ?lastName ;\n", + " rdfs:label ?employeeLabel ;\n", + " :extension ?extension ;\n", + " :country ?country ;\n", + " OPTIONAL {?s :region ?region } .\n", + " BIND(CONCAT(SUBSTR(?firstName,1,1), SUBSTR(?lastName,1,3), \"-\", ?extension, \"-\", IF(!BOUND(?region),\n", + " CONCAT(\"INT-\", ?country), ?region)) AS ?code)\n", + "}\n", + "ORDER BY\n", + " ?lastName" + ] + }, + { + "cell_type": "markdown", + "id": "fa89e37f", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to create a report with the top 5 largest quantity of a product sold in a single order." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "37ab00fa", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productName\n", + " ?orderID\n", + " ?orderDate\n", + " ?quantity\n", + " ?unitsInStock\n", + "WHERE {\n", + " ?orderDetail a :OrderDetail .\n", + " ?order a :Order .\n", + " ?product a :Product .\n", + " ?orderDetail :quantity ?quantity ;\n", + " :belongsToOrder ?order ;\n", + " :hasProduct ?product .\n", + " ?order :orderID ?orderID ;\n", + " :orderDate ?orderDate .\n", + " ?product :unitsInStock ?unitsInStock ;\n", + " :productName ?productName .\n", + "}\n", + "ORDER BY\n", + " DESC(?quantity)\n", + " DESC(?orderDate)\n", + "LIMIT 5" + ] + }, + { + "cell_type": "markdown", + "id": "773253f0", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to retrieve the second page of a report with the top largest quantity of a product sold in a single order." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4dbddb31", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productName\n", + " ?orderID\n", + " ?orderDate\n", + " ?quantity\n", + " ?unitsInStock\n", + "WHERE {\n", + " ?orderDetail a :OrderDetail .\n", + " ?order a :Order .\n", + " ?product a :Product .\n", + " ?orderDetail :quantity ?quantity ;\n", + " :belongsToOrder ?order ;\n", + " :hasProduct ?product .\n", + " ?order :orderID ?orderID ;\n", + " :orderDate ?orderDate .\n", + " ?product :unitsInStock ?unitsInStock ;\n", + " :productName ?productName .\n", + "}\n", + "ORDER BY\n", + " DESC(?quantity)\n", + " DESC(?orderDate)\n", + "OFFSET 5\n", + "LIMIT 5" + ] + }, + { + "cell_type": "markdown", + "id": "4ac75ade", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know the total number of suppliers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7803c6f7", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT (COUNT(1) AS ?supplierCount)\n", + "WHERE{\n", + " ?s a :Supplier .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "ef1a5aa5", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know the number of countries I buy from." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b2b2ae50", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT (COUNT(DISTINCT ?country) AS ?countryCount)\n", + "WHERE{\n", + " ?s a :Supplier ;\n", + " :country ?country .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "80d7cd76", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know the top 5 most sold products." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8a8cdb14", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productID\n", + " (SUM(?quantity) AS ?totalQtySold)\n", + "WHERE {\n", + " ?order a :OrderDetail ;\n", + " :quantity ?quantity ;\n", + " :hasProduct ?product .\n", + " ?product :productID ?productID .\n", + "}\n", + "GROUP BY\n", + " ?productID\n", + "ORDER BY\n", + " DESC(?totalQtySold)\n", + "LIMIT 5" + ] + }, + { + "cell_type": "markdown", + "id": "ee8bf4a4", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know the top 5 largest orders shipped to the USA." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ae32e37d", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?orderID\n", + " (ROUND(SUM(?unitPrice * ?quantity * (1 - ?discount))) AS ?total)\n", + "WHERE {\n", + " ?order a :Order ;\n", + " :orderID ?orderID ;\n", + " :shipCountry \"USA\" .\n", + " ?orderDetail a :OrderDetail ;\n", + " :belongsToOrder ?order ;\n", + " :unitPrice ?unitPrice ;\n", + " :quantity ?quantity ;\n", + " :discount ?discount .\n", + "}\n", + "GROUP BY\n", + " ?orderID\n", + "ORDER BY\n", + " DESC(?total)\n", + "LIMIT 5\n" + ] + }, + { + "cell_type": "markdown", + "id": "60e6393f", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know the orders over 10K shipped to the USA." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e3ac1f76", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + "?orderID\n", + "(ROUND(SUM(?unitPrice * ?quantity * (1 - ?discount))) AS ?total)\n", + "WHERE {\n", + " ?order a :Order ;\n", + " :orderID ?orderID ;\n", + " :shipCountry \"USA\" .\n", + " ?orderDetail a :OrderDetail ;\n", + " :belongsToOrder ?order ;\n", + " :unitPrice ?unitPrice ;\n", + " :quantity ?quantity ;\n", + " :discount ?discount .\n", + "}\n", + "GROUP BY\n", + " ?orderID\n", + "HAVING (SUM(?unitPrice * ?quantity * (1 - ?discount)) > 10000)\n", + "ORDER BY\n", + " DESC(?total)\n", + " " + ] + }, + { + "cell_type": "markdown", + "id": "dd70c340", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know the top 5 supplier representatives by number of products sold." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "156af26f", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?supplierContactName\n", + " (COUNT(?product) as ?productCount)\n", + "WHERE\n", + "{\n", + " ?product a :Product ;\n", + " :hasSupplier ?supplier .\n", + " ?supplier a :Supplier ;\n", + " :contactName ?supplierContactName .\n", + "}\n", + "GROUP BY\n", + " ?supplierContactName\n", + "ORDER BY\n", + " DESC(?productCount)\n", + "LIMIT 5" + ] + }, + { + "cell_type": "markdown", + "id": "772f62da", + "metadata": {}, + "source": [ + "## Recommendation Stories" + ] + }, + { + "cell_type": "markdown", + "id": "d949514a", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know which products were bought together in the same order.\n", + ">Query: Customers who bought product-61 also bought which products in the same order and how many times?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "39071115", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productA\n", + " ?productB\n", + " (COUNT (*) AS ?productBCount)\n", + "WHERE {\n", + " ?productA ^:hasProduct/:belongsToOrder/^(^:hasProduct/:belongsToOrder) ?productB ;\n", + " :productID ?productID .\n", + " FILTER (?productA != ?productB && ?productA = :product-61) # Filtering on product-61 for testing\n", + "}\n", + "GROUP BY\n", + " ?productA\n", + " ?productB\n", + "ORDER BY\n", + " DESC(?productBCount) ?productA ?productB # Most frequent at the top" + ] + }, + { + "cell_type": "markdown", + "id": "58bfc778", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know which products were bought together across all orders.\n", + ">Query: Customers who bought product-61 also bought which products across all orders and how many times?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "49b4d16e", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productA\n", + " ?productB\n", + " (COUNT (*) AS ?productBCount)\n", + "WHERE {\n", + " ?productA ^:hasProduct/:belongsToOrder/:hasCustomer/^(^:hasProduct/:belongsToOrder/:hasCustomer) ?productB ;\n", + " :productID ?productID .\n", + " FILTER (?productA != ?productB && ?productA = :product-61) # Filtering on product-61 for testing purposes\n", + "}\n", + "GROUP BY\n", + " ?productA\n", + " ?productB\n", + "ORDER BY\n", + " DESC(?productBCount) ?productA ?productB" + ] + }, + { + "cell_type": "markdown", + "id": "2d60ace7", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know how many times two given products where bought by the same customer. \n", + ">Query: How many times products 2 and 61 where bought by the same customer." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "609854ee", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT (COUNT (1) AS ?Count)\n", + "WHERE { \n", + " :product-2 ^:hasProduct/:belongsToOrder/:hasCustomer/^(^:hasProduct/:belongsToOrder/:hasCustomer) :product-61 \n", + "}\n" + ] + }, + { + "cell_type": "markdown", + "id": "fa23ef4c", + "metadata": {}, + "source": [] + }, + { + "cell_type": "markdown", + "id": "76d1c9f2", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know the contact details of suppliers, customers and employees to send out Xmas cards." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5c48f65c", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "PREFIX foaf: \n", + "\n", + "SELECT\n", + " ?contactName\n", + " ?address\n", + " ?city\n", + " ?postalCode\n", + " ?country\n", + "WHERE {\n", + " {\n", + " ?supplier a :Supplier ;\n", + " :contactName ?contactName ;\n", + " :address ?address ;\n", + " :city ?city ;\n", + " :postalCode ?postalCode ;\n", + " :country ?country .\n", + " } UNION {\n", + " ?customer a :Customer ;\n", + " :contactName ?contactName ;\n", + " :address ?address ;\n", + " :city ?city ;\n", + " :postalCode ?postalCode ;\n", + " :country ?country .\n", + " } UNION {\n", + " ?employee a :Employee ;\n", + " foaf:firstName ?firstName ;\n", + " foaf:lastName ?lastName ;\n", + " :address ?address ;\n", + " :city ?city ;\n", + " :postalCode ?postalCode ;\n", + " :country ?country .\n", + " BIND (CONCAT (?firstName, \" \", ?lastName) AS ?contactName)\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "18a6d8c7", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to know all products that belong to the Seafood category an their quantity in stock." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b962e726", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " ?productName\n", + " ?unitPrice\n", + " ?unitsInStock\n", + "WHERE { # outer query\n", + " ?product a :Product ;\n", + " :productName ?productName ;\n", + " :unitPrice ?unitPrice ;\n", + " :unitsInStock ?unitsInStock ;\n", + " :hasCategory ?category .\n", + " { # inner query\n", + " SELECT\n", + " ?category\n", + " WHERE {\n", + " ?category a :Category ;\n", + " :categoryID ?categoryID ;\n", + " :name \"Seafood\" .\n", + " }\n", + " }\n", + "}\n", + "ORDER BY\n", + " ?productName" + ] + }, + { + "cell_type": "markdown", + "id": "bcc90397", + "metadata": {}, + "source": [ + ">### As a Sales Manager, I want to calculate the average number of orders processed per year." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e01012ec", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "SELECT\n", + " (AVG(?orderCount) AS ?avgCount)\n", + " (MIN(?orderYear) AS ?startYear)\n", + " (MAX(?orderYear) AS ?endYear)\n", + "{\n", + " SELECT ?orderYear (count(?order) AS ?orderCount)\n", + " WHERE {\n", + " ?order a :Order ;\n", + " :orderDate ?orderDate ;\n", + " BIND(year(?orderDate) AS ?orderYear)\n", + " }\n", + " GROUP BY\n", + " ?orderYear\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "bfd458ce", + "metadata": {}, + "source": [ + ">### As a Sales Representative, I want to be able to insert a new customer." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5a9a0cf0", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "INSERT DATA {\n", + " :customer-AAAAA a :Customer ;\n", + " rdfs:label \"customer-AAAAA\" ;\n", + " :customerID \"AAAAA\" ;\n", + " :companyName \"Northwind\" ;\n", + " :contactName \"John Lennon\" ;\n", + " :contactTitle \"CTO\" ;\n", + " :address \"Abbey Road\" ;\n", + " :city \"London\" .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "da20429f", + "metadata": {}, + "source": [ + "Checking if new customer has been added successfully" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cd3dfb0c", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "DESCRIBE :customer-AAAAA" + ] + }, + { + "cell_type": "markdown", + "id": "ee1e399c", + "metadata": {}, + "source": [ + ">### As a Sales Representative, I want to be able to update an existing customer." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d442d971", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "# Step 1: Insert new triple for for the properties not included the original insert query\n", + "\n", + "PREFIX : \n", + "\n", + "INSERT DATA {\n", + " :customer-AAAAA a :Customer ;\n", + " :country \"UK\" ;\n", + " :postalCode \"SW1A 2AA\" .\n", + "}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d9b4b6ed", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "# Step 2: Update the property values added.\n", + "\n", + "PREFIX : \n", + "\n", + "DELETE {\n", + " :customer-AAAAA :address ?oldAddress\n", + "}\n", + "INSERT {\n", + " :customer-AAAAA :address ?newAddress\n", + "}\n", + "WHERE {\n", + " :customer-AAAAA a :Customer ;\n", + " :address ?oldAddress ;\n", + " BIND(\"10 Downing Road\" AS ?newAddress) .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "73fdc759", + "metadata": {}, + "source": [ + ">Checking if existing customer has been updated successfully" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4c418dc4", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "DESCRIBE :customer-AAAAA" + ] + }, + { + "cell_type": "markdown", + "id": "91eea78e", + "metadata": {}, + "source": [ + ">### As a Sales Representative, I want to be able to delete an existing customer." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8e0a1182", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "\n", + "DELETE {\n", + " :customer-AAAAA ?p ?s \n", + "}\n", + "WHERE {\n", + " :customer-AAAAA ?p ?s .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "0d7d3737", + "metadata": {}, + "source": [ + ">Checking if existing customer has been deleted successfully" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d4ed3bf2", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "DESCRIBE :customer-AAAAA" + ] + }, + { + "cell_type": "markdown", + "id": "e4c2f636", + "metadata": {}, + "source": [ + "## Visualisation" + ] + }, + { + "cell_type": "markdown", + "id": "fcc65a1d", + "metadata": {}, + "source": [ + ">### As a Data Engineer, I want to visualise a graph representation of a given Order.\n", + "Select the `Graph` tab to visualise the graph. Click on a Class, e.g. `order-10370` and select the `Details` icon to see its properties." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4b074adc", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX : \n", + "PREFIX rdf: \n", + "PREFIX rdfs: \n", + "\n", + "CONSTRUCT {\n", + " ?order ?ordPredicate ?ordObject .\n", + " ?orderDetail ?oddPredicate ?oddObject .\n", + " ?product ?prdPredicate ?prdObject .\n", + "} \n", + "WHERE {\n", + " VALUES ?order { :order-10370 }\n", + " ?order a :Order ;\n", + " :hasEmployee ?employee ;\n", + " :hasShipper ?shipper ;\n", + " ?ordPredicate ?ordObject .\n", + " ?orderDetail a :OrderDetail ;\n", + " :belongsToOrder ?order ;\n", + " :hasProduct ?product ;\n", + " ?oddPredicate ?oddObject .\n", + " ?product a :Product ;\n", + " :hasCategory ?category ;\n", + " :hasSupplier ?supplier ;\n", + " ?prdPredicate ?prdObject .\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "4a5514d8", + "metadata": {}, + "source": [ + "## Cleaning up" + ] + }, + { + "cell_type": "markdown", + "id": "a14d56c2", + "metadata": {}, + "source": [ + "### Cleaning up the Northwind data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3ac621ea", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX nwGraph: \n", + "\n", + "DROP GRAPH :NorthwindGraph ;" + ] + }, + { + "cell_type": "markdown", + "id": "95af188b", + "metadata": {}, + "source": [ + "### Checking if Northwind Graph is empty" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cbac6e3d", + "metadata": {}, + "outputs": [], + "source": [ + "%%sparql\n", + "\n", + "PREFIX nwGraph: \n", + "\n", + "ASK { GRAPH nwGraph:NorthwindGraph { ?s ?p ?o } }\n" + ] + }, + { + "cell_type": "markdown", + "id": "d9ad69bf", + "metadata": {}, + "source": [ + "## Use Case Template\n", + "\n", + "### Contents\n", + "\n", + "- Outcome\n", + " - Primary business outcomes\n", + " - Secondary business outcomes\n", + "- Personas\n", + "- Concepts\n", + "- Stories\n", + "- Workflows\n", + "- Owner\n", + "- Lifecycle state\n", + "- Projects\n", + "- Sub-use cases / Dependent use cases\n", + "- Super use cases\n", + "- Ontologies (including shapes)\n", + "- Datasets (this comes last, not first)\n", + "\n", + "\n", + "### Use Case \\\n", + "\n", + "#### Outcome:\n", + "\n", + "- \\\n", + " - Describe “the why,” what do we want to achieve?\n", + " - Define success, desired/required outcomes.\n", + "- \\\n", + " - All outcomes that are mentioned in the stories.\n", + "\n", + "#### Personas:\n", + "\n", + "- \\ (not actual people names)\n", + "- \\\n", + "\n", + "#### Concepts: (not including personas which are also concepts)\n", + "\n", + "- \\\n", + " - Will later (in the lifecycle of the use case) be linked to ontology axioms (OWL), shape definitions (SHACL) or concepts in generic taxonomies (SKOS)\n", + "\n", + "#### Stories:\n", + "\n", + "- As a \\, I want \\ in order to achieve \\\n", + " - Plain English\n", + " - We will put it in RDF later when your stories have gone through initial agreement with the business\n", + " - i.e., get them agreed first.\n", + " - In the \\ clause, all nouns must be defined as \\’s\n", + " - Next level:\n", + " - Mandatory and Optional Input Concepts\n", + " - Output Concepts (not to the level of JSON API output schemas, just the concepts that occur in the output of a given story)\n", + " - Additional entitlement restrictions\n", + " - Test scenarios with actual test data for each story, every story has at least 1 test scenario in a “Given, When, Then” format\n", + "\n", + "#### Workflows:\n", + "\n", + "- Initially, in plain English\n", + "\n", + "#### Owner:\n", + "\n", + "- \\ / \\ / \\\n", + "- \\\n", + "\n", + "#### Lifecycle state:\n", + "\n", + "- \\ | \\ | \\\n", + "- \\ in the roadmap\n", + " - Required maturity level for the given \\ (data/tech/org/business maturity)\n", + "- \\\n", + "\n", + "#### Projects:\n", + "\n", + "- JIRA projects, issues tied to this use case etc\n", + "- Roadmap tied to \\\n", + "- Budgets\n", + "- Teams\n", + "\n", + "#### Sub-use cases / Dependent use cases\n", + "- Which sub use cases need to be delivered in which order first?\n", + "\n", + "#### Super use cases\n", + "- Higher level use case (usually one but could be multiple)\n", + "- The higher level you go in the Use Case Tree the more abstract / broader the type of use case is (business capability, business domain, etc)\n", + "\n", + "#### Ontologies (including shapes):\n", + "- Based on the agreed stories and the agreed list of \\, select the appropriate ontologies to be used\n", + "- Not a discussion with the business, it is an “implementation detail” for specialists.\n", + "- Per use case at least 1 ontology, usually multiple\n", + "- Per concept and linked ontology axiom (class, data, or object property) or per shape: create test instances in test datasets to be used for automated execution of all story test scenarios.\n", + "\n", + "#### Datasets: (this comes last, not first)\n", + "- Identify which (logical) datasets can deliver on the agreed list of concepts and can be mapped to the agreed list of ontologies\n", + "- Work with the DTops team to implement the pipelines that will deliver these datasets into the knowledge graph\n", + "- Define criteria for transform, validate and enrich steps in these pipelines" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.8" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}