Northwind Graph
From RDBMS to Graph, using a classic dataset
TheNorthwind Graph demonstrates how to migrate from a relational database to Neo4j. The transformation is iterative and deliberate, emphasizing the conceptual shift from relational tables to the nodes and relationships of a graph.
This guide will show you how to:
- Load: create data from external CSV files
- Index: index nodes based on label
- Relate: transform foreign key references into data relationships
- Promote: transform join records into relationships
Product Catalog
Northwind sells food products in a few categories, provided by suppliers. Let's start by loading the product catalog tables.
The load statements to the right require public internet access.LOAD CSV
will retrieve a CSV file from a valid URL, applying a Cypher statement to
each row using a named map (here we're using the name `row`).
Load records
LOAD CSV WITH HEADERS FROM "http://data.neo4j.com/northwind/products.csv" AS row CREATE (n:Product) SET n = row, n.unitPrice = toFloat(row.unitPrice), n.unitsInStock = toInt(row.unitsInStock), n.unitsOnOrder = toInt(row.unitsOnOrder), n.reorderLevel = toInt(row.reorderLevel), n.discontinued = (row.discontinued <> "0")
LOAD CSV WITH HEADERS FROM "http://data.neo4j.com/northwind/categories.csv" AS row CREATE (n:Category) SET n = row
LOAD CSV WITH HEADERS FROM "http://data.neo4j.com/northwind/suppliers.csv" AS row CREATE (n:Supplier) SET n = row
Create indexes
CREATE INDEX ON :Product(productID)
CREATE INDEX ON :Category(categoryID)
CREATE INDEX ON :Supplier(supplierID)
Product Catalog Graph
The products, categories and suppliers are related through foreign key references. Let's promote those to data relationships to realize the graph.
Create data relationships
MATCH (p:Product),(c:Category) WHERE p.categoryID = c.categoryID CREATE (p)-[:PART_OF]->(c)
MATCH (p:Product),(s:Supplier) WHERE p.supplierID = s.supplierID CREATE (s)-[:SUPPLIES]->(p)
Query using patterns
MATCH (s:Supplier)-->(:Product)-->(c:Category) RETURN s.companyName as Company, collect(distinct c.categoryName) as Categories
MATCH (c:Category {categoryName:"Produce"})<--(:Product)<--(s:Supplier) RETURN DISTINCT s.companyName as ProduceSuppliers
Customer Orders
Northwind customers place orders which may detail multiple
products.
Load and index records
LOAD CSV WITH HEADERS FROM "http://data.neo4j.com/northwind/customers.csv" AS row CREATE (n:Customer) SET n = row
LOAD CSV WITH HEADERS FROM "http://data.neo4j.com/northwind/orders.csv" AS row CREATE (n:Order) SET n = row
CREATE INDEX ON :Customer(customerID)
CREATE INDEX ON :Order(orderID)
Create data relationships
MATCH (c:Customer),(o:Order) WHERE c.customerID = o.customerID CREATE (c)-[:PURCHASED]->(o)
Customer Order Graph
Notice that Order Details are always part of an Order and that theyrelate the Order to a Product — they're a join table. Join tables are always a sign of a data relationship, indicating shared information between two other records.
Here, we'll directly promote each OrderDetail record into a relationship in the graph.
Load and index records
LOAD CSV WITH HEADERS FROM "http://data.neo4j.com/northwind/order-details.csv" AS row MATCH (p:Product), (o:Order) WHERE p.productID = row.productID AND o.orderID = row.orderID CREATE (o)-[details:ORDERS]->(p) SET details = row, details.quantity = toInt(row.quantity)
Query using patterns
MATCH (cust:Customer)-[:PURCHASED]->(:Order)-[o:ORDERS]->(p:Product), (p)-[:PART_OF]->(c:Category {categoryName:"Produce"}) RETURN DISTINCT cust.contactName as CustomerName, SUM(o.quantity) AS TotalProductsPurchased
Northwind Graph
Next steps
More code
- Movie Graph - actors & movies
- Query Templates - common ad-hoc queries
- Cypher - query language fundamentals