TNS
VOXPOP
Where are you using WebAssembly?
Wasm promises to let developers build once and run anywhere. Are you using it yet?
At work, for production apps
0%
At work, but not for production apps
0%
I don’t use WebAssembly but expect to when the technology matures
0%
I have no plans to use WebAssembly
0%
No plans and I get mad whenever I see the buzzword
0%
Data

Build a Movie Database with Neo4j’s Knowledge Graph Sandbox

For this tutorial, we'll show how to amass otherwise disparate databases of actors, movies and directors and show how they are connected, using Neo4j’s Sandbox.
Jul 29th, 2023 9:00am by
Featued image for: Build a Movie Database with Neo4j’s Knowledge Graph Sandbox
Feature image by David Mark from Pixabay.

A lot has been written about the rise of the citizen developer or even the citizen data scientist, and when it comes to the creation and use of knowledge graphs, they can do some pretty amazing things.

Definitions of knowledge graphs vary, but for the purposes of this article in which we walk you through the build of a Neo4j graph, a knowledge graph is a visualization of the connected nature of different data sets. It can be thought of as an augmented model view of a master data-management solution that shows how different groups, objects or other data points are connected.

For this tutorial, we’ll show how to amass otherwise disparate databases of actors, movies and directors and show how they are connected. The idea is to provide users with simplicity for such graph tools such as those that Neo4j provides. Specifically, we’ll use Neo4j’s Sandbox with Neo4j’s Cypher language to visualize in data graphs movies released after 2000 while limiting the results to a specific number, such as five movies. Actor, producer and other connections to those movies are also visualized. This data graph can be generated in with just a few clicks on the Sandbox site, using Cypher to query the Neo4j Database.

Once these databases have been selected, Neo4j automates the sandbox’s build. It’s really that simple as we’ll see. So let’s get started creating our movie database by first accessing Neo4j’s Sandbox page and either registering or logging on. 

Select Neo4j Sandbox Under Get Started:

Sign up or create an account with Google, Twitter or LinkedIn:

 

Select “For Developers,” which will automatically un-select “For Data Scientist” and then Select the Movies Dataset:

Select Create at the bottom left-hand of the screen:

Select Open with Browser:

The text on the left of the screen that remains as you complete all of the exercises in the Sandbox helps to guide you through the commands with Cypher, which is a graph query language that is used to query the Neo4j Database. Just as SQL is used to query a MySQL database, you use Cypher to query the Neo4j Database. All queries and commands can be copied and paste directly into the command line as the accompanying text indicates on the side of the screen.
We begin by creating a sample query to return all the movies in the database released after 2000 but limited to five items:

The next steps in the Sandbox offer a summary of Cypher and related descriptions of knowledge graph terms such as Nodes and Relationships, Labels and Properties and how they are used. Click Next for each:

–Use the CREATE clause to create your personal node:


–Use the Match clause for Node matches with actor Tom Hanks:


You can also use a WHERE clause which allows for more complex filtering including >, <, STARTS WITH, ENDS WITH, etc. with the Match clause:

Here, we find the movie Cloud Atlas by its title with the MATCH clause and movies released between 2010 and 2015 with the MATCH clause: 

–Write a query using Merge to create a movie node with title “Greyhound.” As noted in the Sandbox’s sidebar documentation, if the node does not exist then set its released property to 2020 and lastUpdatedAt property to the current time stamp. If the node already exists, then only set lastUpdatedAt to the current time stamp. Return the movie node:


Relationships have an outgoing or incoming relationship, denoted in Cypher by → or ←. In this query, Person (Tom Hanks) has an outgoing relationship and movie has an incoming relationship:

The results are zoomed out.

The results zoomed in.

Find the nodes Person and Movie that are connected by a REVIEWED relationship and is outgoing from the Person node and incoming to the Movie node:


Find all actors who have co-acted with Tom Hanks in any movie in Table mode:


A Range of Possibilities

These are just a sample of commands the Sandbox offers for the Movies Database. Other Sandbox databases and accompanying explanations to help get you started using Cypher for Neo4j total 20 in all, 13 of which are oriented for developers and the rest are geared towards data scientists. The datasets range from the Offshore leaks dataset and guide from the International Consortium of Investigative Journalists (ICIJ) to Stack Overflow questions, including answers, tags, and comments and the relationships between them.

Most of all, have fun!

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.