What I Learned at Neo4j’s NODES 22 Conference
Let’s just say there is a lot happening in data science — and in particular, data science assisted by artificial intelligence/machine learning — these days. It already is having profound effects on computing real-world applications in a number of areas, spanning the sciences, economics, industrial applications, and health care.
In many ways, we are just on the cusp of learning what’s possible. Gartner predicts that by 2025, graph technologies will be used in 80% of data and analytics innovations, up from 10% in 2021, facilitating rapid decision-making across the organization on a business level.
In parallel, the tools used for data science — and more specifically, data visualization and graphical representation — are fueling these advances.
“Graph data science is when you want to answer questions, not just with your data, but also the connections between the data points,” said Luke Gannon, product manager at Neo4j, said during his talk at NODES 22, Neo4j’s annual developers’ virtual conference, held in November.
“And this is really important because when you have these connections, they allow you to answer new questions, like who is the most important, or what’s the best choice for one thing or what might happen next?’”
Graph data science is but one of the more exciting subjects that fall under the data science umbrella. Graph databases, data visualization, AI/ML pipelines and applications were among the use cases covered during NODES 22. The conference also served as a launch party of sorts for Neo4j 5, a major new release of the company’s signature technology.
Here’s what I learned at NODES 22 about what’s possible from data graphs and data science.
Social Connections: More Powerful Than You Think
@Yale professor’s @NAChristakis: “It’s not who you are that matters. it’s how the network is organized and structured that might elicit this property of cooperation.” @neo4j‘s conference NODES 22. #livetweet #NODES2022 pic.twitter.com/cUuue9Gjaa
— BC Gain (@bcamerongain) November 16, 2022
Many of the data use cases described during the conference show how ML, applied to analysis of data involving human subjects, illuminates how interactions in a network can have major influences on human behavior. The term “social contagion” aptly applies in these cases, showing how the direct and indirect actions between social connections can alter not only individual human behavior but change the general behavior of entire groups.
The ability to draw inferences and patterns based on relationship data in groups’ networks may have either noble or nefarious purposes. Point-of-contact data was particularly useful to help trace the source of contagion for Covid-19 patients during the earlier stages of the pandemic.
On the other hand, Russian government-backed groups often use bots to influence behavior by targeting people in the U.S. and Europe who are susceptible to propaganda — and who might have strong influence over others in their network.
Most use cases usually fall somewhere in between these extremes of benevolence and malice, such as using data graphs to pinpoint those in a social network who have the most influence over others to purchase products.
This is insane: The more bots interacting with humans to solve problems, the better. And large quantities of dumb bots work very well! @Yale professor’s @NAChristakis @neo4j‘s conference NODES 22. #livetweet #NODES2022 @thenewstack pic.twitter.com/mZOi2Ltlag
— BC Gain (@bcamerongain) November 16, 2022
“Human beings are embedded in social networks. These networks obey very particular biological, psychological, sociological and mathematical principles. And taking this into account offers us tremendous opportunities to gain new insights into behaviors and also to change,” said Nicholas A. Christakis, a Yale University professor and freelance scientific adviser, during a keynote speech.
“We can use an understanding of social network structure and function for good to intervene in both online and offline worlds in order to enhance our health and our well-being, our public policy and our business.”
Taking a closer look, through data science, at how humans and their networks are embedded involves a shift in focus to the “externalities of intervention,” Christakis said. “It engages us in the exploration of how it is that when we intervene in a group, how we affect not just the people that we target, but also all the other people around them.”
Using data graphs and analysis and visualization tools, Christakis has uncovered some often startling results about how human behavior affects others who are not directly connected to them.
Indeed, he said, “one of the most bizarre results that has come out of my lab in the last few years” was after he used simple artificial linear networks to trace back the sequence of interactions that individuals had. The effects of altruism and how they affect both direct and indirect social connections were analyzed in the study.
“What we were able to find is that this kind of altruistic effect could spread from person to person,” he said.
Social contagion can apply to any social setting. How two members of the conference attendees treat each other may depend on how two other members of the audience treat each other, even though neither pair ever interacted with any other member of the other pair, Christakis said.
“This is experimental documentation of social contagion,” he said.
App Interfaces Can Be Simple but Powerful
The rapid adoption of data science and in particular data visualization and data graph tools is largely attributed to not only their power but their simplicity of use. Neo4j has capitalized on this trend by supporting projects outside the traditional sphere of data science, for applications beyond the IT sector.
One case in point: A group of journalists used data science tools such as Neo4j to trace connections of more than 400,000 individuals connected with secret offshore accounts. Called the “Paradise Papers,” Frederik Obermaier and Bastian Obermayer, both German reporters, wrote the project for the Süddeutsche Zeitung newspaper.
The journalists’ use of Neo4j graph databases allowed them to better visualize and analyze connections between individuals and organizations, such as hidden offshore banks and companies. These data points can now be accessed with just a few command lines in a more intuitive and intelligent way than a SQL, NoSQL or other kinds of databases would have provided.
NODES 22 showed through demos and talks how Neo4j continues to improve, through the use of its platform and tools, data graphs and visualizations how to achieve graph-data calculations and visualizations in a scalable way.
By leveraging machine learning, Neo4j can now be used to run different algorithms and processes involving billions of nodes and relationships. The ML pipelines can be integrated with Python and other ML frameworks, while different sets of databases can be integrated into a single data visualization panel.
Neo4j’s capabilities, such as data inferences and governance, may be innovative, but graph and other data visualizations should be simple to use. The graph data of Neo4J’s deployment by J.B. Hunt Transport Services, a supply chain and transportation services provider, could be described as “boring,” acknowledged Donovan Bergin, a technical solutions architect at J.B. Hunt, during his talk.
But that’s exactly what his company sometimes needs, he added: “We built a boring graph: You just got your equipment. You’ve got operations and other stuff that I’m not allowed to talk about today.”
@JBHuntDrivers‘s Donovan Bergin how J.B. Hunt employs telemetry for governance of its transportation fleet. Interesting how inferances from data allow for increasing switching of truck to train transport for lower CO2. #NODES2022 @neo4j #NODES2022 @thenewstack pic.twitter.com/Bd5Uzvv6Ts
— BC Gain (@bcamerongain) November 16, 2022
J.B. Hunt uses the graph data to monitor equipment, with telemetry and sensor readings for location tracking, device alerts if voltages are too low, and sensor reading if temperatures are too high or too low or if “we’re going in the wrong direction,” Bergin said.
Other data provided can include visualization of different hubs or nodes connections by rail or other links. Data science can be used to, among other things, determine and predict how critical different nodes are for logistics.
Neo4J Has Made Significant Improvements
Neo4j’s major release of Neo4j 5 offers a number of new features, such as allowing users to integrate multiple data graphs and improvements to scalability and flexibility. Improvements have been made to drivers, query functionality and graphs using Neo4J’s graphic query language Cipher and other improvements focused on indexing. The latest version of the platform was also designed to make it easier to run and manage Neo4j clusters.
The autonomous clustering functionality is “perhaps one of the most sophisticated clustering architectures in the database,” according to Stu Moore, product manager at Neo4j, by enabling elasticity within a cluster.
“The key kind of innovation and change within string technology has been that you no longer need to run a copy of the database on every single server within the cluster,” Moore said.
Another key feature is how server-side routing is turned on by default in Neo4j 5 for the use of load balancers and other network technologies on the cloud. With it, queries are internally routed to appropriate database management servers.
Neo4j 5 represents a new release model for the graph database management provider. Previously, each new release was followed by incremental bug fixes or security patches that represented subsequent versions of the main release.
However, the new release model “is really going to be more like what you would expect from a cloud-first vendor that releases software in a continuous fashion,” said John Stegeman, graph database product specialist, during his conference talk.
The new release model will follow that of Neo’s managed-cloud platform, AuraDB. “This is always what we’ve done with our Aura product, as new features of 5.1, 5.2, 5.3, etc. are coming more or less on a continuous basis over time,” Stegeman said.
“What’s changing with Neo4j 5 is we’re bringing the experience to the self-hosted community: people who are running Neo4j in their own data center, or are self-managing their own deployments on the cloud vendors,” Stegeman said.
— BC Gain (@bcamerongain) November 17, 2022
Neo4j 5 will no longer support entry and exit (B-Tree) indexes for database queries, which have been replaced with more elegant and nimble range and point indexes. This represents a “significant” change in Neo4j 5, Stegeman said.
Neo4j 5 is intended to serve as a point of departure for the company, as it seeks to improve and keep up with the scaling and performance demands required for many of the use cases described at NODES 22.
Said Moore during his talk, “This is really an exciting release for us — not just thanks to the features that we’re delivering — but because this is the first time we’ve released our entire Neo4j product platform at the same time.”
Data Science Is Beautiful and Scary
Anyone remotely familiar with data science these days knows how it has transformed computing and applications in a number of sectors. But, in many ways, what this conference conveyed was not so much what was said but what was left unsaid.
Christakis noted how inferences and analyses used to affect social connections and influences can be used for the good of society — without delving into how these very high-powered ML-assisted applications can also be used for nefarious purposes.
Also, as the conference talks reflected, data science should have an even more profound effect on a number of industries in the future than it has in the past. It will certainly be exciting to see what comes next.
Where to Go from Here
- See any portion of the entire NODES 22 conference from Neo4j’s NODES playlist on YouTube.
- Neo4j’s new guide, “Graph Databases Explained,” introduces you to the transformative power of relationships in problem-solving.
- Learn more about the underappreciated aspects of graph database in this article from The New Stack.
- Neo4j GraphAcademy offers complete, self-paced, multipart video courses designed to build your skills to the point you can add them to your résumé.
- Register right now for Neo4j AuraDB Free and take your time learning about graphs and Cypher. There will never be a fee to use anything you’ve built on the Free tier.
- Explore more about the Cypher language in The Neo4j Cypher Manual.
- Watch Neo4j’s Chris Gioran explain the principles and value of graph databases step-by-step, in his comprehensive video series, “Under the Hood.”
- Download your free copy of “Full Stack GraphQL Applications with React, Node.js, and Neo4j.