Graph Data Science 101: Using Neo4j as a Graph Data Science Platform

Tech First
3 min readNov 29, 2020

--

If you’re going to use graph data science (GDS), you should run it on a platform. In this chapter, we show you what platform pieces Neo4j offers to help you. Neo4j is a graph technology company that provides an enterprise-grade GDS platform that includes four components.

Neo4j supports transactional processing and analytical processing of graph data as well as visualization. It also includes graph storage and compute with data management and analytics tooling. The set of integrated tools includes a common protocol, API, and query language (Cypher) to provide effective access for different uses. In this chapter, we cover each of the four areas of the Neo4j platform in a bit more detail to help you see how your GDS solution fits together.

Neo4j GDS Library

The Neo4j GDS Library offers an enterprise-ready approach to running sophisticated graph algorithms on connected data at scale. Graph analytics and feature engineering add highly predictive relationships to your machine learning (ML) for better results. Algorithms are executed in an analytics workspace that scales computations to handle graphs that contain tens of billions of nodes and relationships. For examples, training, and details on how to use the Neo4j GDS Library, visit neo4j.com/developer/ graph-algorithms. You can also go directly to the Neo4j GDS Library at neo4j.com/graph-data-science-library.

Neo4j Graph Database Management System

The Neo4j Database Management System (DBMS) supports multiple databases that can be run in standalone or clustered installations and supports sharding and federated access to databases. Neo4j graph databases are designed to treat the relationships between data as important as the data itself. It’s considered a native-graph database because the data is stored together with how each individual entity connects with or is related to others. You can find more information about the Neo4j Graph DBMS at neo4j.com/developer/graph-database.

CYPHER DECLARATIVE QUERY LANGUAGE

Cypher is the most widely adopted, fully defined, and open query language for property graph databases. It is a declarative, SQL- inspired language for describing visual patterns in graphs by using ASCII-Art syntax. You can state what you want to select, insert, update, or delete from your graph data without describing how to do it. Cypher is intended to be readable. For example the phrase, “Jennifer likes graph technology,” would be written as

(p:Person {name: “Jennifer”})-[rel:LIKES]->(g:Technology {type: “Graphs”})

Cypher basics and learning resources can be found on the Cypher page for Neo4j developers at neo4j.com/developer/cypher- query-language.

Neo4j Desktop is a user interface for operating local databases. Neo4j Browser is a general purpose user interface for working with the Neo4j database and is a core component of Neo4j Desktop. Developers and data scientists can use this tool to query, visualize, administer, and monitor their databases. The diagram in Figure 4–1 shows the Neo4j Browser being used against a fraud graph.

Neo4j Bloom

Neo4j Bloom is a graph visualization and exploration tool that allows you to find patterns in a Neo4j graph by using a codeless search paradigm. It uses an interactive point-and-click interface to expand and refine results, find interesting paths, and share insights with others.

Bloom is intended for ad-hoc, visual explorations, and fast prototyping with type-ahead search suggestions and direct editing of nodes and relationships. The visual presentation has flexible color, size, and icon schemes to help differentiate influential items with styling that can be based on the results of running algorithms from the GDS Library (see the earlier section in this chapter titled “Neo4j GDS Library”).

Figure 4–2 shows the Bloom interface for an example of restaurant reviews that can be exported and shared.

Read More…

--

--