What I took away from GraphConnect Europe 2016

Mingling in the exhibition hall at GraphConnect.

The conference took place at the QEII Centre, London. The day began in the Flemming Hall, filled with close to 700 attendees, with a welcoming talk by Neo Technology’s Rik Van Bruggen and Holger Temme, followed by Emil Eifrem’s opening keynote. Emil took us on a 45 minute journey starting from the challenges that many companies were facing around the turn of the century with SQL Database Management Systems, setting the scene for the “napkin model” — how their graph database idea was first drafted — that would eventually become today’s Neo4j. The keynote concluded enthusiastically with the announcement of Neo4j’s latest release v3.0.

During the break we got a chance to just walk around the venue, meet the sponsors and generally mingle with other attendees. What caught everyone’s eyes were the people in the white lab coats; Neo Technology’s engineers who were there to help answer any question one could have about Neo4j. Amongst them Nicole White, a Data Scientist specialising in recommendations. I was able to briefly discuss some ideas with her about applying community detection algorithms on graphs. The problem I was interested in solving was how to utilise such algorithms so as to not only account for a node’s topology — its position in the graph and the relationships it shares with its neighbours — but also whether it shares common features with other nodes. Through her response, I was able to get a short peek into the content of her presentation which was part of the Neo4j Deep Dive track.

The conference was essentially broken into four separate tracks: the Business Impact track covered various use cases related to how Neo4j is being used in the market; the Neo4j Deep Dive covered technical talks on various Neo4j aspects, functionalities, new features and integrations; the Neo4j in Action covered some novel ideas which best illustrate the impact of exploiting Neo4j in practice, and; Lightning Talks covering innovative, new ideas on how to use Neo4j.

I started my day with Mark Needham and Michael Hunger in the Deep Dive track; their talk was on importing data: describing different tools available for importing both transactional data as well as bulk initial data sets through a series of examples. In the pre-v3.0 editions, one could import csv files into Neo4j or — more generally — text files with different delimiters.  With Neo4j v3.0, importing files reached a whole new level. Michael introduced a number of procedures, amongst the long list of apoc-procedures, that can be used to import Json files, or for loading data from RDBMS and web-APIs (Json, XML, csv). [Geek alert: Apoc stands for “Awesome Procedures On Cypher” but it also refers to the technician and driver on board of the “Ne-bu-cha-dne-zzar” (couldn’t read this without the hyphens) in the Matrix movie, who was killed by — guess who — Cypher.]

One of the most important new features of Neo4j 3.0 is Official Language Drivers, which were introduced by Nigel Small & Stefan Plantikow in the next half hour long session I followed, again in the Deep Dive track. The talk was about the new uniform drivers for Java, Python, .NET and JavaScript that were released as part of v3.0, as well as the binary protocol that powers them. In addition, it also covered the procedures feature again, but in more detail, explaining how it gives one the ability to streamline imperative graph operations and thus to extend functionality beyond the graph database, something which I consider to be an important milestone for Neo4j (I’ll revisit this momentarily).

The next session I followed was given by Stelios Georgiannakis, a Senior Engineer VP at the Royal Bank of Scotland and was related to managing dependencies using Neo4j. Stelios introduced Zambezi, an HTML5 & Java micro-services framework used for application development, and explained how various challenges that arise from rolling out changes in the framework affect numerous applications and the developers responsible for maintaining them. Identifying and resolving these dependencies was anything but trivial. However, by leveraging Neo4j’s intuitive structure and dynamic schema — entities and relationships can be incorporated into (or removed from) a current schema without significant costs or undesired effects on the system — dependency management became a lot easier. As Stelios puts it “we just add nodes and relationships and… it works!”.

I wasn’t able to attend any other talks from the Business Impact track but I had a chance to watch most of them on the GraphConnect website and it seems to me that similar needs or problems were repeated amongst all talks in the track, justifying the need for turning to graphs and graph database management systems to solve some important business use cases. Whether one is concerned with the development of a product information management solution, increasing comprehension through intuitive data visualisations or managing trust and achieving end-to-end transparency in product supply chains, shifting one’s perspective towards graphs often appears to be part of the solution.

I spent the rest of my day in the Deep Drive track, following Nicole White’s talk on Recommendation approaches, ranging from simple generalisations of the triadic closure to machine learning algorithms for clustering such as k-means, followed by William Lyon’s talk on NoSQL polyglot persistence. The latter is the idea of using multiple database technologies to power a single application thus enhancing the application by taking advantage of the strengths of each technology. How to tackle the added complexity that follows from such integrations was also discussed, and this is what really caught my attention. William referred to BOLT, the Binary Protocol for Neo4j, and the language drivers that come with the v.3.0 update that one can use to implement it, which allows one to do things like build a Neo4j Spark connector. It also allows one to build embeddable graph visualisations like the one below, which represents [Geek alert!] the Game of Thrones interactions graph with centrality measures indicating the importance of each node/character in the series.

Into graphs and Game of Thrones? Then check out this paper.

The four tracks were followed by a talk from Mar Cabra on how the ICL used Neo4j to unravel the Panama Papers to analyse and reveal hidden, potentially illegal, relationships between bank accounts and individuals, in what is believed to be the biggest leak in journalism history. Take a look here – it is fascinating.

The conference closing keynote was given by Dr. Jim Webber (Chief Scientist, Neo Technology) who took us on a 50 minute journey from: amazon transactions and the TFL journey planner; to a fully connected graph with stable triadic closures that helps explain how World War II came to be; to how USA-UK love relationships are affected by France (a funny way to explain, in graphs, that reliability is a far greater concern than availability).

Overall, the conference was both informative and fun, offering attendees an all-round perspective of the upgraded version of Neo4j as well as a chance to find out about many inspiring use cases. Looking forward to the next one.