Making a Graph DB ep2

3 min

In the first article about my Graph Database, I had some ideas to start with, and set some tasks. I did my homework and read basic stuff about all the mentioned databases and I discovered lots of new things.

In this episode, I’ll be focusing on learning more about: who uses graphs, what are they doing with them, why did they choose graphs and maybe something about their implementation (but that’s for the next episodes).

Apparently it’s quite a big interest in Graph Databases everywhere and most IT giants have implemented something “in house”, using the already available databases and some “software duct-tape”.
Obviously, when you already have a huge infrastructure, you want to migrate your data as slowly and painlessly as possible, so you build on top of the old system.

There are many enterprise solutions, crazy expensive and they need weeks of training. There are also a number of free & open-source versions, but most of them are impossible to deploy on my tiny MacBook Air. I’ll only use them for inspiration.

Here are some relevant questions & answers on Quora:

And a few articles:

Use-cases from the big boys:

Things to study more:

  • Google’s Knowledge Graph
  • Facebook’s Social Graph
  • Twitter’s Interest Graph
  • Microsoft’s Graph

So, to wrap this up, here are some answers to my questions:

❓: Who uses graphs?
❗️: Everybody who is somebody :D

❓: What are they using graphs for?
❗️: Diverse stuff. General interaction networks analysis, social networks (following/ follower/ friends), genomics / biology, route finding, modelling risks in financial systems, fraud detection, identity and access management, network and IT operations, recommendation engines, internet of things, machine learning, etc.

❓: Why are they using graphs, instead of other types of DBs?
❗️: Because some use-cases are more efficient this way. Because they scale more naturally to large data sets as they don’t typically need costly join operations. Complex hierarchical structures, like many-to-many, are difficult to model with a table-based or document-based technology. To cite Dave Stagner’s answer: “You should try to use data persistence that actually reflects the structure of the data. Otherwise, you’re introducing a layer of translation in your code that reflects your choice of tools, not your choice of problems.” Perfect answer.

I didn’t write any code yet. Patience is a virtue 🙏
I want to understand what I have to do first.

Until the next time!

@articles #software #programming #graph #db