Social media uses graph databases to ‘find your friends’ and businesses are starting to use it to interrogate their supply chains – even if they’re in simple spreadsheets
Users of Facebook and LinkedIn will be familiar with the sometimes uncanny – and spooky – way these services and algorithms suggest new friends to you. How did they know I know this person? But what if you used the same technology to run your supply chain? That’s exactly what some organisations are starting to do.
Imagine that you are responsible for the supply chain of a carmaker. You look into your database of suppliers and it tells you that three of your tyre suppliers are dependent on the same company for a critical component – thus revealing a potential risk to your supply chain that you may need to address. This is the promise of supply chain solutions built on a type of database called graph.
And, not only does it analyse risk, it can also help optimise costs. Continuing the automotive example, say you need to analyse your suppliers of sunroof components. You have a commitment with each vendor in terms of volume commitments, but where are you in terms of volume and value?
A graph database will not only tell you how much business you have with each supplier, it also allows you to run detailed scenarios, look at a sales forecast in relation to the operational and production of the business and calculate if you need to stimulate demand with a special offer to increase consumption, or order ahead to avoid underbuying penalties, explains Gaurav Deshpande, vice president of marketing at TigerGraph.
Essentially, graph organises data differently from traditional databases. While databases such as SQL force you to organise data into tables, sort of like spreadsheets, graph organises data in a more intuitive way that allows you to model data based on real life relationships between objects. While the basic unit of data in SQL is a record consisting of a series of columns of information, the basic unit of data in graph is two objects (known as nodes) and a link that describes the relationship between them.
And that is how Facebook analyses the relationships between its users. The nodes represent users, and these nodes are linked to other nodes by links labelled ‘is friends with’ – so ‘X is friends with Y’ becomes the basic unit of data in the Facebook database. Of course, X can have many friends and Y can have many friends, so in its constant quest to expand their circles of friends, Facebook mines the data to recommend X and Y’s friends to each other.
Of course, this is not as straightforward as simply swapping lists, because friend recommendations are based on a slew of different factors including common interests, number of shared friends and keywords taken from your messages. But Facebook makes sense of this complicated mess of information with algorithms, mathematical formulae for finding such things as the shortest paths between two people, communities of like-minded people and social influencers.
When algorithms like these are applied to supply chains, they can reveal supply flows and bottlenecks, true costs, points of vulnerability, contract issues, standards compliance and more.
Speedy scenarios
Graph has been adopted to manage one of the world’s biggest supply chains run by the US Army. It maintains a vast amount of equipment used by its 1 million soldiers and 200,000 civilian staff, and covers everything from guns and vehicles to tanks and aircraft.
Managing this supply chain not only involves buying equipment and deploying it around the world, but also providing the millions of parts neccessary to maintain it all. Procurement, budgeting and logistics accounted for 80% of the lifecycle costs of the equipment, so the objective was to collect data on every component and its cost, what equipment it related to and the time before it would fail.
Scenario building was another key requirement, with the ability to query the data for budget forecasting and to project deployment costs to different locations, taking into account not only logistics costs but also the expected impact of local conditions on equipment.
All of this was managed on an ageing mainframe computer that the Army wanted to replace with a new procurement management system. And it was decided to move to a graph database, which currently hosts 2.1 billion nodes and 5.9 billion relationships.
What would have taken 60-man hours to load the data for analysis using the old mainframe, now takes as little as seven hours, according to project leader Preston Hendrickson. “One file for the parts of a tank involved 10 million parts, creating more than 15 million possible relationships among the components,” he says, adding, “And that’s just a small piece of the overall graph.”
The graph system has allowed the Army to better anticipate the demand for spare parts and spread out its orders, resulting in better ordering and more predictable costs. It can also now use modern computer languages to query the data more efficiently. “If an analyst has a question they get an answer immediately, instead of having to figure out how to assemble the question before they actually ask it,” Hendrickson says. Previously, running what-if scenarios would have involved reloading and recomputing the data for each case, but now queries and analysis can be done in the same day.
“Answers are immediate. As a result, the parts delivery is more accurate and order turnaround is much faster,” Hendrickson says.
But you don’t have to introduce a brand new software system, or strip out existing systems to implement graph. It can run in parallel with anything, either drawing data from other applications with linking scripts or populated with data manually. It can deliver different insights from your existing systems, whether ERP systems like SAP or Oracle, or less sophisticated spreadsheet- driven systems, by giving you an overview of your supply chain.
As well as commercial graph solutions, graph is available as open source for organisations to experiment with the technology. Graph can be the master database, but frequently isn’t – at least not initially, says Rik Van Bruggen, regional vice president at graph vendor Neo4j, whose clients start implementing the graph database for a particular graph-oriented workload that can complement their current system. “Over time, it is likely that more and more graph-beneficial use cases are discovered and implemented, and then the balance between the different system technologies might shift,” he says.
Graph tends to be used by larger companies at the moment, partly because these organisations have the resources to experiment with new technology, but also because the complexity of the problems they are dealing with lend themselves well to graph. But, deciding whether to use graph or not doesn’t hinge so much on the size of the organisation as the complexity of the data you are dealing with, as measured by the number of connections between the data nodes and the types of questions you want to ask.
Graphs are not so good at answering questions such as the average product cost in a supply chain because the answer involves aggregating large amounts of data across an entire database. For this type of question, even graph vendors concede that traditional relational databases such as SQL can be a better fit.
But the need to understand the supply chain grows daily, to streamline the business and save money, to comply with anti-slavery legislation by understanding the origins and chain of custody of all components, to meet consumer demand to know the provenance of materials going into their food and other goods, and to respond to any kind of supply chain disruption.
According to GEODIS, companies estimate they spend 5-15% of turnover on their supply chain. For companies at the top end of that bracket, there are considerable wins to be made by streamlining.
Both Transparency-One (see case study below) and Trace Labs use graph to help clients understand the provenance of materials in their supply chains, collecting and analysing supplier data on behalf of multiple clients to ensure there is no forced labour, that organic and environmental standards are being met and companies are getting a true picture of their carbon footprint. “The graph enables us to connect datasets of different structures in one single multimodal graph database to get holistic knowledge about all events in the supply chain,” says Branimir Rakic, co-founder and chief technology officer at Trace Labs OriginTrail. “One of the uses of the linked data can be seen in an application for consumers who can scan a QR code on the product, meet the farmer and get product-specific information.”
Graph databases are currently exploding in popularity “not only because the world got more complex, but also because it is rather easy to get started with graph databases,” says Jan Stücke at software firm ArangoDB. “You have this mental model already in your mind and can then also explore the data visually and then see, okay, this doesn’t work right – and then you can rather easily find what is not working right and fix it.”
Case study: The T-shirt journey

Supply chain tracking firm Transparency-One adopted graph in response to the increasing complexity of supply chains, says Frédéric Daniel, chief technology officer. “We used to have very simple supply chains, from the tomato in the field to the distributor to the consumer. Now, is it going through five different countries, giving you five, six or even seven different levels within your supply chain? Yet how many companies know about their tier two, three or four suppliers?”
Using graph can help companies visualise their chain, as above. A cotton T-shirt made in the US may use fabric sourced from China and Brazil, made with thread from Poland, Turkey, India and Vietnam, using cotton grown in Pakistan and India. Even with eight suppliers, tracking the supply chain begins to look complicated, but when scaled up to hundreds of products and suppliers, it creates a data problem that traditional systems would struggle to cope with.
Working in tandem with other databases, graph creates a hybrid system in which data is stored in a relational database, while the relationships between the data are captured in graph. Data about each supplier and its products in the T-shirt chain can be stored in an SQL database with the relationships held in a graph.