Getting Started

Getting started with SylvaDB

Introduction

What is a graph database

A graph database stores data in a graph, the most generic of data structures, capable of elegantly representing any kind of data in a highly accessible way. Basically, a graph states that something is related to something else. It's a natural way of specifying relationships among a collection of items.

A graph uses nodes, edges and properties to represent the data. Typically, nodes are represented as circles and edges as arrows.

This is an example of a simple graph:

simple graph

Nodes represent entities such as people, books, songs, or any other item you might want to keep track of. Edges represent the relationship between the nodes. And properties are pertinent information that relate to nodes or relationships. But in the above graph there isn't any property... yet.

This is an example of a simple graph with some properties:

simple property graph

In this graph there are three nodes representing two kind of entities: the entity Person and the entity Group. As you can see, each node has a set of properties with information about the entity. Also, there are six relationships between the nodes, but only two relationships have properties. Easy!

A graph is a flexible and powerful tool for describing relations between elements of any kind. For this reason, graph databases are often faster than traditional databases for associative data sets with highly complex relationships. They can scale more naturally as they do not typically require expensive database operations. Also, they are suitable to manage ad-hoc and changing data as they depend less on a rigid schema to describe relations between elements.

Why use SylvaDB

SylvaDB is an easy-to-use, flexible, and scalable graph database management system that helps you collect, collaborate, visualize and query large data sets. It provides all the benefits of using a graph database. And the best thing is that you will need no programming knowledge to use it.

An overview

The Dashboard

This is the panel where you manage your graphs in SylvaDB. It's divided in three columns: Graphs, Collaborations and Statistics.

the dashboard

In Graphs you will find all the graphs you have created. Collaborations is for all those graphs where you are contributing as a collaborator. And Statistics shows some statistics about your graphs.

We haven't created any graph yet... So, let's create a new one!

Designing the graph

Creating a graph

We are going to create a graph about music. So, first of all, we need to choose a name for our graph. We will call it My music collection. It's useful to also provide a description, although it's not required. Finally, we choose if our graph will be public or not (publicly available for other people).

new graph

After clicking on Create new graph, we will be redirected to the Dashboard, where we can see an overview of our new graph.

graph overview

Some useful information is showed in this overview. For example, you can see the number of nodes and relationships of the graph, also the number of collaborators, and other information that we will see later, as Strict Schema... What does it mean? Let's see!

Creating a schema

A schema is the place where you describe how is organized your data, the structure of your database. In a graph database you use the schema to define Types: the kind of data you can store in your nodes. Also, to define the allowed relationships between the nodes.

So, clicking on strict schema you will see there isn't any Type defined yet. Let's create a new Type.

After clicking on New Type, you will see a form where you can add a Name and a Description for your Type.

New Type

You should see Types as data entities. For example: a book, a person, an album, etc. Here we are creating a Type to represent an Album entity.

Then, we define the properties of the Type. We need to think what are the most relevant information about a music album. For example, we would store: Title, Duration, Number of tracks and Release date. Those are what we call the keys. So, clicking on we can get something like this:

Type properties

You can see that we have marked the property Title as the label of this Type. This way, we will use this property to identify each node.

We finish clicking on . Then you will see an overview of our schema:

Schema diagram

Also, we want to create the Type Artist, so move to the bottom of the page and click New Type.

More Types

We will define three properties for the Type Artist: Active, Name and Size. Note that Name will be the label.

Artist properties

By default, properties are defined as strings, but sometimes we'll want to restrict the kind of data we would want to store: a number, a date, a boolean, ... In those cases, we can define the type of a property using the Advanced Mode tool. So, clicking on Advanced Mode, you will see a set of fields for each property:

Advanced Mode

We are interested in using just the Data type field. So, clicking on that field, you will see a list of property types. Now select the type Boolean for the property Active and Number for the property Size. This way, we are limiting the possible values of the Active property to be just True or False (visualized as a check box), and the possible values of the Size property to be just numbers, avoiding possible errors and mistakes when storing data.

We finish clicking on Save Type

Album-Artist schema overview

So, it's time to change the Type Album, updating the types of its properties using the Advanced Mode tool. This time, we want to use the type Date for the property Release Date and the type Number for the property Number of tracks.

Album change properties

Finally, repeating the steps to create also the Type Concert, we get the following schema:

Types listed

We have defined three Types in our schema: Artist, Album and Concert, so we can already create nodes in our graph using these Types. But, this way, our graph will contain only nodes without any relation between them. And we think there is some sort of connection between these three entities, so, we need to also define Relationships in our schema.

Here you can see the relationships we want to create:

Music Collection schema graph

Note that we have two different relationships between Album and Concert.

We start defining the relationship between Artist and Album. So, in the Artist section from the schema overview we saw before, we click on outgoing:

Artist-Album relationship

Then, we select Album as the target of the relationship:

Artist-Album relationship name

We also want to add some property to this relationship. We want to store the Date:

Artist-Album relationship properties

As you can see, we have marked the property Date as the label of the relationship. This way, we will use this property to identify each relationship when we visualize the graph.

After clicking Save Type, you will see that a new relationship has been created in the Schema diagram (you can drag the boxes to see the schema better):

Artist-Album diagram

When you create a relationship in the schema between two Types, you will not see the links incoming nor outgoing anymore. So, you will need to use the link New allowed Relationship at the bottom of the page if you want to create more relationships using either of these Types.

More Types

This time, we don't want to define any property in these relationships. So, repeating the process to create the rest of relationships, finally we get our database schema:

Schema diagram

We have created a schema for the Music Collection graph. But, we haven't added any data yet, so in the next topics we will see how to add some nodes and relationships to the graph.

Adding data

Creating nodes

In order to store some data in our graph, we need to create nodes. So, move to the top menu at the right corner of the screen:

Data menu

And add a new Artist clicking on the '+' symbol:

Data Artist new link

Yo will see a form to fill in with data about the Artist. This form is composed by four sections:

  • Properties
  • Relationships
  • Files
  • Links

The Properties section represent the properties we defined in the schema for the Type Artist. For example, we will add information about Led Zeppelin:

Data properties section

We will see the rest of form sections later. So, let's save the artist clicking on the save button. You will see a data summary with the nodes for this entity. Currently, this list only shows one node.

You can change the order of properties listed using the field Order from the Advanced Mode tool we saw before in the Schema definition. So, let's return to the schema definition for Artist and activate the Advanced Mode tool. You simply use numbers to indicate the order:

Schema properties ordered

Now the properties are listed in the order we set:

Data properties section ordered

Let's add some information about an Album. Again, move to the top menu and click on the '+' icon:

Data Album new link

We repeat the same process filling in the properties for Album.

Data Album properties

We also add some information about a Concert:

Data Concert properties

We have just created three nodes in our graph for the types Artist, Album and Concert.

Creating relationships

After saving the Concert, you will see a data summary with the nodes for this entity. Currently, this list only shows one node.

Data Concert list

Since we haven't added any relationship yet, clicking on the icon will not show any relationship. So, let's add a relationship.

Clicking on the icon the form for Concert will be showed again, but this time, we just want to add relationships:

Data Concert relationships

Note that the little arrows show you the direction of the relationship. Also, you can see enclosed by parenthesis the type of the node at the source or target of the relationship.

Adding a target or source to each relationship is really easy. Just start typing the label property of the node (remember you marked the label checkbox in some properties), and you will show a list with results matching the criteria.

Data Concert relationships

We don't need to fill in the relationship was recorded in, so we will remove it clicking on

Finally, we finish clicking on Save Concert. We can repeat the same process with the rest of nodes and relationships.

Adding multimedia

In the previous form you can also add links and files to your node:

Data multimedia

Visualizing the graph

Graph visualization

Clicking on Graph from the top menu, you will visualize the graph. You can choose two kind of views from the select box: inspect graph and whole graph.

With inspect graph view, you can get a more detailed view of your graph. This view is useful for small graphs:

Visualization inspect graph

The whole graph view is the best option when you want to visualize big graphs (graphs with more than 50 nodes). It also provides some useful tools to improve the analysis and visualization of your graph:

Visualization whole graph

Interacting with graphs

Like we have seen, with SylvaDB is a powerful tool for representing data in graphs. But the real power comes from the interaction with that data. The SylvaDB user has many tools for that purpose, including graph analytics and the query builder.

Analytics

SylvaDB's analytics give users the ability to get information or patterns that are not clear in a simple way. Using the graphlab algorithms, we can obtain very useful results for every type of user and project.

Let's see how to use the analytics. When we are in the visualization graph view, we have the option to analyze our graph clicking on the Analyze button.

Once clicked, we access to the fullscreen view. In this view we have the tools to interact with the whole graph. In this section, we will to focus in the Analytics.

In the analytics section we have the differents algorithms that we can execute over our entire graph or a subgraph. The currently available algorithms are connected components, triangle counting, graph coloring, betweeness centrality, pagerank and Degeneracy (k-core).

When we click on an algorithm, we have different options:

  • A play button that runs the algorithm using the entire graph.
  • A checkbox, "On selected nodes", to run the algorithm over a subgraph from the entire whole graph.
  • A select field that allows you to access the result history of the specified algorithm.

Once an algorithm is executed, we obtain a chart representing the values it returns along with the number of nodes corresponding to that value. We can interact with this chart to find nodes with a specific value in the graph visualization.

Query builder

The query builder is a feature that provide us with an easy and intuitive way to create queries. Using the query builder, every SylvaDB user can extract information from their graph in a very simple way.

To create a query, we will follow these steps:

  • The first step is to select a node type, according to the list of available types obtained from the graph. Also, there is a special type called Wildcard.
  • Once we click on a node type, we will see the type box on the grid. The node type box is the basic element of the query builder. It allows us to modify the fields of the type to create the desired query. We can divide the node type box in three parts:
    • The first part is the title. Here we see the name of the type and the buttons to close the box, minimize/maximize the box, and show the box's advanced mode. Also, if we had more than one box of the same type, we could see a select field to choose an alias for the box.
    • The next part is the fields. The fields are the conditions that we have to take into account when we execute our query. So, it is in these fields that we select the properties, their lookups, their values, etc.
    • The last part is the relationships available for that type. We will see a select with these relationships, and when we click on an option we will see one of two different behaviours:
      • If we have the two node types boxes involved, we have to drag and drop the relationship from the source box to the target box. We can see the hover behaviour in the destiny zone to drop the relationship.
      • If we dont have the target box in the grid, when we select the desired relationship the target box is loaded automatically.
      In the relationships, we have the same behaviour as the node types boxes for the fields and the selects for the alias.
  • The last step is to run the query. We can see the results table by clicking in the results tab.

Exporting graphs

You can back up your graphs using the Export tools. In order to export a graph, you need to save the graph schema and the graph data (nodes and relationships). You have two ways of saving your graph data: using GEXF (Gephi) or using CSV as file formats. Unfortunately, you can't save your attachments nor links at the moment.

You can back up your graphs using the Export tools. In order to export a graph, you need to save the graph schema and the graph data (nodes and relationships). You have two ways of saving your graph data: using GEXF (Gephi) or using CSV as file formats. Unfortunately, you can't save your attachments nor links at the moment.

Export tools

Exporting the schema

You can save your graph schema using the Export schema tool. This tool will save your graph schema as a JSON file.

Exporting data as GEXF (Gephi)

This is the preferred way of saving your graph data in SylvaDB. This tool will save your nodes and relationships as a GEXF (Gephi) file.

Exporting data as CSV

You can also save your graph data as CSV files. This tool will save your nodes and relationships in separate CSV files (one file for each node type and relationship type).

Importing graphs

You can import graphs in SylvaDB using the Import tools. You have to import the graph schema and optionally the graph data. If you want to import data, you can use two file formats: GEXF (Gephi) and CSV. Note that you need an empty graph in order to import the schema and data.

Importing the schema

After creating a new graph, open the Tools menu and click Import schema. Then select the JSON file that holds your graph schema.

Import schema

Importing from GEXF (Gephi)

This is the preferred way of loading nodes and relationships into your graphs. Just move to the Tools menu and click Import data. Then select Load GEXF. Finally, you only have to drag the GEXF file into the box.

Import gexf

Importing from CSV

You can also load nodes and relationships from CSV files. Move to the Tools menu and click Import data. Then select Load CSV. Finally, select all your nodes (CSV files) and drag them into the box. Wait for instructions and repeat the same operation with your relationships. Remember that you need one CSV file for each node type and relationship type.

CSV files

Editing CSV files

A CSV file contains any number of rows, where each row consists of fields separated by commas and enclosed by double quotes (this is mandatory in SylvaDB). A field can contains line breaks, commas and escaped double quotes (represented as a pair of double quotes). The first row of a CSV is the header of the CSV and contains a list of field names. The other rows are data, and in SylvaDB they represent nodes or relationships, depending of the CSV file.

This is an example of CSV for the node type Artist:

                "id", "type", "Active", "Name", "Size"
                "1", "Artist", "False", "Led Zeppelin", "4"
              

This is an example of CSV for the node type Album:

                "id", "type", "Duration", "Name", "Number of tracks", "Release date"
                "2", "Album", "40:58", "Houses of the Holy", "8", "1973-3-28"
                "3", "Album", "1:22:15", "Physical Graffiti", "15", "1975-2-24"
              

This is an example of CSV for the relationship type Record:

                "source id", "target id", "label", "Date"
                "1", "2", "records", "1972-8-1"
                "1", "3", "records", "1973-11-1"
              

As you can see in the examples, in the case of nodes, the CSV header must contain the field names (in this order): id, type and all the property names of the node type. In the case of relationships, the CSV header must contain the field names (in this order): source id, target id, label and all the property names of the relationship type. Note that source id and target id are the IDs of the source node and target node of the relationship, respectively.

Since a CSV file is just a plain text file, you can open it with any simple text editor (please, avoid Microsoft Word-like programs). Although the most common way for dealing with CSV files is using a spreadsheet software like LibreOffice Calc or Microsoft Excel. In the next example, we will use LibreOffice Calc.

Try opening the example for the node type Album with LibreOffice Calc. You will see a window like the following:

Importing a CSV in a LibreOffice Calc

Note that you must select the same options as shown in the "Separator options" and "Other options" sections from the above window. After clicking "OK", you will get your CSV imported:

CSV imported in LibreOffice Calc

Remember that in SylvaDB is mandatory to have every field enclosed by double quotes. This way, all the fields are interpreted as text, even the numbers. Note that fields are enclosed by double quotes only after saving the spreadsheet as a CSV file. So, you will not see the fields enclosed by double quotes while working in the spreadsheet.

So, before adding data to the CSV, we must configure this sheet in order to process the numbers as text fields. Select all the rows (go to "Edit" menu and click "Select All"). Then open the "Format Cells" menu (go to "Format menu" and click "Cells..."). Finally, in the "Category" section, select "text":

Configuring the spreadsheet

After clicking "OK", numbers will be processed as text, and enclosed by double quotes when saving the spreadsheet as a CSV file.

We are adding a few more albums to this spreadsheet:

Adding a few albums in the spreadsheet

Finally save the spreadsheet as "Text CSV" format.

Now, if you open this CSV file with a simple text editor, you will see something like the following:

                "id","type","Duration","Name","Number of tracks","Release date"
                "2","Album","40:58","Houses of the Holy","8","1973-3-28"
                "3","Album","1:22:15","Physical Graffiti","15","1975-2-24"
                "4","Album","44:25","Presence","7","1976-4-31"
                "5","Album","33:04","Coda","12","1982-11-19"
              

You can see that all the fields have been enclosed by double quotes, even the ID numbers. Also, if you have some embedded double quote literal inside any field, LibreOffice Calc will escape it using a pair of double quotes, so you don't have to handle this by hand.

If you plan to add some data to a blank graph using CSV files, you can get a set of CSV templates using the Export CSV templates tool. Note that this tool is only available in graphs with no data (just the schema):

Exporting CSV templates