Neo4j provides several interfaces for multiple programming languages including Python, a predominant language in bioinformatics.

The Neo4j community has contributed a range of driver options when it comes to working with the database via Python. These range from lightweight to comprehensive driver packages.

We are going to use a python package called combattbmodel. We developed combattbmodel to model the Combat-TB-NeoDB schema using py2neo, a client library and toolkit for working with Neo4j from within Python applications and from the command line. This package enables bioinformaticians to interact with Combat-TB-NeoDB using pure Python.

To install combattbmodel, run:

$ pip install -i combattbmodel

The simplest way to try out a connection to Combat-TB-NeoDB is via the console. Once you have started a local Combat-TB-NeoDB instance, open a new Python console and enter the following code:

>>> from py2neo import Graph
>>> graph = Graph(host='localhost', password='')

If you wish to use instead of setting up a local instance, point py2neo to with the secure param set to True.

>>> graph = Graph(host='', password='', secure=True)

Example Python Queries

Exploring Combat-TB-NeoDB

Node labels currently defined

The set of node labels currently defined within the graph.

>>> graph.node_labels
frozenset({'Organism', 'GOTerm', 'Pathway', 'Variant', 'TRna', 'PseudoGene', 'CallSet', 'Gene', 'RRna', 'DbXref', 'Protein', 'NCRna', 'Drug', 'InterProTerm', 'Publication', 'VariantSet', 'Author', 'Chromosome', 'Location', 'MRna', 'CDS'})

Relationship types currently defined

The set of relationship types currently defined within the graph.

>>> graph.relationship_types

Open Neo4j browser

Open a page pointing at the Neo4j browser for this graph.

>>> graph.open_browser()

Querying Combat-TB-NeoDB

Finding known variants from a list of genes of interest

Find known variants in katG and gyrB genes

>>> from combattbmodel.vcfmodel import Variant
>>> genes = ['katG', 'gyrB']
>>> for v in
...     for g in v.occurs_in:
...         if in genes:
...             print(, v.pos, v.consequence)
gyrB 6620 Asp461Asn


>>> for gene in genes:
...     for v in list(”_.gene=~’(?i).*{gene}.*’”)):
...         print(, v.pos, v.consequence)
katG 2155168 Ser315Thr