Cypher
Cypher is a declarative graph query language that allows for expressive and efficient querying and updating of the graph. Cypher is designed to be simple, yet powerful; highly complicated database queries can be easily expressed, enabling you to focus on your domain, instead of getting lost in database access.
Point your browser to localhost:7474 to access the Neo4j browser.
On cold boot, Neo4j has nothing cached, and needs to go to disk for all records. Once records are cached, you will see greatly improved performance. One technique that is widely employed is to “warm the cache”.
To warm the cache, run:
CALL apoc.warmup.run
This will load all nodes and relationships into memory.
To view the schema, run:
call db.schema.visualization
Example Cypher Queries
Exploring Combat-TB-NeoDB
Label count
In Neo4j, node types are called labels. The following query counts the number of nodes per label.
MATCH(node)
RETURN head(labels(node)) AS label,
count(*) AS count
ORDER BY count DESC
Relationship type count
The following query counts the number of relationships per type
MATCH()-[rel]->()
RETURN type(rel) AS rel_type,
count(*) AS count
ORDER BY count DESC
Random relationships
The following query retrieves a random relationship of each type. The query goes through every relationship and thus may take several seconds.
MATCH()-[rel]->() WITH type(rel) AS rel_type, collect(rel) AS rels
WITH rels[toInteger(rand() * size(rels))] AS rel
RETURN startNode(rel), rel, endNode(rel)
Querying Combat-TB-NeoDB
Genes that encode protein, limiting to results to 25
The following query finds genes that encode protein.
MATCH(g:Gene)-[r:ENCODES]->(p:Protein)
RETURN g.name as gene, p.name as protein LIMIT 25
Genes that encode a protein that interacts with a known drug target
The following query finds proteins that interact with known drug targets.
MATCH p=(gene:Gene)-[:ENCODES]-(p1:Protein)-[i:INTERACTS_WITH]-(p2:Protein)-[:TARGET]-(drug:Drug)
RETURN gene.name as Gene, i.score as Score, p2.uniquename as Interactor, drug.name as Drug
ORDER BY Score DESC
Find proteins that interact with a certain protein
The following query finds proteins that interact with a protein that has O06295
as the uniquename or UniProtId.
MATCH p=(:Protein {uniquename:'O06295'})-[r:INTERACTS_WITH]-(:Protein)
RETURN p
Find the top 10 proteins that interact with a specific protein sorted by score
The following query finds the top 10 proteins that interact with a
protein that has O06295
as the uniquename or UniProtId. We the return the score,
sorted in descending order, the UniProtID, and the ProteinName.
MATCH p=(protein)-[r:INTERACTS_WITH]-(:Protein {uniquename:'O06295'})
RETURN r.score as SCORE, protein.uniquename as UniProtID, protein.name as ProteinName
ORDER BY r.score DESC LIMIT 10
Drugs that targets multiple proteins
The following query finds drugs that target multiple proteins.
MATCH(p:Protein)-[:TARGET]-(drug) WITH drug, count(p) AS ProteinSetSize,
collect(protein.uniquename) AS ProteinSet
WHERE ProteinSetSize > 1
RETURN drug.name AS DrugName, drug.accession AS DrugAcc, ProteinSet, ProteinSetSize
ORDER BY ProteinSetSize DESC
Proteins targeted by multiple drugs
The following query finds proteins targeted by multiple drugs.
MATCH(drug:Drug)-[:TARGET]-(protein) WITH protein, count(drug) as DrugCount,
collect(drug.name) as DrugNames, protein.uniquename as Proteins,
protein.function as ProteinFunctions
WHERE DrugCount > 1
RETURN Proteins, ProteinFunctions, DrugCount, DrugNames ORDER BY DrugCount DESC
Which proteins are likely to infer drug resistance (DR) if mutated
The following query finds proteins known to be associated with DR mutations
MATCH(d:Drug)--(:Variant)--(g)--(p:Protein)
RETURN distinct(p.name) as Protein, g.name as Gene, d.name as Drug
Which proteins are targeted by a specific drug (Isoniazid)
The following query finds proteins targeted by Isoniazid.
MATCH(drug:Drug {name: "Isoniazid"})-[r:TARGET]-(protein:Protein)
RETURN drug,r,protein
Which proteins are indirectly targeted by a specific drug
The following query finds proteins that are indirectly targeted by Rifampicin.
MATCH(drug:Drug {name: "Rifampicin"})--(:Variant)--(g:Gene)--(p:Protein) RETURN *