Skip to content

Latest commit

 

History

History
261 lines (225 loc) · 20.8 KB

README.md

File metadata and controls

261 lines (225 loc) · 20.8 KB

Detect and analyze Network Motifs of any network, with Python.

Unit Tests

🪱 We focus on C.elegans connectome networks but the code can be applied to any network. 🪱

We implemented and support:

Enumeration algorithms:

  • Mfinder, both an induced and non-induced version. R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan,D. Chklovskii, and U. Alon, “Network motifs: simple building blocks of complex networks.” Science, vol. 298, no. 5594, pp. 824–827, October 2002"
  • FANMOD - S. Wernicke, “Efficient detection of network motifs” 2006
  • Triadic Census - (a wrapper of networkx implementation) Vladimir Batagelj and Andrej Mrvar, "A subquadratic triad census algorithm for large sparse networks with small maximum degree" 2001

Large Motif search algorithms:

  • SIM - Single Input moudle. Shai S. Shen-Orr, Ron Milo, Shmoolik Mangan & Uri Alon, "Network motifs in the transcriptional regulation network of Escherichia coli"

Random networks algorithms:

  • Simple Markov-Chain: R. Kannan, P. Tetali, S. Vempala, Random Struct. Algorithms 14, 293 (1999).
  • Erdos Renyi: with probability p such that the average |E| of all random networks ~= |E| of the original network. (Wikipedia)
  • Barabási–Albert model: with m (# of edges to attach) = |E| / |N|. then random direction per edge is chosen. (Wikipedia)
  • Anatomical constrained Markov-Chain: allow switches based on the C.elegans nerve-ring distances, based on Zaslaver et al., 2022: "The synaptic organization in the Caenorhabditis elegans neural network suggests significant local compartmentalized computations"

Network formats:

  • simple txt format: (v1, v2, w) per line
  • worm wiring format: https://wormwiring.org/pages/adjacency.html
  • polarity format from the 2020 paper: Fenyves BG, Szilágyi GS, Vassy Z, Sőti C, Csermely P. "Synaptic polarity and sign-balance prediction using gene expression data in the Caenorhabditis elegans chemical synapse neuronal connectome network"
  • Durbin 1986 - https://www.wormatlas.org/neuronalwiring.html
  • multilayer connectome format from the 2016 paper: Bentley B, Branicky R, Barnes CL, Chew YL, Yemini E, et al. "The Multilayer Connectome of Caenorhabditis elegans"

Usage:

This repository contains two main parts: first is a generic network-motif detection software, second is a post-motif-search analysis features spread out in jupyter notebooks.

Motif detection software:

The motif_search_main.py is the main python script which can be ran via any shell / script in a command-line-argument fashion (it also includes the documentations):

Example from the terminal:

motif_search_main.py --help

optional arguments:
  -h, --help            show this help message and exit
  -rs RANDOM_SEED, --random_seed RANDOM_SEED
                        random seed for the entire program
  -lf LOG_FILE, --log_file LOG_FILE
                        file path to save log results
  -bf BIN_FILE, --bin_file BIN_FILE
                        file path to save binary results
  -it {simple_adj_txt,worm_wiring_xlsx,polarity_xlsx,durbin_txt,multilayer,graph,binary_network_file}, --input_type {simple_adj_txt,worm_wiring_xlsx,polarity_xlsx,durbin_txt,multilayer,graph,binary_network_file}
                        the type of the input network
  -inf INPUT_NETWORK_FILE, --input_network_file INPUT_NETWORK_FILE
                        file path of the input network
  -ing INPUT_NETWORK_GRAPH [INPUT_NETWORK_GRAPH ...], --input_network_graph INPUT_NETWORK_GRAPH [INPUT_NETWORK_GRAPH ...]
                        a graph: list of strings (tuples) where each is an edge. in the format: ["1 2" "2 3" ...]
  -rmc, --run_motif_criteria
                        run full motif search with motif criteria tests
  -sa {mfinder_i,mfinder_ni,fanmod,triadic_census,specific}, --sub_graph_algorithm {mfinder_i,mfinder_ni,fanmod,triadic_census,specific}
                        sub-graph enumeration algorithm
  -k K, --k K           the size of sub-graph / motif to search in the enumeration algorithm
  -sim SIM, --sim SIM   the maximum size of control size in the SIM search algorithm
  -uim, --use_isomorphic_mapping
                        run (pre motif search) isomorphic sub-graphs search
  -asl, --allow_self_loops
                        allow self loops in the (pre motif search) isomorphic sub-graphs search
  -st SYNAPSE_THRESHOLD, --synapse_threshold SYNAPSE_THRESHOLD
                        filter neurons with >= # synapses (only in neuron networks files)
  -fsy {chem,gap,all}, --filter_syn_type {chem,gap,all}
                        filter synapse type, supported in durbin and worm_wiring networks
  -fsx {herm,male}, --filter_sex_type {herm,male}
                        filter sex type, supported in durbin and worm_wiring networks
  -fnrn, --filter_nerve_ring_neurons
                        filter the neuronal connectome with only the (180) nerve ring neurons
  -fp {+,-,no pred,complex} [{+,-,no pred,complex} ...], --filter_polarity {+,-,no pred,complex} [{+,-,no pred,complex} ...]
                        polarity: filter neurons with polarity
  -fpn {Glu,GABA,ACh,0} [{Glu,GABA,ACh,0} ...], --filter_prim_nt {Glu,GABA,ACh,0} [{Glu,GABA,ACh,0} ...]
                        polarity: filter neurons with primary neurotransmitter
  -fma {dopamine,octopamine,serotonin,tyramine} [{dopamine,octopamine,serotonin,tyramine} ...], --filter_monoamines {dopamine,octopamine,serotonin,tyramine} [{dopamine,octopamine,serotonin,tyramine} ...]
                        Monoamines: filter neurons with MA transmitter
  -r {markov_chain,nerve_ring_markov_chain,erdos_renyi,barabasi}, --randomizer {markov_chain,nerve_ring_markov_chain,erdos_renyi,barabasi}
                        main randomizer algorithm in a full motif search
  -na NETWORK_AMOUNT, --network_amount NETWORK_AMOUNT
                        amount of random networks to generate in a full motif search
  -sf SWITCH_FACTOR, --switch_factor SWITCH_FACTOR
                        number of switch factors done by the markov chain randomizer
  -a ALPHA, --alpha ALPHA
                        motif criteria alpha for testing p value significance
  -ft FREQUENCY_THRESHOLD, --frequency_threshold FREQUENCY_THRESHOLD
                        motif criteria frequency threshold test
  -ut UNIQUENESS_THRESHOLD, --uniqueness_threshold UNIQUENESS_THRESHOLD
                        motif criteria uniqueness threshold test
  -uut, --use_uniq_criteria
                        whether to use the uniqueness test

Example via .bat script:

@echo off
python motif_search_main.py ^
--input_type durbin_txt ^
--input_network_file "networks/data/Durbin_1986/neurodata.txt" ^
--synapse_threshold 10 ^
--filter_sex_type herm ^
--filter_syn_type chem ^
--sub_graph_algorithm mfinder_i ^
--k 3 ^
--sim 1 ^
--bin_file "durbin_herm_chem_k4_m5.bin" ^
--run_motif_criteria

In both cases it is mandatory to run from the root folder.

Basic output example:

Filtering Neurons of: herm
Filtering Synapses of type: chem
Network file name: SI 2 Synapse adjacency matrices.xlsx

Network properties:
        Neurons in the network: 272
        Participating Neurons (in the graph): 250

        Participating Nodes are neurons with at least: 15 synapses
        Synapses in the network: 9580
        Synapses in the graph: 1587
        Gaps in the network: 0
        Gaps in the graph: 0
        Nodes: 87
        Edges: 80
        Average clustering coefficient: 0.024
        Average shortest path (undirected): 1.415
        Density: 0.011
        Degree: Mean: 1.839 Std: 1.66 Median: 1.0 Max: 12 (node: RIAR)
        In-Degree: Mean: 0.92 Std: 1.147 Median: 1.0 Max: 5 (node: AVAR)
        Out-Degree: Mean: 0.92 Std: 1.127 Median: 1.0 Max: 8 (node: RIAR)

Motif criteria:
        alpha: 0.01
        use uniqueness: False
        uniqueness threshold: 3
        frequency threshold: 0.1

Sub graph enumeration algorithm: fanmod
Search using k: 3
Using isomorphic mapping: True
SIM sub graph search using max-control size: 1
Allow self loops: False
Using nerve ring neurons only: False
Full motif search: True
Sub Graph search timer [Sec]: 0.07
SIM search timer [Sec]: 0.0
Populate polarity motifs timer [Sec]: 0.0

Randomizer: using markov_chain algorithm
Randomizer: generating 1000 random networks
Markov chain switch factor: 10
Markov chain iterations: 800
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 86.29it/s] 
Markov chain success ratio: 0.90157
average # edges of all random networks: 80.0
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [01:05<00:00, 15.22it/s] 

Motif candidates found: 7
Total number of sub graphs found: 154
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
|  ID  |     Name     |  Adj_mat  |  is-motif  |  N_real  |  N_rand  |  Std  |  Z_score  |  uniq  |  is-significant  |  is-freq  |  is-uniq  |  is-anti-freq  |
+======+==============+===========+============+==========+==========+=======+===========+========+==================+===========+===========+================+
|  6   |   fan_out    | [[0 1 1]  | anti-motif |    35    |  47.73   | 4.78  |   -2.66   |   3    |       True       |   False   |    n/a    |      True      |
|      |              |  [0 0 0]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [0 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
|  12  |   cascade    | [[0 0 1]  | anti-motif |    52    |  73.39   | 8.51  |   -2.51   |   1    |       True       |   False   |    n/a    |      True      |
|      |              |  [1 0 0]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [0 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
|  14  |     n/a      | [[0 1 1]  |    none    |    12    |   2.91   | 4.45  |   2.04    |   1    |      False       |   True    |    n/a    |     False      |
|      |              |  [1 0 0]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [0 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
|  36  |    fan_in    | [[0 0 1]  | anti-motif |    44    |  50.91   | 2.91  |   -2.37   |   8    |       True       |   False   |    n/a    |      True      |
|      |              |  [0 0 1]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [0 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
|  38  | feed_forward | [[0 1 1]  |   motif    |    5     |   1.15   | 1.07  |   3.59    |   1    |       True       |   True    |    n/a    |     False      |
|      |              |  [0 0 1]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [0 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
|  46  |     n/a      | [[0 1 1]  |    none    |    0     |   0.06   | 0.29  |   -0.21   |   0    |      False       |   False   |    n/a    |      True      |
|      |              |  [1 0 1]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [0 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
|  74  |     n/a      | [[0 1 0]  |    none    |    5     |   1.76   | 2.62  |   1.23    |   1    |      False       |   True    |    n/a    |     False      |
|      |              |  [1 0 0]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [1 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
|  78  |     n/a      | [[0 1 1]  |    none    |    0     |   0.04   | 0.25  |   -0.15   |   0    |      False       |   False   |    n/a    |      True      |
|      |              |  [1 0 0]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [1 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
|  98  |     n/a      | [[0 1 0]  |    none    |    1     |   0.19   | 0.42  |   1.92    |   1    |      False       |   True    |    n/a    |     False      |
|      |              |  [0 0 1]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [1 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
| 102  |     n/a      | [[0 1 1]  |    none    |    0     |   0.03   | 0.16  |   -0.17   |   0    |      False       |   False   |    n/a    |      True      |
|      |              |  [0 0 1]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [1 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
| 108  |     n/a      | [[0 0 1]  |    none    |    0     |   0.02   | 0.14  |   -0.14   |   0    |      False       |   False   |    n/a    |      True      |
|      |              |  [1 0 1]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [1 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
| 110  |     n/a      | [[0 1 1]  |    none    |    0     |   0.00   | 0.04  |   -0.04   |   0    |      False       |   False   |    n/a    |      True      |
|      |              |  [1 0 1]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [1 0 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+
| 238  |     n/a      | [[0 1 1]  |    none    |    0     |   0.00   | 0.00  |   0.00    |   0    |      False       |   False   |    n/a    |     False      |
|      |              |  [1 0 1]  |            |          |          |       |           |        |                  |           |           |                |
|      |              |  [1 1 0]] |            |          |          |       |           |        |                  |           |           |                |
+------+--------------+-----------+------------+----------+----------+-------+-----------+--------+------------------+-----------+-----------+----------------+

Post Motif-Search analysis' features:

These utilities are located under /post-motif-analysis folder, they need a binary output file (from the previous step) to work:

--bin_file "your_results_file.bin"
  • Plot Motifs's Z-score vs N_real: image

  • Compare multiple networks normalized and discretized Z-scores: image image image

  • Compare 2 networks Z-scores (via a scatter): image

  • Sort Motifs and Anti-Motifs based on Z-score:

image image

  • Analyze the frequency of nodes per their role in a given Motif: image

  • Analyze the frequency of specific nodes in all Motifs: image

  • Filter (and later draw) all subgraphs where node N was part of motif M in role R, e.g.,: image

  • Rank any set of subgraphs by different metrics (degree, clustering coefficient) image

  • Explore larger-than-k=3 (beasts) motifs: image