Perturbation Efficiency in A Sample Dataset

What is perturbation efficiency?

The perturbation efficiency means that the disturbance and propagation efficiency of an element in the network. In a given network not all nodes have same or similar perturbation efficiency. Changes in some nodes might propagate to whole network and for some nodes the effect might be limited to small subgraph of the network. Not only topology but also miRNA:target interaction dynamics determine perturbation efficiency.

Expression level and type of trigger element plays a crucial role. The trigger element can be an miRNA or competing target. The perturbation efficiency is affected from ratio of miRNA amount to sum of expression levels of its targets. Also, amount of competing element among whole competing elements is important since it determines distribution of miRNA. For example, if trigger is an miRNA with expression level of 1500 and if sum of expression levels of its targets is 1000000, then this miRNA will not perturb its neighborhood efficiently. So, miRNA:target ratio is important for regulation of interaction network.
In a biological system, the miRNA:target interactions does not depend solely on stoichiometry, unfortunately. miRNAs affect the targets via degradation or inhibition after the binding. The experimental studies have shown that the features of miRNA:target interactions determine the binding and degradation efficiency. For example, binding energy between miRNA and target and seed structure of miRNA determine the binding efficiency of complex. In addition, the binding region on the target affects the degradation of target.
On the other hand, interaction factors affecting binding and degradation of miRNA to its targets also have impact one efficiency of perturbation of the change. For instance, if a node has very low binding affinity to targeting miRNA, change in expression level of that node will cause weak or no perturbation.

Thus, we developed functions which can calculate perturbation efficiency of a given node or all nodes. calc_perturbation() function calculates perturbation efficiency for given trigger (e.g. Gene17 2-fold). find_node_perturbation() function screens the whole network and calculate perturbation efficiency of all nodes.

Load the midsamp

# install.packages("devtools")
# devtools::install_github("selcenari/ceRNAnetsim")
library(ceRNAnetsim)
data("midsamp")
midsamp
#>     Genes miRNAs Gene_expression miRNA_expression seeds targeting_region
#> 1   Gene1   Mir1           10000             1000  0.43             0.30
#> 2   Gene2   Mir1           10000             1000  0.43             0.01
#> 3   Gene3   Mir1            5000             1000  0.32             0.40
#> 4   Gene4   Mir1           10000             1000  0.23             0.50
#> 5   Gene4   Mir2           10000             2000  0.35             0.90
#> 6   Gene5   Mir2            5000             2000  0.05             0.40
#> 7   Gene6   Mir2           10000             2000  0.01             0.80
#> 8   Gene4   Mir3           10000             3000  0.43             0.40
#> 9   Gene6   Mir3           10000             3000  0.35             0.90
#> 10  Gene7   Mir3            7000             3000  0.23             0.30
#> 11  Gene8   Mir3            3000             3000  0.01             0.20
#> 12  Gene6   Mir4           10000             5000  0.05             0.40
#> 13  Gene9   Mir4            6000             5000  0.32             0.80
#> 14 Gene10   Mir4            2000             5000  0.43             0.20
#> 15 Gene11   Mir4            8000             5000  0.35             0.90
#> 16 Gene12   Mir4            1500             5000  0.43             0.40
#> 17 Gene13   Mir4             500             5000  0.23             0.90
#> 18 Gene14   Mir4            7000             5000  0.43             0.80
#> 19 Gene14   Mir3            7000             3000  0.43             0.90
#> 20 Gene15   Mir3            3000             3000  0.35             0.20
#> 21 Gene16   Mir3            2000             3000  0.01             0.80
#> 22 Gene17   Mir3            6000             3000  0.23             0.40
#> 23 Gene17   Mir2            6000             2000  0.35             0.90
#> 24 Gene18   Mir2            1000             2000  0.01             0.01
#> 25 Gene19   Mir2            6500             2000  0.43             0.90
#> 26 Gene20   Mir2            4800             2000  0.35             0.80
#>    Energy
#> 1     -20
#> 2     -15
#> 3     -14
#> 4     -10
#> 5     -12
#> 6     -11
#> 7     -25
#> 8      -6
#> 9     -15
#> 10    -20
#> 11    -30
#> 12    -12
#> 13    -18
#> 14    -23
#> 15    -25
#> 16    -30
#> 17    -17
#> 18    -15
#> 19    -25
#> 20    -12
#> 21    -18
#> 22    -22
#> 23     -7
#> 24    -30
#> 25    -32
#> 26    -18

How does the `calc_perturbation()` work?

This function works for a given node from network. It calculates and returns two values:

perturbation efficiency : mean of percentage of expression changes of all elements except trigger in the network
perturbed count : count of disturbed nodes for given iteration number and limit (the minimum change in expression level needed to be considered “disturbed” or “changed”).

In the example below, “Gene17” is up-regulated 3-fold in midsamp dataset where Energy and seeds columns are used for calculating affinity effect and targeting_region columns is used for calculating degradation effect. The network will be iterated over 30 times and number of disturbed nodes (as taking into account nodes that have changed more than the value of the threshold (0.1 percentage in terms of the change)) will be counted.

midsamp %>%
  priming_graph(competing_count = Gene_expression, 
                miRNA_count = miRNA_expression,
                aff_factor = c(Energy,seeds), 
                deg_factor = targeting_region) %>%
  calc_perturbation("Gene17", 3, cycle = 30, limit = 0.1)
#> Warning in priming_graph(., competing_count = Gene_expression, miRNA_count = miRNA_expression, : First column is processed as competing and the second as miRNA.
#> # A tibble: 1 x 2
#>   perturbation_efficiency perturbed_count
#>                     <dbl>           <dbl>
#> 1                   0.377               8

If you are interested in testing various fold change values of a given node, then we can use map (actually parallelized version future_map) to run function for set of input values.

First, let’s keep the primed version of graph in an object

primed_mid <- midsamp %>%
  priming_graph(competing_count = Gene_expression, 
                miRNA_count = miRNA_expression,
                aff_factor = c(Energy,seeds), 
                deg_factor = targeting_region)
#> Warning in priming_graph(., competing_count = Gene_expression, miRNA_count = miRNA_expression, : First column is processed as competing and the second as miRNA.

Now, let’s calculate perturbation efficiency caused by 2-fold to 10-fold increase in Gene17

# for parallel processing
# future::plan(multiprocess) 

seq(2,10) %>%
  rlang::set_names() %>% 
  furrr::future_map_dfr(~ primed_mid %>% calc_perturbation("Gene17", .x, cycle = 30, limit = 0.1), .id="fold_change")

If you’re interested in screening nodes instead of fold changes then you don’t have to write a complicated map command, there’s already a function available for that purpose.

How does the `find_node_perturbation()` work?

The find_node_perturbation() function calculates the perturbation efficiency and perturbed node count of each node in network.

In the example below, each node is increased 2-fold and tested for perturbation efficiency for 4 cycles with threshold of 0.1

midsamp %>%
  priming_graph(competing_count = Gene_expression,
                miRNA_count = miRNA_expression,
                aff_factor = c(Energy,seeds),
                deg_factor = targeting_region) %>%
  find_node_perturbation(how = 3, cycle = 4, limit = 0.1)%>%
  select(name, perturbation_efficiency, perturbed_count)
#> Warning in priming_graph(., competing_count = Gene_expression, miRNA_count = miRNA_expression, : First column is processed as competing and the second as miRNA.
#> # A tibble: 24 x 3
#>    name   perturbation_efficiency perturbed_count
#>    <chr>                    <dbl>           <dbl>
#>  1 Gene1                  0.0626                2
#>  2 Gene2                  0.0974                3
#>  3 Gene3                  0.0311                2
#>  4 Gene4                  0.561                10
#>  5 Gene5                  0.0366                3
#>  6 Gene6                  0.505                12
#>  7 Gene7                  0.251                 6
#>  8 Gene8                  0.00973               0
#>  9 Gene9                  0.688                 6
#> 10 Gene10                 0.510                 6
#> # ... with 14 more rows

On the other hand, some of nodes in network might not be affected from perturbation because of low expression or weak interaction factors. In this case, fast argument can be used. Argument fast calculate affected expression percent of the targets and perturbation calculation is not ran for these elements in network, if that percentage value is smaller than given fast value.

midsamp %>%
  priming_graph(competing_count = Gene_expression,
                miRNA_count = miRNA_expression,
                aff_factor = c(Energy,seeds),
                deg_factor = targeting_region) %>%
  find_node_perturbation(how = 3, cycle = 4, limit = 0.1, fast=5)%>%
  select(name, perturbation_efficiency, perturbed_count)
#> Warning in priming_graph(., competing_count = Gene_expression, miRNA_count = miRNA_expression, : First column is processed as competing and the second as miRNA.
#> Warning: `as_quosure()` requires an explicit environment as of rlang 0.3.0.
#> Please supply `env`.
#> This warning is displayed once per session.
#> Subsetting by edges
#> Warning: `cols` is now required.
#> Please use `cols = c(eff_count)`
#> # A tibble: 24 x 3
#>    name   perturbation_efficiency perturbed_count
#>    <chr>                    <dbl>           <dbl>
#>  1 Gene1                    NA                 NA
#>  2 Gene2                    NA                 NA
#>  3 Gene3                    NA                 NA
#>  4 Gene4                    NA                 NA
#>  5 Gene5                    NA                 NA
#>  6 Gene6                     1.26               8
#>  7 Gene7                    NA                 NA
#>  8 Gene8                    NA                 NA
#>  9 Gene9                     3.12               8
#> 10 Gene10                    2.90               8
#> # ... with 14 more rows

The results of the find_node_perturbation() will list effectiveness or importance of nodes in the network. This function can help selecting nodes which will have effective perturbation in network.

What are the limitations?

find_node_perturbation() runs calc_perturb on all nodes in the network in parallel with help of the future and furrr packages. In this vignette, the function is demonstrated on the midsamp data. This dataset is not comparable to actual biological miRNA:target gene datasets in size and complexity. Although find_node_perturbation() runs in parallel it might take long time to run in real huge biological datasets.

In real biological datasets, more complex interactions whether functional or non-functional could be observed. We have improved our approach with fast argument in find_node_perturbation() based on selection of elements that could be affected from perturbation. In this fucntion, fast argument specifies the percentage of the competing amount that can be affected within the initial competing amount and acts as a selection parameter. For instance, in huge_example data:

data("huge_example")
huge_example%>%
  priming_graph(competing_count = competing_counts, miRNA_count = mirnaexpression_normal)%>%
  find_node_perturbation(how=5, cycle=3, fast = 3)%>%
  select(name, perturbation_efficiency, perturbed_count)

huge_example%>%
  priming_graph(competing_count = competing_counts, miRNA_count = mirnaexpression_normal)%>%
  find_node_perturbation(how=5, cycle=3, fast = 3)%>%
  filter(!is.na(perturbation_efficiency), !is.na(perturbed_count))%>%
  select(name, perturbation_efficiency, perturbed_count)

name	perturbation_efficiency	perturbed_count
AK2	7.938457e-02	542
SLC25A5	1.540258e-01	542
MSL3	3.241137e-02	542
MYCBP2	3.141119e-02	514
VPS41	2.116591e-02	514
ADIPOR2	8.039431e-02	542
CELSR3	2.220963e-03	542
SCMH1	4.186803e-02	542
EXTL3	4.015119e-02	542
NR1H4	1.033477e-05	542

Selcen Ari

2019-10-17

What is perturbation efficiency?

Load the midsamp

How does the `calc_perturbation()` work?

How does the `find_node_perturbation()` work?

What are the limitations?

Contents

Perturbation Efficiency in A Sample Dataset

Selcen Ari

2019-10-17

What is perturbation efficiency?

Load the midsamp

How does the calc_perturbation() work?

How does the find_node_perturbation() work?

What are the limitations?

Contents

How does the `calc_perturbation()` work?

How does the `find_node_perturbation()` work?