Showing posts with label ms02. Show all posts

Thursday, November 10, 2022

network permutation

https://www.nature.com/articles/s43588-020-00009-4

Monday, September 20, 2021

ms02 randomness verification

use network with known theoretical random permutation

this direction might be too theoretic and has very little practical importance.

Monday, January 14, 2019

dang 005 GO run log

ridgeside:
col 3 run in the original folder
col 2 un in _col2 folder

ts117:
col4 run in single-core mode on /scr
col4 doMC version run

ts job run, yeastPIN ms02 Dang 0.001 percentile ~ GO

Modified based Guo's example script for ts.
ts does not have 'foreach' and 'doMC' libries, so I have to remove them in R script.

-bash-4.2$ cat ts_yeastPIN_job1.pbs
#!/bin/bash -l
#$ -S /bin/bash
#$ -N yPIN.Dang.001.col4
#$ -cwd

. /etc/profile.d/modules.sh
module load shared
module load R/3.4.3

cmd="R -f yeast_Zscore_GO-DangCR.R"

$cmd
-bash-4.2$

-bash-4.2$ qstatus -a
Running jobs:

job-ID # name owner submit time
------------------------------------------------------------------
8276 1 yPIN.Dang.001.col4 hqin 01/14/2019 10:15:40

Wednesday, November 7, 2018

NetBAS running time problem, yeast PIN, GOBP ~ GOBP 4.2 hours

The following codes run for 4.2 hours on applejack laptop using single core. Need to use parallel method to speed this up.

```{r}
pairsBuffer = data.frame(matrix(NA, nrow = 1, ncol=3))
names(pairsBuffer) = c("name1", "name2", "tag")
for ( i in 1:length(pairs[,1])){
print(i)
#els1 = sort( unlist( strsplit( pairs$cat1[i], split=",") ))
#els2 = sort( unlist( strsplit( pairs$cat2[i], split=",") ))
sub1 = cats[ cats$id == pairs$name1[i], ]
sub2 = cats[ cats$id == pairs$name2[i], ]

tagbuffer = allCombinationsOfTwoVectors ( sub1$GO, sub2$GO) #all combinations

# generate a dataframe buffer with ids
currentBuffer = data.frame( cbind(rep(pairs$name1[i], length(tagbuffer)),
rep(pairs$name2[i], length(tagbuffer)),
tagbuffer ))
names(currentBuffer) = c("name1", "name2", "tag")

pairsBuffer = rbind( pairsBuffer, currentBuffer) #combine with dataframe buffer
}

F.obs = data.frame( table(pairsBuffer$tag))
names(F.obs) = c("tag", "freq")
F.obs

Thursday, November 1, 2018

yeast PIN ms02 Zscore map

An interesting curve was seen. Only 2 ms02 null were used in this plot.

Friday, September 7, 2018

permutation graphs

https://en.wikipedia.org/wiki/Permutation_graph

Algorithmic Graph Theory and Perfect Graphs - 2nd Edition - ISBN: 9780444515308, 9780080526966

View on ScienceDirect

Algorithmic Graph Theory and Perfect Graphs, Volume 57

2nd Edition

Write a review

Authors: Martin Golumbic

eBook ISBN: 9780080526966

Hardcover ISBN: 9780444515308

Imprint: North Holland

Published Date: 4th February 2004

Page Count: 340

View all volumes in this series: Annals of Discrete Mathematics

Monday, January 29, 2018

possible ms02 bug and how to fix it.

update on 20180202: it turns out the input pairs of network contain redundant pairs. So, input data need to be checked for consistency by a switch, such as "-input-check 1"

It is possible that recursive call return pairs that already existed in previous iterations, in recursive implementation.

This may explain why large network permutation give slightly different random networks.

One way to fix this is to do book-keeping on the entire networks (instead of a small chunk using recursive definitions). This may be implemented in a separate functional call, breaking self-pairing and reassignment partners.

Wednesday, November 29, 2017

p=0.001 ridgeside, rls pairwise difference in yeast biogrid PPI

rm(list=ls())
#setwd("~/github/0.network.aging.ms02/1.Fraser02")
setwd("/home/hqin/github/network.aging.configuration/1.Fraser02")
source("../network.r")
set.seed(2017)
debug = 1; 
start_time = Sys.time();

list.files(path="../data/")

## [1] "ken-RLS-byORF.csv"                             
## [2] "SummaryRegressionHetHomFactorized2015Oct13.csv"
## [3] "unique_biogrid_ScePPI.csv"

rls = read.csv("../data/ken-RLS-byORF.csv");
biogrid = read.csv("../data/unique_biogrid_ScePPI.csv");
fit = read.csv("../data/SummaryRegressionHetHomFactorized2015Oct13.csv")

ppi = biogrid[, c("Systematic.Name.Interactor.A","Systematic.Name.Interactor.B")];
names(ppi) = c("ORF1", "ORF2" )

#First, define a function to calculate V difference in pairs of proteins
 diff.RLS = function( inpairs ) {
   inpairs$rls1 = rls$avgLS[match( inpairs$ORF1, rls$ORF ) ];
   inpairs$rls2 = rls$avgLS[match( inpairs$ORF2, rls$ORF ) ];
   
   inpairs$essen1 = fit$essenflag[match(inpairs$ORF1, fit$orf)];
   inpairs$essen2 = fit$essenflag[match(inpairs$ORF2, fit$orf)];
   
   inpairs$rls1 = ifelse( inpairs$essen1=='essential', 0, inpairs$rls1);
   inpairs$rls2 = ifelse( inpairs$essen2=='essential', 0, inpairs$rls2);
   
   ret = mean( abs( inpairs$rls1 - inpairs$rls2 ), na.rm=T );
 }

 # calculate the observed difference in RLS
 diff.RLS.obs = diff.RLS ( ppi );
 paste( "Observed deltaRLS = ", diff.RLS.obs);

## [1] "Observed deltaRLS =  12.759632586095"

#permutation of pairs, and their difference in Ka
 Nsims = 1000; #number of permutations
 permutated.diff.RLS = numeric( Nsims ); #empty vector to store calculations

library(foreach)
library(doMC)

## Loading required package: iterators

## Loading required package: parallel

registerDoMC(cores=8) #Intel i7 has 6 cores, Xeon E5-2603 @ridgeside has 8 cores

permutated.diff.RLS = foreach(i=1:Nsims) %dopar% {
   new.pairs = ms02_singlerun(ppi ) #generate a new MS02 random network
   new.pairs = new.pairs[,1:2] #reformating into two-columns
   names(new.pairs) = c("ORF1", "ORF2")
   diff.RLS( new.pairs ); 
  }

p-value

permutated.diff.RLS = unlist(permutated.diff.RLS)

summary(permutated.diff.RLS)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   14.09   14.20   14.22   14.22   14.25   14.39

sub = permutated.diff.RLS[ permutated.diff.RLS < diff.RLS.obs ]
paste("pvalue = ", length(sub)/Nsims)

## [1] "pvalue =  0"

hist(permutated.diff.RLS)

stop_time = Sys.time()
paste( "running time = ", stop_time - start_time)

## [1] "running time =  18.9267802158992"

Tuesday, November 28, 2017

RLS P-value evaluation using ms02 permutation

rm(list=ls())
setwd("~/github/0.network.aging.ms02/1.Fraser02")
source("../network.r")
set.seed(2017)
debug = 1;

list.files(path="../data/")

## [1] "ken-RLS-byORF.csv"                             
## [2] "SummaryRegressionHetHomFactorized2015Oct13.csv"
## [3] "unique_biogrid_ScePPI.csv"

rls = read.csv("../data/ken-RLS-byORF.csv");
biogrid = read.csv("../data/unique_biogrid_ScePPI.csv");
fit = read.csv("../data/SummaryRegressionHetHomFactorized2015Oct13.csv")

ppi = biogrid[, c("Systematic.Name.Interactor.A","Systematic.Name.Interactor.B")];
names(ppi) = c("ORF1", "ORF2" )

#First, define a function to calculate V difference in pairs of proteins
 diff.RLS = function( inpairs ) {
   inpairs$rls1 = rls$avgLS[match( inpairs$ORF1, rls$ORF ) ];
   inpairs$rls2 = rls$avgLS[match( inpairs$ORF2, rls$ORF ) ];
   
   inpairs$essen1 = fit$essenflag[match(inpairs$ORF1, fit$orf)];
   inpairs$essen2 = fit$essenflag[match(inpairs$ORF2, fit$orf)];
   
   inpairs$rls1 = ifelse( inpairs$essen1=='essential', 0, inpairs$rls1);
   inpairs$rls2 = ifelse( inpairs$essen2=='essential', 0, inpairs$rls2);
   
   ret = mean( abs( inpairs$rls1 - inpairs$rls2 ), na.rm=T );
 }

 # calculate the observed difference in RLS
 diff.RLS.obs = diff.RLS ( ppi );
 paste( "Observed deltaRLS = ", diff.RLS.obs);

## [1] "Observed deltaRLS =  12.759632586095"

#permutation of pairs, and their difference in Ka
 Nsims = 100; #number of permutations
 permutated.diff.RLS = numeric( Nsims ); #empty vector to store calculations

library(foreach)
library(doMC)

## Loading required package: iterators

## Loading required package: parallel

registerDoMC(cores=5) #Intel i7 has 6 cores

permutated.diff.RLS = foreach(i=1:Nsims) %dopar% {
   new.pairs = ms02_singlerun(ppi ) #generate a new MS02 random network
   new.pairs = new.pairs[,1:2] #reformating into two-columns
   names(new.pairs) = c("ORF1", "ORF2")
   diff.RLS( new.pairs ); 
  }

p-value

permutated.diff.RLS = unlist(permutated.diff.RLS)

summary(permutated.diff.RLS)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   14.14   14.20   14.23   14.23   14.25   14.31

sub = permutated.diff.RLS[ permutated.diff.RLS < diff.RLS.obs ]
paste("pvalue = ", length(sub)/Nsims)

## [1] "pvalue =  0"

hist(permutated.diff.RLS)

Open Notebook

Thursday, November 10, 2022

network permutation

Monday, September 20, 2021

ms02 randomness verification

Monday, January 14, 2019

dang 005 GO run log

ts job run, yeastPIN ms02 Dang 0.001 percentile ~ GO

Wednesday, November 7, 2018

NetBAS running time problem, yeast PIN, GOBP ~ GOBP 4.2 hours

Thursday, November 1, 2018

yeast PIN ms02 Zscore map

Friday, September 7, 2018

permutation graphs

Algorithmic Graph Theory and Perfect Graphs, Volume 57

2nd Edition

Monday, January 29, 2018

possible ms02 bug and how to fix it.

Wednesday, November 29, 2017

p=0.001 ridgeside, rls pairwise difference in yeast biogrid PPI

P-value evaluation using ms02 on ridgeside

H Qin

11/27/2017

Tuesday, November 28, 2017

RLS P-value evaluation using ms02 permutation

RLS P-value evaluation using ms02 permutation

H Qin

11/27/2017