Tuesday, November 14, 2017

foreach on dataframe, large data frame cause memory collapse


https://stackoverflow.com/questions/29828710/parallel-processing-in-r-for-a-data-frame

Caution:
The following foreach run on a large data frame (cellmap.org) cause memory collpased on my MacPro, presumably to due to multiple call of the large data frame.

registerDoMC(4) #intel i7 has 4 cores
tb.gin.lenient$ordered_pairs =NA

total = length(tb.gin.lenient[,1]);
x = foreach( i = 1:50, .combine=rbind ) %dopar% {
 pairs = tb.gin.lenient[i , c("ORF1", "ORF2")];
 pairs;
 if ( is.na(pairs[1]) | is.na(pairs[2])) {
  tb.gin.lenient$ordered_pairs[i] = "NA_found";
 } else {
  ordered_pairs = sort(pairs);
  tb.gin.lenient$ordered_pairs[i] =  paste( ordered_pairs[1], ordered_pairs[2], sep="_");
 }
  tb.gin.lenient[i,,drop=FALSE]
}

No comments:

Post a Comment