Thursday, November 16, 2017

biogrid tab2 clean, perl code


FILE:  1.clean.tab2.20171115.pl


#! /usr/bin/perl

# usage:
# ~/pypeworks/biogrid/00_cleanup_biogrid_table.pl  BIOGRID-ORGANISM-Homo_sapiens-3.2.117.tab2.txt > BIOGRID-ORGANISM-Homo_sapiens-3.2.117.tab2.clean.txt
# can use bash to apply it to more species as needed

while ( <>) {
    chomp;
    @aux = split "\t";
    @new = ();
    foreach (@aux) {
        $blah = $_;
        $blah =~ s/\s//g;
        ($blah eq '-') && ($blah='');
        if (! $blah) {
            push @new, 'NA';
        } else {
            # somebody in BioGRID thought it would be cute to use backslash
            # as quotes
            $blah = $_;
            $blah =~ s/\\//g;
            push @new, $blah;
        }
    }
    
    print join ("\t", @new), "\n";

}

Based on 
https://github.com/ivanamihalek/biogrid/blob/master/00_cleanup_biogrid_table.pl 

No comments:

Post a Comment