Many ad hoc scripts are written for bioinformatics projects. I intend to list here some common bugs that I often made. This is going to be a working list. Hopefully, this can leads to better way of coding and managing.
- Hard-coded links that were not updated after projects have evolved. This often lead to wrong files, or results are output to unintended directories.
- A hard-coded debugging variable that was not turned back during production runs.
- Mixing of variables names, file names, etc.
- Compatibility problems. This can occur after software upgrades. For example, after upgrade to perl 5.10.0, I have re-install bioperl to get previous codes working.
- File format problems. With myriads of data format, this problem is going to keep bugging us.
- Typos
- logical mistakes, often occur in ifelse statement.
Some experiences and lessons on dealing with these problems:
I spent 3 hours in fixing a directory problem in a perl script for batch run. I noticed the job was not running in the right directory even when I chdir $homedir every step. The problem is that I copy-paste the directory twice in variable $homedir, so perl always choose the current directory by default. I was so sure that $homedir was correct because I copy-pasted it, and did not check it. I found this out when I copy-pasted the long directory again and found the length did not match. I spent 2 hours on this before 1am. I then decided to go sleep and look it fresh again in the morning. The fresh morning working energy helped me spotted this error.
- Switch between different syntax
In perl, qw and string quotes use different syntax. In qw, no comma is needed.
case 1: qw(results.H0.txt results.H1C.Gblocks.model1.txt results.H2C1S1.txt);
case 2: qw(results.H0.txt, results.H1C.Gblocks.model1.txt, results.H2C1S1.txt);
In case 2, the file name will actually be treated as "results.H0.txt
,". This extra comma is a obvious mistake.
No comments:
Post a Comment