- The RLS set itself also has some problem: For example, swh1 and osh1 are two names of the same gene YAR042W, and these two tests have mean RLSs of 22.4 and 26.4, respectively. I found 9 such cases. Apparently we can only use 1 (may be average of the two tests) for them.
- For the essential genes, I think, we may start with the DEG set that based on the yeast deletion collection, and set some criteria: a). did not appear in the RLS set; b). did not appear in the NxN CellMap table; and c). did not labeled nonessential in the growth-fit set. These three criteria give 70, 65 and 73 overlaps, respectively, and overall there are 91 genes should be removed from the DEG set. Of these genes, 70 have RLS data, but 21 do not.
Thursday, February 1, 2018
Yeast essential genes, quality check
Their RLSs ranges from ~6 to >40. Other genes such as SOD1, although its mutation leads to an RLS of ~3, it is not considered as an essential gene (not in all essential sets I mentioned).
Here's another example: shown below are the entries of several genes in the SGA_ExE, CellMap table, followed by the measured RLS for the mutated strain:
YDR364C (736, 16.4), YGR092W (5237, 9.8), YHR191C (2127, 8.0), YLR268W (2769, 23.0).
These genes also appear in the SGA_NxN table with 3k-7k entries.
Overall, two steps are needed to clean up the data: