Better AI in Image Analysis Software?
J.P. Duke asks: "There is an excellent research article published by the Mayo Clinic in the J Ortho Sci that compares two common software-based approaches in analyzing scanned protein gels. Among other conclusions, they found that the two most popular applications for this research had different tendencies in quantifying proteins -- and that differences in AI algorithms show clearly different results for
proteins that are less-separated on gels. This implies that much major scientific research that depends on these tools might be suspect to flaws very early in analysis. Being a cancer researcher at a large research institution, much of my work depends on software being able to accurately analyze scanned images of protein gels in which proteins are simply displayed as spots on the gel. Among other things, the software needs to be able to precisely calculate the density of protein in a spot as well as the number of actual proteins contained in a spot. What we choose to investigate further as potential biomarkers for cancer depends heavily on the ability of the AI built into these applications." Exactly how far has image-based AI improved in the last several years? Might some of those improvements help someone in J.P.'s situation?
"My questions for Slashdot are as follows:
- Overall, how good has research image software AI become in recent years? Have there been any key software or mathematical breakthroughs that have substantially increased the 'intelligence' of software? How far along is this technology?
- Based on your knowledge of software, what are some things researchers can do to help the software better do its job? For example, using a high quality scanner at higher resolutions generally helps results. What other things can be done to promote better results?
- Finally, all applications that I know of in this area are expensive commercial solutions. As the companies that produce the applications are for-profit, the algorithms and technology used are completely closed and proprietary. Thus it is hard to understand what the software is really doing. Does anybody know of any open source (or at least 'open algorithm') solutions? Even if they are inferior at this point in time, being able to clearly understand what the AI is doing makes us better off in several ways.
The problem is not building some clickety tool for enabling the fine biologists to extract manually their data. That is easy. The problem is having an automated tool to go through the metric ton of images a lab can produce and perform consistently in an otherwise inconsistent set of individual experimental procedures. Suppose you find the magic solution to all those problems, is there scientific background to the image analysis techniques you used? The answer is no, everybody could argue that the results are an artifact of the software. Try explaining how a neural network does it's thing. You can't for a particular image. That's probably why everyone uses NIH image...
Protein Gels is just a mean for an end. It's an indirect way to obtain information about something. I suggest those fine biologists find another mean to the same end. As it is, gels are extremely unreliable and tricky to get right in a standardized fashion.