Beautiful Night!

CaseyCode


Computer Science, U of Idaho

TBIN: T-RFLP Binning program

Oct 10, 2009 by Casey

May 18, 2009 - (November, 2009 : estimated)
Superviser: Zaid Abdo
Customers: Zaid and Larry's team at the U of Idaho.



About the Project

Some biologists need to manipulate data from samples to recognize phylotypes by using statistics based computer programs, which read, filter, analyze the data. In order to use these methods, they have been using Perl language and/or R language, but these computer languages bring pain on their necks because biologists do not have much knowledge about programming languages.
This project will provide Graphical User Interface so that they can manage data simply by cliking mouse without knowing computer languages.



Download

TBIN Mac Version (v0.90): click here for more information.


Documents

Requirement:
Runs on PC, Mac, and LINUX
Document for future maintenance
Requires High performance
Provides Graphical User Interface
Runs on the web. (if we have enough time)
  1. Read data:
    1. Open multiple files
    2. Read files (two types of files)
  2. Process data:
    1. Filtering - stantardizing (normalizing, z-scores)
    2. Binning (distance based, model based)
  3. Analyze data
  4. Save data

Language and Framework: C++ and Cocoa(for Mac)/QT(for Windows, LINUX), not JAVA and Swing
For a multi-platform GUI program, I consider JAVA as the best candidate because based on my experience, JAVA GUI is reliable on all popular platforms now and more likely in the future. Although I heard that QT is getting more reliable, I still believe that JAVA will be the best choice for multi-platform a GUI program for now.
However, performance is critical in this program. As a benchmark shows us, in scientific programs, C++ is roughly 10%-20% faster than JAVA. I considered that performance is more important than GUI reliability. As far as I have seen, QT would provide certain level of relability and seems easy to learn.
The processing classes will be Library/platform independent as much as possible, because this program will be running on Clusters(which has many CPUs and a big memory) as well. For Macintosh GUI, I will use Macintosh dependent Cocoa.


Risk:
Developing and Deploying a program for many platforms are never easy.
Time for development may not long enough



Schedule and Milestones

June-05-2009 : First working prototype (QT)
Sep-30-2009: (KC::estimated) Mac version of the program - v.90
Nov-20-2009: (KC::estimated) ship the first PC and UBuntu LINUX version of the program
Dec-10-2009: (KC::estimated) Complete fixing big bugs and making document for future developers



Known Bugs/Features to be added

- The textbox for user input shows unexpected behavior, when he/she enters a long number (no problem for processing)
- Saving separated files for each color option does not provide proper warning for "file exists"
- If a user uses the program for a long time program may be slow. (Potential Memory Leak)
- If a wrong number is chosen, need to show a warning message (to be added)
- Preference and program information can be added

School