GeneShelf
A Web-based Visual Interface for Large Gene Expression Time-Series Data Repositories

GeneShelf User Manual

Return to GeneShelf Home   Start GeneShelf - a Java Applet

1. What is GeneShelf?
A widespread use of high-throughput gene expression analysis techniques enabled the biomedical research community to share a huge body of gene expression datasets in many public databases on the web. However, current gene expression data repositories provide static representations of the data and support limited interactions. This hinders biologists from effectively exploring shared gene expression datasets. Responding to the growing need for better interfaces to improve the utility of the public datasets, we have designed and developed a new web-based visual interface entitled GeneShelf. It builds upon a zoomable grid display to represent two categorical dimensions. It also incorporates an augmented timeline with expandable time points that better shows multiple data values for the focused time point by embedding bar charts. We applied GeneShelf to one of the largest microarray datasets (with about 1,000 genechips) generated to study the progression and recovery process of injuries at the spinal cord of mice and rats. This dataset has multiple dimensions: animal model (mouse or rat), location relative to injury site (at, above and below), severity of injury (sham, mild, moderate and severe), time after injury (7 time points for mice and 10 time points for rats). For more detail on this dataset, refer to this section at GeneShelf home.

 GeneShelf
GeneShelf showing a relevant genetic pathway at the injury site (T9), 7 days after a mild spinal cord injury at a vertebral level T9. The highlighted "hmgcs1" is very active across all experimental conditions. When users mouse over an item representing a gene in a view, the corresponding items are highlighted in red in all other views.



There are four visualization components in GeneShelf:
Top 10 pathway list view (A)
It shows the top ten pathways relevant to a user-selected experimental condition.
Pathway view (B)
It is a zoomable lightweight pathway visualization tool.
Gene list view (C)
It shows gene names and other annotations for the current pathway.
nTimeLines grid view (D)
It shows the two dimensions of the study design: the horizontal axis shows the severity of injury, and the vertical axis shows the sampling location relative to the injury site. The severity of spinal cord injury becomes severer (sham, mild, moderate, and severe) from left to right. For each severity column, there are three sampling locations (T8: above injury site, T9: the injury site, and T10: below injury site) from top to bottom.
   
2. How to run it
GeneShelf is a Java application built on Piccolo toolkit.
If you do not have Java installed on your computer, you have to install Java (Java SE Runtime Environment 6 or higher) first to launch GeneShelf. If you are prompted to install a new Java after clicking on one of the links above, you have to install it first. To properly run GeneShelf, Sun's Java Virtual machine is necessary. You can download it from the Java SE Downloads page.

Steps to utilize GeneShelf

  1. Fetch data from database
  2. Select a pathway from the top 10 pathway list view
  3. Use the pathway view
  4. Use the gene list view
  5. Use the nTimeLines grid view
   
3. Fetch data from database
The first thing you have to do is to fetch a pathway data from our databases. Click on the button that says "Fetch Data" at the top left corner of GeneShelf.

 GeneShelf

Specify the experimental condition (or the reference condition) such as animal model, injury site, injury level, and time point, where you filter significant genes. Using the "Normalized to" combo box, you specify how to filter significant genes (comparing the selected time point to the time 0 or to the previous time point). The significant genes are used to determine the top ten related pathways from the Wikipathways.org public pathway database using Fisher's exact test. Using the "Signal Algorithm" ratio buttons, you can also specify the signal algorithm to generate gene expression values from gene chips. Once you click on "Fetch Data" button, GeneShelf will download the top 10 pathway list and the whole gene expression data of the top-ranked pathway.

Revisit previous/next reference conditions
Right next to the "Fetch Data" button, there are two more buttons: "Previous list" and "Next list." GeneShelf keeps the history of 10 recently fetched reference conditions. Users can revisit those reference conditions by clicking these two buttons. When users mouse-over a button, GeneShelf shows a reference condition that will be reloaded when pressing the button.

Previous/Next list buttons

Please note that the data fetching can take about 10 - 15 seconds in a typical LAN environment
   
4. Select a pathway from the top 10 pathway list view
Please note that GeneShelf automatically selects the top-ranked pathway to show the whole gene expression data of the pathway. If you want to examine the gene expression data of a different pathway, you just click on a pathway name in the top 10 pathway list view. Gene expression data of ever selected pathways are stored in the local machine to save time to fetch the same data from the server again when users revisit the same pathway. Such pathways are highlighted in purple in this list view.

top 10 pathway list view
   
5. How to use the pathway view
Pathway View
Zoom (video)
You can zoom in and out the view using the vertical slider to the right side of the view.
Pan (video)
You can pan the view by dragging the mouse with the left button depressed.
Color coding
In order to provide neuroscience researchers with more context to understand the SCI project, GeneShelf renders boxes for genes using different colors according to the cell type information suggested in a neuroscience paper. There are three neural cell types: astrocytes, neurons, and oligodendrocytes. We color-coded the third column of the gene list view according to the neuron cell type: orange for astrocyte, blue for neuron, red for oligodendrosyte, and gray for the undefined. We used the same color-coding scheme throughout all other views. If a gene in the current pathway is not present in the dataset, the box for the gene is rendered in an attenuated color.
See the current pathway at WikiPathways.org
At the top of the pathway view, you can see the name of the current pathway. If you want to see the pathway at the WikiPathways.org site, just click on the button at the top right conner of this view.
   
6. How to use the gene list view
Then gene list view shows the detail of the genes in the selected pathway - probeset id, gene name, and neural cell type.

gene list view
Color coding
In order to provide neuroscience researchers with more context to understand the SCI project, GeneShelf has a column in the gene list view that shows the cell type information suggested in a neuroscience paper. There are three neural cell types: astrocytes, neurons, and oligodendrocytes. We color-coded the third column of the gene list view according to the neuron cell type: orange for astrocyte, blue for neuron, red for oligodendrosyte, and gray for the undefined. We used the same color-coding scheme throughout all other views including the pathway view and the nTimeLines grid view.
Interactive coordination of all views
Whenever users mouse over on an item representing a gene at any one view, the corresponding items in all other view are also highlighted in red.
Link to NCBI to learn more about the selected gene
When users find an interesting gene in this gene list view, they can right-click on the corresponding list item to see a popup menu where they can click on the gene name to visit the detail information page for the selected gene at the NCBI website.
   
7. How to use the nTimeLines grid view
This view is built upon the small multiples and the fisheye view. It shows the two dimensions of the study design: severity of injury and the sampling location relative to the injury site. The horizontal axis is for the severity of injury, and the vertical axis is for the sampling location relative to the injury site. For example, the second grid block of the top row shows the time-varying gene expression patterns for mild injury samples at T8 (above the injury site). The row and column titles highlighted in bold brown represent the reference condition specified in the Fetch Data dialogbox. The reference time point is highlighted in bold brown at the grid block for the reference condition.

In each grid block, a gene is represented by a line graph.
Zooming in and out a grid block (video)
GeneShelf implements the fisheye view. Users click on anywhere in the background of a grid block to smoothly enlarge the grid block and shrink other grid blocks. If users click again on the enlarged gird block, all the grid blocks return to the regular size, in which all the blocks have an equal size. GeneShelf smoothly animates upon enlarging and shrinking grid blocks.
Expanding/Closing a time point with nTimeLines (video)
It is often hard for users to compare multiple gene expression patterns at a specific time point in a traditional timeline view. We improved the traditional timeline view to address this problem by enabling users to expand a time point. A mouse-over on a vertical line or label for a time point highlights the time point in red, indicating that this time point can be expanded. When users click on the vertical line or label for a time point, the selected time point slides open to have two vertical lines representing the same time point, between which GeneShelf shows a bar chart for the gene expression values at the time point. Parallel horizontal lines will connect the two duplicate vertical lines representing the selected time point. Each bar represents a gene. The height of each bar for a gene will exactly reach the parallel horizontal line that represents the same gene. GeneShelf shows a gray bounding box surrounding the bar chart to indicate that the whole enclosed area represents a single time point. Users can close an expanded time point either by clicking on the time axis or label for the expanded time point or by clicking on the background of the bar chart.
Missing value representation
When there is no gene expression data at a certain time point, GeneShelf does not show the vertical line for the time point and the time label is attenuated from black to light gray. When drawing line graphics, missing values are interpolated and the line segments passing the missing time point are attenuated. GeneShelf does not display line segments at any leading time points with missing values. Likewise, GeneShelf does not show line segments for any trailing time points with missing values. For example, there is no gene expression data available at 0h, 48h, 28d, 3m, and 6m in the following figure. Labels for the time points are shown in light gray. For leading time point (0h) and trailing time points (28d, 3m, and 6m), GeneShelf does show neither the vertical line for those time points nor the line segments passing the lines. For 48h, the missing gene expression values are interpolated using the neighboring time points (24h and 7d), but the line segments passing the 48h time point are rendered in much lighter color to indicate that they are not real values.
Sorting the bar chart at nTimeLines view
The nTimeLines view enables users to sort the bar chart embedded in the view in 5 different ways: increasing/decreasing order of gene expression values, increasing/decreasing order of gene expression values for each cell type, and the same order as in the gene list view. Users can right-click on the background of a bar chart in an nTimeLines view to select a way to sort the bar chart. For example, in the following figure, the user right-clicks on the bar char in the grid block for moderate injury at T9 (injury site) at the 4h time point and selects the popup menu item to sort the bar chart "in increasing order for each cell type," which results in the sorted view shown in the following image (top). All corresponding bar charts in other grid bocks are arranged in the same order as in the selected bar chart that is highlighted with a green bounding box (bottom).
Interactive coordination of all views
A mouse-over on a line graph or a bar in a bar chart of an expanded time point in a grid block causes the corresponding line graphs and bars in all other grid blocks to highlight in red. It also highlights corresponding gene nodes in the pathway view and the corresponding list item in the gene list view.
Color coding
GeneShelf adopts the same color coding for this view: orange for astrocyte, blue for neuron, red for oligodendrosyte, and gray for the undefined.
Link to NCBI to learn more about the selected gene
When users find an interesting gene in this nTimeLines grid view, they can right-click on the corresponding polyline to see a popup menu where they can click on the gene name to visit the detail information page for the selected gene at the NCBI website.
   
8. How to get some help
Users can click on the help button at the top right corner of GeneShelf window to open this users manual page in their default web browser. If you have any further questions, please contact us at jwseo@cse.snu.ac.kr

If you see an error message like "DBMS is down," please let us know that. Our Oracle DBMS server sometimes becomes unstable to keep GeneShelf from getting data. We check our DBMS everyday to see if it's working well, so GeneShelf will be working properly again within a day or so.

Back To Top     Return to GeneShelf Home     Return to Bioinformatics Resources page

Last updated