Basic Data Visualization

Cytoscape is an open source software platform for integrating, visualizing, and analyzing measurement data in the context of networks.

This tutorial presents a scenario of how expression and network data can be combined to tell a biological story and includes these concepts:

  • Visualizing networks using expression data.

  • Filtering networks based on expression data.

  • Assessing expression data in the context of a biological network.

Loading Network

You can download the demo network session file to your current working directory by running…

[5]:
!wget https://nrnb.org/data/BasicDataVizDemo.cys
--2020-07-02 23:40:44--  https://nrnb.org/data/BasicDataVizDemo.cys
Resolving nrnb.org (nrnb.org)... 185.199.108.153
Connecting to nrnb.org (nrnb.org)|185.199.108.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1390524 (1.3M) [application/octet-stream]
Saving to: ‘BasicDataVizDemo.cys’

BasicDataVizDemo.cy 100%[===================>]   1.33M  8.48MB/s    in 0.2s

2020-07-02 23:40:45 (8.48 MB/s) - ‘BasicDataVizDemo.cys’ saved [1390524/1390524]

Now open the demo network using…

[6]:
import py4cytoscape as p4c
p4c.set_summary_logger(False)
p4c.open_session(file_location="./BasicDataVizDemo.cys")
Opening /content/BasicDataVizDemo.cys...
[6]:
{}

Now you should see a network like this.

[7]:
p4c.export_image(filename="BasicDataVizDemo.png")
from IPython.display import Image
Image('BasicDataVizDemo.png')
[7]:
../_images/tutorials_basic-data-visualization_5_0.png

Visualizing Expression Data on Networks

Probably the most common use of expression data in Cytoscape is to set the visual properties of the nodes (color, shape, border) in a network according to expression data. This creates a powerful visualization, portraying functional relation and experimental response at the same time. Here, we will show an example of doing this.

The data used in this example is from yeast, and represents an experiment of perturbations of the genes Gal1, Gal4, and Gal80, which are all yeast transcription factors.

For this tutorial, the experimental data was part of the Cytoscape session file you loaded earlier, and is visible in the Node Table:

Galbrowse3

  • You can select nodes in the network by

[8]:
p4c.select_nodes(['YDL194W', 'YLR345W'], by_col='name')
[8]:
{'edges': [], 'nodes': [66, 61]}
  • Selecting one or more nodes in the network will update the Node Table to show only the corresponding row(s).

SelectNodes

We can now use the data to manipulate the visual properties of the network by mapping specific data columns to visual style properties:

  • The gal80Rexp expression values will be mapped to node color; nodes with low expression will be colored blue, nodes with high expression will be colored red.

  • Significance for expression values will be mapped to Node Border Width, so nodes with significant changes will appear with a thicker border.

Set Node Fill Color

  • Click on the Style tab in the Control Panel. And you can set node fill color by

[9]:
gal80Rexp_score_table = p4c.get_table_columns(table='node', columns='gal80Rexp')
[10]:
gal80Rexp_score_table.head()
[10]:
gal80Rexp
61 0.449
62 0.448
63 -0.232
64 0.247
65 0.94
[11]:
gal80Rexp_min = gal80Rexp_score_table.min().values[0]
gal80Rexp_max = gal80Rexp_score_table.max().values[0]
gal80Rexp_center = gal80Rexp_min + (gal80Rexp_max - gal80Rexp_min)/2
[12]:
p4c.set_node_color_mapping('gal80Rexp', [gal80Rexp_min, gal80Rexp_center, gal80Rexp_max], ['#0000FF', '#FFFFFF', '#FF0000'])
[12]:
''
  • This produces an initial gradient ranging from blue to red for expression values. Notice that the nodes in the network change color. SetNodeFillColor

Set Default Node Color

Some nodes in the network don’t have any data, and for those nodes, the default color applies. In our case, the default color is blue, which falls within the spectrum of our blue-red gradient. This is not ideal for data visualization, so a useful trick is to choose a color outside the gradient spectrum to distinguish nodes with no defined expression value.

  • Still in the Style tab, And you can set default node color to dark gray by

[13]:
p4c.set_node_color_default('#666666')
[13]:
''

SetDefaultNodeColor

Set Node Border Width

You can set the Border Width by

[14]:
gal80Rsig_score_table = p4c.get_table_columns(table='node', columns='gal80Rsig')
gal80Rsig_min = gal80Rsig_score_table.min().values[0]
gal80Rsig_max = gal80Rsig_score_table.max().values[0]
p4c.set_node_border_width_mapping('gal80Rsig', table_column_values=[gal80Rsig_min, gal80Rsig_max], widths=[10, 30])
[14]:
''

This defines the node border width over the range of gal80Rsig column p-values like

SetNodeBorderWidth1

Double-clicking on the diagonal graph to the right of Current Mapping will bring up a window similar to the one below.

SetNodeBorderWidth2

Layouts

An important aspect of network visualization is the layout, meaning the positioning of nodes and edges. Our network had a preset layout in the original file you imported, but this can be changed.

  • Let’s change the layout to Degree Sorted Circle Layout by

[15]:
p4c.layout_network('degree-circle')
[15]:
{}
[16]:
p4c.export_image(filename='degree-circle.png')
Image('degree-circle.png')
[16]:
../_images/tutorials_basic-data-visualization_21_0.png

In this layout, nodes are sorted by degree (connectedness), with the highest degree node at the 6 o’clock position, and remaining nodes are sorted counter clock-wise based on decreasing degree.

For this network, a degree-sorted circle layout may not be the most effective. Instead, let’s try a force-directed layout instead, which may work better with this network.

[17]:
p4c.layout_network('force-directed')
[17]:
{}
[18]:
p4c.export_image(filename='force-directed.png')
Image('force-directed.png')
[18]:
../_images/tutorials_basic-data-visualization_24_0.png

Cytoscape supports many different layout algorithms, described in detail in the Cytoscape manual.

Select Nodes

Cytoscape allows you to easily filter and select nodes and edges based on data attributes. Next, we will select a subset of nodes with high expression in the gal80 knockout:

  • Let’s create column filter for Node: gal80Rexp by

[19]:
p4c.create_column_filter('myFilter', 'gal80Rexp', 2.00, "GREATER_THAN")
No edges selected.
[19]:
{'edges': None, 'nodes': ['YBR018C', 'YBR020W', 'YBR019C']}

You should now see only a few nodes in the network selected (highlighted yellow).

[20]:
p4c.export_image(filename='column-filter.png')
Image('column-filter.png')
[20]:
../_images/tutorials_basic-data-visualization_29_0.png

Expand Selection and Create New Network

We have now selected only the few top expressing nodes. To see the context of these nodes in the larger network, we can expand the selection of nodes to include the nodes connecting to the selected nodes, i.e. the first neighbors. Once we have that larger selection, we can create a new network.

  • Select the first neighbors of selected nodes by

[21]:
p4c.select_first_neighbors()
[21]:
{'edges': [], 'nodes': [239, 240, 137, 234, 235, 237, 238]}
[22]:
p4c.export_image(filename='first-neighbors.png')
Image('first-neighbors.png')
[22]:
../_images/tutorials_basic-data-visualization_32_0.png

Digging into the biology of this network, it turns out that GAL4 is repressed by GAL80. Both nodes (GAL4 and GAL11) show fairly small changes in expression, and neither change is statistically significant: they are pale blue with thin borders. These slight changes in expression suggest that the critical change affecting the red nodes might be somewhere else in the network, and not either of these nodes. GAL4 interacts with GAL80, which shows a significant level of repression: it is medium blue with a thicker border.

Note that while GAL80 shows evidence of significant repression, most nodes interacting with GAL4 show significant levels of induction: they are rendered as red rectangles. GAL11 is a general transcription co-factor with many interactions.

Putting all of this together, we see that the *transcriptional activation activity of Gal4 is repressed by Gal80*. So, repression of Gal80 increases the transcriptional activation activity of Gal4. Even though the expression of Gal4 itself did not change much, *the Gal4 transcripts were much more likely to be active transcription factors when Gal80 was repressed.* This explains why there is so much up-regulation in the vicinity of Gal4.

Summary

In summary, we have:

  • Explored a yeast interactome from a transcription factor knockout experiment

  • Created a visual style using expression value as node color and with border width mapped to significance

  • Selected high expressing genes and their neighbors and created a new network

Finally, we can now export this network as a publication-quality image….

Saving Results

Cytoscape provides a number of ways to save results and visualizations:

  • As a session:

[23]:
p4c.save_session('basic-data-visualization.cys')
[23]:
{}
  • As an image:

[24]:
p4c.export_image('basic-data-visualization', type='PDF')
p4c.export_image('basic-data-visualization', type='PNG')
p4c.export_image('basic-data-visualization', type='JPEG')
p4c.export_image('basic-data-visualization', type='SVG')
p4c.export_image('basic-data-visualization', type='PS')
[24]:
{'file': '/content/basic-data-visualization.ps'}
  • To a public repository:

p4c.export_network_to_ndex('userid', 'password', True)
  • As a graph format file (Formats: “CX JSON”, “Cytoscape.js JSON”, “GraphML”, “XGMML”, “SIF”,…):

[25]:
p4c.export_network('basic-data-visualization', 'CX')
p4c.export_network('basic-data-visualization', 'cyjs')
p4c.export_network('basic-data-visualization', 'graphML')
p4c.export_network('basic-data-visualization', 'xGMML')
p4c.export_network('basic-data-visualization', 'SIF')
[25]:
{'file': '/content/basic-data-visualization.sif'}
[ ]: