Skip to content

Commit

Permalink
Improving matrix load code and instructions (#125)
Browse files Browse the repository at this point in the history
* Updating TCGA tutorial

* Adding more options for loading matrix into graph

* Adding option to load edges without vertices when loading matrix.

* Adding more matrix loading instructions

* Updating website
  • Loading branch information
kellrott authored and adamstruck committed Jun 27, 2018
1 parent b37b002 commit c03a6c1
Show file tree
Hide file tree
Showing 23 changed files with 301 additions and 93 deletions.
2 changes: 1 addition & 1 deletion docs/docs/databases/elastic/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/databases/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/databases/kvstore/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/databases/mongo/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/databases/sql/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/developers/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
10 changes: 5 additions & 5 deletions docs/docs/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -173,11 +173,11 @@ Configuration Notes DataSourceName is a driver-specific data source name, usual

<guid>https://docs.bmeg.io/arachne/docs/tutorials/tcga-rna/</guid>
<description>Explore TCGA RNA Expression Data Create the graph
arachne create tcga-rna Load pathway information
curl -O http://www.pathwaycommons.org/archives/PC2/v9/PathwayCommons9.All.hgnc.sif.gz gunzip PathwayCommons9.All.hgnc.sif.gz python $GOPATH/src/github.com/bmeg/arachne/example/load_sif.py --db tcga-rna PathwayCommons9.All.hgnc.sif Load expression data
curl -O https://tcga.xenahubs.net/download/TCGA.BRCA.sampleMap/HiSeqV2.gz gunzip HiSeqV2.gz python $GOPATH/src/github.com/bmeg/arachne/example/load_matrix.py --db tcga-rna HiSeqV2 Load clinical information
curl -O https://tcga.xenahubs.net/download/TCGA.BRCA.sampleMap/BRCA_clinicalMatrix.gz gunzip BRCA_clinicalMatrix.gz python $GOPATH/src/github.com/bmeg/arachne/example/load_property_matrix.py --db tcga-rna BRCA_clinicalMatrix Query the graph
pip install &amp;quot;git+https://github.com/bmeg/arachne.git#egg=aql&amp;amp;subdirectory=aql/python/&amp;quot; import aql conn = aql.Connection(&amp;quot;http://localhost:8201&amp;quot;) O = conn.graph(&amp;quot;tcga-rna&amp;quot;) # Print out expression data of all Stage IIA samples for row in O.</description>
arachne create tcga-rna Get the data
curl -O http://download.cbioportal.org/gbm_tcga_pub2013.tar.gz tar xvzf gbm_tcga_pub2013.tar.gz Load clinical data
./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_clinical.txt --row-label &#39;Donor&#39; Load RNASeq data
./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_RNA_Seq_v2_expression_median.txt -t --index-col 1 --row-label RNASeq --row-prefix &amp;quot;RNA:&amp;quot; --exclude RNA:Hugo_Symbol Connect RNASeq data to Clinical data
./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_RNA_Seq_v2_expression_median.txt -t --index-col 1 --no-vertex --edge &#39;RNA:{_gid}&#39; rna Connect Clinical data to subtypes</description>
</item>

<item>
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/queries/getting_started/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/queries/graphql/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/queries/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/queries/jsonpath/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/queries/operations/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/security/basic/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/security/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/tutorials/amazon/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/tutorials/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
109 changes: 86 additions & 23 deletions docs/docs/tutorials/tcga-rna/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down Expand Up @@ -230,44 +230,107 @@ <h3 id="explore-tcga-rna-expression-data">Explore TCGA RNA Expression Data</h3>
<pre><code>arachne create tcga-rna
</code></pre>

<p>Load pathway information</p>
<p>Get the data</p>

<pre><code>curl -O http://www.pathwaycommons.org/archives/PC2/v9/PathwayCommons9.All.hgnc.sif.gz
gunzip PathwayCommons9.All.hgnc.sif.gz
python $GOPATH/src/github.com/bmeg/arachne/example/load_sif.py --db tcga-rna PathwayCommons9.All.hgnc.sif
<pre><code>curl -O http://download.cbioportal.org/gbm_tcga_pub2013.tar.gz
tar xvzf gbm_tcga_pub2013.tar.gz
</code></pre>

<p>Load expression data</p>
<p>Load clinical data</p>

<pre><code>curl -O https://tcga.xenahubs.net/download/TCGA.BRCA.sampleMap/HiSeqV2.gz
gunzip HiSeqV2.gz
python $GOPATH/src/github.com/bmeg/arachne/example/load_matrix.py --db tcga-rna HiSeqV2
<pre><code>./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_clinical.txt --row-label 'Donor'
</code></pre>

<p>Load clinical information</p>
<p>Load RNASeq data</p>

<pre><code>curl -O https://tcga.xenahubs.net/download/TCGA.BRCA.sampleMap/BRCA_clinicalMatrix.gz
gunzip BRCA_clinicalMatrix.gz
python $GOPATH/src/github.com/bmeg/arachne/example/load_property_matrix.py --db tcga-rna BRCA_clinicalMatrix
<pre><code>./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_RNA_Seq_v2_expression_median.txt -t --index-col 1 --row-label RNASeq --row-prefix &quot;RNA:&quot; --exclude RNA:Hugo_Symbol
</code></pre>

<p>Query the graph</p>
<p>Connect RNASeq data to Clinical data</p>

<pre><code>pip install &quot;git+https://github.com/bmeg/arachne.git#egg=aql&amp;subdirectory=aql/python/&quot;
<pre><code>./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_RNA_Seq_v2_expression_median.txt -t --index-col 1 --no-vertex --edge 'RNA:{_gid}' rna
</code></pre>

<pre><code class="language-python">import aql
<p>Connect Clinical data to subtypes</p>

<pre><code>./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_clinical.txt --no-vertex -e &quot;{EXPRESSION_SUBTYPE}&quot; subtype --dst-vertex &quot;{EXPRESSION_SUBTYPE}&quot; Subtype
</code></pre>

<p>Load EntrezID to Hugo Symbol mapping</p>

<pre><code>./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_RNA_Seq_v2_expression_median.txt --index-col 1 --column-include Hugo_Symbol --row-label Gene
</code></pre>

<p>Load Proneural samples into a matrix</p>

<pre><code class="language-python">import pandas
import aql

conn = aql.Connection(&quot;http://localhost:8201&quot;)
O = conn.graph(&quot;tcga-rna&quot;)
genes = {}
for k, v in O.query().V().where(aql.eq(&quot;_label&quot;, &quot;Gene&quot;)).render([&quot;_gid&quot;, &quot;Hugo_Symbol&quot;]):
genes[k] = v
data = {}
for row in O.query().V(&quot;Proneural&quot;).in_().out(&quot;rna&quot;).render([&quot;_gid&quot;, &quot;_data&quot;]):
data[row[0]] = row[1]
samples = pandas.DataFrame(data).rename(genes).transpose().fillna(0.0)
</code></pre>

<h1 id="matrix-load-project">Matrix Load project</h1>

<pre><code>usage: load_matrix.py [-h] [--sep SEP] [--server SERVER]
[--row-label ROW_LABEL] [--row-prefix ROW_PREFIX] [-t]
[--index-col INDEX_COL] [--connect]
[--col-label COL_LABEL] [--col-prefix COL_PREFIX]
[--edge-label EDGE_LABEL] [--edge-prop EDGE_PROP]
[--columns [COLUMNS [COLUMNS ...]]]
[--column-include COLUMN_INCLUDE] [--no-vertex]
[-e EDGE EDGE] [--dst-vertex DST_VERTEX DST_VERTEX]
[-x EXCLUDE] [-d]
db input

positional arguments:
db Destination Graph
input Input File

optional arguments:
-h, --help show this help message and exit
--sep SEP TSV delimiter
--server SERVER Server Address
--row-label ROW_LABEL
Vertex Label used when loading rows
--row-prefix ROW_PREFIX
Prefix added to row vertex gid
-t, --transpose Transpose matrix
--index-col INDEX_COL
Column number to use as index (and gid for vertex
load)
--connect Switch to 'fully connected mode' and load matrix cell
values on edges between row and column names
--col-label COL_LABEL
Column vertex label in 'connect' mode
--col-prefix COL_PREFIX
Prefix added to col vertex gid in 'connect' mode
--edge-label EDGE_LABEL
Edge label for edges in 'connect' mode
--edge-prop EDGE_PROP
Property name for storing value when in 'connect' mode
--columns [COLUMNS [COLUMNS ...]]
Rename columns in TSV
--column-include COLUMN_INCLUDE
List subset of columns to use from TSV
--no-vertex Do not load row as vertex
-e EDGE EDGE, --edge EDGE EDGE
Create an edge the connected the current row vertex
args: &lt;dst&gt; &lt;edgeType&gt;
--dst-vertex DST_VERTEX DST_VERTEX
Create a destination vertex, args: &lt;dstVertex&gt;
&lt;vertexLabel&gt;
-x EXCLUDE, --exclude EXCLUDE
Exclude row id
-d Run in debug mode. Print actions and make no changes

# Print out expression data of all Stage IIA samples
for row in O.query().\
V().\
where(aql.and_(aql.eq(&quot;_label&quot;, &quot;Sample&quot;), aql.eq(&quot;pathologic_stage&quot;, &quot;Stage IIA&quot;))).\
out(&quot;has&quot;).\
where(aql.eq(&quot;_label&quot;, &quot;Data:Expression&quot;):
print row
</code></pre>

</div>
Expand Down
2 changes: 1 addition & 1 deletion docs/download/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<head>
<link href="http://gmpg.org/xfn/11" rel="profile">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="generator" content="Hugo 0.31-DEV" />
<meta name="generator" content="Hugo 0.40.3" />


<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
Expand Down
10 changes: 5 additions & 5 deletions docs/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -194,11 +194,11 @@ Configuration Notes DataSourceName is a driver-specific data source name, usual

<guid>https://docs.bmeg.io/arachne/docs/tutorials/tcga-rna/</guid>
<description>Explore TCGA RNA Expression Data Create the graph
arachne create tcga-rna Load pathway information
curl -O http://www.pathwaycommons.org/archives/PC2/v9/PathwayCommons9.All.hgnc.sif.gz gunzip PathwayCommons9.All.hgnc.sif.gz python $GOPATH/src/github.com/bmeg/arachne/example/load_sif.py --db tcga-rna PathwayCommons9.All.hgnc.sif Load expression data
curl -O https://tcga.xenahubs.net/download/TCGA.BRCA.sampleMap/HiSeqV2.gz gunzip HiSeqV2.gz python $GOPATH/src/github.com/bmeg/arachne/example/load_matrix.py --db tcga-rna HiSeqV2 Load clinical information
curl -O https://tcga.xenahubs.net/download/TCGA.BRCA.sampleMap/BRCA_clinicalMatrix.gz gunzip BRCA_clinicalMatrix.gz python $GOPATH/src/github.com/bmeg/arachne/example/load_property_matrix.py --db tcga-rna BRCA_clinicalMatrix Query the graph
pip install &amp;quot;git+https://github.com/bmeg/arachne.git#egg=aql&amp;amp;subdirectory=aql/python/&amp;quot; import aql conn = aql.Connection(&amp;quot;http://localhost:8201&amp;quot;) O = conn.graph(&amp;quot;tcga-rna&amp;quot;) # Print out expression data of all Stage IIA samples for row in O.</description>
arachne create tcga-rna Get the data
curl -O http://download.cbioportal.org/gbm_tcga_pub2013.tar.gz tar xvzf gbm_tcga_pub2013.tar.gz Load clinical data
./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_clinical.txt --row-label &#39;Donor&#39; Load RNASeq data
./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_RNA_Seq_v2_expression_median.txt -t --index-col 1 --row-label RNASeq --row-prefix &amp;quot;RNA:&amp;quot; --exclude RNA:Hugo_Symbol Connect RNASeq data to Clinical data
./example/load_matrix.py tcga-rna gbm_tcga_pub2013/data_RNA_Seq_v2_expression_median.txt -t --index-col 1 --no-vertex --edge &#39;RNA:{_gid}&#39; rna Connect Clinical data to subtypes</description>
</item>

<item>
Expand Down
Loading

0 comments on commit c03a6c1

Please sign in to comment.