wiki:BIOS_BIOSRutils

Version 1 (modified by Rick, 15 months ago) (diff)

--

Table of Contents generated with DocToc

Introduction

Metadatabase

Using the MDb

The BIOS project has generated for over 4000 individuals RNA-sequencing and DNA methylation data. A part from these data, GoNL imputed genotypes were generated from existing genotypes and several phenotypes/demographic variables were collected for the same set of samples. A highly flexible sample-oriented metadatabase (MDb) was created in order to manage the dynamic generation of this large-scale multiple-omic data set.

The MDb is a non-relation database (http://couchdb.apache.org/) that uses JSON to store records and JavaScript for querying. Furthermore, it has an HTTP API suitable to programmatically access the database from the GRID, e.g, the alignment pipeline.

Each record or document is a sample (individual) within the BIOS project and has a unique identifier. Each document has a predefined structure according to our database schema (https://git.lumc.nl/rp3/bios-schema). Custom Python scripts are use to update or modify the database (https://git.lumc.nl/rp3/bios-mdb.)

Access to the metadatabase (MDb) is restricted; please contact (Leon Mei or Maarten van Iterson).

Brief description of MDb content

The MDb contains as much as meta-information as possible from all samples and datatypes: location of (raw) data on srm, md5 checksum verification, quality control information, links between the different identifiers used (person_id, dna_id, etc) and phenotype information.

Every sample's meta information is encoded in a CouchDB document. Each document has a unique identifier (the bios_id) which is biobankname (CODAM, LL, LLS, NTR, RS and PAN) concatenated with person_id separated by a "-", e.g. CODAM-2001. This unique bios_id is not suitable for use in the public domain, e.g., EGA upload, therefore a unique not identifiable identifier has been created for each individual; the uuid.

Every update of a sample in the database is recorded by increasing a revision number. Therefore it is always possible to undo wrong updates. The attachment of this page has a json file representing a sample's information in the metadatabase (The content of the file can be past on a JSON viewer e.g. http://jsonviewer.stack.hu).

Brief description available views

Views are the way to extract information form a couchDb. Views are organized into designs; each design contains a number of views related to a particular kind of information that can be extracted from the MDb. For example, there is a design 'EGA' which contains currently two views 1) 'freeze1RNASeq' to extract those samples for which RNAseq data has been uploaded to EGA and 2) 'freeze1Methylation' for the DNA methylation data.

Other relevant views are:

design: view:
EGA freeze1RNASeq, freeze1Methylation
Files getFastq, getIdat
Identifiers getIds
Phenotypes allPhenotypes, cellCounts, minimalPhenotypes
Runs getGenotypes, getMethylationRuns, getRNASeqRuns
Samplesheets rnaseqSamplesheet, methylationSamplesheet
Verification md5

Note: We can always add views if necessary; please contact Maarten van Iterson.

Accessing the MDb

Views can be downloaded as JSON documents by making a GET request. Most programming languages have utilities for making GET requests and to transform JSON documents. Some programming languages have an API for CouchDB e.g. JAVA and Python. There are several online tools available for transforming JSON documents to csv files.

Please note that it is usually better to download the view separately and work on the downloaded file. This way you only have to enter your password once and you're resilient to network connectivity problems.

Access the metadatabase using R

We have developed the R package BIOSRutils (https://git.lumc.nl/rp3/biosrutils) for easy access to the MDb and processed datasets. BIOSRutils is available on the VM for R version 3.2.0 (start R using command R-3.2.0 from the commandline). The current version 0.0.1 this is still a development version, several of our aimed features are not yet fully implemented.

BIOSRutils uses a configuration file to read in your MDb username and password, so that you do not have to type it every time you use the MDb.

Create a file called .biosrutils and stored it in your home directory on the VM (/home/username) and add as the first line:

usrpwd: 'username:password'

Note: if your password contains any characters that bash treats specially (' / ^ & # etc.), make sure to escape them appropriately using \ or \\.

Start R-3.2.0 and load the library:

library(BIOSRutils)

Several predefined variables are available, such as, the urls to the current MDb and Rdb, as well as, your provide username and password (USRPWD). All the variables are capitalized to minimize interference with your own code.

ls()

## [1] "BIOBANKS"   "DATASETS"   "MDB"        "PROXY"      "RDB"       
## [6] "RP3DATADIR" "SRMBASE"    "USRPWD"     "VIEWS"

The BIOSRutils package provides the function getView to extract a particular view from the MDb. All available views are stored in the global variable VIEWS. Use the regular way to get help in R, e.g.:

`?`(getView)

For example, we want to extract all phenotype information from all samples we use the allPhenotype view from the design Phenotypes.

## curl -X GET https://metadatabase.bbmrirp3-lumc.vm.surfsara.nl:6984/bios/_design/Phenotypes/_view/allPhenotypes?reduce=false -u 'username:password' -k -g

## Got4600records from database

Basic R manipulations can be use to select particular information. e.g.:

LLSMalesAbove70 <- subset(phenotypes, grepl("LLS", ids) & Sex == 0 & DNA_BloodSampling_Age > 
    70)
LLSMalesAbove70[1:5, 1:5]

##           ids RNA_A260280ratio Lipids_BloodSampling_Date TotChol LDLchol
## 1809 LLS-1068               NA                2003-12-10    3.81  2.2725
## 1838 LLS-1195             2.13                2004-01-26    9.80      NA
## 1853 LLS-1265             2.15                2004-02-23    4.97  2.3010
## 1854 LLS-1279             2.13                2004-02-27    7.39  5.2850
## 1884 LLS-1361             2.13                2004-03-10    6.30  3.6815

Prepared datasets

The BIOS gene and exon counts are stored in a R-object of type SummarizedExperiment. This is a data container that can store feature data like gene counts but annotation on the features as well as annotation on the samples. Furthermore, the feature annotation is RGanges-object which has several advantages. Both data set can be loaded into R using the data. Use colData, rowData or assays to extract information from the object.

More about SummarizedExperiments: 1. package vignette 2. course material 3. BioConductor nature paper

The BIOSRutils global variable DATASETS lists all available data sets.

DATASETS

##  [1] "metabolomics_RP3RP4_overlap"         
##  [2] "methData_Betas_CODAM_F2"             
##  [3] "methData_Betas_LL_F2"                
##  [4] "methData_Betas_LLS_F2"               
##  [5] "methData_Betas_NTR_F2"               
##  [6] "methData_Betas_PAN_F2"               
##  [7] "methData_Betas_RS_F2"                
##  [8] "methData_BIOS_02042015"              
##  [9] "methData_CODAM"                      
## [10] "methData_LL"                         
## [11] "methData_LLS"                        
## [12] "methData_Mvalues_CODAM_F2"           
## [13] "methData_Mvalues_LL_F2"              
## [14] "methData_Mvalues_LLS_F2"             
## [15] "methData_Mvalues_NTR_F2"             
## [16] "methData_Mvalues_PAN_F2"             
## [17] "methData_Mvalues_RS_F2"              
## [18] "methData_NTR"                        
## [19] "methData_PAN"                        
## [20] "methData_RS"                         
## [21] "rnaSeqData_freeze1_06032015BIOS"     
## [22] "rnaSeqData_freeze1_exon_14042015BIOS"

Loading and extracting the data

Load a specific data set usign data, check the name of the loaded data with ls and view its content by just typing the name in the console which will automatically call the buildin show-method.

DNA methylation data

data(methData_LLS)
ls()

##  [1] "BIOBANKS"        "DATASETS"        "LLSMalesAbove70"
##  [4] "MDB"             "methData"        "phenotypes"     
##  [7] "PROXY"           "RDB"             "RP3DATADIR"     
## [10] "SRMBASE"         "USRPWD"          "VIEWS"

methData

## Warning: The SummarizedExperiment class defined in the GenomicRanges package is
##   deprecated and being replaced with the RangedSummarizedExperiment class
##   defined in the new SummarizedExperiment package. You can use
##   updateObject() on any SummarizedExperiment object to turn it into a
##   RangedSummarizedExperiment.

## class: SummarizedExperiment 
## dim: 485512 784 
## exptData(0):
## assays(1): beta
## rownames(485512): cg00050873 cg00212031 ... ch.22.47579720R
##   ch.22.48274842R
## rowRanges metadata column names(10): addressA addressB ...
##   probeEnd probeTarget
## colnames(784): 9374343010_R04C02 8691803012_R04C02 ...
##   8667045031_R01C01 8655685028_R05C02
## colData names(27): uuid dna_id ... Basename filenames

class(methData)

## [1] "SummarizedExperiment"
## attr(,"package")
## [1] "GenomicRanges"

colData(methData)

## DataFrame with 784 rows and 27 columns
##                           uuid      dna_id  biobank_id Sample_Plate
##                    <character> <character> <character>  <character>
## 9374343010_R04C02 BIOS648CBD1C        1002         LLS           10
## 8691803012_R04C02 BIOS33DC8FBC         104         LLS            3
## 8454787132_R02C01 BIOS275BFCF8        1076         LLS            1
## 8655685041_R02C02 BIOSC7D66E13        1133         LLS            4
## 8655685197_R04C01 BIOSE7E8110D         124         LLS            2
## ...                        ...         ...         ...          ...
## 8691803077_R03C01 BIOS0CA69A11         727         LLS            3
## 8691803074_R04C01 BIOS275638C1         849         LLS            3
## 8655685009_R04C02 BIOS884EDA9D         885         LLS            2
## 8667045031_R01C01 BIOSBABA99DE         924         LLS            2
## 8655685028_R05C02 BIOS7708CCB4         997         LLS            2
##                   Sample_Well Sentrix_Barcode Sentrix_Lotnumber
##                   <character>     <character>       <character>
## 9374343010_R04C02         B05      9374343010           9374343
## 8691803012_R04C02         B05      8691803012           8691803
## 8454787132_R02C01         F02      8454787132           8454787
## 8655685041_R02C02         H10      8655685041           8655685
## 8655685197_R04C01         H11      8655685197           8655685
## ...                       ...             ...               ...
## 8691803077_R03C01         G08      8691803077           8691803
## 8691803074_R04C01         H11      8691803074           8691803
## 8655685009_R04C02         B11      8655685009           8655685
## 8667045031_R01C01         E05      8667045031           8667045
## 8655685028_R05C02         G03      8655685028           8655685
##                   Sentrix_Position    C1_Barcode C1_Lotnumber
##                        <character>   <character>  <character>
## 9374343010_R04C02           R04C02 wg2472386-xc1      9426762
## 8691803012_R04C02           R04C02 wg0511987-xc1      8784516
## 8454787132_R02C01           R02C01 wg0513402-xc1      8784516
## 8655685041_R02C02           R02C02 wg0514733-xc1      8784516
## 8655685197_R04C01           R04C01 wg0513724-xc1      8784516
## ...                            ...           ...          ...
## 8691803077_R03C01           R03C01 wg0511957-xc1      8784516
## 8691803074_R04C01           R04C01 wg0511957-xc1      8784516
## 8655685009_R04C02           R04C02 wg0513724-xc1      8784516
## 8667045031_R01C01           R01C01 wg0513714-xc1      8784516
## 8655685028_R05C02           R05C02 wg0513714-xc1      8784516
##                      C2_Barcode C2_Lotnumber   TEM_Barcode TEM_Lotnumber
##                     <character>  <character>   <character>   <character>
## 9374343010_R04C02 wg2473694-xc2      9430495 wg2482594-tem       9429751
## 8691803012_R04C02 wg0527887-xc2      8783644 wg0591558-tem       8615771
## 8454787132_R02C01 wg0527876-xc2      8783644 wg0591239-tem       8615771
## 8655685041_R02C02 wg0526031-xc2      8783644 wg0592298-tem       8615771
## 8655685197_R04C01 wg0527860-xc2      8783644 wg0597544-tem       8615771
## ...                         ...          ...           ...           ...
## 8691803077_R03C01 wg0527888-xc2      8783644 wg0597543-tem       8615771
## 8691803074_R04C01 wg0527888-xc2      8783644 wg0597543-tem       8615771
## 8655685009_R04C02 wg0527860-xc2      8783644 wg0597544-tem       8615771
## 8667045031_R01C01 wg0527877-xc2      8783644 wg0591559-tem       8615771
## 8655685028_R05C02 wg0527877-xc2      8783644 wg0591559-tem       8615771
##                     STM_Barcode STM_Lotnumber   ATM_Barcode ATM_Lotnumber
##                     <character>   <character>   <character>   <character>
## 9374343010_R04C02 wg1611625-stm       9370714 wg2519407-atm       9419844
## 8691803012_R04C02 wg1566697-stm       9269715 wg0537309-atm       8762691
## 8454787132_R02C01 wg1577012-stm       9284859 wg0537570-atm       8762691
## 8655685041_R02C02 wg1577003-stm       9284859 wg0535309-atm       8762691
## 8655685197_R04C01 wg1567608-stm       9269715 wg0537310-atm       8762691
## ...                         ...           ...           ...           ...
## 8691803077_R03C01 wg1577008-stm       9284859 wg0537569-atm       8762691
## 8691803074_R04C01 wg1577008-stm       9284859 wg0537569-atm       8762691
## 8655685009_R04C02 wg1567608-stm       9269715 wg0537310-atm       8762691
## 8667045031_R01C01 wg1566700-stm       9269715 wg0537320-atm       8762691
## 8655685028_R05C02 wg1566700-stm       9269715 wg0537320-atm       8762691
##                   Library_Date Hybridization_Date  Stain_Date   Scan_Date
##                    <character>        <character> <character> <character>
## 9374343010_R04C02   03-02-2014         04-02-2014  05-02-2014  06-02-2014
## 8691803012_R04C02   18-06-2013         20-06-2013  21-06-2013  22-06-2013
## 8454787132_R02C01   18-06-2013         20-06-2013  21-06-2013  21-06-2013
## 8655685041_R02C02   18-06-2013         24-06-2013  25-06-2013  27-06-2013
## 8655685197_R04C01   18-06-2013         20-06-2013  21-06-2013  21-06-2013
## ...                        ...                ...         ...         ...
## 8691803077_R03C01   18-06-2013         20-06-2013  21-06-2013  21-06-2013
## 8691803074_R04C01   18-06-2013         20-06-2013  21-06-2013  22-06-2013
## 8655685009_R04C02   18-06-2013         20-06-2013  21-06-2013  21-06-2013
## 8667045031_R01C01   18-06-2013         20-06-2013  21-06-2013  21-06-2013
## 8655685028_R05C02   18-06-2013         20-06-2013  21-06-2013  22-06-2013
##                             Scan_Time Scanner_Name     bios_id
##                           <character>  <character> <character>
## 9374343010_R04C02 09:47:41.7432+01:00         N140    LLS-1002
## 8691803012_R04C02 01:04:19.2928+02:00         N140     LLS-104
## 8454787132_R02C01 14:51:16.1618+02:00         N140    LLS-1076
## 8655685041_R02C02  22:25:06.015+02:00         N219    LLS-1133
## 8655685197_R04C01 19:01:55.1268+02:00         N140     LLS-124
## ...                               ...          ...         ...
## 8691803077_R03C01 21:52:23.8738+02:00         N140     LLS-727
## 8691803074_R04C01 03:08:03.7458+02:00         N140     LLS-849
## 8655685009_R04C02 20:25:29.2708+02:00         N140     LLS-885
## 8667045031_R01C01 19:28:52.5858+02:00         N140     LLS-924
## 8655685028_R05C02 02:52:41.1678+02:00         N140     LLS-997
##                                                                          Basename
##                                                                       <character>
## 9374343010_R04C02 /virdir/Scratch/RP3_data/450k//LLS/9374343010/9374343010_R04C02
## 8691803012_R04C02 /virdir/Scratch/RP3_data/450k//LLS/8691803012/8691803012_R04C02
## 8454787132_R02C01 /virdir/Scratch/RP3_data/450k//LLS/8454787132/8454787132_R02C01
## 8655685041_R02C02 /virdir/Scratch/RP3_data/450k//LLS/8655685041/8655685041_R02C02
## 8655685197_R04C01 /virdir/Scratch/RP3_data/450k//LLS/8655685197/8655685197_R04C01
## ...                                                                           ...
## 8691803077_R03C01 /virdir/Scratch/RP3_data/450k//LLS/8691803077/8691803077_R03C01
## 8691803074_R04C01 /virdir/Scratch/RP3_data/450k//LLS/8691803074/8691803074_R04C01
## 8655685009_R04C02 /virdir/Scratch/RP3_data/450k//LLS/8655685009/8655685009_R04C02
## 8667045031_R01C01 /virdir/Scratch/RP3_data/450k//LLS/8667045031/8667045031_R01C01
## 8655685028_R05C02 /virdir/Scratch/RP3_data/450k//LLS/8655685028/8655685028_R05C02
##                                                                         filenames
##                                                                       <character>
## 9374343010_R04C02 /virdir/Scratch/RP3_data/450k//LLS/9374343010/9374343010_R04C02
## 8691803012_R04C02 /virdir/Scratch/RP3_data/450k//LLS/8691803012/8691803012_R04C02
## 8454787132_R02C01 /virdir/Scratch/RP3_data/450k//LLS/8454787132/8454787132_R02C01
## 8655685041_R02C02 /virdir/Scratch/RP3_data/450k//LLS/8655685041/8655685041_R02C02
## 8655685197_R04C01 /virdir/Scratch/RP3_data/450k//LLS/8655685197/8655685197_R04C01
## ...                                                                           ...
## 8691803077_R03C01 /virdir/Scratch/RP3_data/450k//LLS/8691803077/8691803077_R03C01
## 8691803074_R04C01 /virdir/Scratch/RP3_data/450k//LLS/8691803074/8691803074_R04C01
## 8655685009_R04C02 /virdir/Scratch/RP3_data/450k//LLS/8655685009/8655685009_R04C02
## 8667045031_R01C01 /virdir/Scratch/RP3_data/450k//LLS/8667045031/8667045031_R01C01
## 8655685028_R05C02 /virdir/Scratch/RP3_data/450k//LLS/8655685028/8655685028_R05C02

rowRanges(methData)

## GRanges object with 485512 ranges and 10 metadata columns:
##                   seqnames               ranges strand   |    addressA
##                      <Rle>            <IRanges>  <Rle>   | <character>
##        cg00050873     chrY [ 9363356,  9363357]      *   |    32735311
##        cg00212031     chrY [21239348, 21239349]      *   |    29674443
##        cg00213748     chrY [ 8148233,  8148234]      *   |    30703409
##        cg00214611     chrY [15815688, 15815689]      *   |    69792329
##        cg00455876     chrY [ 9385539,  9385540]      *   |    27653438
##               ...      ...                  ...    ... ...         ...
##     ch.22.909671F    chr22 [46114168, 46114168]      *   |    47797398
##   ch.22.46830341F    chr22 [48451677, 48451677]      *   |    29618504
##    ch.22.1008279F    chr22 [48731367, 48731367]      *   |    49664383
##   ch.22.47579720R    chr22 [49193714, 49193714]      *   |    53733426
##   ch.22.48274842R    chr22 [49888838, 49888838]      *   |    62659432
##                      addressB channel platform percentGC
##                   <character>   <Rle>    <Rle> <numeric>
##        cg00050873    31717405     Red    HM450      0.62
##        cg00212031    38703326     Red    HM450      0.64
##        cg00213748    36767301     Red    HM450      0.56
##        cg00214611    46723459     Red    HM450      0.72
##        cg00455876    69732350     Red    HM450      0.64
##               ...         ...     ...      ...       ...
##     ch.22.909671F                Both    HM450      0.34
##   ch.22.46830341F                Both    HM450      0.46
##    ch.22.1008279F                Both    HM450      0.56
##   ch.22.47579720R                Both    HM450      0.60
##   ch.22.48274842R                Both    HM450      0.58
##                                                            sourceSeq
##                                                       <DNAStringSet>
##        cg00050873 CGGGGTCCACCCACTCCAAAAACCACCACAGTTGTGCGTTGCCTCCTCGC
##        cg00212031 CGCACGTCTTCCCGACCGCATAACTTGCTCAGTCCCTGCGGCCAACTGGG
##        cg00213748 CGCCCCCTCCTGCAGAACCTCCATCGTTAAAACGGTGCCAGGCGTTAAAA
##        cg00214611 CGCCCGCGCCACACTGCAGCCCAGCACACAAAGCGCGGCCCGGAAGCTAG
##        cg00455876 GACTCTGAGCTACCCGGCACAAGCTCCAAGGGCTTCTCGGAGGAGGCTCG
##               ...                                                ...
##     ch.22.909671F CAGCAAATCAAAAATTCACTGAAAAGAAATGCTTTTGTGTGTAAGTGGTG
##   ch.22.46830341F CAGCATCACATGTAGAAGGCATTCTGCTCAGAGAATGGCCTCCATTTTTC
##    ch.22.1008279F CAAGACTCATTCAACACAGACCCAGCCTCAGGCCCAGGAAGACTGTAGGG
##   ch.22.47579720R CAGGCAAGGGGCCTCAGAGATCACCAGCAAACCCCAGAAGCTGGAGAGAG
##   ch.22.48274842R ACTGACTGCAGGTGCTCACCAGCAACAGGGTGCTCACCCACAACAGGAAC
##                   probeType  probeStart    probeEnd probeTarget
##                       <Rle> <character> <character>   <numeric>
##        cg00050873        cg     9363308     9363357     9363356
##        cg00212031        cg    21239300    21239349    21239348
##        cg00213748        cg     8148185     8148234     8148233
##        cg00214611        cg    15815640    15815689    15815688
##        cg00455876        cg     9385491     9385540     9385539
##               ...       ...         ...         ...         ...
##     ch.22.909671F        ch    46114168    46114217    46114168
##   ch.22.46830341F        ch    48451677    48451726    48451677
##    ch.22.1008279F        ch    48731367    48731416    48731367
##   ch.22.47579720R        ch    49193714    49193763    49193714
##   ch.22.48274842R        ch    49888838    49888887    49888838
##   -------
##   seqinfo: 24 sequences from hg19 genome

assays(methData)$beta[1:5, 1:5]

##            9374343010_R04C02 8691803012_R04C02 8454787132_R02C01
## cg00050873                NA                NA                NA
## cg00212031                NA                NA                NA
## cg00213748                NA                NA                NA
## cg00214611                NA                NA                NA
## cg00455876                NA                NA                NA
##            8655685041_R02C02 8655685197_R04C01
## cg00050873                NA                NA
## cg00212031                NA                NA
## cg00213748                NA                NA
## cg00214611                NA                NA
## cg00455876                NA                NA

RNA seq data

Gene-level SummarizedExperiment

data(rnaSeqData_freeze1_06032015BIOS)
ls()

##  [1] "BIOBANKS"        "DATASETS"        "LLSMalesAbove70"
##  [4] "MDB"             "methData"        "phenotypes"     
##  [7] "PROXY"           "RDB"             "rnaSeqData"     
## [10] "RP3DATADIR"      "SRMBASE"         "USRPWD"         
## [13] "VIEWS"

rnaSeqData

## Warning: The SummarizedExperiment class defined in the GenomicRanges package is
##   deprecated and being replaced with the RangedSummarizedExperiment class
##   defined in the new SummarizedExperiment package. You can use
##   updateObject() on any SummarizedExperiment object to turn it into a
##   RangedSummarizedExperiment.

## class: SummarizedExperiment 
## dim: 46628 2116 
## exptData(0):
## assays(1): counts
## rownames(46628): ENSG00000000419 ENSG00000000457 ...
##   ENSG00000270182 ENSG00000270184
## rowRanges metadata column names(2): gc length
## colnames(2116): BD1NYRACXX-5-1 AD10W1ACXX-4-1 ... BC1KAVACXX-1-14
##   BC1KAVACXX-8-16
## colData names(140): group lib.size ...
##   fastqc_clean.R2_clean_GC_std fastqc_clean.R1_clean_GC_std

colData(rnaSeqData)

## DataFrame with 2116 rows and 140 columns
##                    group   lib.size norm.factors   rnaseq_run_id
##                 <factor>  <numeric>    <numeric>     <character>
## BD1NYRACXX-5-1     CODAM 1259404830            1  BD1NYRACXX-5-1
## AD10W1ACXX-4-1     CODAM 1632462474            1  AD10W1ACXX-4-1
## BD1NYRACXX-5-2     CODAM 1978420658            1  BD1NYRACXX-5-2
## AD10W1ACXX-4-2     CODAM 1334043187            1  AD10W1ACXX-4-2
## BD1NYRACXX-5-3     CODAM 1222613586            1  BD1NYRACXX-5-3
## ...                  ...        ...          ...             ...
## AD1NFNACXX-1-1        RS 1709905424            1  AD1NFNACXX-1-1
## AC1JV9ACXX-5-10       RS  765091757            1 AC1JV9ACXX-5-10
## AD1NFNACXX-1-20       RS 2327049556            1 AD1NFNACXX-1-20
## BC1KAVACXX-1-14       RS 2401508849            1 BC1KAVACXX-1-14
## BC1KAVACXX-8-16       RS 1710394939            1 BC1KAVACXX-8-16
##                     bios_id         uuid  biobank_id   person_id
##                 <character>  <character> <character> <character>
## BD1NYRACXX-5-1   CODAM-2001 BIOS6DB3BAD1       CODAM        2001
## AD10W1ACXX-4-1   CODAM-2002 BIOSCFA14234       CODAM        2002
## BD1NYRACXX-5-2   CODAM-2009 BIOSCA449668       CODAM        2009
## AD10W1ACXX-4-2   CODAM-2013 BIOS415A8BFB       CODAM        2013
## BD1NYRACXX-5-3   CODAM-2016 BIOSD16ED999       CODAM        2016
## ...                     ...          ...         ...         ...
## AD1NFNACXX-1-1       RS-942 BIOSCC469FF2          RS         942
## AC1JV9ACXX-5-10     RS-9420 BIOSB1058B1B          RS        9420
## AD1NFNACXX-1-20      RS-969 BIOSA2EF6C80          RS         969
## BC1KAVACXX-1-14      RS-982 BIOS027136BA          RS         982
## BC1KAVACXX-8-16      RS-984 BIOSC01C4781          RS         984
##                     nreruns   rnaseq_qc methylation_run_id    pheno_id
##                 <character> <character>        <character> <character>
## BD1NYRACXX-5-1            1           0  8667053102_R05C02        2001
## AD10W1ACXX-4-1            1           0  8667053157_R01C02        2002
## BD1NYRACXX-5-2            1           0  8667053152_R02C02        2009
## AD10W1ACXX-4-2            1           0  8655685053_R04C02        2013
## BD1NYRACXX-5-3            1           0  8655685094_R01C01        2016
## ...                     ...         ...                ...         ...
## AD1NFNACXX-1-1            1           0  8691803030_R05C01         942
## AC1JV9ACXX-5-10           1           0  8691803046_R04C02        9420
## AD1NFNACXX-1-20           1           0  8691803032_R01C01         969
## BC1KAVACXX-1-14           1           0  8454787105_R02C02         982
## BC1KAVACXX-8-16           1           0  8691803032_R06C01         984
##                     gwas_id      dna_id      rna_id     gonl_id
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1         2001        2001        2001          NA
## AD10W1ACXX-4-1         2002        2002        2002          NA
## BD1NYRACXX-5-2         2009        2009        2009          NA
## AD10W1ACXX-4-2         2013        2013        2013          NA
## BD1NYRACXX-5-3         2016        2016        2016          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1          942         942         942          NA
## AC1JV9ACXX-5-10        9420        9420        9420          NA
## AD1NFNACXX-1-20         969         969         969          NA
## BC1KAVACXX-1-14         982         982         982          NA
## BC1KAVACXX-8-16         984         984         984          NA
##                       cg_id      in_rp3 rnaseq_freeze methylation_freeze
##                 <character> <character>   <character>        <character>
## BD1NYRACXX-5-1           NA        TRUE             1                  1
## AD10W1ACXX-4-1           NA        TRUE             1                  1
## BD1NYRACXX-5-2           NA        TRUE             1                  1
## AD10W1ACXX-4-2           NA        TRUE             1                  1
## BD1NYRACXX-5-3           NA        TRUE             1                  1
## ...                     ...         ...           ...                ...
## AD1NFNACXX-1-1           NA        TRUE             1                  1
## AC1JV9ACXX-5-10          NA        TRUE             1                  1
## AD1NFNACXX-1-20          NA        TRUE             1                  1
## BC1KAVACXX-1-14          NA        TRUE             1                  1
## BC1KAVACXX-8-16          NA        TRUE             1                  1
##                 gonlv5imputed
##                   <character>
## BD1NYRACXX-5-1           TRUE
## AD10W1ACXX-4-1           TRUE
## BD1NYRACXX-5-2           TRUE
## AD10W1ACXX-4-2           TRUE
## BD1NYRACXX-5-3           TRUE
## ...                       ...
## AD1NFNACXX-1-1           TRUE
## AC1JV9ACXX-5-10          TRUE
## AD1NFNACXX-1-20          TRUE
## BC1KAVACXX-1-14          TRUE
## BC1KAVACXX-8-16          TRUE
##                                             Ascertainment_criterion
##                                                         <character>
## BD1NYRACXX-5-1  Selected for mildly increased DM2 /CVD risk factors
## AD10W1ACXX-4-1  Selected for mildly increased DM2 /CVD risk factors
## BD1NYRACXX-5-2  Selected for mildly increased DM2 /CVD risk factors
## AD10W1ACXX-4-2  Selected for mildly increased DM2 /CVD risk factors
## BD1NYRACXX-5-3  Selected for mildly increased DM2 /CVD risk factors
## ...                                                             ...
## AD1NFNACXX-1-1                                                   NA
## AC1JV9ACXX-5-10                                                  NA
## AD1NFNACXX-1-20                                                  NA
## BC1KAVACXX-1-14                                                  NA
## BC1KAVACXX-8-16                                                  NA
##                                   GWAS_Chip GWAS_DataGeneration_Date
##                                 <character>              <character>
## BD1NYRACXX-5-1  Illumina human omni express                     2012
## AD10W1ACXX-4-1  Illumina human omni express                     2012
## BD1NYRACXX-5-2  Illumina human omni express                     2012
## AD10W1ACXX-4-2  Illumina human omni express                     2012
## BD1NYRACXX-5-3  Illumina human omni express                     2012
## ...                                     ...                      ...
## AD1NFNACXX-1-1                           NA                       NA
## AC1JV9ACXX-5-10                          NA                       NA
## AD1NFNACXX-1-20                          NA                       NA
## BC1KAVACXX-1-14                          NA                       NA
## BC1KAVACXX-8-16                          NA                       NA
##                 DNA_BloodSampling_Age DNA_BloodSampling_Date
##                           <character>            <character>
## BD1NYRACXX-5-1                   77.9             2006-08-08
## AD10W1ACXX-4-1                   70.5             2006-08-09
## BD1NYRACXX-5-2                   66.3             2006-09-14
## AD10W1ACXX-4-2                   76.5             2006-09-26
## BD1NYRACXX-5-3                   71.9             2006-06-07
## ...                               ...                    ...
## AD1NFNACXX-1-1                 70.357             2011-10-05
## AC1JV9ACXX-5-10                51.535             2012-03-29
## AD1NFNACXX-1-20                68.233             2011-09-29
## BC1KAVACXX-1-14                66.379             2011-05-19
## BC1KAVACXX-8-16                68.783             2011-10-04
##                 DNA_BloodSampling_Time               DNA_Source
##                            <character>              <character>
## BD1NYRACXX-5-1                 8-11 am whole blood (buffy coat)
## AD10W1ACXX-4-1                 8-11 am whole blood (buffy coat)
## BD1NYRACXX-5-2                 8-11 am whole blood (buffy coat)
## AD10W1ACXX-4-2                 8-11 am whole blood (buffy coat)
## BD1NYRACXX-5-3                 8-11 am whole blood (buffy coat)
## ...                                ...                      ...
## AD1NFNACXX-1-1                 9:50:00                       NA
## AC1JV9ACXX-5-10                8:20:00                       NA
## AD1NFNACXX-1-20                9:30:00                       NA
## BC1KAVACXX-1-14               10:25:00                       NA
## BC1KAVACXX-8-16                9:00:00                       NA
##                 DNA_Extraction_Method DNA_Extraction_Date
##                           <character>         <character>
## BD1NYRACXX-5-1     QIAamp DNA minikit          2012-05-01
## AD10W1ACXX-4-1     QIAamp DNA minikit          2012-05-01
## BD1NYRACXX-5-2     QIAamp DNA minikit          2012-05-01
## AD10W1ACXX-4-2     QIAamp DNA minikit          2012-05-01
## BD1NYRACXX-5-3     QIAamp DNA minikit          2012-05-01
## ...                               ...                 ...
## AD1NFNACXX-1-1                     NA                  NA
## AC1JV9ACXX-5-10                    NA                  NA
## AD1NFNACXX-1-20                    NA                  NA
## BC1KAVACXX-1-14                    NA                  NA
## BC1KAVACXX-8-16                    NA                  NA
##                 DNA_QuantificationMethod DNA_A260A280ratio
##                              <character>       <character>
## BD1NYRACXX-5-1                  nanodrop               1.9
## AD10W1ACXX-4-1                  nanodrop              1.92
## BD1NYRACXX-5-2                  nanodrop              1.89
## AD10W1ACXX-4-2                  nanodrop              1.89
## BD1NYRACXX-5-3                  nanodrop              1.89
## ...                                  ...               ...
## AD1NFNACXX-1-1                        NA                NA
## AC1JV9ACXX-5-10                       NA                NA
## AD1NFNACXX-1-20                       NA                NA
## BC1KAVACXX-1-14                       NA                NA
## BC1KAVACXX-8-16                       NA                NA
##                 RNA_BloodSampling_Age RNA_Sampling_Date RNA_Sampling_Time
##                           <character>       <character>       <character>
## BD1NYRACXX-5-1                   77.9        2006-08-08           8-11 am
## AD10W1ACXX-4-1                   70.5        2006-08-09           8-11 am
## BD1NYRACXX-5-2                   66.3        2006-09-14           8-11 am
## AD10W1ACXX-4-2                   76.5        2006-09-26           8-11 am
## BD1NYRACXX-5-3                   71.9        2006-06-07           8-11 am
## ...                               ...               ...               ...
## AD1NFNACXX-1-1                 70.357        2011-10-05           9:50:00
## AC1JV9ACXX-5-10                51.535        2012-03-29           8:20:00
## AD1NFNACXX-1-20                68.233        2011-09-29           9:30:00
## BC1KAVACXX-1-14                66.379        2011-05-19          10:25:00
## BC1KAVACXX-8-16                68.783        2011-10-04           9:00:00
##                  RNA_Source RNA_Extraction_Date
##                 <character>         <character>
## BD1NYRACXX-5-1     PAX gene          2010-07-01
## AD10W1ACXX-4-1     PAX gene          2010-07-01
## BD1NYRACXX-5-2     PAX gene          2010-07-01
## AD10W1ACXX-4-2     PAX gene          2010-07-01
## BD1NYRACXX-5-3     PAX gene          2010-07-01
## ...                     ...                 ...
## AD1NFNACXX-1-1           NA                  NA
## AC1JV9ACXX-5-10          NA                  NA
## AD1NFNACXX-1-20          NA                  NA
## BC1KAVACXX-1-14          NA                  NA
## BC1KAVACXX-8-16          NA                  NA
##                             RNA_Extraction_Method     RNA_RIN
##                                       <character> <character>
## BD1NYRACXX-5-1  PAXgene blood miRNA kit (Qiacube)         9.1
## AD10W1ACXX-4-1  PAXgene blood miRNA kit (Qiacube)           9
## BD1NYRACXX-5-2  PAXgene blood miRNA kit (Qiacube)           9
## AD10W1ACXX-4-2  PAXgene blood miRNA kit (Qiacube)         8.8
## BD1NYRACXX-5-3  PAXgene blood miRNA kit (Qiacube)           9
## ...                                           ...         ...
## AD1NFNACXX-1-1                                 NA       8.539
## AC1JV9ACXX-5-10                                NA      8.1775
## AD1NFNACXX-1-20                                NA      8.1436
## BC1KAVACXX-1-14                                NA         8.5
## BC1KAVACXX-8-16                                NA      8.7492
##                 RNA_A260280ratio   BirthYear         Sex Smoking_Age
##                      <character> <character> <character> <character>
## BD1NYRACXX-5-1                 2        1928           0        77.9
## AD10W1ACXX-4-1                 2        1936           1        70.5
## BD1NYRACXX-5-2               2.2        1940           0        66.3
## AD10W1ACXX-4-2               2.2        1930           0        76.5
## BD1NYRACXX-5-3               2.1        1934           0        71.9
## ...                          ...         ...         ...         ...
## AD1NFNACXX-1-1                NA        1941           0          NA
## AC1JV9ACXX-5-10               NA        1960           0          NA
## AD1NFNACXX-1-20               NA        1943           1          NA
## BC1KAVACXX-1-14               NA        1944           0          NA
## BC1KAVACXX-8-16               NA        1942           1          NA
##                     Smoking Lipids_BloodSampling_Age
##                 <character>              <character>
## BD1NYRACXX-5-1            1                     77.9
## AD10W1ACXX-4-1            0                     70.5
## BD1NYRACXX-5-2            2                     66.3
## AD10W1ACXX-4-2            1                     76.5
## BD1NYRACXX-5-3            1                     71.9
## ...                     ...                      ...
## AD1NFNACXX-1-1           NA                   70.357
## AC1JV9ACXX-5-10          NA                   51.535
## AD1NFNACXX-1-20          NA                   68.233
## BC1KAVACXX-1-14          NA                   66.379
## BC1KAVACXX-8-16          NA                   68.783
##                 Lipids_BloodSampling_Date Lipids_BloodSampling_Time
##                               <character>               <character>
## BD1NYRACXX-5-1                 2006-08-08                   8-11 am
## AD10W1ACXX-4-1                 2006-08-09                   8-11 am
## BD1NYRACXX-5-2                 2006-09-14                   8-11 am
## AD10W1ACXX-4-2                 2006-09-26                   8-11 am
## BD1NYRACXX-5-3                 2006-06-07                   8-11 am
## ...                                   ...                       ...
## AD1NFNACXX-1-1                 2011-10-05                   9:50:00
## AC1JV9ACXX-5-10                2012-03-29                   8:20:00
## AD1NFNACXX-1-20                2011-09-29                   9:30:00
## BC1KAVACXX-1-14                2011-05-19                  10:25:00
## BC1KAVACXX-8-16                2011-10-04                   9:00:00
##                 Lipids_BloodSampling_Fasting     TotChol     HDLchol
##                                  <character> <character> <character>
## BD1NYRACXX-5-1                             1         5.6        1.28
## AD10W1ACXX-4-1                             1         4.3        1.24
## BD1NYRACXX-5-2                             1         5.4         1.4
## AD10W1ACXX-4-2                             1           6        1.08
## BD1NYRACXX-5-3                             1         5.7        1.22
## ...                                      ...         ...         ...
## AD1NFNACXX-1-1                             1         4.1         1.6
## AC1JV9ACXX-5-10                            1         5.9        1.63
## AD1NFNACXX-1-20                            1         6.6        2.22
## BC1KAVACXX-1-14                            1         5.3        0.97
## BC1KAVACXX-8-16                            1         5.9         1.7
##                 Triglycerides     LDLchol LDLcholMethod LipidsMed_Age
##                   <character> <character>   <character>   <character>
## BD1NYRACXX-5-1            1.5          NA            NA            NA
## AD10W1ACXX-4-1            1.1          NA            NA            NA
## BD1NYRACXX-5-2            0.7          NA            NA            NA
## AD10W1ACXX-4-2            2.1          NA            NA            NA
## BD1NYRACXX-5-3              1          NA            NA            NA
## ...                       ...         ...           ...           ...
## AD1NFNACXX-1-1           0.91          NA            NA            NA
## AC1JV9ACXX-5-10          0.92          NA            NA            NA
## AD1NFNACXX-1-20          1.22          NA            NA            NA
## BC1KAVACXX-1-14           1.4          NA            NA            NA
## BC1KAVACXX-8-16          0.65          NA            NA            NA
##                    LipidMed Anthropometry_Age      Height      Weight
##                 <character>       <character> <character> <character>
## BD1NYRACXX-5-1            1              77.9       175.5       76.25
## AD10W1ACXX-4-1            1              70.5         166       116.6
## BD1NYRACXX-5-2            0              66.3         170          83
## AD10W1ACXX-4-2            0              76.5         172        86.3
## BD1NYRACXX-5-3            0              71.9       174.5       74.75
## ...                     ...               ...         ...         ...
## AD1NFNACXX-1-1           NA                NA         172        87.5
## AC1JV9ACXX-5-10          NA                NA         180        99.9
## AD1NFNACXX-1-20          NA                NA         162        66.7
## BC1KAVACXX-1-14          NA                NA       183.7        84.3
## BC1KAVACXX-8-16          NA                NA       162.7        73.3
##                 CRP_BloodSampling_Age CRP_BloodSampling_Date
##                           <character>            <character>
## BD1NYRACXX-5-1                   77.9             2006-08-08
## AD10W1ACXX-4-1                   70.5             2006-08-09
## BD1NYRACXX-5-2                   66.3             2006-09-14
## AD10W1ACXX-4-2                   76.5             2006-09-26
## BD1NYRACXX-5-3                   71.9             2006-06-07
## ...                               ...                    ...
## AD1NFNACXX-1-1                     NA                     NA
## AC1JV9ACXX-5-10                    NA                     NA
## AD1NFNACXX-1-20                    NA                     NA
## BC1KAVACXX-1-14                    NA                     NA
## BC1KAVACXX-8-16                    NA                     NA
##                 CRP_BloodSampling_Time       hsCRP
##                            <character> <character>
## BD1NYRACXX-5-1                      NA        0.95
## AD10W1ACXX-4-1                      NA        4.61
## BD1NYRACXX-5-2                      NA        0.78
## AD10W1ACXX-4-2                      NA        8.48
## BD1NYRACXX-5-3                      NA        0.94
## ...                                ...         ...
## AD1NFNACXX-1-1                      NA          NA
## AC1JV9ACXX-5-10                     NA          NA
## AD1NFNACXX-1-20                     NA          NA
## BC1KAVACXX-1-14                     NA          NA
## BC1KAVACXX-8-16                     NA          NA
##                 CellCount_BloodSampling_Age CellCount_BloodSampling_Date
##                                 <character>                  <character>
## BD1NYRACXX-5-1                           NA                           NA
## AD10W1ACXX-4-1                           NA                           NA
## BD1NYRACXX-5-2                           NA                           NA
## AD10W1ACXX-4-2                           NA                           NA
## BD1NYRACXX-5-3                           NA                           NA
## ...                                     ...                          ...
## AD1NFNACXX-1-1                       70.357                   2011-10-05
## AC1JV9ACXX-5-10                      51.535                   2012-03-29
## AD1NFNACXX-1-20                      68.233                   2011-09-29
## BC1KAVACXX-1-14                      66.379                   2011-05-19
## BC1KAVACXX-8-16                      68.783                   2011-10-04
##                 CellCount_BloodSampling_Time         WBC         RBC
##                                  <character> <character> <character>
## BD1NYRACXX-5-1                            NA          NA          NA
## AD10W1ACXX-4-1                            NA          NA          NA
## BD1NYRACXX-5-2                            NA          NA          NA
## AD10W1ACXX-4-2                            NA          NA          NA
## BD1NYRACXX-5-3                            NA          NA          NA
## ...                                      ...         ...         ...
## AD1NFNACXX-1-1                       9:50:00           8        4.82
## AC1JV9ACXX-5-10                      8:20:00         6.4        5.02
## AD1NFNACXX-1-20                      9:30:00         5.5        4.32
## BC1KAVACXX-1-14                     10:25:00         6.5        5.21
## BC1KAVACXX-8-16                      9:00:00         5.6        4.55
##                         HGB         HCT         MCV         MCH
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1          8.9        0.43          90        1.85
## AC1JV9ACXX-5-10         9.8        0.46        92.5        1.94
## AD1NFNACXX-1-20         8.3         0.4        93.4        1.92
## BC1KAVACXX-1-14         9.4        0.45        87.1        1.81
## BC1KAVACXX-8-16         8.5        0.42        93.2        1.86
##                        MCHC        CHCM          CH         RDW
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1         20.5          NA          NA          NA
## AC1JV9ACXX-5-10          21          NA          NA          NA
## AD1NFNACXX-1-20        20.6          NA          NA          NA
## BC1KAVACXX-1-14        20.8          NA          NA          NA
## BC1KAVACXX-8-16          20          NA          NA          NA
##                         HDW         PLT         MPV        Neut
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1           NA         248           8          NA
## AC1JV9ACXX-5-10          NA         345         6.7          NA
## AD1NFNACXX-1-20          NA         241         7.1          NA
## BC1KAVACXX-1-14          NA         265         7.4          NA
## BC1KAVACXX-8-16          NA         225         7.6          NA
##                       Lymph        Mono         Eos        Baso
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1           NA          NA          NA          NA
## AC1JV9ACXX-5-10          NA          NA          NA          NA
## AD1NFNACXX-1-20          NA          NA          NA          NA
## BC1KAVACXX-1-14          NA          NA          NA          NA
## BC1KAVACXX-8-16          NA          NA          NA          NA
##                         LUC   Neut_Perc  Lymph_Perc   Mono_Perc
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1           NA          NA        42.6         8.5
## AC1JV9ACXX-5-10          NA          NA        31.9         7.9
## AD1NFNACXX-1-20          NA          NA        29.9         8.7
## BC1KAVACXX-1-14          NA          NA        37.2         3.9
## BC1KAVACXX-8-16          NA          NA        41.6         9.7
##                    Eos_Perc   Baso_Perc    LUC_Perc  run_number
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA         125
## AD10W1ACXX-4-1           NA          NA          NA         234
## BD1NYRACXX-5-2           NA          NA          NA         125
## AD10W1ACXX-4-2           NA          NA          NA         234
## BD1NYRACXX-5-3           NA          NA          NA         125
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1           NA          NA          NA         124
## AC1JV9ACXX-5-10          NA          NA          NA         243
## AD1NFNACXX-1-20          NA          NA          NA         124
## BC1KAVACXX-1-14          NA          NA          NA         123
## BC1KAVACXX-8-16          NA          NA          NA         123
##                 flowcell_number     machine         raw past_filter
##                     <character> <character> <character> <character>
## BD1NYRACXX-5-1               2b      SN1013    40434000    32052000
## AD10W1ACXX-4-1               3a       SN505    46622000    41461000
## BD1NYRACXX-5-2               2b      SN1013    61132000    48674000
## AD10W1ACXX-4-2               3a       SN505    36635000    33116000
## BD1NYRACXX-5-3               2b      SN1013    40919000    31762000
## ...                         ...         ...         ...         ...
## AD1NFNACXX-1-1               2a      SN1013    50351000    43968000
## AC1JV9ACXX-5-10             10a       SN505  23,696,872  21,347,978
## AD1NFNACXX-1-20              2a      SN1013    66222000    58360000
## BC1KAVACXX-1-14              1b      SN1013    63089000    57536000
## BC1KAVACXX-8-16              1b      SN1013    44481000    41267000
##                        date insert_size star.avg_deletion_length
##                 <character> <character>              <character>
## BD1NYRACXX-5-1   2013-03-29         325                     1.47
## AD10W1ACXX-4-1   2013-04-17         313                     1.42
## BD1NYRACXX-5-2   2013-03-29         325                     1.46
## AD10W1ACXX-4-2   2013-04-17         305                     1.44
## BD1NYRACXX-5-3   2013-03-29         308                     1.43
## ...                     ...         ...                      ...
## AD1NFNACXX-1-1   2013-03-29         298                     1.47
## AC1JV9ACXX-5-10  2013-07-09         304                     1.58
## AD1NFNACXX-1-20  2013-03-29         325                     1.47
## BC1KAVACXX-1-14  2013-03-19         326                     1.43
## BC1KAVACXX-8-16  2013-03-19         314                     1.45
##                 star.start_mapping_time star.pct_unique_mapped
##                             <character>            <character>
## BD1NYRACXX-5-1          Oct 10 21:39:51                  93.19
## AD10W1ACXX-4-1          Oct 07 17:40:58                   92.4
## BD1NYRACXX-5-2          Oct 11 00:56:26                  92.49
## AD10W1ACXX-4-2          Oct 16 20:38:57                  92.63
## BD1NYRACXX-5-3          Oct 10 21:55:25                   92.9
## ...                                 ...                    ...
## AD1NFNACXX-1-1          Oct 08 10:43:53                  92.68
## AC1JV9ACXX-5-10         Oct 06 12:24:09                  89.53
## AD1NFNACXX-1-20         Oct 09 03:56:44                   92.2
## BC1KAVACXX-1-14         Oct 10 05:01:36                  93.06
## BC1KAVACXX-8-16         Oct 10 08:05:07                  90.16
##                 star.num_unique_mapped star.num_splice_annotated
##                            <character>               <character>
## BD1NYRACXX-5-1                13384007                   3603631
## AD10W1ACXX-4-1                16673749                   4812839
## BD1NYRACXX-5-2                20220194                   5536547
## AD10W1ACXX-4-2                13579658                   3876835
## BD1NYRACXX-5-3                12571418                   3404880
## ...                                ...                       ...
## AD1NFNACXX-1-1                18203944                   4744204
## AC1JV9ACXX-5-10                8140547                   2414167
## AD1NFNACXX-1-20               24617486                   6577238
## BC1KAVACXX-1-14               24967281                   6457378
## BC1KAVACXX-8-16               17642273                   4919424
##                 star.num_splice_noncanonical star.pct_unmapped_other
##                                  <character>             <character>
## BD1NYRACXX-5-1                          3291                    0.06
## AD10W1ACXX-4-1                          4690                    0.05
## BD1NYRACXX-5-2                          5662                    0.06
## AD10W1ACXX-4-2                          4701                    0.05
## BD1NYRACXX-5-3                          3997                    0.05
## ...                                      ...                     ...
## AD1NFNACXX-1-1                          5975                    0.07
## AC1JV9ACXX-5-10                         3276                    0.05
## AD1NFNACXX-1-20                         6869                    0.06
## BC1KAVACXX-1-14                         7585                    0.05
## BC1KAVACXX-8-16                         5405                    0.06
##                 star.num_splice_total star.num_splice_atac
##                           <character>          <character>
## BD1NYRACXX-5-1                3631134                 3204
## AD10W1ACXX-4-1                4852111                 4039
## BD1NYRACXX-5-2                5584443                 4996
## AD10W1ACXX-4-2                3909894                 3384
## BD1NYRACXX-5-3                3433262                 3033
## ...                               ...                  ...
## AD1NFNACXX-1-1                4783352                 4325
## AC1JV9ACXX-5-10               2438071                 2838
## AD1NFNACXX-1-20               6630389                 5836
## BC1KAVACXX-1-14               6511885                 5826
## BC1KAVACXX-8-16               4962823                 4477
##                 star.num_splice_gcag star.num_input
##                          <character>    <character>
## BD1NYRACXX-5-1                 24928       14362274
## AD10W1ACXX-4-1                 32581       18044680
## BD1NYRACXX-5-2                 37530       21861429
## AD10W1ACXX-4-2                 27794       14660267
## BD1NYRACXX-5-3                 23072       13532066
## ...                              ...            ...
## AD1NFNACXX-1-1                 31893       19641011
## AC1JV9ACXX-5-10                18193        9092744
## AD1NFNACXX-1-20                46290       26699100
## BC1KAVACXX-1-14                44465       26828880
## BC1KAVACXX-8-16                33675       19567453
##                 star.rate_deletion_per_base star.pct_mapped_multiple
##                                 <character>              <character>
## BD1NYRACXX-5-1                            0                     3.92
## AD10W1ACXX-4-1                            0                     4.25
## BD1NYRACXX-5-2                            0                     4.47
## AD10W1ACXX-4-2                            0                     4.12
## BD1NYRACXX-5-3                            0                     4.11
## ...                                     ...                      ...
## AD1NFNACXX-1-1                            0                     4.19
## AC1JV9ACXX-5-10                           0                     3.69
## AD1NFNACXX-1-20                           0                     3.55
## BC1KAVACXX-1-14                           0                     3.87
## BC1KAVACXX-8-16                           0                      3.8
##                 star.rate_mismatch_per_base star.start_job_time
##                                 <character>         <character>
## BD1NYRACXX-5-1                         0.23     Oct 10 21:38:54
## AD10W1ACXX-4-1                         0.22     Oct 07 17:39:59
## BD1NYRACXX-5-2                         0.25     Oct 11 00:55:27
## AD10W1ACXX-4-2                         0.22     Oct 16 20:29:47
## BD1NYRACXX-5-3                         0.25     Oct 10 21:53:03
## ...                                     ...                 ...
## AD1NFNACXX-1-1                          0.2     Oct 08 10:41:20
## AC1JV9ACXX-5-10                        0.26     Oct 06 12:22:56
## AD1NFNACXX-1-20                         0.2     Oct 09 03:55:17
## BC1KAVACXX-1-14                        0.21     Oct 10 04:59:50
## BC1KAVACXX-8-16                        0.19     Oct 10 08:02:10
##                 star.pct_unmapped_short star.mapping_speed
##                             <character>        <character>
## BD1NYRACXX-5-1                     2.56             544.25
## AD10W1ACXX-4-1                     3.02             595.97
## BD1NYRACXX-5-2                     2.75             554.23
## AD10W1ACXX-4-2                     2.95                593
## BD1NYRACXX-5-3                     2.74             624.56
## ...                                 ...                ...
## AD1NFNACXX-1-1                     2.73             625.73
## AC1JV9ACXX-5-10                    6.35              503.6
## AD1NFNACXX-1-20                    3.85             279.41
## BC1KAVACXX-1-14                     2.7             555.08
## BC1KAVACXX-8-16                    5.71             489.19
##                 star.avg_insertion_length star.pct_mapped_many
##                               <character>          <character>
## BD1NYRACXX-5-1                        1.2                 0.26
## AD10W1ACXX-4-1                       1.19                 0.28
## BD1NYRACXX-5-2                        1.2                 0.23
## AD10W1ACXX-4-2                       1.19                 0.24
## BD1NYRACXX-5-3                        1.2                  0.2
## ...                                   ...                  ...
## AD1NFNACXX-1-1                        1.2                 0.32
## AC1JV9ACXX-5-10                      1.21                 0.38
## AD1NFNACXX-1-20                       1.2                 0.34
## BC1KAVACXX-1-14                      1.19                 0.32
## BC1KAVACXX-8-16                      1.19                 0.28
##                 star.rate_insertion_per_base star.num_splice_gtag
##                                  <character>          <character>
## BD1NYRACXX-5-1                          0.01              3599711
## AD10W1ACXX-4-1                          0.01              4810801
## BD1NYRACXX-5-2                          0.01              5536255
## AD10W1ACXX-4-2                          0.01              3874015
## BD1NYRACXX-5-3                          0.01              3403160
## ...                                      ...                  ...
## AD1NFNACXX-1-1                          0.01              4741159
## AC1JV9ACXX-5-10                            0              2413764
## AD1NFNACXX-1-20                         0.01              6571394
## BC1KAVACXX-1-14                         0.01              6454009
## BC1KAVACXX-8-16                         0.01              4919266
##                 star.num_mapped_many star.num_mapped_multiple
##                          <character>              <character>
## BD1NYRACXX-5-1                 38000                   563715
## AD10W1ACXX-4-1                 50826                   766081
## BD1NYRACXX-5-2                 49664                   976688
## AD10W1ACXX-4-2                 35799                   604307
## BD1NYRACXX-5-3                 27346                   555867
## ...                              ...                      ...
## AD1NFNACXX-1-1                 63747                   823711
## AC1JV9ACXX-5-10                34925                   335743
## AD1NFNACXX-1-20                91494                   947493
## BC1KAVACXX-1-14                85467                  1037749
## BC1KAVACXX-8-16                54001                   742645
##                 star.avg_mapped_length star.avg_input_length
##                            <character>           <character>
## BD1NYRACXX-5-1                   96.67                    97
## AD10W1ACXX-4-1                   98.17                    98
## BD1NYRACXX-5-2                   96.68                    97
## AD10W1ACXX-4-2                   98.17                    98
## BD1NYRACXX-5-3                   96.46                    96
## ...                                ...                   ...
## AD1NFNACXX-1-1                   98.17                    98
## AC1JV9ACXX-5-10                  97.49                    98
## AD1NFNACXX-1-20                  98.15                    98
## BC1KAVACXX-1-14                  98.33                    98
## BC1KAVACXX-8-16                  98.48                    98
##                   star.end_time star.pct_unmapped_mismatch
##                     <character>                <character>
## BD1NYRACXX-5-1  Oct 10 21:41:26                          0
## AD10W1ACXX-4-1  Oct 07 17:42:47                          0
## BD1NYRACXX-5-2  Oct 11 00:58:48                          0
## AD10W1ACXX-4-2  Oct 16 20:40:26                          0
## BD1NYRACXX-5-3  Oct 10 21:56:43                          0
## ...                         ...                        ...
## AD1NFNACXX-1-1  Oct 08 10:45:46                          0
## AC1JV9ACXX-5-10 Oct 06 12:25:14                          0
## AD1NFNACXX-1-20 Oct 09 04:02:28                          0
## BC1KAVACXX-1-14 Oct 10 05:04:30                          0
## BC1KAVACXX-8-16 Oct 10 08:07:31                          0
##                 bam.genome_insert_mean bam.genome_insert_std
##                            <character>           <character>
## BD1NYRACXX-5-1        288.648256495878      234.108811011201
## AD10W1ACXX-4-1        294.584736018414      245.447514390202
## BD1NYRACXX-5-2        286.413248406994      230.288439069619
## AD10W1ACXX-4-2        291.046940453097      240.342521371492
## BD1NYRACXX-5-3        275.199058717648      216.256191245783
## ...                                ...                   ...
## AD1NFNACXX-1-1        243.377767271685      174.593146049674
## AC1JV9ACXX-5-10        270.98717085555      234.077479869603
## AD1NFNACXX-1-20       271.426005179576      206.029383007847
## BC1KAVACXX-1-14       262.341694194699      188.846941324494
## BC1KAVACXX-8-16       281.656613101494      222.606885771599
##                 bam.genome_duplicates bam.exon_duplicates bam.exon_mapped
##                           <character>         <character>     <character>
## BD1NYRACXX-5-1                2661291                   0        26186566
## AD10W1ACXX-4-1                4967141                   0        33376848
## BD1NYRACXX-5-2                7305149                   0        41080270
## AD10W1ACXX-4-2                3560547                   0        27278995
## BD1NYRACXX-5-3                3668799                   0        25458963
## ...                               ...                 ...             ...
## AD1NFNACXX-1-1                6078591                   0        34995158
## AC1JV9ACXX-5-10               5175009                   0        15785461
## AD1NFNACXX-1-20               8338397                   0        47655190
## BC1KAVACXX-1-14               8125608                   0        49057971
## BC1KAVACXX-8-16               5190439                   0        34874136
##                 bam.genome_total bam.genome_mapped bam.exon_total
##                      <character>       <character>    <character>
## BD1NYRACXX-5-1          30369701          29537099       26186566
## AD10W1ACXX-4-1          38317327          37103536       33376848
## BD1NYRACXX-5-2          46554232          45220039       41080270
## AD10W1ACXX-4-2          31054670          30098674       27278995
## BD1NYRACXX-5-3          28654414          27841451       25458963
## ...                          ...               ...            ...
## AD1NFNACXX-1-1          41723982          40491606       34995158
## AC1JV9ACXX-5-10         19163800          17928954       15785461
## AD1NFNACXX-1-20         56174753          53899515       47655190
## BC1KAVACXX-1-14         56673162          55019984       49057971
## BC1KAVACXX-8-16         41274638          38906239       34874136
##                 fastqc_raw.R2_raw_GC_mean fastqc_raw.R2_raw_GC_std
##                               <character>              <character>
## BD1NYRACXX-5-1           50.6134742425783         11.9436168503436
## AD10W1ACXX-4-1           52.5497727984934         11.6764709027634
## BD1NYRACXX-5-2           51.3511595055147         11.8771215708693
## AD10W1ACXX-4-2           52.9603163325893         11.7375629125041
## BD1NYRACXX-5-3           51.6023001132948         11.7728320915493
## ...                                   ...                      ...
## AD1NFNACXX-1-1           51.1126529412655         12.1481063937884
## AC1JV9ACXX-5-10          56.1104199342259         10.8167704633456
## AD1NFNACXX-1-20          52.2642327963524         11.8520406039121
## BC1KAVACXX-1-14          52.1173262764685         11.9197420032885
## BC1KAVACXX-8-16          52.6479899386612         11.7668179866764
##                 fastqc_raw.R1_raw_GC_mean fastqc_raw.R1_raw_GC_std
##                               <character>              <character>
## BD1NYRACXX-5-1           50.2351419723759         11.8848917281142
## AD10W1ACXX-4-1           51.8356043001923         11.2889096664749
## BD1NYRACXX-5-2           51.0789575779742         11.8219118904248
## AD10W1ACXX-4-2           52.4489872339274         11.4721063444188
## BD1NYRACXX-5-3           50.9744983067781         11.6161522540398
## ...                                   ...                      ...
## AD1NFNACXX-1-1           50.6593415093687           11.84707799697
## AC1JV9ACXX-5-10           54.909296338238         10.5315750859386
## AD1NFNACXX-1-20          52.0007684863848         11.7174692643988
## BC1KAVACXX-1-14          51.9082951558929         11.7670706654902
## BC1KAVACXX-8-16          52.4404759544934         11.5887318822462
##                 fastqc_clean.R1_clean_GC_mean
##                                   <character>
## BD1NYRACXX-5-1                50.164378341129
## AD10W1ACXX-4-1               51.8220336816596
## BD1NYRACXX-5-2               51.0271786120152
## AD10W1ACXX-4-2               52.4373139201977
## BD1NYRACXX-5-3               50.9028924407951
## ...                                       ...
## AD1NFNACXX-1-1               50.5797529326033
## AC1JV9ACXX-5-10               55.634548872841
## AD1NFNACXX-1-20              51.9339059552363
## BC1KAVACXX-1-14              51.8790750218106
## BC1KAVACXX-8-16              52.3904075032816
##                 fastqc_clean.R2_clean_GC_mean fastqc_clean.R2_clean_GC_std
##                                   <character>                  <character>
## BD1NYRACXX-5-1               50.3968946725711             12.1131258203262
## AD10W1ACXX-4-1               52.0100726108017              11.811466538215
## BD1NYRACXX-5-2               51.1828528749685             12.0540137218588
## AD10W1ACXX-4-2               52.5987985271214             11.9084620443752
## BD1NYRACXX-5-3               51.1013302404673              12.082642631825
## ...                                       ...                          ...
## AD1NFNACXX-1-1               50.7719921815163             12.2156418850006
## AC1JV9ACXX-5-10              55.7497819654327             11.0483150261391
## AD1NFNACXX-1-20              52.1394176386261             11.9282296293526
## BC1KAVACXX-1-14              52.0255556861058             11.9945544323497
## BC1KAVACXX-8-16              52.5096132946266             11.7511126114672
##                 fastqc_clean.R1_clean_GC_std
##                                  <character>
## BD1NYRACXX-5-1              12.2507376336796
## AD10W1ACXX-4-1                 11.7524863028
## BD1NYRACXX-5-2              12.1621853153794
## AD10W1ACXX-4-2              11.8501461812698
## BD1NYRACXX-5-3              12.2171170446126
## ...                                      ...
## AD1NFNACXX-1-1              12.1562080492483
## AC1JV9ACXX-5-10             10.9872330877667
## AD1NFNACXX-1-20             11.8857304444009
## BC1KAVACXX-1-14             11.9727270527311
## BC1KAVACXX-8-16             11.7504095784964

rowRanges(rnaSeqData)

## GRanges object with 46628 ranges and 2 metadata columns:
##                   seqnames                 ranges strand   |
##                      <Rle>              <IRanges>  <Rle>   |
##   ENSG00000000419    chr20 [ 49551404,  49575092]      -   |
##   ENSG00000000457     chr1 [169818772, 169863408]      -   |
##   ENSG00000000460     chr1 [169631245, 169823221]      +   |
##   ENSG00000000938     chr1 [ 27938575,  27961788]      -   |
##   ENSG00000000971     chr1 [196621008, 196716634]      +   |
##               ...      ...                    ...    ... ...
##   ENSG00000270174     chr6 [  5665218,   5695505]      -   |
##   ENSG00000270177     chr5 [133562101, 133563518]      +   |
##   ENSG00000270178     chr3 [179521851, 179522154]      +   |
##   ENSG00000270182     chr7 [ 27197963,  27198595]      +   |
##   ENSG00000270184    chr16 [ 85817988,  85821223]      +   |
##                                  gc    length
##                           <numeric> <numeric>
##   ENSG00000000419 0.397680198840099      1207
##   ENSG00000000457 0.466715435259693      2734
##   ENSG00000000460 0.430529977491303      4887
##   ENSG00000000938 0.573114565342545      3474
##   ENSG00000000971 0.361493123772102      8144
##               ...               ...       ...
##   ENSG00000270174 0.501240694789082       806
##   ENSG00000270177 0.539492242595205      1418
##   ENSG00000270178             0.375       304
##   ENSG00000270182  0.53870458135861       633
##   ENSG00000270184  0.44267053701016       689
##   -------
##   seqinfo: 93 sequences (1 circular) from hg19 genome

assays(rnaSeqData)$counts[1:5, 1:5]

##                 BD1NYRACXX-5-1 AD10W1ACXX-4-1 BD1NYRACXX-5-2
## ENSG00000000419          18910          26042          33868
## ENSG00000000457          22340          26380          30769
## ENSG00000000460           6793           6177           7889
## ENSG00000000938        1129953        1387162        1616590
## ENSG00000000971           8526           5429           8833
##                 AD10W1ACXX-4-2 BD1NYRACXX-5-3
## ENSG00000000419          16600          16277
## ENSG00000000457          17890          19584
## ENSG00000000460           5670           5630
## ENSG00000000938        1488194        1305774
## ENSG00000000971           7549           3290

Exon-level SummarizedExperiment

data(rnaSeqData_freeze1_exon_14042015BIOS)
colData(rnaSeqData)

## DataFrame with 2116 rows and 140 columns
##                    group   lib.size norm.factors   rnaseq_run_id
##                 <factor>  <numeric>    <numeric>     <character>
## BD1NYRACXX-5-1     CODAM 1259404830            1  BD1NYRACXX-5-1
## AD10W1ACXX-4-1     CODAM 1632462474            1  AD10W1ACXX-4-1
## BD1NYRACXX-5-2     CODAM 1978420658            1  BD1NYRACXX-5-2
## AD10W1ACXX-4-2     CODAM 1334043187            1  AD10W1ACXX-4-2
## BD1NYRACXX-5-3     CODAM 1222613586            1  BD1NYRACXX-5-3
## ...                  ...        ...          ...             ...
## AD1NFNACXX-1-1        RS 1709905424            1  AD1NFNACXX-1-1
## AC1JV9ACXX-5-10       RS  765091757            1 AC1JV9ACXX-5-10
## AD1NFNACXX-1-20       RS 2327049556            1 AD1NFNACXX-1-20
## BC1KAVACXX-1-14       RS 2401508849            1 BC1KAVACXX-1-14
## BC1KAVACXX-8-16       RS 1710394939            1 BC1KAVACXX-8-16
##                     bios_id         uuid  biobank_id   person_id
##                 <character>  <character> <character> <character>
## BD1NYRACXX-5-1   CODAM-2001 BIOS6DB3BAD1       CODAM        2001
## AD10W1ACXX-4-1   CODAM-2002 BIOSCFA14234       CODAM        2002
## BD1NYRACXX-5-2   CODAM-2009 BIOSCA449668       CODAM        2009
## AD10W1ACXX-4-2   CODAM-2013 BIOS415A8BFB       CODAM        2013
## BD1NYRACXX-5-3   CODAM-2016 BIOSD16ED999       CODAM        2016
## ...                     ...          ...         ...         ...
## AD1NFNACXX-1-1       RS-942 BIOSCC469FF2          RS         942
## AC1JV9ACXX-5-10     RS-9420 BIOSB1058B1B          RS        9420
## AD1NFNACXX-1-20      RS-969 BIOSA2EF6C80          RS         969
## BC1KAVACXX-1-14      RS-982 BIOS027136BA          RS         982
## BC1KAVACXX-8-16      RS-984 BIOSC01C4781          RS         984
##                     nreruns   rnaseq_qc methylation_run_id    pheno_id
##                 <character> <character>        <character> <character>
## BD1NYRACXX-5-1            1           0  8667053102_R05C02        2001
## AD10W1ACXX-4-1            1           0  8667053157_R01C02        2002
## BD1NYRACXX-5-2            1           0  8667053152_R02C02        2009
## AD10W1ACXX-4-2            1           0  8655685053_R04C02        2013
## BD1NYRACXX-5-3            1           0  8655685094_R01C01        2016
## ...                     ...         ...                ...         ...
## AD1NFNACXX-1-1            1           0  8691803030_R05C01         942
## AC1JV9ACXX-5-10           1           0  8691803046_R04C02        9420
## AD1NFNACXX-1-20           1           0  8691803032_R01C01         969
## BC1KAVACXX-1-14           1           0  8454787105_R02C02         982
## BC1KAVACXX-8-16           1           0  8691803032_R06C01         984
##                     gwas_id      dna_id      rna_id     gonl_id
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1         2001        2001        2001          NA
## AD10W1ACXX-4-1         2002        2002        2002          NA
## BD1NYRACXX-5-2         2009        2009        2009          NA
## AD10W1ACXX-4-2         2013        2013        2013          NA
## BD1NYRACXX-5-3         2016        2016        2016          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1          942         942         942          NA
## AC1JV9ACXX-5-10        9420        9420        9420          NA
## AD1NFNACXX-1-20         969         969         969          NA
## BC1KAVACXX-1-14         982         982         982          NA
## BC1KAVACXX-8-16         984         984         984          NA
##                       cg_id      in_rp3 rnaseq_freeze methylation_freeze
##                 <character> <character>   <character>        <character>
## BD1NYRACXX-5-1           NA        TRUE             1                  1
## AD10W1ACXX-4-1           NA        TRUE             1                  1
## BD1NYRACXX-5-2           NA        TRUE             1                  1
## AD10W1ACXX-4-2           NA        TRUE             1                  1
## BD1NYRACXX-5-3           NA        TRUE             1                  1
## ...                     ...         ...           ...                ...
## AD1NFNACXX-1-1           NA        TRUE             1                  1
## AC1JV9ACXX-5-10          NA        TRUE             1                  1
## AD1NFNACXX-1-20          NA        TRUE             1                  1
## BC1KAVACXX-1-14          NA        TRUE             1                  1
## BC1KAVACXX-8-16          NA        TRUE             1                  1
##                 gonlv5imputed
##                   <character>
## BD1NYRACXX-5-1           TRUE
## AD10W1ACXX-4-1           TRUE
## BD1NYRACXX-5-2           TRUE
## AD10W1ACXX-4-2           TRUE
## BD1NYRACXX-5-3           TRUE
## ...                       ...
## AD1NFNACXX-1-1           TRUE
## AC1JV9ACXX-5-10          TRUE
## AD1NFNACXX-1-20          TRUE
## BC1KAVACXX-1-14          TRUE
## BC1KAVACXX-8-16          TRUE
##                                             Ascertainment_criterion
##                                                         <character>
## BD1NYRACXX-5-1  Selected for mildly increased DM2 /CVD risk factors
## AD10W1ACXX-4-1  Selected for mildly increased DM2 /CVD risk factors
## BD1NYRACXX-5-2  Selected for mildly increased DM2 /CVD risk factors
## AD10W1ACXX-4-2  Selected for mildly increased DM2 /CVD risk factors
## BD1NYRACXX-5-3  Selected for mildly increased DM2 /CVD risk factors
## ...                                                             ...
## AD1NFNACXX-1-1                                                   NA
## AC1JV9ACXX-5-10                                                  NA
## AD1NFNACXX-1-20                                                  NA
## BC1KAVACXX-1-14                                                  NA
## BC1KAVACXX-8-16                                                  NA
##                                   GWAS_Chip GWAS_DataGeneration_Date
##                                 <character>              <character>
## BD1NYRACXX-5-1  Illumina human omni express                     2012
## AD10W1ACXX-4-1  Illumina human omni express                     2012
## BD1NYRACXX-5-2  Illumina human omni express                     2012
## AD10W1ACXX-4-2  Illumina human omni express                     2012
## BD1NYRACXX-5-3  Illumina human omni express                     2012
## ...                                     ...                      ...
## AD1NFNACXX-1-1                           NA                       NA
## AC1JV9ACXX-5-10                          NA                       NA
## AD1NFNACXX-1-20                          NA                       NA
## BC1KAVACXX-1-14                          NA                       NA
## BC1KAVACXX-8-16                          NA                       NA
##                 DNA_BloodSampling_Age DNA_BloodSampling_Date
##                           <character>            <character>
## BD1NYRACXX-5-1                   77.9             2006-08-08
## AD10W1ACXX-4-1                   70.5             2006-08-09
## BD1NYRACXX-5-2                   66.3             2006-09-14
## AD10W1ACXX-4-2                   76.5             2006-09-26
## BD1NYRACXX-5-3                   71.9             2006-06-07
## ...                               ...                    ...
## AD1NFNACXX-1-1                 70.357             2011-10-05
## AC1JV9ACXX-5-10                51.535             2012-03-29
## AD1NFNACXX-1-20                68.233             2011-09-29
## BC1KAVACXX-1-14                66.379             2011-05-19
## BC1KAVACXX-8-16                68.783             2011-10-04
##                 DNA_BloodSampling_Time               DNA_Source
##                            <character>              <character>
## BD1NYRACXX-5-1                 8-11 am whole blood (buffy coat)
## AD10W1ACXX-4-1                 8-11 am whole blood (buffy coat)
## BD1NYRACXX-5-2                 8-11 am whole blood (buffy coat)
## AD10W1ACXX-4-2                 8-11 am whole blood (buffy coat)
## BD1NYRACXX-5-3                 8-11 am whole blood (buffy coat)
## ...                                ...                      ...
## AD1NFNACXX-1-1                 9:50:00                       NA
## AC1JV9ACXX-5-10                8:20:00                       NA
## AD1NFNACXX-1-20                9:30:00                       NA
## BC1KAVACXX-1-14               10:25:00                       NA
## BC1KAVACXX-8-16                9:00:00                       NA
##                 DNA_Extraction_Method DNA_Extraction_Date
##                           <character>         <character>
## BD1NYRACXX-5-1     QIAamp DNA minikit          2012-05-01
## AD10W1ACXX-4-1     QIAamp DNA minikit          2012-05-01
## BD1NYRACXX-5-2     QIAamp DNA minikit          2012-05-01
## AD10W1ACXX-4-2     QIAamp DNA minikit          2012-05-01
## BD1NYRACXX-5-3     QIAamp DNA minikit          2012-05-01
## ...                               ...                 ...
## AD1NFNACXX-1-1                     NA                  NA
## AC1JV9ACXX-5-10                    NA                  NA
## AD1NFNACXX-1-20                    NA                  NA
## BC1KAVACXX-1-14                    NA                  NA
## BC1KAVACXX-8-16                    NA                  NA
##                 DNA_QuantificationMethod DNA_A260A280ratio
##                              <character>       <character>
## BD1NYRACXX-5-1                  nanodrop               1.9
## AD10W1ACXX-4-1                  nanodrop              1.92
## BD1NYRACXX-5-2                  nanodrop              1.89
## AD10W1ACXX-4-2                  nanodrop              1.89
## BD1NYRACXX-5-3                  nanodrop              1.89
## ...                                  ...               ...
## AD1NFNACXX-1-1                        NA                NA
## AC1JV9ACXX-5-10                       NA                NA
## AD1NFNACXX-1-20                       NA                NA
## BC1KAVACXX-1-14                       NA                NA
## BC1KAVACXX-8-16                       NA                NA
##                 RNA_BloodSampling_Age RNA_Sampling_Date RNA_Sampling_Time
##                           <character>       <character>       <character>
## BD1NYRACXX-5-1                   77.9        2006-08-08           8-11 am
## AD10W1ACXX-4-1                   70.5        2006-08-09           8-11 am
## BD1NYRACXX-5-2                   66.3        2006-09-14           8-11 am
## AD10W1ACXX-4-2                   76.5        2006-09-26           8-11 am
## BD1NYRACXX-5-3                   71.9        2006-06-07           8-11 am
## ...                               ...               ...               ...
## AD1NFNACXX-1-1                 70.357        2011-10-05           9:50:00
## AC1JV9ACXX-5-10                51.535        2012-03-29           8:20:00
## AD1NFNACXX-1-20                68.233        2011-09-29           9:30:00
## BC1KAVACXX-1-14                66.379        2011-05-19          10:25:00
## BC1KAVACXX-8-16                68.783        2011-10-04           9:00:00
##                  RNA_Source RNA_Extraction_Date
##                 <character>         <character>
## BD1NYRACXX-5-1     PAX gene          2010-07-01
## AD10W1ACXX-4-1     PAX gene          2010-07-01
## BD1NYRACXX-5-2     PAX gene          2010-07-01
## AD10W1ACXX-4-2     PAX gene          2010-07-01
## BD1NYRACXX-5-3     PAX gene          2010-07-01
## ...                     ...                 ...
## AD1NFNACXX-1-1           NA                  NA
## AC1JV9ACXX-5-10          NA                  NA
## AD1NFNACXX-1-20          NA                  NA
## BC1KAVACXX-1-14          NA                  NA
## BC1KAVACXX-8-16          NA                  NA
##                             RNA_Extraction_Method     RNA_RIN
##                                       <character> <character>
## BD1NYRACXX-5-1  PAXgene blood miRNA kit (Qiacube)         9.1
## AD10W1ACXX-4-1  PAXgene blood miRNA kit (Qiacube)           9
## BD1NYRACXX-5-2  PAXgene blood miRNA kit (Qiacube)           9
## AD10W1ACXX-4-2  PAXgene blood miRNA kit (Qiacube)         8.8
## BD1NYRACXX-5-3  PAXgene blood miRNA kit (Qiacube)           9
## ...                                           ...         ...
## AD1NFNACXX-1-1                                 NA       8.539
## AC1JV9ACXX-5-10                                NA      8.1775
## AD1NFNACXX-1-20                                NA      8.1436
## BC1KAVACXX-1-14                                NA         8.5
## BC1KAVACXX-8-16                                NA      8.7492
##                 RNA_A260280ratio   BirthYear         Sex Smoking_Age
##                      <character> <character> <character> <character>
## BD1NYRACXX-5-1                 2        1928           0        77.9
## AD10W1ACXX-4-1                 2        1936           1        70.5
## BD1NYRACXX-5-2               2.2        1940           0        66.3
## AD10W1ACXX-4-2               2.2        1930           0        76.5
## BD1NYRACXX-5-3               2.1        1934           0        71.9
## ...                          ...         ...         ...         ...
## AD1NFNACXX-1-1                NA        1941           0          NA
## AC1JV9ACXX-5-10               NA        1960           0          NA
## AD1NFNACXX-1-20               NA        1943           1          NA
## BC1KAVACXX-1-14               NA        1944           0          NA
## BC1KAVACXX-8-16               NA        1942           1          NA
##                     Smoking Lipids_BloodSampling_Age
##                 <character>              <character>
## BD1NYRACXX-5-1            1                     77.9
## AD10W1ACXX-4-1            0                     70.5
## BD1NYRACXX-5-2            2                     66.3
## AD10W1ACXX-4-2            1                     76.5
## BD1NYRACXX-5-3            1                     71.9
## ...                     ...                      ...
## AD1NFNACXX-1-1           NA                   70.357
## AC1JV9ACXX-5-10          NA                   51.535
## AD1NFNACXX-1-20          NA                   68.233
## BC1KAVACXX-1-14          NA                   66.379
## BC1KAVACXX-8-16          NA                   68.783
##                 Lipids_BloodSampling_Date Lipids_BloodSampling_Time
##                               <character>               <character>
## BD1NYRACXX-5-1                 2006-08-08                   8-11 am
## AD10W1ACXX-4-1                 2006-08-09                   8-11 am
## BD1NYRACXX-5-2                 2006-09-14                   8-11 am
## AD10W1ACXX-4-2                 2006-09-26                   8-11 am
## BD1NYRACXX-5-3                 2006-06-07                   8-11 am
## ...                                   ...                       ...
## AD1NFNACXX-1-1                 2011-10-05                   9:50:00
## AC1JV9ACXX-5-10                2012-03-29                   8:20:00
## AD1NFNACXX-1-20                2011-09-29                   9:30:00
## BC1KAVACXX-1-14                2011-05-19                  10:25:00
## BC1KAVACXX-8-16                2011-10-04                   9:00:00
##                 Lipids_BloodSampling_Fasting     TotChol     HDLchol
##                                  <character> <character> <character>
## BD1NYRACXX-5-1                             1         5.6        1.28
## AD10W1ACXX-4-1                             1         4.3        1.24
## BD1NYRACXX-5-2                             1         5.4         1.4
## AD10W1ACXX-4-2                             1           6        1.08
## BD1NYRACXX-5-3                             1         5.7        1.22
## ...                                      ...         ...         ...
## AD1NFNACXX-1-1                             1         4.1         1.6
## AC1JV9ACXX-5-10                            1         5.9        1.63
## AD1NFNACXX-1-20                            1         6.6        2.22
## BC1KAVACXX-1-14                            1         5.3        0.97
## BC1KAVACXX-8-16                            1         5.9         1.7
##                 Triglycerides     LDLchol LDLcholMethod LipidsMed_Age
##                   <character> <character>   <character>   <character>
## BD1NYRACXX-5-1            1.5          NA            NA            NA
## AD10W1ACXX-4-1            1.1          NA            NA            NA
## BD1NYRACXX-5-2            0.7          NA            NA            NA
## AD10W1ACXX-4-2            2.1          NA            NA            NA
## BD1NYRACXX-5-3              1          NA            NA            NA
## ...                       ...         ...           ...           ...
## AD1NFNACXX-1-1           0.91          NA            NA            NA
## AC1JV9ACXX-5-10          0.92          NA            NA            NA
## AD1NFNACXX-1-20          1.22          NA            NA            NA
## BC1KAVACXX-1-14           1.4          NA            NA            NA
## BC1KAVACXX-8-16          0.65          NA            NA            NA
##                    LipidMed Anthropometry_Age      Height      Weight
##                 <character>       <character> <character> <character>
## BD1NYRACXX-5-1            1              77.9       175.5       76.25
## AD10W1ACXX-4-1            1              70.5         166       116.6
## BD1NYRACXX-5-2            0              66.3         170          83
## AD10W1ACXX-4-2            0              76.5         172        86.3
## BD1NYRACXX-5-3            0              71.9       174.5       74.75
## ...                     ...               ...         ...         ...
## AD1NFNACXX-1-1           NA                NA         172        87.5
## AC1JV9ACXX-5-10          NA                NA         180        99.9
## AD1NFNACXX-1-20          NA                NA         162        66.7
## BC1KAVACXX-1-14          NA                NA       183.7        84.3
## BC1KAVACXX-8-16          NA                NA       162.7        73.3
##                 CRP_BloodSampling_Age CRP_BloodSampling_Date
##                           <character>            <character>
## BD1NYRACXX-5-1                   77.9             2006-08-08
## AD10W1ACXX-4-1                   70.5             2006-08-09
## BD1NYRACXX-5-2                   66.3             2006-09-14
## AD10W1ACXX-4-2                   76.5             2006-09-26
## BD1NYRACXX-5-3                   71.9             2006-06-07
## ...                               ...                    ...
## AD1NFNACXX-1-1                     NA                     NA
## AC1JV9ACXX-5-10                    NA                     NA
## AD1NFNACXX-1-20                    NA                     NA
## BC1KAVACXX-1-14                    NA                     NA
## BC1KAVACXX-8-16                    NA                     NA
##                 CRP_BloodSampling_Time       hsCRP
##                            <character> <character>
## BD1NYRACXX-5-1                      NA        0.95
## AD10W1ACXX-4-1                      NA        4.61
## BD1NYRACXX-5-2                      NA        0.78
## AD10W1ACXX-4-2                      NA        8.48
## BD1NYRACXX-5-3                      NA        0.94
## ...                                ...         ...
## AD1NFNACXX-1-1                      NA          NA
## AC1JV9ACXX-5-10                     NA          NA
## AD1NFNACXX-1-20                     NA          NA
## BC1KAVACXX-1-14                     NA          NA
## BC1KAVACXX-8-16                     NA          NA
##                 CellCount_BloodSampling_Age CellCount_BloodSampling_Date
##                                 <character>                  <character>
## BD1NYRACXX-5-1                           NA                           NA
## AD10W1ACXX-4-1                           NA                           NA
## BD1NYRACXX-5-2                           NA                           NA
## AD10W1ACXX-4-2                           NA                           NA
## BD1NYRACXX-5-3                           NA                           NA
## ...                                     ...                          ...
## AD1NFNACXX-1-1                       70.357                   2011-10-05
## AC1JV9ACXX-5-10                      51.535                   2012-03-29
## AD1NFNACXX-1-20                      68.233                   2011-09-29
## BC1KAVACXX-1-14                      66.379                   2011-05-19
## BC1KAVACXX-8-16                      68.783                   2011-10-04
##                 CellCount_BloodSampling_Time         WBC         RBC
##                                  <character> <character> <character>
## BD1NYRACXX-5-1                            NA          NA          NA
## AD10W1ACXX-4-1                            NA          NA          NA
## BD1NYRACXX-5-2                            NA          NA          NA
## AD10W1ACXX-4-2                            NA          NA          NA
## BD1NYRACXX-5-3                            NA          NA          NA
## ...                                      ...         ...         ...
## AD1NFNACXX-1-1                       9:50:00           8        4.82
## AC1JV9ACXX-5-10                      8:20:00         6.4        5.02
## AD1NFNACXX-1-20                      9:30:00         5.5        4.32
## BC1KAVACXX-1-14                     10:25:00         6.5        5.21
## BC1KAVACXX-8-16                      9:00:00         5.6        4.55
##                         HGB         HCT         MCV         MCH
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1          8.9        0.43          90        1.85
## AC1JV9ACXX-5-10         9.8        0.46        92.5        1.94
## AD1NFNACXX-1-20         8.3         0.4        93.4        1.92
## BC1KAVACXX-1-14         9.4        0.45        87.1        1.81
## BC1KAVACXX-8-16         8.5        0.42        93.2        1.86
##                        MCHC        CHCM          CH         RDW
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1         20.5          NA          NA          NA
## AC1JV9ACXX-5-10          21          NA          NA          NA
## AD1NFNACXX-1-20        20.6          NA          NA          NA
## BC1KAVACXX-1-14        20.8          NA          NA          NA
## BC1KAVACXX-8-16          20          NA          NA          NA
##                         HDW         PLT         MPV        Neut
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1           NA         248           8          NA
## AC1JV9ACXX-5-10          NA         345         6.7          NA
## AD1NFNACXX-1-20          NA         241         7.1          NA
## BC1KAVACXX-1-14          NA         265         7.4          NA
## BC1KAVACXX-8-16          NA         225         7.6          NA
##                       Lymph        Mono         Eos        Baso
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1           NA          NA          NA          NA
## AC1JV9ACXX-5-10          NA          NA          NA          NA
## AD1NFNACXX-1-20          NA          NA          NA          NA
## BC1KAVACXX-1-14          NA          NA          NA          NA
## BC1KAVACXX-8-16          NA          NA          NA          NA
##                         LUC   Neut_Perc  Lymph_Perc   Mono_Perc
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA          NA
## AD10W1ACXX-4-1           NA          NA          NA          NA
## BD1NYRACXX-5-2           NA          NA          NA          NA
## AD10W1ACXX-4-2           NA          NA          NA          NA
## BD1NYRACXX-5-3           NA          NA          NA          NA
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1           NA          NA        42.6         8.5
## AC1JV9ACXX-5-10          NA          NA        31.9         7.9
## AD1NFNACXX-1-20          NA          NA        29.9         8.7
## BC1KAVACXX-1-14          NA          NA        37.2         3.9
## BC1KAVACXX-8-16          NA          NA        41.6         9.7
##                    Eos_Perc   Baso_Perc    LUC_Perc  run_number
##                 <character> <character> <character> <character>
## BD1NYRACXX-5-1           NA          NA          NA         125
## AD10W1ACXX-4-1           NA          NA          NA         234
## BD1NYRACXX-5-2           NA          NA          NA         125
## AD10W1ACXX-4-2           NA          NA          NA         234
## BD1NYRACXX-5-3           NA          NA          NA         125
## ...                     ...         ...         ...         ...
## AD1NFNACXX-1-1           NA          NA          NA         124
## AC1JV9ACXX-5-10          NA          NA          NA         243
## AD1NFNACXX-1-20          NA          NA          NA         124
## BC1KAVACXX-1-14          NA          NA          NA         123
## BC1KAVACXX-8-16          NA          NA          NA         123
##                 flowcell_number     machine         raw past_filter
##                     <character> <character> <character> <character>
## BD1NYRACXX-5-1               2b      SN1013    40434000    32052000
## AD10W1ACXX-4-1               3a       SN505    46622000    41461000
## BD1NYRACXX-5-2               2b      SN1013    61132000    48674000
## AD10W1ACXX-4-2               3a       SN505    36635000    33116000
## BD1NYRACXX-5-3               2b      SN1013    40919000    31762000
## ...                         ...         ...         ...         ...
## AD1NFNACXX-1-1               2a      SN1013    50351000    43968000
## AC1JV9ACXX-5-10             10a       SN505  23,696,872  21,347,978
## AD1NFNACXX-1-20              2a      SN1013    66222000    58360000
## BC1KAVACXX-1-14              1b      SN1013    63089000    57536000
## BC1KAVACXX-8-16              1b      SN1013    44481000    41267000
##                        date insert_size star.avg_deletion_length
##                 <character> <character>              <character>
## BD1NYRACXX-5-1   2013-03-29         325                     1.47
## AD10W1ACXX-4-1   2013-04-17         313                     1.42
## BD1NYRACXX-5-2   2013-03-29         325                     1.46
## AD10W1ACXX-4-2   2013-04-17         305                     1.44
## BD1NYRACXX-5-3   2013-03-29         308                     1.43
## ...                     ...         ...                      ...
## AD1NFNACXX-1-1   2013-03-29         298                     1.47
## AC1JV9ACXX-5-10  2013-07-09         304                     1.58
## AD1NFNACXX-1-20  2013-03-29         325                     1.47
## BC1KAVACXX-1-14  2013-03-19         326                     1.43
## BC1KAVACXX-8-16  2013-03-19         314                     1.45
##                 star.start_mapping_time star.pct_unique_mapped
##                             <character>            <character>
## BD1NYRACXX-5-1          Oct 10 21:39:51                  93.19
## AD10W1ACXX-4-1          Oct 07 17:40:58                   92.4
## BD1NYRACXX-5-2          Oct 11 00:56:26                  92.49
## AD10W1ACXX-4-2          Oct 16 20:38:57                  92.63
## BD1NYRACXX-5-3          Oct 10 21:55:25                   92.9
## ...                                 ...                    ...
## AD1NFNACXX-1-1          Oct 08 10:43:53                  92.68
## AC1JV9ACXX-5-10         Oct 06 12:24:09                  89.53
## AD1NFNACXX-1-20         Oct 09 03:56:44                   92.2
## BC1KAVACXX-1-14         Oct 10 05:01:36                  93.06
## BC1KAVACXX-8-16         Oct 10 08:05:07                  90.16
##                 star.num_unique_mapped star.num_splice_annotated
##                            <character>               <character>
## BD1NYRACXX-5-1                13384007                   3603631
## AD10W1ACXX-4-1                16673749                   4812839
## BD1NYRACXX-5-2                20220194                   5536547
## AD10W1ACXX-4-2                13579658                   3876835
## BD1NYRACXX-5-3                12571418                   3404880
## ...                                ...                       ...
## AD1NFNACXX-1-1                18203944                   4744204
## AC1JV9ACXX-5-10                8140547                   2414167
## AD1NFNACXX-1-20               24617486                   6577238
## BC1KAVACXX-1-14               24967281                   6457378
## BC1KAVACXX-8-16               17642273                   4919424
##                 star.num_splice_noncanonical star.pct_unmapped_other
##                                  <character>             <character>
## BD1NYRACXX-5-1                          3291                    0.06
## AD10W1ACXX-4-1                          4690                    0.05
## BD1NYRACXX-5-2                          5662                    0.06
## AD10W1ACXX-4-2                          4701                    0.05
## BD1NYRACXX-5-3                          3997                    0.05
## ...                                      ...                     ...
## AD1NFNACXX-1-1                          5975                    0.07
## AC1JV9ACXX-5-10                         3276                    0.05
## AD1NFNACXX-1-20                         6869                    0.06
## BC1KAVACXX-1-14                         7585                    0.05
## BC1KAVACXX-8-16                         5405                    0.06
##                 star.num_splice_total star.num_splice_atac
##                           <character>          <character>
## BD1NYRACXX-5-1                3631134                 3204
## AD10W1ACXX-4-1                4852111                 4039
## BD1NYRACXX-5-2                5584443                 4996
## AD10W1ACXX-4-2                3909894                 3384
## BD1NYRACXX-5-3                3433262                 3033
## ...                               ...                  ...
## AD1NFNACXX-1-1                4783352                 4325
## AC1JV9ACXX-5-10               2438071                 2838
## AD1NFNACXX-1-20               6630389                 5836
## BC1KAVACXX-1-14               6511885                 5826
## BC1KAVACXX-8-16               4962823                 4477
##                 star.num_splice_gcag star.num_input
##                          <character>    <character>
## BD1NYRACXX-5-1                 24928       14362274
## AD10W1ACXX-4-1                 32581       18044680
## BD1NYRACXX-5-2                 37530       21861429
## AD10W1ACXX-4-2                 27794       14660267
## BD1NYRACXX-5-3                 23072       13532066
## ...                              ...            ...
## AD1NFNACXX-1-1                 31893       19641011
## AC1JV9ACXX-5-10                18193        9092744
## AD1NFNACXX-1-20                46290       26699100
## BC1KAVACXX-1-14                44465       26828880
## BC1KAVACXX-8-16                33675       19567453
##                 star.rate_deletion_per_base star.pct_mapped_multiple
##                                 <character>              <character>
## BD1NYRACXX-5-1                            0                     3.92
## AD10W1ACXX-4-1                            0                     4.25
## BD1NYRACXX-5-2                            0                     4.47
## AD10W1ACXX-4-2                            0                     4.12
## BD1NYRACXX-5-3                            0                     4.11
## ...                                     ...                      ...
## AD1NFNACXX-1-1                            0                     4.19
## AC1JV9ACXX-5-10                           0                     3.69
## AD1NFNACXX-1-20                           0                     3.55
## BC1KAVACXX-1-14                           0                     3.87
## BC1KAVACXX-8-16                           0                      3.8
##                 star.rate_mismatch_per_base star.start_job_time
##                                 <character>         <character>
## BD1NYRACXX-5-1                         0.23     Oct 10 21:38:54
## AD10W1ACXX-4-1                         0.22     Oct 07 17:39:59
## BD1NYRACXX-5-2                         0.25     Oct 11 00:55:27
## AD10W1ACXX-4-2                         0.22     Oct 16 20:29:47
## BD1NYRACXX-5-3                         0.25     Oct 10 21:53:03
## ...                                     ...                 ...
## AD1NFNACXX-1-1                          0.2     Oct 08 10:41:20
## AC1JV9ACXX-5-10                        0.26     Oct 06 12:22:56
## AD1NFNACXX-1-20                         0.2     Oct 09 03:55:17
## BC1KAVACXX-1-14                        0.21     Oct 10 04:59:50
## BC1KAVACXX-8-16                        0.19     Oct 10 08:02:10
##                 star.pct_unmapped_short star.mapping_speed
##                             <character>        <character>
## BD1NYRACXX-5-1                     2.56             544.25
## AD10W1ACXX-4-1                     3.02             595.97
## BD1NYRACXX-5-2                     2.75             554.23
## AD10W1ACXX-4-2                     2.95                593
## BD1NYRACXX-5-3                     2.74             624.56
## ...                                 ...                ...
## AD1NFNACXX-1-1                     2.73             625.73
## AC1JV9ACXX-5-10                    6.35              503.6
## AD1NFNACXX-1-20                    3.85             279.41
## BC1KAVACXX-1-14                     2.7             555.08
## BC1KAVACXX-8-16                    5.71             489.19
##                 star.avg_insertion_length star.pct_mapped_many
##                               <character>          <character>
## BD1NYRACXX-5-1                        1.2                 0.26
## AD10W1ACXX-4-1                       1.19                 0.28
## BD1NYRACXX-5-2                        1.2                 0.23
## AD10W1ACXX-4-2                       1.19                 0.24
## BD1NYRACXX-5-3                        1.2                  0.2
## ...                                   ...                  ...
## AD1NFNACXX-1-1                        1.2                 0.32
## AC1JV9ACXX-5-10                      1.21                 0.38
## AD1NFNACXX-1-20                       1.2                 0.34
## BC1KAVACXX-1-14                      1.19                 0.32
## BC1KAVACXX-8-16                      1.19                 0.28
##                 star.rate_insertion_per_base star.num_splice_gtag
##                                  <character>          <character>
## BD1NYRACXX-5-1                          0.01              3599711
## AD10W1ACXX-4-1                          0.01              4810801
## BD1NYRACXX-5-2                          0.01              5536255
## AD10W1ACXX-4-2                          0.01              3874015
## BD1NYRACXX-5-3                          0.01              3403160
## ...                                      ...                  ...
## AD1NFNACXX-1-1                          0.01              4741159
## AC1JV9ACXX-5-10                            0              2413764
## AD1NFNACXX-1-20                         0.01              6571394
## BC1KAVACXX-1-14                         0.01              6454009
## BC1KAVACXX-8-16                         0.01              4919266
##                 star.num_mapped_many star.num_mapped_multiple
##                          <character>              <character>
## BD1NYRACXX-5-1                 38000                   563715
## AD10W1ACXX-4-1                 50826                   766081
## BD1NYRACXX-5-2                 49664                   976688
## AD10W1ACXX-4-2                 35799                   604307
## BD1NYRACXX-5-3                 27346                   555867
## ...                              ...                      ...
## AD1NFNACXX-1-1                 63747                   823711
## AC1JV9ACXX-5-10                34925                   335743
## AD1NFNACXX-1-20                91494                   947493
## BC1KAVACXX-1-14                85467                  1037749
## BC1KAVACXX-8-16                54001                   742645
##                 star.avg_mapped_length star.avg_input_length
##                            <character>           <character>
## BD1NYRACXX-5-1                   96.67                    97
## AD10W1ACXX-4-1                   98.17                    98
## BD1NYRACXX-5-2                   96.68                    97
## AD10W1ACXX-4-2                   98.17                    98
## BD1NYRACXX-5-3                   96.46                    96
## ...                                ...                   ...
## AD1NFNACXX-1-1                   98.17                    98
## AC1JV9ACXX-5-10                  97.49                    98
## AD1NFNACXX-1-20                  98.15                    98
## BC1KAVACXX-1-14                  98.33                    98
## BC1KAVACXX-8-16                  98.48                    98
##                   star.end_time star.pct_unmapped_mismatch
##                     <character>                <character>
## BD1NYRACXX-5-1  Oct 10 21:41:26                          0
## AD10W1ACXX-4-1  Oct 07 17:42:47                          0
## BD1NYRACXX-5-2  Oct 11 00:58:48                          0
## AD10W1ACXX-4-2  Oct 16 20:40:26                          0
## BD1NYRACXX-5-3  Oct 10 21:56:43                          0
## ...                         ...                        ...
## AD1NFNACXX-1-1  Oct 08 10:45:46                          0
## AC1JV9ACXX-5-10 Oct 06 12:25:14                          0
## AD1NFNACXX-1-20 Oct 09 04:02:28                          0
## BC1KAVACXX-1-14 Oct 10 05:04:30                          0
## BC1KAVACXX-8-16 Oct 10 08:07:31                          0
##                 bam.genome_insert_mean bam.genome_insert_std
##                            <character>           <character>
## BD1NYRACXX-5-1        288.648256495878      234.108811011201
## AD10W1ACXX-4-1        294.584736018414      245.447514390202
## BD1NYRACXX-5-2        286.413248406994      230.288439069619
## AD10W1ACXX-4-2        291.046940453097      240.342521371492
## BD1NYRACXX-5-3        275.199058717648      216.256191245783
## ...                                ...                   ...
## AD1NFNACXX-1-1        243.377767271685      174.593146049674
## AC1JV9ACXX-5-10        270.98717085555      234.077479869603
## AD1NFNACXX-1-20       271.426005179576      206.029383007847
## BC1KAVACXX-1-14       262.341694194699      188.846941324494
## BC1KAVACXX-8-16       281.656613101494      222.606885771599
##                 bam.genome_duplicates bam.exon_duplicates bam.exon_mapped
##                           <character>         <character>     <character>
## BD1NYRACXX-5-1                2661291                   0        26186566
## AD10W1ACXX-4-1                4967141                   0        33376848
## BD1NYRACXX-5-2                7305149                   0        41080270
## AD10W1ACXX-4-2                3560547                   0        27278995
## BD1NYRACXX-5-3                3668799                   0        25458963
## ...                               ...                 ...             ...
## AD1NFNACXX-1-1                6078591                   0        34995158
## AC1JV9ACXX-5-10               5175009                   0        15785461
## AD1NFNACXX-1-20               8338397                   0        47655190
## BC1KAVACXX-1-14               8125608                   0        49057971
## BC1KAVACXX-8-16               5190439                   0        34874136
##                 bam.genome_total bam.genome_mapped bam.exon_total
##                      <character>       <character>    <character>
## BD1NYRACXX-5-1          30369701          29537099       26186566
## AD10W1ACXX-4-1          38317327          37103536       33376848
## BD1NYRACXX-5-2          46554232          45220039       41080270
## AD10W1ACXX-4-2          31054670          30098674       27278995
## BD1NYRACXX-5-3          28654414          27841451       25458963
## ...                          ...               ...            ...
## AD1NFNACXX-1-1          41723982          40491606       34995158
## AC1JV9ACXX-5-10         19163800          17928954       15785461
## AD1NFNACXX-1-20         56174753          53899515       47655190
## BC1KAVACXX-1-14         56673162          55019984       49057971
## BC1KAVACXX-8-16         41274638          38906239       34874136
##                 fastqc_raw.R2_raw_GC_mean fastqc_raw.R2_raw_GC_std
##                               <character>              <character>
## BD1NYRACXX-5-1           50.6134742425783         11.9436168503436
## AD10W1ACXX-4-1           52.5497727984934         11.6764709027634
## BD1NYRACXX-5-2           51.3511595055147         11.8771215708693
## AD10W1ACXX-4-2           52.9603163325893         11.7375629125041
## BD1NYRACXX-5-3           51.6023001132948         11.7728320915493
## ...                                   ...                      ...
## AD1NFNACXX-1-1           51.1126529412655         12.1481063937884
## AC1JV9ACXX-5-10          56.1104199342259         10.8167704633456
## AD1NFNACXX-1-20          52.2642327963524         11.8520406039121
## BC1KAVACXX-1-14          52.1173262764685         11.9197420032885
## BC1KAVACXX-8-16          52.6479899386612         11.7668179866764
##                 fastqc_raw.R1_raw_GC_mean fastqc_raw.R1_raw_GC_std
##                               <character>              <character>
## BD1NYRACXX-5-1           50.2351419723759         11.8848917281142
## AD10W1ACXX-4-1           51.8356043001923         11.2889096664749
## BD1NYRACXX-5-2           51.0789575779742         11.8219118904248
## AD10W1ACXX-4-2           52.4489872339274         11.4721063444188
## BD1NYRACXX-5-3           50.9744983067781         11.6161522540398
## ...                                   ...                      ...
## AD1NFNACXX-1-1           50.6593415093687           11.84707799697
## AC1JV9ACXX-5-10           54.909296338238         10.5315750859386
## AD1NFNACXX-1-20          52.0007684863848         11.7174692643988
## BC1KAVACXX-1-14          51.9082951558929         11.7670706654902
## BC1KAVACXX-8-16          52.4404759544934         11.5887318822462
##                 fastqc_clean.R1_clean_GC_mean
##                                   <character>
## BD1NYRACXX-5-1                50.164378341129
## AD10W1ACXX-4-1               51.8220336816596
## BD1NYRACXX-5-2               51.0271786120152
## AD10W1ACXX-4-2               52.4373139201977
## BD1NYRACXX-5-3               50.9028924407951
## ...                                       ...
## AD1NFNACXX-1-1               50.5797529326033
## AC1JV9ACXX-5-10               55.634548872841
## AD1NFNACXX-1-20              51.9339059552363
## BC1KAVACXX-1-14              51.8790750218106
## BC1KAVACXX-8-16              52.3904075032816
##                 fastqc_clean.R2_clean_GC_mean fastqc_clean.R2_clean_GC_std
##                                   <character>                  <character>
## BD1NYRACXX-5-1               50.3968946725711             12.1131258203262
## AD10W1ACXX-4-1               52.0100726108017              11.811466538215
## BD1NYRACXX-5-2               51.1828528749685             12.0540137218588
## AD10W1ACXX-4-2               52.5987985271214             11.9084620443752
## BD1NYRACXX-5-3               51.1013302404673              12.082642631825
## ...                                       ...                          ...
## AD1NFNACXX-1-1               50.7719921815163             12.2156418850006
## AC1JV9ACXX-5-10              55.7497819654327             11.0483150261391
## AD1NFNACXX-1-20              52.1394176386261             11.9282296293526
## BC1KAVACXX-1-14              52.0255556861058             11.9945544323497
## BC1KAVACXX-8-16              52.5096132946266             11.7511126114672
##                 fastqc_clean.R1_clean_GC_std
##                                  <character>
## BD1NYRACXX-5-1              12.2507376336796
## AD10W1ACXX-4-1                 11.7524863028
## BD1NYRACXX-5-2              12.1621853153794
## AD10W1ACXX-4-2              11.8501461812698
## BD1NYRACXX-5-3              12.2171170446126
## ...                                      ...
## AD1NFNACXX-1-1              12.1562080492483
## AC1JV9ACXX-5-10             10.9872330877667
## AD1NFNACXX-1-20             11.8857304444009
## BC1KAVACXX-1-14             11.9727270527311
## BC1KAVACXX-8-16             11.7504095784964

rowRanges(rnaSeqData)

## GRanges object with 303544 ranges and 5 metadata columns:
##                                                                                                                                   seqnames
##                                                                                                                                      <Rle>
##                                                                                                   ENSE00001544499,ENSE00001544501     chrM
##                                                                                                   ENSE00001544498,ENSE00001544497     chrM
##                                                                                                                   ENSE00002006242     chrM
##                                                                                                                   ENSE00001435714     chrM
##                                                                                                   ENSE00001544494,ENSE00001993597     chrM
##                                                                                                                               ...      ...
##   ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197    chr22
##                                                   ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865    chr22
##                                                                                                                   ENSE00002513195    chr22
##                                                   ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642    chr22
##                                                                   ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668    chr22
##                                                                                                                                                 ranges
##                                                                                                                                              <IRanges>
##                                                                                                   ENSE00001544499,ENSE00001544501         [ 577, 1601]
##                                                                                                   ENSE00001544498,ENSE00001544497         [1602, 3229]
##                                                                                                                   ENSE00002006242         [3230, 3304]
##                                                                                                                   ENSE00001435714         [3307, 4262]
##                                                                                                   ENSE00001544494,ENSE00001993597         [4263, 4400]
##                                                                                                                               ...                  ...
##   ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 [51221929, 51222091]
##                                                   ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 [51222185, 51222500]
##                                                                                                                   ENSE00002513195 [51223601, 51223721]
##                                                   ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 [51227178, 51227781]
##                                                                   ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 [51237083, 51239737]
##                                                                                                                                   strand
##                                                                                                                                    <Rle>
##                                                                                                   ENSE00001544499,ENSE00001544501      +
##                                                                                                   ENSE00001544498,ENSE00001544497      +
##                                                                                                                   ENSE00002006242      +
##                                                                                                                   ENSE00001435714      +
##                                                                                                   ENSE00001544494,ENSE00001993597      +
##                                                                                                                               ...    ...
##   ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197      -
##                                                   ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865      +
##                                                                                                                   ENSE00002513195      +
##                                                   ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642      +
##                                                                   ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668      +
##                                                                                                                                     |
##                                                                                                                                     |
##                                                                                                   ENSE00001544499,ENSE00001544501   |
##                                                                                                   ENSE00001544498,ENSE00001544497   |
##                                                                                                                   ENSE00002006242   |
##                                                                                                                   ENSE00001435714   |
##                                                                                                   ENSE00001544494,ENSE00001993597   |
##                                                                                                                               ... ...
##   ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197   |
##                                                   ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865   |
##                                                                                                                   ENSE00002513195   |
##                                                   ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642   |
##                                                                   ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668   |
##                                                                                                                                         ENSEMBL
##                                                                                                                                     <character>
##                                                                                                   ENSE00001544499,ENSE00001544501 MT-TF,MT-RNR1
##                                                                                                   ENSE00001544498,ENSE00001544497 MT-TV,MT-RNR2
##                                                                                                                   ENSE00002006242        MT-TL1
##                                                                                                                   ENSE00001435714        MT-ND1
##                                                                                                   ENSE00001544494,ENSE00001993597   MT-TQ,MT-TI
##                                                                                                                               ...           ...
##   ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197        RABL2B
##                                                   ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865     RPL23AP82
##                                                                                                                   ENSE00002513195     RPL23AP82
##                                                   ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642     RPL23AP82
##                                                                   ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668     RPL23AP82
##                                                                                                                                            metaexondid
##                                                                                                                                            <character>
##                                                                                                   ENSE00001544499,ENSE00001544501          MT_577_1601
##                                                                                                   ENSE00001544498,ENSE00001544497         MT_1602_3229
##                                                                                                                   ENSE00002006242         MT_3230_3304
##                                                                                                                   ENSE00001435714         MT_3307_4262
##                                                                                                   ENSE00001544494,ENSE00001993597         MT_4263_4400
##                                                                                                                               ...                  ...
##   ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 22_51221929_51222091
##                                                   ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 22_51222185_51222500
##                                                                                                                   ENSE00002513195 22_51223601_51223721
##                                                   ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 22_51227178_51227781
##                                                                   ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 22_51237083_51239737
##                                                                                                                                                                                                                                                           exon_id
##                                                                                                                                                                                                                                                       <character>
##                                                                                                   ENSE00001544499,ENSE00001544501                                                                                                 ENSE00001544499,ENSE00001544501
##                                                                                                   ENSE00001544498,ENSE00001544497                                                                                                 ENSE00001544498,ENSE00001544497
##                                                                                                                   ENSE00002006242                                                                                                                 ENSE00002006242
##                                                                                                                   ENSE00001435714                                                                                                                 ENSE00001435714
##                                                                                                   ENSE00001544494,ENSE00001993597                                                                                                 ENSE00001544494,ENSE00001993597
##                                                                                                                               ...                                                                                                                             ...
##   ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197
##                                                   ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865                                                 ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865
##                                                                                                                   ENSE00002513195                                                                                                                 ENSE00002513195
##                                                   ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642                                                 ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642
##                                                                   ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668                                                                 ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668
##                                                                                                                                          gc
##                                                                                                                                   <numeric>
##                                                                                                   ENSE00001544499,ENSE00001544501 0.4536585
##                                                                                                   ENSE00001544498,ENSE00001544497 0.4299754
##                                                                                                                   ENSE00002006242 0.3866667
##                                                                                                                   ENSE00001435714 0.4780335
##                                                                                                   ENSE00001544494,ENSE00001993597 0.3913043
##                                                                                                                               ...       ...
##   ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197 0.7484663
##                                                   ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865 0.6012658
##                                                                                                                   ENSE00002513195 0.5206612
##                                                   ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642 0.3460265
##                                                                   ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668 0.3954802
##                                                                                                                                      length
##                                                                                                                                   <numeric>
##                                                                                                   ENSE00001544499,ENSE00001544501      1025
##                                                                                                   ENSE00001544498,ENSE00001544497      1628
##                                                                                                                   ENSE00002006242        75
##                                                                                                                   ENSE00001435714       956
##                                                                                                   ENSE00001544494,ENSE00001993597       138
##                                                                                                                               ...       ...
##   ENSE00001841640,ENSE00001685116,ENSE00001890928,ENSE00001409639,ENSE00001843702,ENSE00001910924,ENSE00001522195,ENSE00001522197       163
##                                                   ENSE00001895558,ENSE00002470266,ENSE00002282778,ENSE00001792547,ENSE00001849865       316
##                                                                                                                   ENSE00002513195       121
##                                                   ENSE00001902224,ENSE00002272967,ENSE00001944953,ENSE00001672850,ENSE00002218642       604
##                                                                   ENSE00002272638,ENSE00002218017,ENSE00002439339,ENSE00001816668      2655
##   -------
##   seqinfo: 25 sequences from an unspecified genome; no seqlengths

assays(rnaSeqData)$counts[1:5, 1:5]

##                                 BD1NYRACXX-5-1 AD10W1ACXX-4-1
## ENSE00001544499,ENSE00001544501          29110          41062
## ENSE00001544498,ENSE00001544497              0              0
## ENSE00002006242                          14504          25271
## ENSE00001435714                          72504          96329
## ENSE00001544494,ENSE00001993597              0              0
##                                 BD1NYRACXX-5-2 AD10W1ACXX-4-2
## ENSE00001544499,ENSE00001544501          47738          34911
## ENSE00001544498,ENSE00001544497              0              0
## ENSE00002006242                          25864          15013
## ENSE00001435714                          86555          65928
## ENSE00001544494,ENSE00001993597              0              0
##                                 BD1NYRACXX-5-3
## ENSE00001544499,ENSE00001544501          27028
## ENSE00001544498,ENSE00001544497              0
## ENSE00002006242                          12725
## ENSE00001435714                          82553
## ENSE00001544494,ENSE00001993597              0

Metabolomics data

Extract RP4 data from molgenis database

molgenis.connect(username, password)

## Loading required package: bitops

## Login success

## 
## Run 'ls()' to see the available functions to interact with the molgenis database!

ls()

##  [1] "BIOBANKS"                      "DATASETS"                     
##  [3] "LLSMalesAbove70"               "MDB"                          
##  [5] "methData"                      "molgenis.add"                 
##  [7] "molgenis.addAll"               "molgenis.addList"             
##  [9] "molgenis.delete"               "molgenis.env"                 
## [11] "molgenis.get"                  "molgenis.getAttributeMetaData"
## [13] "molgenis.getEntityMetaData"    "molgenis.login"               
## [15] "molgenis.logout"               "molgenis.update"              
## [17] "password"                      "phenotypes"                   
## [19] "PROXY"                         "RDB"                          
## [21] "rnaSeqData"                    "RP3DATADIR"                   
## [23] "SRMBASE"                       "username"                     
## [25] "USRPWD"                        "VIEWS"

subjects <- molgenis.get.all("subjects")

## Extracted 2000 rows...

## Extracted 3000 rows...

## Extracted 4000 rows...

## Extracted 5000 rows...

## Extracted 6000 rows...

## Extracted 7000 rows...

## Extracted 8000 rows...

## Extracted 9000 rows...

## Extracted 10000 rows...

## Extracted 11000 rows...

## Extracted 12000 rows...

## Extracted 13000 rows...

## Extracted 14000 rows...

## Extracted 15000 rows...

## Extracted 16000 rows...

## Extracted 17000 rows...

## Extracted 18000 rows...

## Extracted 19000 rows...

## Extracted 20000 rows...

## Extracted 21000 rows...

## Extracted 22000 rows...

## Extracted 23000 rows...

## Extracted 23728 rows...

dim(subjects)

## [1] 23728    40

head(subjects)

##    biobank                                              id bios_id
## 1 BIOMARCS BIOMARCS-{38788478-F7D7-4518-B57F-09F5EA6190EF}    <NA>
## 2 BIOMARCS BIOMARCS-{38B676D4-E73F-48B4-B874-4B3805D42DB5}    <NA>
## 3 BIOMARCS BIOMARCS-{395012FA-A34C-4478-A899-64B3AB3CA9B7}    <NA>
## 4 BIOMARCS BIOMARCS-{396322D1-DEF6-4AC9-AC40-3B70B9A9EFD1}    <NA>
## 5 BIOMARCS BIOMARCS-{3963929C-AFA9-4A9D-B095-C30999122838}    <NA>
## 6 BIOMARCS BIOMARCS-{39DD6E7B-1A01-4D79-8757-4A84256E3C27}    <NA>
##              date_of_birth age_bloodcollection gender pedigree_information
## 1 1933-11-29T00:00:00+0019                  NA   true                false
## 2 1940-04-10T00:00:00+0020                  NA   true                false
## 3 1939-06-18T00:00:00+0120                  NA   true                false
## 4 1960-09-18T00:00:00+0100                  NA   true                false
## 5 1937-02-26T00:00:00+0019                  NA   true                false
## 6 1952-08-28T00:00:00+0100                  NA   true                false
##   gwas_platform_used gwas_available_date dna_amount      dna_source
## 1                                              true EDTA buffy coat
## 2                                              true EDTA buffy coat
## 3                                              true EDTA buffy coat
## 4                                              true EDTA buffy coat
## 5                                              true EDTA buffy coat
## 6                                              true EDTA buffy coat
##   rna_amount rna_source        date_of_inclusion smoking
## 1      false            2011-03-28T00:00:00+0200   false
## 2      false            2011-11-01T00:00:00+0100   false
## 3      false            2010-06-03T00:00:00+0200   false
## 4      false            2011-10-20T00:00:00+0200    true
## 5      false            2009-09-12T00:00:00+0200   false
## 6      false            2010-09-06T00:00:00+0200    true
##   alcohol_consumption height weight waist_circumference hip_circumference
## 1                        167     79                  NA                NA
## 2                        167     78                  NA                NA
## 3                        189     85                  NA                NA
## 4                        189    108                  NA                NA
## 5                        181     98                  NA                NA
## 6                        176     70                  NA                NA
##   hs_crp wbc hgb hct plt neut_percentage lymph_percentage mono_percentage
## 1    7.0  NA  NA  NA  NA              NA               NA              NA
## 2     NA  NA  NA  NA  NA              NA               NA              NA
## 3    1.4  NA  NA  NA  NA              NA               NA              NA
## 4     NA  NA  NA  NA  NA              NA               NA              NA
## 5   23.0  NA  NA  NA  NA              NA               NA              NA
## 6     NA  NA  NA  NA  NA              NA               NA              NA
##   eos_percentage baso_percentage luc_percentage tot_cholesterol
## 1             NA              NA             NA              NA
## 2             NA              NA             NA              NA
## 3             NA              NA             NA             4.1
## 4             NA              NA             NA              NA
## 5             NA              NA             NA             5.1
## 6             NA              NA             NA             3.3
##   hdl_cholesterol triglycerides systolic_blood_pressure
## 1              NA            NA                     170
## 2              NA            NA                     174
## 3            1.20           0.8                     122
## 4              NA            NA                     144
## 5            0.88           0.6                     150
## 6              NA            NA                      68
##   diastolic_blood_pressure lipid_lowering_med blood_pressure_lowering_med
## 1                       NA                  1                        true
## 2                       NA                  1                        true
## 3                       78                  1                        true
## 4                       NA                  0                        true
## 5                       NA                  1                        true
## 6                       NA                  1                        true
##   metabolic_syndrome diabetes
## 1                       false
## 2                       false
## 3                       false
## 4                        true
## 5                       false
## 6                       false

biobanks <- molgenis.get.all("biobanks")
dim(biobanks)

## [1] 24 93

head(biobanks)

##   age_at_death age_at_inclusion antidepressant_use_blood_coll anxiety
## 1         true             true                          true   false
## 2         true             true                          true   false
## 3        false             true                         false   false
## 4        false             true                         false   false
## 5        false             true                          true   false
## 6         true             true                          true   false
##                              ascertainment_criterion caffeine_consumption
## 1                      patients with type 2 diabetes                false
## 2                             pop-based family study                 true
## 3                                   healthy controls                false
## 4                                       case control                false
## 5 population-based, stratified for ethnic background                false
## 6                                          pop-based                 true
##   chronic_migraine comorbidities     contact_person1_email
## 1            false         false     n.van_leeuwen@lumc.nl
## 2            false          true   a.demirkan@erasmusmc.nl
## 3            false         false mihai.netea@radboudumc.nl
## 4            false         false       y.f.m.ramos@lumc.nl
## 5            false          true    m.b.snijder@amc.uva.nl
## 6             true          true          Lude@ludesign.nl
##   contact_person1_name       contact_person2_email contact_person2_name
## 1       N. van Leeuwen           jm.dekker@vumc.nl          J.M. Dekker
## 2        Ayse Demirkan j.vergeer-drop@erasmusmc.nl    Jeannette Vergeer
## 3          Mihai Netea   leo.joosten@radboudumc.nl          Leo Joosten
## 4       Yolande Ramos                                                  
## 5      Marieke Snijder        k.stronks@amc.uva.nl       Karien Stronks
## 6          Lude Franke  sasha.zhernakova@gmail.com     Sasha Zhernakova
##   contact_person3_email contact_person3_name creatinine   csf
## 1                                                  true false
## 2                                                  true false
## 3                                                 false false
## 4                                                 false false
## 5                                                  true false
## 6     Llscience@umcg.nl     Salome Scholtens       true false
##   cur_use_platelet_inhibitors curr_use_ace_inhibitors
## 1                        true                    true
## 2                        true                    true
## 3                       false                   false
## 4                       false                   false
## 5                        true                    true
## 6                        true                    true
##   curr_use_beta_blockers curr_use_ra_receptor_blockers curr_use_statins
## 1                   true                          true             true
## 2                   true                          true             true
## 3                  false                         false            false
## 4                  false                         false            false
## 5                   true                          true             true
## 6                   true                          true             true
##   curr_use_vit_k_antagonist current_depression_diag
## 1                      true                   false
## 2                      true                    true
## 3                     false                   false
## 4                     false                   false
## 5                      true                   false
## 6                      true                   false
##   depression_diag_instrument depression_ids_sr depression_scale_yasr
## 1                      false             false                 false
## 2                       true             false                 false
## 3                      false             false                 false
## 4                      false             false                 false
## 5                      false             false                 false
## 6                       true             false                 false
##   depressive_symptom_score depressive_symptom_score_used diabetes_type_2
## 1                    false                         false            true
## 2                     true                          true            true
## 3                    false                         false            true
## 4                    false                         false           false
## 5                     true                          true            true
## 6                     true                          true            true
##   diabetic_complications diagnosis_by_ogtt diagnosis_oa education_level
## 1                   true             false        false            true
## 2                  false             false         true            true
## 3                  false             false        false            true
## 4                  false             false         true           false
## 5                  false             false        false            true
## 6                   true             false        false            true
##   family_history_cv_disease   ft3   ft4 hba1c hba1c_longitudinal
## 1                     false false false  true               true
## 2                      true false false false              false
## 3                      true false false false              false
## 4                     false false false false              false
## 5                      true false false  true              false
## 6                      true  true  true  true               true
##   head_circumference headache_days heart_rate history_arrhythmia
## 1              false         false       true               true
## 2              false         false       true               true
## 3              false         false      false              false
## 4              false         false      false              false
## 5              false         false       true              false
## 6               true          true       true               true
##   history_cabg history_coron_artery_disease history_device_implantation
## 1         true                         true                        true
## 2         true                         true                        true
## 3        false                        false                       false
## 4        false                        false                       false
## 5        false                         true                       false
## 6         true                         true                        true
##   history_myocardial_infarction history_pci  abbreviation intelligence
## 1                          true        true        DZS_WF        false
## 2                          true        true      ERF_ERGO         true
## 3                         false       false FUNCTGENOMICS        false
## 4                         false       false          GARP        false
## 5                          true       false        HELIUS        false
## 6                          true        true     LIFELINES        false
##   interview_data_dementia lifetime_depression lifetime_depression_diag
## 1                   false               false                    false
## 2                    true                true                     true
## 3                   false               false                    false
## 4                   false               false                    false
## 5                   false               false                    false
## 6                    true               false                     true
##    lvef lvef_method migraine_frequency migraine_with_aura
## 1 false       false              false              false
## 2  true       false              false               true
## 3 false       false              false              false
## 4 false       false              false              false
## 5 false       false              false              false
## 6  true        true              false               true
##   migraine_without_aura mmse_and_other_csf_tests mri_ct_ecg_brain
## 1                 false                    false            false
## 2                  true                     true             true
## 3                 false                    false            false
## 4                 false                    false            false
## 5                 false                    false            false
## 6                  true                     true            false
##                                  name ntprobnp_bnp_anp_level
## 1 Diabetes Zorgsysteem west-Friesland                  false
## 2                            ERF_ERGO                  false
## 3                             Radboud                  false
## 4      Genetica ARtrose en Progressie                  false
## 5    HEalthy LIfe in an Urban Setting                  false
## 6                                                      false
##   osteoarthritis osteoarthritis_longitudinal painscores_joints personality
## 1          false                       false             false       false
## 2           true                       false              true        true
## 3          false                       false             false       false
## 4           true                       false             false       false
## 5           true                       false             false       false
## 6           true                        true              true        true
##                    pi_email                                 pi_name
## 1        l.m.t_hart@lumc.nl L.M. 't Hart (LUMC) / J.M. Dekker VUmc)
## 2   c.vanduijn@erasmusmc.nl                      Cornelia van Duijn
## 3 mihai.netea@radboudumc.nl                             Mihai Netea
## 4      i.meulenbelt@lumc.nl                       Ingrid Meulenbelt
## 5 a.h.zwinderman@amc.uva.nl                         Koos Zwinderman
## 6  cisca.wijmenga@gmail.com                          Cisca Wijmenga
##   pr_interval prev_diag_aortic_aneurysm prev_diag_diast_heart_failure
## 1        true                      true                          true
## 2        true                      true                          true
## 3       false                     false                         false
## 4       false                     false                         false
## 5        true                     false                         false
## 6        true                      true                         false
##   prev_diag_heart_failure prev_diag_hemorrhagic_stroke
## 1                    true                         true
## 2                    true                         true
## 3                   false                        false
## 4                   false                        false
## 5                   false                        false
## 6                    true                        false
##   prev_diag_per_vascular_disease prev_diag_stroke
## 1                           true             true
## 2                           true             true
## 3                          false            false
## 4                          false            false
## 5                          false             true
## 6                           true             true
##   prev_diag_syst_heart_failure prev_diag_thromb_stroke
## 1                         true                    true
## 2                         true                    true
## 3                        false                   false
## 4                        false                   false
## 5                        false                   false
## 6                        false                   false
##   psychological_wellbeing pulse_rate qrs_duration qrs_voltage qt_time
## 1                   false      false         true       false   false
## 2                   false       true         true        true    true
## 3                   false      false        false       false   false
## 4                   false      false        false       false   false
## 5                   false       true         true        true    true
## 6                   false      false         true       false    true
##   qtc_time self_esteem self_rated_health   tsh use_of_angiotensin_ii
## 1     true       false             false false                  true
## 2     true       false             false false                  true
## 3    false       false              true false                 false
## 4    false       false             false false                 false
## 5     true       false              true false                  true
## 6    false       false              true  true                  true
##   use_of_anti_depressive_drugs use_of_anti_epilectic_drugs
## 1                         true                        true
## 2                         true                        true
## 3                        false                       false
## 4                        false                       false
## 5                         true                        true
## 6                         true                        true
##   use_of_anticonception use_of_beta_blockers use_of_triptans womac_scores
## 1                  true                 true            true        false
## 2                 false                 true            true        false
## 3                  true                false           false        false
## 4                 false                false           false        false
## 5                  true                 true            true        false
## 6                  true                 true            true         true
##   x_ray_photographs
## 1             false
## 2              true
## 3             false
## 4             false
## 5             false
## 6             false

samples <- molgenis.get.all("samples")

## Extracted 2000 rows...

## Extracted 3000 rows...

## Extracted 4000 rows...

## Extracted 5000 rows...

## Extracted 6000 rows...

## Extracted 7000 rows...

## Extracted 8000 rows...

## Extracted 9000 rows...

## Extracted 10000 rows...

## Extracted 11000 rows...

## Extracted 12000 rows...

## Extracted 13000 rows...

## Extracted 14000 rows...

## Extracted 15000 rows...

## Extracted 16000 rows...

## Extracted 17000 rows...

## Extracted 18000 rows...

## Extracted 19000 rows...

## Extracted 20000 rows...

## Extracted 21000 rows...

## Extracted 22000 rows...

## Extracted 23000 rows...

## Extracted 24000 rows...

## Extracted 24112 rows...

dim(samples)

## [1] 24112    10

head(samples)

##      biobank      subject_id              id          date_collection
## 1 ALPHAOMEGA ALPHAOMEGA-9265 ALPHAOMEGA-2722 2005-02-15T00:00:00+0100
## 2 ALPHAOMEGA ALPHAOMEGA-4450 ALPHAOMEGA-2736 2005-02-15T00:00:00+0100
## 3 ALPHAOMEGA ALPHAOMEGA-3401 ALPHAOMEGA-2743 2005-02-16T00:00:00+0100
## 4 ALPHAOMEGA ALPHAOMEGA-1280 ALPHAOMEGA-2752 2005-02-17T00:00:00+0100
## 5 ALPHAOMEGA ALPHAOMEGA-6748  ALPHAOMEGA-276 2002-09-26T00:00:00+0200
## 6 ALPHAOMEGA ALPHAOMEGA-5782 ALPHAOMEGA-2764 2005-02-21T00:00:00+0100
##   date_inclusion sample_matrix fasting time_handling temp_storage
## 1           <NA>   EDTA plasma   false            NA          -80
## 2           <NA>   EDTA plasma    true            NA          -80
## 3           <NA>   EDTA plasma   false            NA          -80
## 4           <NA>   EDTA plasma   false            NA          -80
## 5           <NA>   EDTA plasma    true            NA          -80
## 6           <NA>   EDTA plasma   false            NA          -80
##   time_storage
## 1          110
## 2          110
## 3          110
## 4          110
## 5          139
## 6          110

tail(samples)

##       biobank   subject_id                  id          date_collection
## 24107   VUNTR  VUNTR-A918C VUNTR-9222_66256-01 2008-10-01T00:00:00+0200
## 24108   VUNTR  VUNTR-A623C VUNTR-9226_62616-01 2008-10-01T00:00:00+0200
## 24109   VUNTR VUNTR-A1148D VUNTR-9227_62057-02 2008-10-01T00:00:00+0200
## 24110   VUNTR VUNTR-A1148C VUNTR-9228_62057-01 2008-10-01T00:00:00+0200
## 24111   VUNTR VUNTR-A1552C VUNTR-9229_60436-01 2008-10-01T00:00:00+0200
## 24112   VUNTR VUNTR-A2462C VUNTR-9230_60384-01 2008-10-01T00:00:00+0200
##       date_inclusion sample_matrix fasting time_handling temp_storage
## 24107           <NA>   EDTA plasma    true             6          -30
## 24108           <NA>   EDTA plasma    true             6          -30
## 24109           <NA>   EDTA plasma    true             6          -30
## 24110           <NA>   EDTA plasma    true             6          -30
## 24111           <NA>   EDTA plasma    true             6          -30
## 24112           <NA>   EDTA plasma    true             6          -30
##       time_storage
## 24107           NA
## 24108           NA
## 24109           NA
## 24110           NA
## 24111           NA
## 24112           NA

measurements <- molgenis.get.all("measurements")

## Extracted 2000 rows...

## Extracted 3000 rows...

## Extracted 4000 rows...

## Extracted 5000 rows...

## Extracted 6000 rows...

## Extracted 7000 rows...

## Extracted 8000 rows...

## Extracted 9000 rows...

## Extracted 10000 rows...

## Extracted 11000 rows...

## Extracted 12000 rows...

## Extracted 13000 rows...

## Extracted 14000 rows...

## Extracted 15000 rows...

## Extracted 16000 rows...

## Extracted 17000 rows...

## Extracted 18000 rows...

## Extracted 19000 rows...

## Extracted 20000 rows...

## Extracted 21000 rows...

## Extracted 22000 rows...

## Extracted 23000 rows...

## Extracted 24000 rows...

## Extracted 24072 rows...

dim(measurements)

## [1] 24072   249

measurements[1:5, 1:5]

##                       id        sample_id   acace     ace    ala
## 1 BBMRI-PROSPER.31528441 PROSPER-31528441 0.02900 0.01190 0.4562
## 2 BBMRI-PROSPER.31608168 PROSPER-31608168 0.24870 0.01826 0.1326
## 3 BBMRI-PROSPER.31618227 PROSPER-31618227 0.02947 0.02237 0.4932
## 4 BBMRI-PROSPER.31709409 PROSPER-31709409 0.07189 0.01456 0.4230
## 5 BBMRI-PROSPER.31749591 PROSPER-31749591 0.02733 0.02391 0.3537

tbl <- table(subjects$biobank)
op <- par(mar = c(5, 10, 4, 2))
barplot(tbl[order(tbl)], horiz = TRUE, las = 2)
par(op)

library(lubridate)

## 
## Attaching package: 'lubridate'

## The following object is masked from 'package:IRanges':
## 
##     %within%

## The following object is masked from 'package:base':
## 
##     date

library(ggplot2)
subjects$date_of_birth <- as.character(subjects$date_of_birth)
subjects$date_of_birth[!is.na(subjects$date_of_birth)][subjects$date_of_birth[!is.na(subjects$date_of_birth)] == 
    ""] <- NA
subjects$age <- interval(gsub("T.*$", "", subjects$date_of_birth), Sys.Date())/duration(num = 1, 
    units = "years")
levels(subjects$gender) <- c("", "female", "male")  ##c('', 'false', 'true')
biobank_ordered <- with(subjects, reorder(biobank, age, median, na.rm = TRUE))
gp <- ggplot(subjects, aes(biobank_ordered, age))
gp <- gp + geom_boxplot(aes(fill = factor(gender)))
gp <- gp + theme(axis.text.x = element_text(angle = 90, hjust = 0, size = 7))
gp

## Warning: Removed 8516 rows containing non-finite values (stat_boxplot).
measurements$biobank <- gsub("-.*$", "", measurements$sample_id)
biobank_ordered <- with(measurements, reorder(biobank, lac, median, na.rm = TRUE))
gp <- ggplot(measurements, aes(biobank_ordered, lac))
gp <- gp + geom_boxplot()
gp <- gp + theme(axis.text.x = element_text(angle = 90, hjust = 0, size = 7))
gp

## Warning: Removed 18 rows containing non-finite values (stat_boxplot).
metadata <- molgenis.getEntityMetaData("measurements")
description <- lapply(metadata$attributes, function(x) x$description)
head(description)

## $id
## [1] "Identifier"
## 
## $sample_id
## [1] "Sample identifier"
## 
## $acace
## [1] "Acetoacetate"
## 
## $ace
## [1] "Acetate"
## 
## $ala
## [1] "Alanine"
## 
## $alb
## [1] "Albumin"

Linking Tables

Linking of the subject, sample and measurment tables is done through common identifiers within each table. Each table has a column named 'id' which represents the table-name id e.g. the 'id' column of the subjects' table represents the 'subject_id' end so on.

You can use the function 'merge' to combine the different tables as follows, for example, for all LLS data:

head(subjects[, c("id", "bios_id")])

##                                                id bios_id
## 1 BIOMARCS-{38788478-F7D7-4518-B57F-09F5EA6190EF}    <NA>
## 2 BIOMARCS-{38B676D4-E73F-48B4-B874-4B3805D42DB5}    <NA>
## 3 BIOMARCS-{395012FA-A34C-4478-A899-64B3AB3CA9B7}    <NA>
## 4 BIOMARCS-{396322D1-DEF6-4AC9-AC40-3B70B9A9EFD1}    <NA>
## 5 BIOMARCS-{3963929C-AFA9-4A9D-B095-C30999122838}    <NA>
## 6 BIOMARCS-{39DD6E7B-1A01-4D79-8757-4A84256E3C27}    <NA>

measurements[1:5, c("id", "sample_id")]

##                       id        sample_id
## 1 BBMRI-PROSPER.31528441 PROSPER-31528441
## 2 BBMRI-PROSPER.31608168 PROSPER-31608168
## 3 BBMRI-PROSPER.31618227 PROSPER-31618227
## 4 BBMRI-PROSPER.31709409 PROSPER-31709409
## 5 BBMRI-PROSPER.31749591 PROSPER-31749591

head(samples[, c("biobank", "id", "subject_id")])

##      biobank              id      subject_id
## 1 ALPHAOMEGA ALPHAOMEGA-2722 ALPHAOMEGA-9265
## 2 ALPHAOMEGA ALPHAOMEGA-2736 ALPHAOMEGA-4450
## 3 ALPHAOMEGA ALPHAOMEGA-2743 ALPHAOMEGA-3401
## 4 ALPHAOMEGA ALPHAOMEGA-2752 ALPHAOMEGA-1280
## 5 ALPHAOMEGA  ALPHAOMEGA-276 ALPHAOMEGA-6748
## 6 ALPHAOMEGA ALPHAOMEGA-2764 ALPHAOMEGA-5782

LLS <- subset(samples, grepl("LLS", biobank))
dim(LLS)

## [1] 3311   10

head(LLS)

##           biobank           subject_id                id
## 3996 LLS_PARTOFFS  LLS_PARTOFFS-323011  LLS_PARTOFFS-169
## 3997 LLS_PARTOFFS LLS_PARTOFFS-2443021 LLS_PARTOFFS-1690
## 3998 LLS_PARTOFFS LLS_PARTOFFS-2443020 LLS_PARTOFFS-1691
## 3999 LLS_PARTOFFS LLS_PARTOFFS-2143020 LLS_PARTOFFS-1692
## 4000 LLS_PARTOFFS LLS_PARTOFFS-2143021 LLS_PARTOFFS-1693
## 4001 LLS_PARTOFFS LLS_PARTOFFS-2283111 LLS_PARTOFFS-1694
##               date_collection date_inclusion sample_matrix fasting
## 3996 2003-01-07T00:00:00+0100           <NA>   EDTA plasma   false
## 3997 2004-06-08T00:00:00+0200           <NA>   EDTA plasma   false
## 3998 2004-06-08T00:00:00+0200           <NA>   EDTA plasma   false
## 3999 2004-06-09T00:00:00+0200           <NA>   EDTA plasma   false
## 4000 2004-06-09T00:00:00+0200           <NA>   EDTA plasma   false
## 4001 2004-06-09T00:00:00+0200           <NA>   EDTA plasma   false
##      time_handling temp_storage time_storage
## 3996             4          -80          135
## 3997             1          -80          118
## 3998             1          -80          118
## 3999             3          -80          118
## 4000             3          -80          118
## 4001             3          -80          118

mLLS <- merge(LLS, subjects, by.x = "subject_id", by.y = "id", all.x = TRUE, 
    suffixes = c("_samples", "_subjects"))
dim(mLLS)

## [1] 3311   50

head(mLLS)

##             subject_id biobank_samples                id
## 1 LLS_PARTOFFS-1013010    LLS_PARTOFFS  LLS_PARTOFFS-582
## 2 LLS_PARTOFFS-1013020    LLS_PARTOFFS  LLS_PARTOFFS-881
## 3 LLS_PARTOFFS-1023010    LLS_PARTOFFS LLS_PARTOFFS-1397
## 4 LLS_PARTOFFS-1023011    LLS_PARTOFFS LLS_PARTOFFS-1398
## 5 LLS_PARTOFFS-1023040    LLS_PARTOFFS LLS_PARTOFFS-1126
## 6  LLS_PARTOFFS-103010    LLS_PARTOFFS   LLS_PARTOFFS-40
##            date_collection date_inclusion sample_matrix fasting
## 1 2003-08-19T00:00:00+0200           <NA>   EDTA plasma   false
## 2 2003-11-10T00:00:00+0100           <NA>   EDTA plasma   false
## 3 2004-03-15T00:00:00+0100           <NA>   EDTA plasma   false
## 4 2004-03-15T00:00:00+0100           <NA>   EDTA plasma   false
## 5 2003-12-23T00:00:00+0100           <NA>   EDTA plasma   false
## 6 2002-11-01T00:00:00+0100           <NA>   EDTA plasma   false
##   time_handling temp_storage time_storage biobank_subjects bios_id
## 1             4          -80          128     LLS_PARTOFFS        
## 2             3          -80          125     LLS_PARTOFFS        
## 3             3          -80          121     LLS_PARTOFFS    1397
## 4             3          -80          121     LLS_PARTOFFS    1398
## 5             4          -80          124     LLS_PARTOFFS        
## 6             2          -80          137     LLS_PARTOFFS      40
##              date_of_birth age_bloodcollection gender pedigree_information
## 1 1935-09-19T00:00:00+0119                  NA female                 true
## 2 1933-01-22T00:00:00+0019                  NA female                 true
## 3 1950-12-23T00:00:00+0100                  NA female                 true
## 4 1951-06-25T00:00:00+0100                  NA   male                 true
## 5 1958-03-02T00:00:00+0100                  NA female                 true
## 6 1944-04-27T00:00:00+0200                  NA female                 true
##      gwas_platform_used gwas_available_date dna_amount      dna_source
## 1          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
## 2          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
## 3          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
## 4 Illumina Omni-express 2012-01-01 00:00:00       true EDTA buffy coat
## 5          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
## 6          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
##   rna_amount            rna_source        date_of_inclusion smoking
## 1       true Whole blood (PAXgene) 2003-08-19T00:00:00+0200        
## 2       true Whole blood (PAXgene) 2003-11-10T00:00:00+0100   false
## 3       true Whole blood (PAXgene) 2004-03-15T00:00:00+0100   false
## 4       true Whole blood (PAXgene) 2004-03-15T00:00:00+0100   false
## 5       true Whole blood (PAXgene) 2003-12-23T00:00:00+0100        
## 6       true Whole blood (PAXgene) 2002-11-01T00:00:00+0100   false
##   alcohol_consumption height weight waist_circumference hip_circumference
## 1                         NA     NA                  NA                NA
## 2                true    155     51                  NA                NA
## 3                true    164     65                  NA                NA
## 4                true     NA     NA                  NA                NA
## 5                         NA     NA                  NA                NA
## 6                true    159     70                  NA                NA
##   hs_crp  wbc hgb   hct plt neut_percentage lymph_percentage
## 1   0.55 4.29 8.3 0.394 233            52.4             32.9
## 2   1.16 4.73 9.6 0.427 159            36.1             48.9
## 3   1.23 6.41 8.3 0.389 316            59.6             28.5
## 4   0.16 5.02 8.4 0.391 218            66.8             23.5
## 5   0.77 5.98 9.2 0.451 306            69.9             21.2
## 6   3.25 5.28 7.9 0.398 292            51.4             35.8
##   mono_percentage eos_percentage baso_percentage luc_percentage
## 1             8.2            2.1             0.7            3.8
## 2             9.3            1.2             1.0            3.5
## 3             5.1            4.1             0.7            2.0
## 4             4.9            2.7             0.7            1.4
## 5             5.2            1.3             0.7            1.8
## 6             7.3            2.0             0.7            2.8
##   tot_cholesterol hdl_cholesterol triglycerides systolic_blood_pressure
## 1            4.95            1.64          1.25                      NA
## 2            5.84            2.00          1.14                      NA
## 3            6.39            1.70          3.67                      NA
## 4            4.66            0.98          4.19                      NA
## 5            5.11            1.63          0.80                      NA
## 6            5.77            1.60          2.06                      NA
##   diastolic_blood_pressure lipid_lowering_med blood_pressure_lowering_med
## 1                       NA                  1                       false
## 2                       NA                  0                       false
## 3                       NA                  0                       false
## 4                       NA                  1                        true
## 5                       NA                  0                       false
## 6                       NA                  0                        true
##   metabolic_syndrome diabetes      age
## 1               <NA>    false 80.71781
## 2               <NA>    false 83.37534
## 3               <NA>    false 65.44658
## 4               <NA>     true 64.94247
## 5               <NA>          58.25205
## 6               <NA>    false 72.10685

mmLLS <- merge(mLLS, measurements, by.x = "id", by.y = "sample_id", all.x = TRUE, 
    suffixes = c("_merged", "measurements"))

## Warning in merge.data.frame(mLLS, measurements, by.x = "id", by.y =
## "sample_id", : column name 'id' is duplicated in the result

dim(mmLLS)

## [1] 3311  299

head(mmLLS)

##                  id           subject_id biobank_samples
## 1   LLS_PARTOFFS-10   LLS_PARTOFFS-53021    LLS_PARTOFFS
## 2  LLS_PARTOFFS-100  LLS_PARTOFFS-233030    LLS_PARTOFFS
## 3 LLS_PARTOFFS-1000 LLS_PARTOFFS-1173060    LLS_PARTOFFS
## 4 LLS_PARTOFFS-1001 LLS_PARTOFFS-1173120    LLS_PARTOFFS
## 5 LLS_PARTOFFS-1002 LLS_PARTOFFS-1173100    LLS_PARTOFFS
## 6 LLS_PARTOFFS-1003 LLS_PARTOFFS-1173080    LLS_PARTOFFS
##            date_collection date_inclusion sample_matrix fasting
## 1 2002-09-27T00:00:00+0200           <NA>   EDTA plasma   false
## 2 2002-12-04T00:00:00+0100           <NA>   EDTA plasma   false
## 3 2003-11-28T00:00:00+0100           <NA>   EDTA plasma   false
## 4 2003-11-28T00:00:00+0100           <NA>   EDTA plasma   false
## 5 2003-11-28T00:00:00+0100           <NA>   EDTA plasma   false
## 6 2003-11-28T00:00:00+0100           <NA>   EDTA plasma   false
##   time_handling temp_storage time_storage biobank_subjects bios_id
## 1             5          -80          139     LLS_PARTOFFS    <NA>
## 2             4          -80          136     LLS_PARTOFFS        
## 3             6          -80          125     LLS_PARTOFFS        
## 4             4          -80          125     LLS_PARTOFFS        
## 5             5          -80          125     LLS_PARTOFFS    1002
## 6             5          -80          125     LLS_PARTOFFS        
##              date_of_birth age_bloodcollection gender pedigree_information
## 1 1946-03-28T00:00:00+0100                  NA female                 true
## 2 1947-12-18T00:00:00+0100                  NA   male                 true
## 3 1938-10-21T00:00:00+0020                  NA   male                 true
## 4 1949-03-07T00:00:00+0100                  NA female                 true
## 5 1945-07-08T00:00:00+0200                  NA female                 true
## 6 1940-12-24T00:00:00+0200                  NA   male                 true
##      gwas_platform_used gwas_available_date dna_amount      dna_source
## 1 Illumina Omni-express 2012-01-01 00:00:00       true EDTA buffy coat
## 2          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
## 3          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
## 4          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
## 5          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
## 6          Illumina 660 2012-01-01 00:00:00       true EDTA buffy coat
##   rna_amount            rna_source        date_of_inclusion smoking
## 1       true Whole blood (PAXgene) 2002-09-27T00:00:00+0200    true
## 2       true Whole blood (PAXgene) 2002-12-04T00:00:00+0100        
## 3       true Whole blood (PAXgene) 2003-11-28T00:00:00+0100   false
## 4       true Whole blood (PAXgene) 2003-11-28T00:00:00+0100   false
## 5       true Whole blood (PAXgene) 2003-11-28T00:00:00+0100   false
## 6       true Whole blood (PAXgene) 2003-11-28T00:00:00+0100   false
##   alcohol_consumption height weight waist_circumference hip_circumference
## 1                true    174     75                  NA                NA
## 2                         NA     NA                  NA                NA
## 3                true    175     78                  NA                NA
## 4                true    167     64                  NA                NA
## 5                true    160     74                  NA                NA
## 6                true    167     83                  NA                NA
##   hs_crp  wbc  hgb   hct plt neut_percentage lymph_percentage
## 1   1.73 7.08  9.6 0.463 372            55.0             34.6
## 2   1.34 4.31  9.4 0.437 246            59.7             25.5
## 3   7.02 7.60  9.1 0.443 241            74.6             16.4
## 4   0.71 5.97  8.6 0.433 274            69.6             21.4
## 5   1.09 6.92  8.5 0.405 254            60.5             30.2
## 6   4.01 5.66 10.3 0.507 194            53.7             34.2
##   mono_percentage eos_percentage baso_percentage luc_percentage
## 1             5.4            3.0             0.6            1.4
## 2             6.6            4.2             2.9            1.2
## 3             5.6            1.4             0.4            1.6
## 4             4.6            2.3             0.5            1.7
## 5             5.1            2.1             0.5            1.6
## 6             6.6            2.0             0.7            2.8
##   tot_cholesterol hdl_cholesterol triglycerides systolic_blood_pressure
## 1            7.69            1.01          1.50                      NA
## 2            5.80            1.01          4.54                      NA
## 3            4.04            1.01          1.66                      NA
## 4            5.66            1.54          2.32                      NA
## 5            5.97            1.42          1.06                      NA
## 6            6.11            1.74          2.18                      NA
##   diastolic_blood_pressure lipid_lowering_med blood_pressure_lowering_med
## 1                       NA                  0                       false
## 2                       NA                  0                        true
## 3                       NA                  0                       false
## 4                       NA                  0                       false
## 5                       NA                  0                       false
## 6                       NA                  0                       false
##   metabolic_syndrome diabetes      age                      id    acace
## 1                       false 70.18904   BBMRI-LLS_PARTOFFS.10 0.003705
## 2               <NA>    false 68.46301  BBMRI-LLS_PARTOFFS.100 0.007047
## 3               <NA>    false 77.62740 BBMRI-LLS_PARTOFFS.1000 0.057550
## 4               <NA>    false 67.24384 BBMRI-LLS_PARTOFFS.1001 0.039930
## 5               <NA>    false 70.90959 BBMRI-LLS_PARTOFFS.1002 0.025510
## 6               <NA>    false 75.44932 BBMRI-LLS_PARTOFFS.1003 0.045990
##       ace    ala     alb apoa1   apob apob_apoa1  bohbut     cit     cla
## 1 0.02638 0.3544 0.09067 1.650 1.4050     0.8518 0.03281 0.04117 0.06249
## 2 0.02098 0.3569 0.08845 1.629 1.1800     0.7245 0.04961 0.07549 0.01911
## 3 0.03499 0.2318 0.08536 1.326 0.8481     0.6397 0.09579 0.04290 0.01517
## 4 0.03966 0.2608 0.08722 1.697 1.0380     0.6115 0.08170 0.09927 0.03234
## 5 0.03837 0.3309 0.09331 1.607 1.0040     0.6249 0.06824 0.05616 0.03013
## 6 0.02615 0.3283 0.09608 1.930 1.1000     0.5702 0.18570 0.10930 0.02280
##   cla_fa    crea      dag   dag_tg    dha dha_fa  estc falen   faw3
## 1 0.4267 0.05741 0.005182 0.003562 0.2108 1.4390 4.786 17.02 0.6226
## 2 0.1210 0.06608 0.046690 0.016250 0.1262 0.7990 2.967 17.67 0.4184
## 3 0.1562 0.05893 0.009926 0.007891 0.1245 1.2820 2.313 17.65 0.3243
## 4 0.2470 0.04218 0.035250 0.021610 0.1210 0.9248 3.275 17.38 0.3542
## 5 0.2801 0.05559 0.013520 0.018010 0.1380 1.2840 3.459 17.03 0.3864
## 6 0.1682 0.06723 0.019080 0.011930 0.1460 1.0770 3.663 18.27 0.4711
##   faw3_fa  faw6 faw6_fa freec   glc    gln    gp  hdl_c  hdl_d hdl_tg
## 1   4.251 4.689   32.01 2.001 3.862 0.4322 1.382 1.6190 10.020 0.1778
## 2   2.649 4.593   29.08 1.229 5.831 0.3969 1.649 1.1300  9.778 0.2212
## 3   3.341 3.352   34.53 1.015 4.540 0.4414 1.548 0.9519  9.555 0.1227
## 4   2.706 4.262   32.56 1.328 4.246 0.5418 1.359 1.4230  9.947 0.1457
## 5   3.594 3.698   34.39 1.428 4.901 0.4521 1.267 1.4300  9.850 0.1016
## 6   3.476 4.233   31.24 1.580 6.019 0.4272 1.357 1.7870 10.070 0.1868
##   hdl2_c hdl3_c     his  idl_c idl_c_percentage idl_ce idl_ce_percentage
## 1 0.7916 0.8271 0.04683 1.1440            63.48 0.7836             43.49
## 2 0.7628 0.3668 0.01640 0.5548            61.06 0.4241             46.68
## 3 0.5535 0.3985 0.05765 0.4923            62.47 0.3516             44.61
## 4 0.9848 0.4378 0.06723 0.6946            65.45 0.4883             46.01
## 5 0.9213 0.5084 0.05599 0.7886            65.46 0.5450             45.24
## 6 1.3100 0.4764 0.06068 0.7583            62.03 0.5429             44.41
##   idl_fc idl_fc_percentage  idl_l         idl_p idl_pl idl_pl_percentage
## 1 0.3603             20.00 1.8020 0.00000017640 0.4668             25.90
## 2 0.1307             14.39 0.9086 0.00000009239 0.2165             23.83
## 3 0.1407             17.85 0.7881 0.00000007802 0.2100             26.64
## 4 0.2063             19.44 1.0610 0.00000010340 0.2712             25.55
## 5 0.2436             20.22 1.2050 0.00000011670 0.3205             26.60
## 6 0.2154             17.62 1.2220 0.00000012160 0.3172             25.95
##    idl_tg idl_tg_percentage     ile l_hdl_c l_hdl_c_percentage l_hdl_ce
## 1 0.19120            10.610 0.04400  0.3633              53.01   0.2705
## 2 0.13730            15.110 0.07118  0.1973              43.03   0.1582
## 3 0.08584            10.890 0.05212  0.1327              42.17   0.1094
## 4 0.09552             9.001 0.08965  0.3381              48.92   0.2564
## 5 0.09553             7.931 0.03739  0.2927              46.25   0.2189
## 6 0.14700            12.030 0.05390  0.4758              45.91   0.3596
##   l_hdl_ce_percentage l_hdl_fc l_hdl_fc_percentage l_hdl_l         l_hdl_p
## 1               39.47  0.09278              13.540  0.6854 0.0000010799999
## 2               34.51  0.03908               8.524  0.4585 0.0000007434000
## 3               34.76  0.02335               7.418  0.3147 0.0000005113000
## 4               37.10  0.08168              11.820  0.6911 0.0000010890000
## 5               34.59  0.07380              11.660  0.6328 0.0000009991001
## 6               34.70  0.11620              11.210  1.0360 0.0000016460000
##   l_hdl_pl l_hdl_pl_percentage l_hdl_tg l_hdl_tg_percentage l_ldl_c
## 1   0.2640               38.52  0.05810               8.477  1.4570
## 2   0.2304               50.25  0.03080               6.719  0.6435
## 3   0.1656               52.61  0.01641               5.215  0.5803
## 4   0.3296               47.68  0.02350               3.400  0.7997
## 5   0.3230               51.05  0.01709               2.700  0.9788
## 6   0.5163               49.82  0.04422               4.267  0.8960
##   l_ldl_c_percentage l_ldl_ce l_ldl_ce_percentage l_ldl_fc
## 1              69.07   1.0620               50.37   0.3944
## 2              62.75   0.4727               46.09   0.1708
## 3              65.72   0.4018               45.50   0.1785
## 4              67.41   0.5670               47.79   0.2327
## 5              69.26   0.6902               48.84   0.2886
## 6              65.36   0.6428               46.89   0.2532
##   l_ldl_fc_percentage l_ldl_l      l_ldl_p l_ldl_pl l_ldl_pl_percentage
## 1               18.70  2.1090 0.0000002959   0.4813               22.82
## 2               16.66  1.0260 0.0000001474   0.2709               26.42
## 3               20.21  0.8831 0.0000001232   0.2424               27.45
## 4               19.61  1.1860 0.0000001657   0.3046               25.68
## 5               20.42  1.4130 0.0000001957   0.3513               24.86
## 6               18.47  1.3710 0.0000001943   0.3405               24.84
##   l_ldl_tg l_ldl_tg_percentage l_vldl_c l_vldl_c_percentage l_vldl_ce
## 1  0.17090               8.103  0.09075               31.04   0.05192
## 2  0.11110              10.830  0.24130               22.63   0.11850
## 3  0.06031               6.830  0.07841               21.73   0.04339
## 4  0.08206               6.917  0.12130               23.15   0.05801
## 5  0.08320               5.887  0.03990               24.73   0.02391
## 6  0.13450               9.807  0.11140               24.17   0.05330
##   l_vldl_ce_percentage l_vldl_fc l_vldl_fc_percentage l_vldl_l
## 1                17.76   0.03883               13.280   0.2923
## 2                11.12   0.12280               11.520   1.0660
## 3                12.02   0.03502                9.703   0.3609
## 4                11.07   0.06332               12.080   0.5241
## 5                14.82   0.01599                9.907   0.1614
## 6                11.57   0.05807               12.600   0.4608
##         l_vldl_p l_vldl_pl l_vldl_pl_percentage l_vldl_tg
## 1 0.000000004834   0.06405                21.91    0.1375
## 2 0.000000018390   0.19020                17.84    0.6347
## 3 0.000000006301   0.05993                16.60    0.2226
## 4 0.000000009043   0.08432                16.09    0.3185
## 5 0.000000002791   0.02615                16.21    0.0953
## 6 0.000000007895   0.07842                17.02    0.2711
##   l_vldl_tg_percentage    la la_fa    lac ldl_c ldl_d ldl_tg     leu
## 1                47.05 3.678 25.11 1.1470 2.858 23.61 0.2977 0.05075
## 2                59.53 4.004 25.35 1.0570 1.227 23.46 0.2143 0.05892
## 3                61.67 2.801 28.84 1.0200 1.109 23.60 0.1017 0.06372
## 4                60.76 3.644 27.84 0.9523 1.514 23.55 0.1449 0.09815
## 5                59.06 2.954 27.47 1.3060 1.890 23.58 0.1400 0.05319
## 6                58.82 3.527 26.02 1.8790 1.708 23.58 0.2397 0.07647
##   m_hdl_c m_hdl_c_percentage m_hdl_ce m_hdl_ce_percentage m_hdl_fc
## 1  0.5278              59.66   0.4139               46.78  0.11400
## 2  0.4069              45.63   0.3280               36.79  0.07887
## 3  0.3476              48.05   0.2819               38.97  0.06566
## 4  0.4606              48.48   0.3724               39.20  0.08820
## 5  0.4736              51.49   0.3697               40.20  0.10390
## 6  0.5756              47.85   0.4413               36.68  0.13430
##   m_hdl_fc_percentage m_hdl_l     m_hdl_p m_hdl_pl m_hdl_pl_percentage
## 1              12.880  0.8847 0.000002028   0.3108               35.13
## 2               8.845  0.8917 0.000002132   0.4095               45.92
## 3               9.077  0.7234 0.000001714   0.3292               45.51
## 4               9.283  0.9502 0.000002243   0.4372               46.02
## 5              11.300  0.9197 0.000002141   0.4043               43.96
## 6              11.160  1.2030 0.000002820   0.5630               46.80
##   m_hdl_tg m_hdl_tg_percentage m_ldl_c m_ldl_c_percentage m_ldl_ce
## 1  0.04601               5.201  0.8815              71.13   0.6572
## 2  0.07533               8.448  0.3658              58.95   0.2392
## 3  0.04662               6.445  0.3303              64.42   0.2146
## 4  0.05230               5.504  0.4392              65.18   0.2986
## 5  0.04179               4.544  0.5671              69.63   0.4095
## 6  0.06441               5.354  0.5028              63.81   0.3481
##   m_ldl_ce_percentage m_ldl_fc m_ldl_fc_percentage m_ldl_l       m_ldl_p
## 1               53.04   0.2243               18.10  1.2390 0.00000024270
## 2               38.55   0.1266               20.40  0.6205 0.00000012290
## 3               41.86   0.1157               22.56  0.5127 0.00000009882
## 4               44.31   0.1406               20.87  0.6738 0.00000013090
## 5               50.27   0.1576               19.35  0.8145 0.00000015810
## 6               44.18   0.1547               19.64  0.7880 0.00000015530
##   m_ldl_pl m_ldl_pl_percentage m_ldl_tg m_ldl_tg_percentage m_vldl_c
## 1   0.2769               22.34  0.08086               6.525   0.2449
## 2   0.2003               32.28  0.05442               8.770   0.4056
## 3   0.1593               31.08  0.02308               4.502   0.1934
## 4   0.2000               29.68  0.03458               5.132   0.2384
## 5   0.2115               25.97  0.03584               4.400   0.1516
## 6   0.2221               28.18  0.06308               8.006   0.2344
##   m_vldl_c_percentage m_vldl_ce m_vldl_ce_percentage m_vldl_fc
## 1               43.77   0.16310                29.16   0.08175
## 2               25.85   0.21400                13.64   0.19160
## 3               25.67   0.10760                14.28   0.08585
## 4               28.06   0.14100                16.60   0.09740
## 5               33.32   0.09792                21.53   0.05366
## 6               30.38   0.13090                16.97   0.10350
##   m_vldl_fc_percentage m_vldl_l      m_vldl_p m_vldl_pl
## 1                14.61   0.5594 0.00000001570   0.10910
## 2                12.21   1.5690 0.00000004720   0.29990
## 3                11.40   0.7534 0.00000002276   0.14160
## 4                11.46   0.8496 0.00000002548   0.15790
## 5                11.80   0.4549 0.00000001336   0.08907
## 6                13.42   0.7716 0.00000002276   0.14970
##   m_vldl_pl_percentage m_vldl_tg m_vldl_tg_percentage  mufa mufa_fa    pc
## 1                19.50    0.2054                36.72 3.454   23.58 2.368
## 2                19.12    0.8634                55.03 4.373   27.69 1.962
## 3                18.80    0.4183                55.53 2.250   23.18 1.485
## 4                18.58    0.4534                53.36 3.216   24.57 1.999
## 5                19.58    0.2143                47.10 2.704   25.14 1.851
## 6                19.40    0.3875                50.22 3.895   28.74 2.505
##       phe  pufa pufa_fa    pyr remnant_c s_hdl_c s_hdl_c_percentage
## 1 0.04166 5.311   36.26 0.1114     2.310  0.5023              49.98
## 2 0.04510 5.012   31.73 0.1030     1.839  0.3623              33.23
## 3 0.04048 3.677   37.87 0.1131     1.267  0.4052              38.49
## 4 0.05951 4.617   35.27 0.1080     1.667  0.3930              35.79
## 5 0.04440 4.085   37.99 0.1160     1.568  0.5056              45.34
## 6 0.04973 4.704   34.71 0.1418     1.748  0.4794              37.87
##   s_hdl_ce s_hdl_ce_percentage s_hdl_fc s_hdl_fc_percentage s_hdl_l
## 1   0.3848               38.30   0.1174               11.69   1.005
## 2   0.2341               21.47   0.1282               11.76   1.090
## 3   0.2868               27.25   0.1184               11.24   1.053
## 4   0.2606               23.73   0.1324               12.06   1.098
## 5   0.3934               35.28   0.1122               10.06   1.115
## 6   0.3110               24.57   0.1684               13.30   1.266
##       s_hdl_p s_hdl_pl s_hdl_pl_percentage s_hdl_tg s_hdl_tg_percentage
## 1 0.000004470   0.4430               44.09  0.05957               5.928
## 2 0.000004970   0.6441               59.08  0.08376               7.683
## 3 0.000004746   0.5960               56.62  0.05145               4.888
## 4 0.000004948   0.6542               59.57  0.05092               4.637
## 5 0.000004990   0.5730               51.39  0.03648               3.272
## 6 0.000005659   0.7301               57.67  0.05655               4.466
##   s_ldl_c s_ldl_c_percentage s_ldl_ce s_ldl_ce_percentage s_ldl_fc
## 1  0.5199              69.84   0.3936               52.87  0.12640
## 2  0.2174              52.43   0.1385               33.40  0.07889
## 3  0.1979              60.18   0.1276               38.80  0.07033
## 4  0.2748              61.15   0.1869               41.60  0.08787
## 5  0.3436              67.43   0.2489               48.84  0.09473
## 6  0.3093              60.38   0.2150               41.98  0.09426
##   s_ldl_fc_percentage s_ldl_l      s_ldl_p s_ldl_pl s_ldl_pl_percentage
## 1               16.98  0.7444 0.0000002635   0.1785               23.98
## 2               19.02  0.4147 0.0000001510   0.1485               35.82
## 3               21.38  0.3289 0.0000001155   0.1126               34.25
## 4               19.55  0.4493 0.0000001591   0.1463               32.57
## 5               18.59  0.5096 0.0000001788   0.1450               28.45
## 6               18.40  0.5122 0.0000001834   0.1607               31.38
##   s_ldl_tg s_ldl_tg_percentage s_vldl_c s_vldl_c_percentage s_vldl_ce
## 1  0.04599               6.177   0.3741               41.97    0.2456
## 2  0.04876              11.760   0.3444               31.12    0.1875
## 3  0.01833               5.573   0.2552               35.08    0.1502
## 4  0.02822               6.280   0.3024               39.95    0.1842
## 5  0.02099               4.120   0.2801               44.68    0.1773
## 6  0.04221               8.241   0.3160               40.44    0.1891
##   s_vldl_ce_percentage s_vldl_fc s_vldl_fc_percentage s_vldl_l
## 1                27.55    0.1285                14.41   0.8915
## 2                16.94    0.1569                14.18   1.1060
## 3                20.65    0.1049                14.43   0.7274
## 4                24.34    0.1182                15.61   0.7568
## 5                28.28    0.1028                16.41   0.6268
## 6                24.20    0.1269                16.24   0.7813
##        s_vldl_p s_vldl_pl s_vldl_pl_percentage s_vldl_tg
## 1 0.00000004408    0.2436                27.33    0.2737
## 2 0.00000005748    0.2453                22.17    0.5168
## 3 0.00000003721    0.1651                22.70    0.3072
## 4 0.00000003791    0.1716                22.67    0.2829
## 5 0.00000003070    0.1505                24.00    0.1963
## 6 0.00000003892    0.1826                23.37    0.2828
##   s_vldl_tg_percentage serum_c serum_tg   sfa sfa_fa     sm  tg_pg totcho
## 1                30.71   6.787   1.5180 5.882  40.16 0.6856 0.6091  2.974
## 2                46.71   4.196   3.0670 6.409  40.58 0.3414 1.4670  2.235
## 3                42.22   3.327   1.4430 3.782  38.95 0.4188 0.8645  1.799
## 4                37.38   4.603   1.7000 5.257  40.16 0.5256 0.8147  2.392
## 5                31.31   4.887   0.9563 3.965  36.87 0.4774 0.4211  2.207
## 6                36.19   5.243   1.7760 4.953  36.55 0.4250 0.6577  2.801
##    totfa totpg     tyr unsatdeg    val vldl_c vldl_d vldl_tg xl_hdl_c
## 1 14.650 2.389 0.04587    1.171 0.1545 1.1660  35.66  0.8507  0.22530
## 2 15.790 1.959 0.06946    1.085 0.1113 1.2850  39.96  2.4940  0.16310
## 3  9.709 1.455 0.05004    1.181 0.1827 0.7745  37.84  1.1320  0.06647
## 4 13.090 2.002 0.11030    1.121 0.2065 0.9724  38.25  1.3140  0.23100
## 5 10.750 1.783 0.06058    1.229 0.1339 0.7794  35.81  0.6191  0.15790
## 6 13.550 2.432 0.09906    1.218 0.1766 0.9896  37.48  1.2020  0.25610
##   xl_hdl_c_percentage xl_hdl_ce xl_hdl_ce_percentage xl_hdl_fc
## 1               52.17   0.19530                45.22   0.03002
## 2               58.40   0.12150                43.50   0.04160
## 3               67.92   0.05457                55.77   0.01189
## 4               57.73   0.17100                42.74   0.06000
## 5               54.80   0.11580                40.19   0.04207
## 6               50.70   0.18810                37.24   0.06796
##   xl_hdl_fc_percentage xl_hdl_l      xl_hdl_p xl_hdl_pl
## 1                6.951  0.43190 0.00000043410   0.19250
## 2               14.890  0.27940 0.00000027500   0.08489
## 3               12.160  0.09785 0.00000009559   0.02313
## 4               15.000  0.40000 0.00000038760   0.15010
## 5               14.600  0.28810 0.00000027880   0.12400
## 6               13.460  0.50510 0.00000049630   0.22740
##   xl_hdl_pl_percentage xl_hdl_tg xl_hdl_tg_percentage xl_vldl_c
## 1                44.56  0.014130                3.271  0.011350
## 2                30.39  0.031340               11.220  0.064820
## 3                23.64  0.008258                8.439  0.016840
## 4                37.51  0.019010                4.751  0.031350
## 5                43.04  0.006249                2.169  0.004626
## 6                45.01  0.021650                4.286  0.026650
##   xl_vldl_c_percentage xl_vldl_ce xl_vldl_ce_percentage xl_vldl_fc
## 1                17.64   0.003116                 4.845   0.008232
## 2                19.95   0.034680                10.680   0.030140
## 3                21.23   0.009798                12.350   0.007041
## 4                20.26   0.016340                10.560   0.015010
## 5                29.54   0.002569                16.410   0.002057
## 6                20.40   0.013730                10.510   0.012930
##   xl_vldl_fc_percentage xl_vldl_l       xl_vldl_p xl_vldl_pl
## 1                12.800   0.06432 0.0000000006560   0.010790
## 2                 9.278   0.32490 0.0000000033340   0.052500
## 3                 8.875   0.07933 0.0000000008151   0.011370
## 4                 9.702   0.15470 0.0000000015850   0.024540
## 5                13.140   0.01566 0.0000000001534   0.002668
## 6                 9.894   0.13070 0.0000000013330   0.022550
##   xl_vldl_pl_percentage xl_vldl_tg xl_vldl_tg_percentage xs_vldl_c
## 1                 16.77   0.042190                 65.59    0.4368
## 2                 16.16   0.207500                 63.88    0.2064
## 3                 14.33   0.051130                 64.45    0.2253
## 4                 15.86   0.098840                 63.88    0.2687
## 5                 17.04   0.008364                 53.42    0.3013
## 6                 17.26   0.081450                 62.34    0.2920
##   xs_vldl_c_percentage xs_vldl_ce xs_vldl_ce_percentage xs_vldl_fc
## 1                49.42     0.2878                 32.57    0.14890
## 2                38.75     0.1352                 25.38    0.07120
## 3                49.01     0.1495                 32.50    0.07590
## 4                49.96     0.1813                 33.72    0.08738
## 5                51.67     0.1922                 32.95    0.10910
## 6                46.90     0.1871                 30.05    0.10490
##   xs_vldl_fc_percentage xs_vldl_l     xs_vldl_p xs_vldl_pl
## 1                 16.85    0.8838 0.00000006895     0.2776
## 2                 13.36    0.5328 0.00000004427     0.1414
## 3                 16.50    0.4598 0.00000003644     0.1202
## 4                 16.25    0.5378 0.00000004230     0.1523
## 5                 18.71    0.5832 0.00000004480     0.1819
## 6                 16.85    0.6227 0.00000004922     0.1857
##   xs_vldl_pl_percentage xs_vldl_tg xs_vldl_tg_percentage xxl_vldl_c
## 1                 31.41    0.16940                 19.17   0.008250
## 2                 26.54    0.18500                 34.72   0.022160
## 3                 26.13    0.11430                 24.87   0.005346
## 4                 28.32    0.11680                 21.72   0.010200
## 5                 31.19    0.09996                 17.14   0.001834
## 6                 29.82    0.14490                 23.28   0.009104
##   xxl_vldl_c_percentage xxl_vldl_ce xxl_vldl_ce_percentage xxl_vldl_fc
## 1                 26.65    0.006399                 20.670   0.0018510
## 2                 17.89    0.011980                  9.672   0.0101800
## 3                 19.71    0.003108                 11.460   0.0022380
## 4                 16.53    0.004910                  7.962   0.0052870
## 5                 22.99    0.001438                 18.040   0.0003954
## 6                 18.26    0.004670                  9.365   0.0044330
##   xxl_vldl_fc_percentage xxl_vldl_l       xxl_vldl_p xxl_vldl_pl
## 1                  5.979   0.030960 0.00000000014470   0.0003335
## 2                  8.218   0.123800 0.00000000057710   0.0149700
## 3                  8.250   0.027130 0.00000000012590   0.0030720
## 4                  8.573   0.061680 0.00000000028780   0.0078650
## 5                  4.958   0.007975 0.00000000003677   0.0011910
## 6                  8.890   0.049870 0.00000000023120   0.0063730
##   xxl_vldl_pl_percentage xxl_vldl_tg xxl_vldl_tg_percentage
## 1                  1.077     0.02238                  72.28
## 2                 12.090     0.08672                  70.02
## 3                 11.330     0.01871                  68.97
## 4                 12.750     0.04361                  70.71
## 5                 14.940     0.00495                  62.07
## 6                 12.780     0.03439                  68.96
##   abnormal_macromolecule_a low_glucose low_glutamine_high_glutamate
## 1                    false       false                        false
## 2                    false       false                        false
## 3                    false       false                        false
## 4                    false       false                        false
## 5                    false       false                        false
## 6                    false       false                        false
##   low_protein_content high_citrate high_ethanol high_lactate high_pyruvate
## 1               false        false        false        false         false
## 2               false        false        false        false         false
## 3               false        false        false        false         false
## 4               false        false        false        false         false
## 5               false        false        false        false         false
## 6               false        false        false        false         false
##   serum_sample unidentified_small_molecule_a unidentified_small_molecule_b
## 1        false                         false                         false
## 2        false                         false                         false
## 3        false                         false                         false
## 4        false                         false                         false
## 5        false                         false                         false
## 6        false                         false                         false
##   unknown_acetylated_compound isopropyl_alcohol polysaccharides
## 1                       false             false           false
## 2                       false             false           false
## 3                       false             false           false
## 4                       false             false           false
## 5                       false             false           false
## 6                       false             false           false
##   aminocaproic_acid  fast      biobank
## 1             false false LLS_PARTOFFS
## 2             false false LLS_PARTOFFS
## 3             false false LLS_PARTOFFS
## 4             false false LLS_PARTOFFS
## 5             false false LLS_PARTOFFS
## 6             false false LLS_PARTOFFS
Using metabolomic SummarizedExperiments
data(metabolomics_RP3RP4_overlap)
metabolomicData

## class: SummarizedExperiment0 
## dim: 247 3882 
## metadata(0):
## assays(1): measurements
## rownames(247): acace ace ... aminocaproic_acid fast
## metadata column names(0):
## colnames: NULL
## colData names(51): biobank subject_id ... temp_storage
##   time_storage

colData(metabolomicData)

## DataFrame with 3882 rows and 51 columns
##       biobank  subject_id     bios_id date_of_birth age_bloodcollection
##      <factor>    <factor> <character>      <factor>           <numeric>
## 1       VUNTR  VUNTR-A20A        A20A                              56.7
## 2       VUNTR  VUNTR-A20B        A20B                              56.0
## 3       VUNTR  VUNTR-A20C        A20C                              31.0
## 4       VUNTR  VUNTR-A21A        A21A                              65.0
## 5       VUNTR  VUNTR-A21B        A21B                              59.7
## ...       ...         ...         ...           ...                 ...
## 3878    VUNTR  VUNTR-A56C        A56C                              32.6
## 3879    VUNTR VUNTR-A573C       A573C                              23.6
## 3880    VUNTR VUNTR-A573D       A573D                              23.6
## 3881    VUNTR VUNTR-A574C       A574C                              22.4
## 3882    VUNTR VUNTR-A575C       A575C                              22.8
##        gender pedigree_information
##      <factor>             <factor>
## 1        true                 true
## 2       false                 true
## 3        true                 true
## 4        true                 true
## 5       false                 true
## ...       ...                  ...
## 3878    false                 true
## 3879    false                 true
## 3880    false                 true
## 3881     true                 true
## 3882     true                 true
##                              gwas_platform_used gwas_available_date
##                                        <factor>            <factor>
## 1                              Illumina Omni 1M             06-2015
## 2                              Illumina Omni 1M             06-2015
## 3                              Illumina Omni 1M             06-2015
## 4                              Illumina Omni 1M             06-2015
## 5                              Illumina Omni 1M             06-2015
## ...                                         ...                 ...
## 3878      Illumina Omni 1M; Affymetrix 6.0 907K             06-2015
## 3879 Affymetrix Genome-Wide Human SNP Array 6.0             06-2015
## 3880 Affymetrix Genome-Wide Human SNP Array 6.0             06-2015
## 3881                        Affymetrix 6.0 907K             06-2015
## 3882                                                               
##      dna_amount dna_source rna_amount rna_source        date_of_inclusion
##        <factor>   <factor>   <factor>   <factor>                 <factor>
## 1          true      blood       true      blood 2008-10-01T00:00:00+0200
## 2          true      blood       true      blood 2008-10-01T00:00:00+0200
## 3          true      blood       true      blood 2008-11-01T00:00:00+0100
## 4          true      blood       true      blood 2008-03-01T00:00:00+0100
## 5          true      blood       true      blood 2008-03-01T00:00:00+0100
## ...         ...        ...        ...        ...                      ...
## 3878       true      blood       true      blood 2008-07-01T00:00:00+0200
## 3879       true      blood       true      blood 2012-09-01T00:00:00+0200
## 3880       true      blood       true      blood 2012-09-01T00:00:00+0200
## 3881       true      blood       true      blood 2012-04-01T00:00:00+0200
## 3882       true      blood       true      blood 2012-07-01T00:00:00+0200
##       smoking alcohol_consumption    height    weight waist_circumference
##      <factor>            <factor> <numeric> <numeric>           <numeric>
## 1        true                           183      89.5                  96
## 2       false                           178     106.2                 105
## 3        true                           189      70.5                  73
## 4       false                           176      83.3                  94
## 5       false                           169      64.6                  69
## ...       ...                 ...       ...       ...                 ...
## 3878    false                           159      54.0                  76
## 3879    false                           174      61.9                  72
## 3880    false                           174      60.5                  69
## 3881    false                           183      94.7                  89
## 3882    false                           188      68.6                  74
##      hip_circumference    hs_crp       wbc       hgb       hct       plt
##              <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
## 1                   98      2.40      11.7       9.7     0.455       221
## 2                  123      3.74       7.0       9.1     0.403       273
## 3                   94      0.67       5.0       9.3     0.435       162
## 4                  105      7.50       7.6       9.7     0.486       317
## 5                   95      6.66       6.5       8.1     0.408        97
## ...                ...       ...       ...       ...       ...       ...
## 3878                97     7.300      7.80       7.5     0.356       223
## 3879                96     2.600      5.44       8.7     0.416       196
## 3880                92     2.320      6.56       8.4     0.403       186
## 3881               109     0.344      4.52       9.8     0.465       278
## 3882                89     1.280      6.68       9.4     0.437       262
##      neut_percentage lymph_percentage mono_percentage eos_percentage
##            <numeric>        <numeric>       <numeric>      <numeric>
## 1               63.8             28.4             5.2            2.4
## 2               55.9             31.7             8.1            2.5
## 3               48.4             37.5             8.3            5.6
## 4               69.4             21.0             7.4            2.2
## 5               65.3             23.8             7.1            2.9
## ...              ...              ...             ...            ...
## 3878            72.3             23.1             3.7            0.7
## 3879            43.9             44.3             9.9            1.7
## 3880            41.0             47.9             9.5            1.5
## 3881            47.8             37.8            13.1            1.1
## 3882            57.5             28.0            10.9            3.4
##      baso_percentage luc_percentage tot_cholesterol hdl_cholesterol
##            <numeric>      <numeric>       <numeric>       <numeric>
## 1                0.2             NA            6.85            0.93
## 2                1.8             NA            5.06            1.44
## 3                0.2             NA            3.64            1.25
## 4                0.0             NA            5.20            1.50
## 5                0.9             NA            5.78            2.30
## ...              ...            ...             ...             ...
## 3878             0.2             NA            5.67            2.22
## 3879             0.2             NA            4.80            1.30
## 3880             0.2             NA            4.50            1.20
## 3881             0.2             NA            6.30            0.80
## 3882             0.1             NA            4.20            1.40
##      triglycerides systolic_blood_pressure diastolic_blood_pressure
##          <numeric>               <numeric>                <numeric>
## 1             1.81                      NA                       NA
## 2             0.96                      NA                       NA
## 3             0.42                      NA                       NA
## 4             0.85                      NA                       NA
## 5             0.79                      NA                       NA
## ...            ...                     ...                      ...
## 3878          1.31                      NA                       NA
## 3879          1.33                   128.0                     87.5
## 3880          1.16                   133.5                     81.5
## 3881          3.18                   143.0                     77.5
## 3882          1.30                   132.0                     81.5
##      lipid_lowering_med blood_pressure_lowering_med metabolic_syndrome
##               <integer>                    <factor>           <factor>
## 1                     0                       false                   
## 2                     0                       false                   
## 3                     0                       false                   
## 4                     0                       false                   
## 5                     0                       false                   
## ...                 ...                         ...                ...
## 3878                  0                       false                   
## 3879                  0                       false                   
## 3880                  0                       false                   
## 3881                  0                       false                   
## 3882                  0                       false                   
##      diabetes              real_bios_id biobank.1 subject_id.1
##      <factor>               <character>  <factor>     <factor>
## 1                NTR-A20A-NTR16223-9207     VUNTR   VUNTR-A20A
## 2                NTR-A20B-NTR16225-9208     VUNTR   VUNTR-A20B
## 3                NTR-A20C-NTR16565-9388     VUNTR   VUNTR-A20C
## 4                NTR-A21A-NTR13461-7740     VUNTR   VUNTR-A21A
## 5                         NTR-A21B-7741     VUNTR   VUNTR-A21B
## ...       ...                       ...       ...          ...
## 3878             NTR-A56C-NTR15095-8604     VUNTR   VUNTR-A56C
## 3879          NTR-A573C-NT0027615-10346     VUNTR  VUNTR-A573C
## 3880          NTR-A573D-NT0027614-10345     VUNTR  VUNTR-A573D
## 3881          NTR-A574C-NT0027387-10113     VUNTR  VUNTR-A574C
## 3882                    NTR-A575C-10252     VUNTR  VUNTR-A575C
##                sample_id          date_collection date_inclusion
##                 <factor>                 <factor>    <character>
## 1    VUNTR-9207_64109-31 2008-10-01T00:00:00+0200             NA
## 2    VUNTR-9208_64109-41 2008-10-01T00:00:00+0200             NA
## 3    VUNTR-9388_64109-02 2008-11-01T00:00:00+0100             NA
## 4    VUNTR-7740_65698-31 2008-03-01T00:00:00+0100             NA
## 5    VUNTR-7741_65698-41 2008-03-01T00:00:00+0100             NA
## ...                  ...                      ...            ...
## 3878 VUNTR-8604_68402-01 2008-07-01T00:00:00+0200             NA
## 3879 VUNTR-10346_1062613 2012-09-01T00:00:00+0200             NA
## 3880 VUNTR-10345_1062614 2012-09-01T00:00:00+0200             NA
## 3881 VUNTR-10113_1019101 2012-04-01T00:00:00+0200             NA
## 3882 VUNTR-10252_1049462 2012-07-01T00:00:00+0200             NA
##      sample_matrix  fasting time_handling temp_storage time_storage
##           <factor> <factor>     <numeric>    <numeric>    <numeric>
## 1      EDTA plasma    false             6          -30           NA
## 2      EDTA plasma    false             6          -30           NA
## 3      EDTA plasma     true             6          -30           NA
## 4      EDTA plasma     true             6          -30           NA
## 5      EDTA plasma     true             6          -30           NA
## ...            ...      ...           ...          ...          ...
## 3878   EDTA plasma     true             6          -30           NA
## 3879   EDTA plasma     true             6          -30           NA
## 3880   EDTA plasma     true             6          -30           NA
## 3881   EDTA plasma     true             6          -30           NA
## 3882   EDTA plasma     true             6          -30           NA

meas <- assays(metabolomicData)$measurements
remove <- apply(meas, 1, function(x) sum(is.na(x)) == ncol(meas))
metabolomicData <- metabolomicData[!remove, ]
metabolomicData

## class: SummarizedExperiment0 
## dim: 231 3882 
## metadata(0):
## assays(1): measurements
## rownames(231): acace ace ... xxl_vldl_tg xxl_vldl_tg_percentage
## metadata column names(0):
## colnames: NULL
## colData names(51): biobank subject_id ... temp_storage
##   time_storage

Genotype data

The impute2 genotype files have been transformed to tabix-files. These tabix files contain dosages and are filter on MAF 0.05 and INFO 0.04. Additionally, the rs-number, chrosomome name and position are added to these files as well as the sample identifiers (gwas_id).

Reading Impute2 tabix files

TabixFile creates a reference to a Tabix file (and its index). Internally the object tbx contains a pointer to the file. This mechanism allows us to read the data in chunks, e.g. when the whole file does not fit in memory, and perform some operation on each chunk.

The next code chunk show how you can do this using plain R.

gzipped <- dir(file.path(RP3DATADIR, "GWAS_ImputationGoNLv5/dosages", BIOBANKS[1]), 
    pattern = "gz$", full.names = TRUE)
chunk <- read.dosages(gzipped[1], yieldSize = 5000)

## Reading chunk...

chunk[1:5, 1:10]

##   snp_id    rs_id position exp_freq_a1  info certainty type chr     rsid
## 1    ---  1-77560    77560       0.001 0.450     0.999    0   1  1-77560
## 2    ---  1-83516    83516       0.001 0.426     0.998    0   1  1-83516
## 3    ---  1-87885    87885       0.001 0.504     0.999    0   1  1-87885
## 4    --- 1-249389   249389       0.002 0.402     0.997    0   1 1-249389
## 5    --- 1-362911   362911       0.002 0.514     0.998    0   1 1-362911
##      pos
## 1  77560
## 2  83516
## 3  87885
## 4 249389
## 5 362911

chunk <- read.dosages(gzipped[1], yieldSize = 5000, type = "GRanges")

## Reading chunk...

chunk

## GRanges object with 5000 ranges and 3 metadata columns:
##          seqnames             ranges strand   |        rsid         ref
##             <Rle>          <IRanges>  <Rle>   | <character> <character>
##      [1]     chr1   [ 77560,  77560]      *   |     1-77560           T
##      [2]     chr1   [ 83516,  83516]      *   |     1-83516           C
##      [3]     chr1   [ 87885,  87885]      *   |     1-87885           A
##      [4]     chr1   [249389, 249389]      *   |    1-249389           A
##      [5]     chr1   [362911, 362911]      *   |    1-362911           G
##      ...      ...                ...    ... ...         ...         ...
##   [4996]     chr1 [1957299, 1957299]      *   |   rs3820007           C
##   [4997]     chr1 [1957414, 1957414]      *   |   1-1957414           T
##   [4998]     chr1 [1958532, 1958532]      *   |   1-1958532           G
##   [4999]     chr1 [1959238, 1959238]      *   | rs114869768           G
##   [5000]     chr1 [1959261, 1959261]      *   |  rs28574670           A
##                  alt
##          <character>
##      [1]           C
##      [2]           T
##      [3]           C
##      [4]           T
##      [5]           T
##      ...         ...
##   [4996]           T
##   [4997]           C
##   [4998]           A
##   [4999]           A
##   [5000]           G
##   -------
##   seqinfo: 1 sequence from an unspecified genome; no seqlengths

chunk <- read.dosages(gzipped[1], yieldSize = 5000, type = "SummarizedExperiment")

## Reading chunk...

chunk

## class: RangedSummarizedExperiment 
## dim: 5000 768 
## metadata(0):
## assays(1): dosage
## rownames: NULL
## rowRanges metadata column names(3): rsid ref alt
## colnames: NULL
## colData names(1): gwas_id

colData(chunk)

## DataFrame with 768 rows and 1 column
##         gwas_id
##     <character>
## 1          7208
## 2          6434
## 3          8640
## 4          4267
## 5          8725
## ...         ...
## 764     2477002
## 765     5399001
## 766     1457001
## 767      543002
## 768     8127001

rowRanges(chunk)

## GRanges object with 5000 ranges and 3 metadata columns:
##          seqnames             ranges strand   |        rsid         ref
##             <Rle>          <IRanges>  <Rle>   | <character> <character>
##      [1]     chr1   [ 77560,  77560]      *   |     1-77560           T
##      [2]     chr1   [ 83516,  83516]      *   |     1-83516           C
##      [3]     chr1   [ 87885,  87885]      *   |     1-87885           A
##      [4]     chr1   [249389, 249389]      *   |    1-249389           A
##      [5]     chr1   [362911, 362911]      *   |    1-362911           G
##      ...      ...                ...    ... ...         ...         ...
##   [4996]     chr1 [1957299, 1957299]      *   |   rs3820007           C
##   [4997]     chr1 [1957414, 1957414]      *   |   1-1957414           T
##   [4998]     chr1 [1958532, 1958532]      *   |   1-1958532           G
##   [4999]     chr1 [1959238, 1959238]      *   | rs114869768           G
##   [5000]     chr1 [1959261, 1959261]      *   |  rs28574670           A
##                  alt
##          <character>
##      [1]           C
##      [2]           T
##      [3]           C
##      [4]           T
##      [5]           T
##      ...         ...
##   [4996]           T
##   [4997]           C
##   [4998]           A
##   [4999]           A
##   [5000]           G
##   -------
##   seqinfo: 1 sequence from an unspecified genome; no seqlengths

assay(chunk)[1:5, 1:5]

##      [,1] [,2]  [,3] [,4] [,5]
## [1,]    0    0 0.000    0    0
## [2,]    0    0 0.001    0    0
## [3,]    0    0 0.001    0    0
## [4,]    0    0 0.001    0    0
## [5,]    0    0 0.000    0    0

read.dosages with default type=data.frame returns a data.frame containing the specified chunk of the dosages-file. The other type options return only the genomic locations of the chunk as a GRanges-object or the same information as the data.frame but as a SummarizedExperiment.

The GenomicFiles, based on BiocParallel, provide functionality to read the chunks in parallel. The following code chunk show how to use these functions.

Use cases

Session info

sessionInfo()

## R version 3.2.0 (2015-04-16)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## Running under: Ubuntu precise (12.04.5 LTS)
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    parallel  methods   stats     graphics  grDevices utils    
## [8] datasets  base     
## 
## other attached packages:
##  [1] ggplot2_2.1.0              lubridate_1.5.6           
##  [3] rjson_0.2.15               RCurl_1.95-4.8            
##  [5] bitops_1.0-6               BIOSRutils_0.0.7          
##  [7] SummarizedExperiment_1.0.2 Biobase_2.30.0            
##  [9] GenomicRanges_1.22.4       GenomeInfoDb_1.6.3        
## [11] IRanges_2.4.8              S4Vectors_0.8.11          
## [13] BiocGenerics_0.16.1        knitr_1.13                
## [15] BiocStyle_1.8.0            BiocInstaller_1.20.3      
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.5          formatR_1.4          futile.logger_1.4.1 
##  [4] plyr_1.8.3           XVector_0.10.0       futile.options_1.0.0
##  [7] tools_3.2.0          zlibbioc_1.16.0      digest_0.6.9        
## [10] jsonlite_0.9.20      evaluate_0.9         gtable_0.2.0        
## [13] yaml_2.1.13          stringr_1.0.0        Biostrings_2.38.4   
## [16] grid_3.2.0           BiocParallel_1.4.3   rmarkdown_0.9.6.9   
## [19] lambda.r_1.1.7       magrittr_1.5         Rsamtools_1.22.0    
## [22] scales_0.4.0         htmltools_0.3.5      colorspace_1.2-6    
## [25] labeling_0.3         stringi_1.0-1        munsell_0.4.3

References

Attachments (4)

Download all attachments as: .zip