Gewählte Publikation:
SHR
Neuro
Krebs
Kardio
Lipid
Stoffw
Microb
Wittelsbürger, U; Pfeifer, B; Lercher, MJ.
WhopGenome: high-speed access to whole-genome variation and sequence data in R.
Bioinformatics. 2015; 31(3): 413-415.
Doi: 10.1093/bioinformatics/btu636
[OPEN ACCESS]
Web of Science
PubMed
FullText
FullText_MUG
- Co-Autor*innen der Med Uni Graz
-
Pfeifer Bastian
- Altmetrics:
- Dimensions Citations:
- Plum Analytics:
- Scite (citation analytics):
- Abstract:
-
The statistical programming language R has become a de facto standard for the analysis of many types of biological data, and is well suited for the rapid development of new algorithms. However, variant call data from population-scale resequencing projects are typically too large to be read and processed efficiently with R's built-in I/O capabilities. WhopGenome can efficiently read whole-genome variation data stored in the widely used variant call format (VCF) file format into several R data types. VCF files can be accessed either on local hard drives or on remote servers. WhopGenome can associate variants with annotations such as those available from the UCSC genome browser, and can accelerate the reading process by filtering loci according to user-defined criteria. WhopGenome can also read other Tabix-indexed files and create indices to allow fast selective access to FASTA-formatted sequence files.
The WhopGenome R package is available on CRAN at http://cran.r-project.org/web/packages/WhopGenome/. A Bioconductor package has been submitted.
lercher@cs.uni-duesseldorf.de.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
- Find related publications in this database (using NLM MeSH Indexing)
-
Algorithms -
-
Genetic Variation -
-
Genome, Human -
-
Genomics - methods
-
Humans -
-
Molecular Sequence Annotation -
-
Software -