Sunday, September 26, 2010

bedtools - another new version is out

gets better every version, great toolbox!
Latest news (Version 2.10.0, 21-September-2010)
New annotateBed tool that annotates one BED/VCF/GFF file with the coverage and number of overlaps observed from multiple other BED/VCF/GFF files. In this way, it allows one to ask to what degree one feature coincides with multiple other feature types with a single command.
New unionBedGraphs tool that combines multiple BEDGRAPH files into a single file such that one can compare coverage (and other text-values) across multiple samples
Support for writing uncompressed BAM output with the -ubam option.
New "distance feature" (-d) added to closestBed. In addition to finding the closest feature to each feature in A, the -d option will report the distance to the closest feature in B. Overlapping features have a distance of 0.
New "per base depth feature" (-d) added to coverageBed. This reports the per base coverage (1-based) of each feature in file B based on the coverage of features found in file A. For example, this could report the per-base depth of sequencing reads (-a) across each capture target (-b).
Useful new groupBy tool. This is a very useful new utility that mimics the "groupBy" clause in database systems. Given a file or stream that is sorted by the appropriate "grouping columns", groupBy will compute multiple statistics/operations on other columns in the file or stream. This will work with output from all BEDTools as well as any other tab-delimited file or stream. Please see the help for the tools for examples.
New freqdesc and freqasc operations for groupBy. Computes histograms of the values observed in a column in a file or stream.
Native, "mix and match" support for BED, GFF, VCF (v4.0), BAM, and BEDPE files. All input files can be "gzipped"; such files are auto-detected.
Proper support for "split" BAM alignments and "blocked" BED (aka BED12) features. By using the "-split" option, intersectBed, coverageBed, genomeCoverageBed, and bamToBed will now correctly compute overlaps/coverage solely for the "split" portions of BAM alignments or the "blocks" of BED12 features such as genes.

Ref: Quinlan, AR and Hall, IM, 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 26, 6, pp. 841–842.

