DragonFly On-Line Manual Pages
scramble(1) Staden io_lib scramble(1)
NAME
scramble - Converts between the SAM, BAM and CRAM file formats.
SYNOPSIS
scramble [options] [input_file [output_file]]
DESCRIPTION
scramble converts between various next-gen sequencing alignment file
formats, including SAM, BAM and CRAM. It can either act as a pipe
reading stdin and writing to stdout, or on named files.
When operating as a pipe the input type defaults to SAM or BAM,
requiring the -I cram option to indicate input is in CRAM format is
appropriate. The output defaults to BAM, but can be adjusted by using
the -O format option. When given filenames the file type is
automatically chosen based on the filename suffix.
OPTIONS
-I format
Selects the input format, where format is one of sam, bam or
cram. Use this when reading via a pipe to avoid input bytes
being consumed when attempting to detect if the input is in SAM
or BAM format.
-O format
Selects the output format, where format is one of sam, bam or
cram.
-1 to -9
Sets the compression level from 1 (low compression, fast) to 9
(high compression, slow) when writing in BAM or CRAM format.
This is only used during writing.
-0 or -u
Writes uncompressed data. In BAM this still uses BGZF
containers, but with no internal compression. In CRAM it stores
blocks in RAW format instead. The option has no effect on SAM
output.
-j CRAM encoding only. Add bzip2 to the list of compression codes
potentially used during CRAM creation.
-Z CRAM encoding only. Add lzma to the list of compression codes
potentially used during CRAM creation. Given the slow
compression speed of lzma, this may only be used where it gives
a significant advantage over zlib or bzip2, but with higher
compression levels (-7) this weighting is ignored as LZMA
decompression speed is acceptable, albeit still slower than
zlib.
-m CRAM decoding only. Generate MD:Z: and NM:I: auxiliary fields
based on the reference-based compression.
-M CRAM encoding only. Forcibly pack sequences from multiple
references into the same slice. Normally CRAM will start a new
slice when changing from one reference to another, but will
still automatically switch to multi-reference slices if the
number of sequences per slice becomes too small.
-R range
Currently for CRAM input only, but SAM/BAM support is pending.
This indicates a reference sequence name and optionally a start
and end location within that reference, using the syntax
ref_name or ref_name:start-end. For efficient operation the CRAM
file needs a .crai format index (built using the cram_index
program).
-r ref.fa
CRAM encoding only. Use this to specify the reference fasta
file. Note that if the input SAM or BAM file a file: or local
file system based URI specified in the @SQ headers then this
option may not be necessary.
-s number
CRAM encoding only. Specifies the number of sequecnes per
slice. Defaults to 10000.
-S number
CRAM encoding only. Specifies the number of slices per
container. Defaults to 1.
-t BAM and CRAM only. Specifies the number of compression or
decompression threads, adaptively shared between both encoding
and decoding. Defaults to 1 (no threading).
-V version_string
CRAM encoding only. Sets the CRAM file format version.
Supported values are "2.0", "2.1" and "3.0".
-e CRAM encoding only. Embed snippets of the reference sequence in
every slice. This means the files can be decoded without
needing to specify the reference fasta file.
-x CRAM encoding only. Omit reference based compression and
instead store details of every base verbatim.
-B Experimental, encoding only. When storing quality values, bin
into 8 discrete values (plus 0), as typically used by modern
Illumina instruments. (Note that the bins may not be precisely
the same ranges.)
-! CRAM v3.0 and above decoding only. Do not check CRCs. This
option should only be used when attempting to recover from a
data corruption.
EXAMPLES
To convert a BAM file from stdin to CRAM on stdout, using reference
MT.fa.
some_command | scramble -I bam -O cram -r MT.fa | some_command
The default CRAM output format is version 3.0, so no version needs to
be specified when converting from 2.1 to 3.0. To perform the reverse
use:
scramble -V 2.1 in.cram out.cram
AUTHOR
James Bonfield, Wellcome Trust Sanger Institute
March 19 2013 scramble(1)