DragonFly On-Line Manual Pages

y4mscaler(1)                    y4mtools manual                   y4mscaler(1)

NAME
       y4mscaler - Scale/crop/translate a YUV4MPEG2 stream

SYNOPSIS
       y4mscaler [options] < Y4Mstream > Y4Mstream

DESCRIPTION
       y4mscaler is a general-purpose video scaler which operates on YUV4MPEG2
       streams, as produced and consumed by the MJPEGtools such as lav2yuv and
       mpeg2enc(1).

       y4mscaler is meant to be used in a pipeline.  Thus, input is from
       stdin, and output is to stdout.

       The essential function of y4mscaler is to scale a specified "active"
       region of the input stream (the source) into a specified active region
       of the output stream (the target).  Pixels outside of the active region
       of the source are ignored; pixels outside of the active region of the
       target are filled with a background color.  The source may additionally
       have a matte applied to it; pixels outside the source matte are set to
       a separately specified background color.

       y4mscaler correctly handles chroma subsampling, and thus it can also
       perform chroma subsampling conversions.  The YUV4MPEG2 stream format
       supports three varieties of 4:2:0 subsampling, as well as 4:1:1, 4:2:2,
       4:4:4, a 4:4:4 modes with an alpha channel, and a monochrome luma-only
       mode.  (See "NOTES ON CHROMA MODES AND SUBSAMPLING".)

       y4mscaler can perform simple interlacing conversions:  switching from
       top-field-first to bottom-field-first and vice-versa (by lossily
       discarding the first field), and creating a progressive stream from
       interlaced by discarding every other field (effectively halving the
       vertical resolution).

       The source and target are defined by many, many parameters, but
       y4mscaler has many, many heuristics built-in to automagically set them
       appropriately.  Most source parameters are taken from the input stream
       header.  Remaining source and target parameters which are not specified
       by the user are guessed in a sane manner.

       y4mscaler includes preset parameters for a number of common target
       streams: DVD, VideoCD (VCD), SuperVCD (SVCD), associated still image
       formats, and DV.

EXAMPLES
       To create a stream appropriate for use in an SVCD:

            y4mscaler -O preset=svcd

       To create a stream for a VideoCD (a non-interlaced format), from a DV
       source (an interlaced format), shifting the input frame 4 pixels to the
       left:

            y4mscaler -I ilace=bottom-only -I active=-4+0cc -O preset=vcd

       To take a widescreen NTSC DV source, and convert it to a letterboxed
       stream, with blue bars on the top and bottom:

            y4mscaler -O sar=ntsc -O bg=RGB:0,0,255

       To take a widescreen NTSC DV source, and convert it to a "fullscreen"
       stream (i.e. the sides are clipped, just like on TV):

            y4mscaler -O sar=ntsc -O infer=clip

       To take a centered, letterboxed NTSC source, and convert it to a
       widescreen (16:9) format stream for DVD, with the black bars removed:

            y4mscaler -O preset=dvd -O sar=ntsc_wide -O infer=clip

       To take the center 100x100 pixel chunk of an NTSC DV stream, surround
       it with a 20-pixel blue border, and blow that up to a full-screen
       SuperVCD stream:

            y4mscaler -I active=140x140+0+0cc -I matte=100x100+0+0cc -I
            bg=RGB:0,0,255 -O preset=svcd

OPTIONS
       The first three options, -v, -V, and -h, are simple straightforward
       options which take either no arguments or one numeric argument.

       -v [0,1,2]
            Set verbosity level.
             0 = warnings and errors only.
             1 = add informative messages, too (default).
             2 = add chatty debugging message, too.

       -V   Show version information and exit.

       -h   Show a help message (synopsis of options).

       The -I, -O, and -S options each take one argument of the form
       parameter=value, which specify parameters for the input, output, and
       scaling, respectively.  These options can be used repeatedly to specify
       multiple parameters.  The parameter names and values are not case-
       sensitive.  Definitions of the form "parameter=[AAA|BBB|CCC]" mean that
       only one of the listed keywords AAA, BBB, or CCC may be chosen.
       Succeeding options will override earlier ones.

       -I input_parameter
            Specify parameters for the source/input stream.  All '-I'
            arguments are evaluated in order, and later arguments on the
            command-line will override earlier ones.  All '-I' arguments are
            evaluated before any '-O' arguments.

            active=WxH+X+Yaa
               Specify the active region of the source frame, which is scaled
               to fit the active region of the target frame.  The default is
               the full frame.  (The "WxH" may be omitted, and the region size
               defaults to the size of of the source frame.)  W and H are
               width and height.  X and Y are the offset of the anchor point.
               "aa" is the anchor mode (default: TL); see "NOTES ON REGION
               GEOMETRY" for details.
               Example:  active=200x180+30+24cc

            matte=WxH+X+Y
               Specify a matte region for the source frame.  All pixels
               outside of this region are set to the source background color.
               The default matte is the full frame.  (The "WxH" may be
               omitted, and the region size defaults to the size of of the
               source frame.)  W and H are width and height.  X and Y are the
               offset of the anchor point.  "aa" is the anchor mode (default:
               TL); see "NOTES ON REGION GEOMETRY" for details.
               Example:  matte=200x180+30+24cc

            bg=RGB:r,g,b
            bg=YCBCR:y,cb,cr
            bg=RGBA:r,g,b,a
            bg=YCBCRA:y,cb,cr,a
               Set the source background color.  Pixels outside of the
               source's matte region are set to this color. One can specify
               the color as either a R'G'B' or Y'CbCr triplet.  For example,
               the default color is black, specified as "bg=YCBCR:16,128,128"
               or "bg=RGB:0,0,0".  The 'A' versions will set the alpha
               (transparency) value of the color.  The alpha range is [0,255]
               for RGBA and [16,235] for YCBCRA.  The default is fully-opaque
               (255 for RGBA, 235 for YCBCRA).

            norm=[NTSC|PAL|SECAM]
               Specify the "norm" of the source stream.  This is normally
               inferred from the stream header.

            ilace=[NONE|TOP_FIRST|BOTTOM_FIRST|TOP_ONLY|BOTTOM_ONLY]
               Specify the interlacing used by the source stream.  NONE,
               TOP_FIRST, and BOTTOM_FIRST correspond to non-interlaced, top-
               field-first, and bottom-field-first.  These values are normally
               inferred from the stream header; specifying them will override
               the stream header.
               TOP_ONLY and BOTTOM_ONLY specify that only the top or bottom
               field of each frame should be used; the other field is
               discarded.  These options can only be used with an interlaced
               input, and cause the interlaced stream to be treated as a
               progressive stream with half the height.  (This is particularly
               useful in creating a VCD from a full-size interlaced input
               stream.)  These two special options can only be used when the
               source is a pure progressive stream (as opposed to a YUV4MPEG2
               "mixed-mode" stream).

            chromass=[420JPEG|420MPEG2|420PALDV|444|422|411|mono|444alpha]
               Specify the chroma subsampling mode used in the source stream.
               This parameter is inferred from the stream header, so this
               keyword should almost never be used in a source specification.
               The only useful reason to specify this keyword is to override
               one variety of 4:2:0 with another. Any other use will cause
               processing to fail.

            sar=N:D
            sar=[NTSC|PAL|NTSC_WIDE|PAL_WIDE]
               Specify the sample-aspect-ratio of the source stream.  The
               value can either be or numeric ratio (such as "10:11") or one
               of the keywords, which correspond to the CCIR-601 values for
               4:3 or 16:9 displays, respectively.  This parameter is usually
               inferred from the stream header.

       -O output_parameter
            Specify parameters for the destination/output stream.  All '-O'
            arguments are evaluated in order, and later arguments on the
            command-line will override earlier ones.  All '-O' arguments are
            evaluated after any '-I' arguments.

            size=WxH
            size=SRC
               Set the output/target frame size, as width W and height H in
               pixels.  Use the keyword SRC to specify that the target frame
               size should match the source frame size.

            active=WxH+X+Yaa
               Specify the active region of the target frame, into which the
               active region of the source frame is scaled.  The default is
               the full target frame.  (The "WxH" may be omitted, and the
               region size defaults to the size of of the target frame.)  W
               and H are width and height.  X and Y are the offset of the
               anchor point.  "aa" is the anchor mode (default: TL); see
               "NOTES ON REGION GEOMETRY" for details.
               Example:  active=200x180+30+24cc

            bg=RGB:r,g,b
            bg=YCBCR:y,cb,cr
            bg=RGBA:r,g,b,a
            bg=YCBCRA:y,cb,cr,a
               Set the target background color.  Pixels outside of the
               target's active region are set to this color. One can specify
               the color as either a R'G'B' or Y'CbCr triplet.  For example,
               the default color is black, specified as "bg=YCBCR:16,128,128"
               or "bg=RGB:0,0,0".  The 'A' versions will set the alpha
               (transparency) value of the color.  The alpha range is [0,255]
               for RGBA and [16,235] for YCBCRA.  The default is fully-opaque
               (255 for RGBA, 235 for YCBCRA).

            ilace=[NONE|TOP_FIRST|BOTTOM_FIRST]
               Specify the interlacing used by the target stream.  NONE,
               TOP_FIRST, and BOTTOM_FIRST correspond to non-interlaced, top-
               field-first, and bottom-field-first.  The default if to match
               the source stream.
               If the source and target are both interlaced, but with
               different modes (i.e. one is bottom-first, and the other is
               top-first), then y4mscaler will convert one mode to the other
               by dropping the first source field.

            chromass=[420JPEG|420MPEG2|420PALDV|444|422|411|mono|444alpha]
               Specify the chroma subsampling mode to be used in the target
               stream.  The default is to match the source mode.  See "NOTES
               ON CHROMA MODES AND SUBSAMPLING" for more information.

            sar=N:D
            sar=[SRC|NTSC|PAL|NTSC_WIDE|PAL_WIDE]
               Specify the sample-aspect-ratio of the source stream.  The
               value can either be or numeric ratio (such as "10:11") or one
               of the keywords, which correspond to the CCIR-601 values for
               4:3 or 16:9 displays, respectively.  The keyword SRC specifies
               that the target SAR should match the source.

            scale=N/D
            Xscale=N/D
            Yscale=N/D
               Set the scaling ratios, as a fraction; for example, scale=1/2.
               "scale=" sets both X and Y factors simultaneously.  "Xscale="
               and "Yscale=" can be used to set them independently.

            infer=[PAD|CLIP|PRESERVE_X|PRESERVE_Y]
               Set the mode used to infer scaling ratios from active regions
               and SAR's.  The keywords are mutually exclusive. The default is
               PAD.

            infer=[SIMPLIFY|EXACT]
               Set whether the above heuristic uses exact ratios, or whether
               it is allowed to slightly adjust active regions to simplify the
               scaling ratios.  The keywords are mutually exclusive.  The
               default is SIMPLIFY.

            align=[TL|TC|TR|CL|CC|CR|BL|BC|BR]
               Set the alignment point between the source and target active
               regions.  The keywords specify "top-left", "top-center", "top-
               right", etc.  The specified corner or point from the source
               region will be mapped to the same spot in the target region;
               and cropping or padding which is applied to the active regions
               will preserve this mapping.  The default is CC, for "center-
               center", i.e. the source and target regions are mutually
               centered.  The keywords are mutually exclusive.  The default is
               CC.  See "NOTES ON SOURCE AND TARGET ALIGNMENT" for details.

            preset=[VCD|CVD|SVCD|DVD|DVD_WIDE|DV|DV_WIDE|
                    SVCD_STILL_HI|SVCD_STILL_LO|VCD_STILL_HI|VCD_STILL_LO|
                    ATSC_720P|ATSC_1080I|ATSC_1080P]
               Use preset target parameters for several common output formats.
               Individual parameters can be overridden by following with more
               "-O" settings.  These keywords are mutually exclusive.  For the
               details of what settings these preset keywords imply, see
               "NOTES ON TARGET PRESETS".

               VCD - 352-wide VideoCD, progressive

               CVD - 352-wide (full-height) ChinaVideoDisc

               SVCD - 480-wide SuperVCD

               DVD - 720-wide DVD

               DVD_WIDE - 720-wide DVD, anamorphic pixels

               DV - 720-wide DV (bottom-field-first, 4:1:1)

               DV_WIDE - 720-wide DV, anamorphic pixels

               SVCD_STILL_HI - high-resolution SVCD still image

               SVCD_STILL_LO - low-resolution SVCD still image

               VCD_STILL_HI - high-resolution VCD still image

               VCD_STILL_LO - low-resolution SVCD still image

               ATSC_720P - ATSC 720p (progressive HDTV)

               ATSC_1080I - ATSC 1080i (interlaced HDTV)

               ATSC_1080P - ATSC 1080p (HDTV)

       -S scaling_parameter
            Specify parameters for the scaling engine.  All '-S' arguments are
            evaluated in order, and later arguments on the command-line will
            override earlier ones.

            mode=MONO
               Request monochrome scaling.  The source is treated as
               monochrome and its chroma channels are ignored.  The chroma
               channels of the output stream will be zeroed to yield a
               grayscale output.

            mode=LINESWITCH
               Request line swapping.  Effectively, the top and bottom fields
               within each frame will be swapped.  This may help with
               malformed streams that have a messed up spatial order.  This
               option is only effective on interlaced streams.

            scaler=scaler-name
               Use a particular scaling engine.  The available engines are:
                'default' - Matto's Generic Scaler (the default)

            option=scaler-option
               Specify an option for the chosen scaling engine.  To see all
               the available options, use "option=help".

               For the default engine, the available scaler-options select the
               filter kernel:

                  box - box filter

                  linear - linear interpolation

                  quadratic - quadratic interpolation

                  cubic - cubic interpolation, Mitchell-Netravali spline

                  cubicCR - cubic interpolation, Catmull-Rom spline

                  cubicB - cubic interpolation, B-spline

                  cubicK4 - Keys 4th-order cubic

                  sinc:N - sinc with Lanczos window, N cycles

               To select kernels for the x and y scaling directions
               independently, use two kernel names separated by a comma, e.g.
               option=box,quadratic.

               sinc:N will give the best quality results (least aliasing), but
               is the slowest.  The quality improves with larger values of N,
               as does processing time.  cubic is generally regarded in the
               graphics world as the 3rd-order cubic spline with the best
               trade-off between smoothing and aliasing.  box yields the worst
               quality results (most aliasing), but is the fastest.  The
               default kernel is cubicK4, which has a flatter passband and
               sharper cutoff than cubic.  (It requires the same computational
               power as sinc:4, but produces less ringing artifacts.)

NOTES ON TARGET PRESETS
       The following table details the settings provided by the various target
       "preset=" keywords.  When two values are given the primary is for NTSC
       streams; the value in {braces} is for PAL streams.  If interlace value
       is unspecified, it is inherited from the source, otherwise the
       indicated target interlacing is required.

        Preset         Frame Size    Interlace     SAR            Subsampling
       -----------------------------------------------------------------------
         VCD           352x240{288}  none          10:11{59:54}   4:2:0-JPEG
         CVD           352x480{576}  ---           20:11{59:27}   4:2:0-MPEG2
        SVCD           480x480{576}  ---           15:11{59:36}   4:2:0-MPEG2
         DVD           720x480{576}  ---           10:11{59:54}   4:2:0-MPEG2
         DVD_WIDE      720x480{576}  ---           40:33{118:81}  4:2:0-MPEG2
         DV            720x480{576}  bottom-first  10:11{59:54}   4:1:1
         DV_WIDE       720x480{576}  bottom-first  40:33{118:81}  4:1:1
        SVCD_STILL_HI  704x480{576}  none          10:11{59:54}   4:2:0-MPEG2
        SVCD_STILL_LO  480x480{576}  none          15:11{59:36}   4:2:0-MPEG2
         VCD_STILL_HI  704x480{576}  none          10:11{59:54}   4:2:0-JPEG
         VCD_STILL_LO  352x240{288}  none          10:11{59:54}   4:2:0-JPEG
        ATSC_720p         1280x720   none          1:1            4:2:0-MPEG2
        ATSC_1080i        1920x1080  (required)    1:1            4:2:0-MPEG2
        ATSC_1080p        1920x1080  none          1:1            4:2:0-MPEG2

NOTES ON REGION GEOMETRY
       Active and matte regions are specified using a geometry string of the
       form "WxH+X+Yaa".  The "WxH" part specifies the size of the region, as
       a Width and Height in pixels.  (In some cases, the "WxH" may be
       omitted, and the region size defaults to the full frame size.)  The
       "+X+Y" specifies the position of the region, as an offset relative to
       the anchor point specified by "aa".

       The "aa" code can be one of TL, TC, TR, CL, CC, CR, BL, BC, or BR.
       These stand for "top-left", "top-center", ..., "bottom-center",
       "bottom-right".  These codes are not case-sensitive.

       The "+X+Y" specifies the offset of the region's anchor point from the
       frame's anchor point.  For example, "+20+30TL" means that the top-left
       corner of the region will be offset 20 pixels to the right and 30
       pixels down from the top-left corner of the frame.

       The offset values can also be negative.  For example, "-4+0CC" means
       that the center (vertical and horizontal) of the region is offset 4
       pixels to the left of the center of the frame.

       The default anchoring point for geometry strings is TL, i.e. the top-
       left corner.

NOTES ON SOURCE AND TARGET ALIGNMENT
       Often, the source and target active regions do not match exactly.  This
       happens when, using the given or calculated scaling ratios, the source
       region scales to a different size or shape than the target region.  In
       this case, the source and target regions are mutually clipped, so that
       only the portion of the source which fits will be scaled into the
       target.

       Before any clipping or padding, the source and target regions are
       aligned so that the points specified via the "align=aa" parameter
       coincide.  The "aa" code specifies an anchor point as described above.

       For example, "align=BC" specifies that the bottom-center of the source
       region should get mapped to the bottom-center of the target region.  In
       other words, the source region will be horizontally centered and
       vertically aligned to the bottom of the target region before clipping:

               ----------------  source
               |abcdefghijklmn|
            ---|opqrstuvwxyz01|---  target      ----------------
            |  |234567890ABCDE|  |              |234567890ABCDE|
            |  |FGHIJKLMNOPQRS|  |              |FGHIJKLMNOPQRS|
            |  |TUVWXYZabcdefg|  |              |TUVWXYZabcdefg|
            ----------------------              ----------------
                   Before                       Mutually Clipped

       If instead "align=TR" were centered, the source would be clipped in a
       different place, and scaled into a different region of the target
       frame:

            ----------------------                 ----------------
            |     |abcdefghijklmn|                 |abcdefghijklmn|
            |     |opqrstuvwxyz01|                 |opqrstuvwxyz01|
            |     |234567890ABCDE|                 |234567890ABCDE|
            ------|FGHIJKLMNOPQRS|                 ----------------
         target   |TUVWXYZabcdefg| source
                  ----------------
                   Before                       Mutually Clipped

       The default alignment mode is "CC", that is, the source and target are
       mutually centered.

NOTES ON SCALE FACTOR INFERENCE
       If the X and Y scaling factors are not explicitly provided, y4mscaler
       will infer the factors from the source and target active regions and
       sample aspect ratios (SAR's).

       If the active regions are not compatible shape-wise (given the SAR's),
       the source and target regions will be clipped or padded according to
       one of four policies.  The policy is selected using the "infer="
       parameter and one of the keywords PAD, CLIP, PRESERVE_X, or PRESERVE_Y.
       (The default is PAD.)

          PAD
             Pick scaling factors which will pad the source, but ensure that
             all of the source image content ends up in the target.

          CLIP
             Pick scaling factors which will clip the source, but which will
             fill the target region as much as possible.

          PRESERVE_X
             Pick scaling factors which preserve as much of the horizontal
             source content as possible.

          PRESERVE_Y
             Pick scaling factors which preserve as much of the vertical
             source content as possible.

       The policy is further affected by a choice of two other keywords,
       SIMPLIFY, or EXACT.  (The default is SIMPLIFY.)

          EXACT
             Calculate exact scaling factors.

          SIMPLIFY
             Adjust the active regions and scaling factors (within 10% or so),
             to simplify the ratios as much as possible.  (For example, crop
             or pad slightly to achieve a ratio of 2/1 rather than 45/22.)

NOTES ON CHROMA MODES AND SUBSAMPLING
       y4mscaler can convert streams from one chroma subsampling mode to
       another.  Such conversions are always lossy operations, even if the
       overall frame is undergoing 1/1 scaling.

       y4mscaler will infer the source's subsampling mode from tags in the
       input stream header.  The target presets ("preset=XXX") will attempt to
       set the target subsampling mode appropriately.  Otherwise, by default
       the target subsampling mode will match the source.  One can explicitly
       set the subsampling mode for the source and/or the target by using the
       "chromass=" parameter.

       y4mscaler is capable of reading and writing streams in the 4:4:4,
       4:2:2, 4:1:1, and 4:2:0 (all three varieties) subsampling modes.  The
       first three, however, are a relatively new addition to the YUV4MPEG2
       standard, and many MJPEGtools will fail to process them correctly, if
       at all.  smil2yuv and raw2yuv can produce native 4:1:1 streams from
       NTSC DV video, which can then be converted to 4:2:0 by y4mscaler before
       further processing by other tools.

       If the source has an alpha-channel (i.e. 444ALPHA mode) and the target
       does not, the alpha channel will simply be discarded.  On the other
       hand, if the target has an alpha-channel but the source does not, a
       constant alpha-channel will be created using the alpha-value of the
       target's background color (as set by "-O bg=").  The default is fully-
       opaque.

       Similarly, if the target has chroma channels but the source does not
       (i.e. a luma-only MONO stream), then the chroma channels in the output
       will be set according to the background color.

NOTES ON ANOMALOUS INTERLACE MIXTURES
       The YUV4MPEG2 format allows for "mixed-mode interlacing" streams, which
       may contain a mixture of progressive and interlaced frames.  Each frame
       is tagged as temporally interlaced or progressive, and vertically-
       subsampled frames (4:2:0 formats) are further tagged as spatially
       interlaced or not.  Unfortunately, this allows for the possibility of
       anomalous frames, which happen to be temporally interlaced (fields
       sampled at different times) but spatially progressive (subsampling
       performed across entire frame), or vice-versa.  The only reasonable
       thing to do with such anomalous frames is to vertically-upsample the
       chroma, essentially making to problem go away as quickly as possible.

       y4mscaler will only process such frames if the target output format is
       non-vertically-subsampled (e.g. 4:4:4, 4:2:2, etc.) and no other
       vertical processing is required.  Otherwise y4mscaler will bail on
       processing in midstream when it encounters an anomalous frame.  If
       there is any possibility of encountering such an error, y4mscaler will
       print a warning when processing begins.

EXIT STATUS
       0      Successful program execution.

       1      Usage, syntax, or operational error.

AUTHOR
       This manual page is copyright 2005 by Matthew Marjanovic.
       Feel free to direct any questions, remarks, problems, or bug reports
       concerning this tool to <dmg @ mir.com>.

       For more info, see our website at:
              <http://www.mir.com/DMG/> <http://www.mir.com/DMG/>

       For more information on MJPEGtools, consult:
              <http://mjpeg.sourceforge.net/> <http://mjpeg.sourceforge.net/>

SEE ALSO
       mjpegtools(1), yuv2lav(1), mpeg2enc(1), ppmtoy4m(1), raw2yuv(1),
       smil2yuv(1), yuvplay(1), yuvscaler(1)

y4mtools                       February 14, 2003                  y4mscaler(1)