色彩空間與像素格式3-FFmpeg中的像素格式

本文為作者原創，轉載請注明出處：http://www.rzrgm.cn/leisure_chn/p/18610070.html

“色彩空間與像素格式”系列文章如下：
[1]. 色彩空間與像素格式1-色彩空間基礎
[2]. 色彩空間與像素格式2-RGB/YUV色彩空間
[3]. 色彩空間與像素格式3-FFmpeg中的像素格式

3. FFmpeg 中的像素格式

FFmpeg 中的像素格式稱為“pixel format”，每種像素格式的定義包含色彩空間、采樣方式、存儲模式、位深等信息。

3.1 基礎概念

與像素格式相關的幾個基礎概念如下：

pixel_format：像素格式，圖像像素在內存中的排列格式。一種像素格式包含色彩空間、采樣方式、存儲模式、位深等信息，其中最重要的信息就是存儲模式，可參照本文第 2 節中有關存儲模式的說明。

bit_depth: 位深，指每個分量(Y、U、V、R、G、B 等)單個采樣點所占的位寬度。

例如對于 yuv420p(位深是 8) 格式而言，每一個 Y 樣本、U 樣本和 V 樣本都是 8 位的寬度，只不過在水平方向和垂直方向，U 樣本數目和 V 樣本數目都只有 Y 樣本數目的一半。而 bpp(Bits Per Pixel) 則是將圖像總比特數分攤到每個像素上，計算出平均每個像素占多少個 bit，例如 yuv420p 的 bpp 是 12，表示平均每個像素占 12 bit(Y 占 8 位、U 占 2 位、V 占 2 位)，實際每個 U 樣本和 V 樣本都是 8 位寬度而不是 2 位寬度。

plane: 存儲圖像中一個或多個分量的一片內存區域。一個 plane 包含一個或多個分量。planar 存儲模式中，至少有一個分量占用單獨的一個 plane，具體到 yuv420p 格式有 Y、U、V 三個 plane，nv12 格式有 Y、UV 兩個 plane，gbrap 格式有 G、B、R、A 四個 plane。packed 存儲模式中，因為所有分量的像素是交織存放的，所以 packed 存儲模式只有一個 plane。

slice: slice 是 FFmpeg 中使用的一個內部結構，在 codec、filter 中常有涉及，通常指圖像中一片連續的行，表示將一幀圖像分成多個片段。注意 slice 是針對圖像分片，而不是針對 plane 分片，一幀圖像有多個 plane，一個 slice 里同樣包含多個 plane。

stride/pitch: 一個 plane 中一行數據的寬度。有對齊要求，計算公式如下：

stride 值 = 圖像寬度 / 水平子采樣因子 * 分量數 * 單樣本位寬度 / 8

其中，圖像寬度的單位是像素點，如分辨率是 1280x720 的一幅圖像，其寬為 1280 個像素點，高為 720 個像素點。水平子采樣因子指在水平方向上每多少個像素采樣出一個色度樣本，亮度樣本不進行下采樣，所以采樣因子總是 1。分量數指當前 plane 包含多少個分量，如 rgb24 格式一個 plane 有 R、G、B 三個分量。單樣本位寬度指某分量的一個樣本在考慮對齊后在內存中占用的實際位數，例如位深 8 占 8 位寬，位深 10 實際占 16 位寬，對齊值與平臺相關。

需要注意的是，stride 考慮的是 plane 中的一行。對 yuv420p 格式而言，Y 分量是完全采樣，因此一行 Y 樣本數等于圖像寬度，U 分量和 V 分量水平采樣因子是 2 (水平方向每兩個像素采樣出一個 U 樣本和 V 樣本)，因此一行 U 樣本數和一行 V 樣本數都等于圖像寬度的一半。U 分量和 V 分量垂直采樣因子也是 2，因此 U 分量和 V 分量的行數少了，只有圖像高度的一半，但垂直方向的采樣率并不影響一個 plane 的 stride 值，因為 stride 的定義決定了其值只取決于水平方向的采樣率。

若源圖像像素格式是 yuv420p，它有 Y、U、V 三個 plane，位深是 8 (每一個Y 樣本、U 樣本、V 樣本所占位寬度是 8 位)，分辨率是 1280x720，則在 Y plane 的一行數據中，有 1280 個 Y 樣本，占用 1280 個字節，stride 值是 1280；在 U plane 的一行數據中，有 640 個 U 樣本，占用 640 個字節，stride 值是 640；在 V plane 的一行數據中，有 640 個樣本，占用 640 個字節，stride 值是 640。

若源圖像像素格式是 yuv420p10，它有 Y、U、V 三個 plane，位深是 10 ，單樣本位寬度是 16 (內存對齊后每個樣本占 16 位)，分辨率仍然是 1280x720，則 Y plane 的 stride 值為 1280 x 16 / 8 = 2560，U plane stride 值為 1280 / 2 x 16 / 8 = 1280，V plane stride 值為 1280 / 2 x 16 / 8 = 1280。

若源圖像像素格式是 yuv420p16le，它有 Y、U、V 三個 plane，位深是 16，單樣本位寬度也是 16，分辨率仍然是 1280x720，則 Y plane 的 stride 值為 1280 x 16 / 8 = 2560，U plane stride 值為 1280 / 2 x 16 / 8 = 1280，V plane stride 值為 1280 / 2 x 10 / 8 = 1280。

若源圖像像素格式是 p010le，它有 Y、UV 兩個 plane，位深是 10，單樣本位寬度是 16 (內存對齊后，每個樣本占 16 位)，分辨率仍然是 1280x720，則 Y plane 的 stride 值為 1280 x 16 / 8 = 2560，UV plane stride 值為 1280 / 2 x 2 x 16 / 8 = 2560。

若源圖像像素格式是 bgr24，它有 BGR 一個 plane，位深是 8，分辨率仍然是 1280x720。因 bgr24 像素格式是 packed 存儲模式，每個像素 R、G、B 三個采樣點交織存放，內存區的排列形式為 BGRBGR...，因此可以認為它只有一個 plane，此 plane 中一行圖像有 1280 個 R 樣本，1280 個 G 樣本，1280 個 B 樣本，此 plane 的 stride 值為 1280 x 3 x 8 / 8 = 3840。

3.2 數據結構

3.2.1 AVPixelFormat

AVPixelFormat 定義了像素格式 ID，AVPixelFormat 由 FFmpeg 內部代碼使用，用來標識某一像素格式。幾個有代表性的像素格式列舉如下，其他格式省略：

enum AVPixelFormat {
    AV_PIX_FMT_NONE = -1,
    AV_PIX_FMT_YUV420P,   ///< planar YUV 4:2:0, 12bpp, (1 Cr & Cb sample per 2x2 Y samples)
    AV_PIX_FMT_YUYV422,   ///< packed YUV 4:2:2, 16bpp, Y0 Cb Y1 Cr
    AV_PIX_FMT_RGB24,     ///< packed RGB 8:8:8, 24bpp, RGBRGB...
    AV_PIX_FMT_BGR24,     ///< packed RGB 8:8:8, 24bpp, BGRBGR...
    AV_PIX_FMT_YUV422P,   ///< planar YUV 4:2:2, 16bpp, (1 Cr & Cb sample per 2x1 Y samples)
    AV_PIX_FMT_YUV444P,   ///< planar YUV 4:4:4, 24bpp, (1 Cr & Cb sample per 1x1 Y samples)

    ......

    AV_PIX_FMT_NV12,      ///< planar YUV 4:2:0, 12bpp, 1 plane for Y and 1 plane for the UV components, which are interleaved (first byte U and the following byte V)
    AV_PIX_FMT_NV21,      ///< as above, but U and V bytes are swapped

    AV_PIX_FMT_ARGB,      ///< packed ARGB 8:8:8:8, 32bpp, ARGBARGB...
    AV_PIX_FMT_RGBA,      ///< packed RGBA 8:8:8:8, 32bpp, RGBARGBA...
    AV_PIX_FMT_ABGR,      ///< packed ABGR 8:8:8:8, 32bpp, ABGRABGR...
    AV_PIX_FMT_BGRA,      ///< packed BGRA 8:8:8:8, 32bpp, BGRABGRA...

    ......

    AV_PIX_FMT_YUV420P10BE,///< planar YUV 4:2:0, 15bpp, (1 Cr & Cb sample per 2x2 Y samples), big-endian
    AV_PIX_FMT_YUV420P10LE,///< planar YUV 4:2:0, 15bpp, (1 Cr & Cb sample per 2x2 Y samples), little-endian

    ......

    AV_PIX_FMT_NV16,         ///< interleaved chroma YUV 4:2:2, 16bpp, (1 Cr & Cb sample per 2x1 Y samples)
    AV_PIX_FMT_NV20LE,       ///< interleaved chroma YUV 4:2:2, 20bpp, (1 Cr & Cb sample per 2x1 Y samples), little-endian
    AV_PIX_FMT_NV20BE,       ///< interleaved chroma YUV 4:2:2, 20bpp, (1 Cr & Cb sample per 2x1 Y samples), big-endian

    ......

    /**
     *  HW acceleration through QSV, data[3] contains a pointer to the
     *  mfxFrameSurface1 structure.
     */
    AV_PIX_FMT_QSV,

    ......

    AV_PIX_FMT_P010LE, ///< like NV12, with 10bpp per component, data in the high bits, zeros in the low bits, little-endian
    AV_PIX_FMT_P010BE, ///< like NV12, with 10bpp per component, data in the high bits, zeros in the low bits, big-endian

    ......

    AV_PIX_FMT_NB         ///< number of pixel formats, DO NOT USE THIS if you want to link with shared libav* because the number of formats might differ between versions
};

3.2.2 AVPixFmtDescriptor

AVPixFmtDescriptor 定義了圖像數據在內存中的組織排列形式，此數據結構定義了像素格式的實現細節。

/**
 * Descriptor that unambiguously describes how the bits of a pixel are
 * stored in the up to 4 data planes of an image. It also stores the
 * subsampling factors and number of components.
 *
 * @note This is separate of the colorspace (RGB, YCbCr, YPbPr, JPEG-style YUV
 *       and all the YUV variants) AVPixFmtDescriptor just stores how values
 *       are stored not what these values represent.
 */
typedef struct AVPixFmtDescriptor {
    const char *name;
    uint8_t nb_components;  ///< The number of components each pixel has, (1-4)

    /**
     * Amount to shift the luma width right to find the chroma width.
     * For YV12 this is 1 for example.
     * chroma_width = AV_CEIL_RSHIFT(luma_width, log2_chroma_w)
     * The note above is needed to ensure rounding up.
     * This value only refers to the chroma components.
     */
    uint8_t log2_chroma_w;

    /**
     * Amount to shift the luma height right to find the chroma height.
     * For YV12 this is 1 for example.
     * chroma_height= AV_CEIL_RSHIFT(luma_height, log2_chroma_h)
     * The note above is needed to ensure rounding up.
     * This value only refers to the chroma components.
     */
    uint8_t log2_chroma_h;

    /**
     * Combination of AV_PIX_FMT_FLAG_... flags.
     */
    uint64_t flags;

    /**
     * Parameters that describe how pixels are packed.
     * If the format has 1 or 2 components, then luma is 0.
     * If the format has 3 or 4 components:
     *   if the RGB flag is set then 0 is red, 1 is green and 2 is blue;
     *   otherwise 0 is luma, 1 is chroma-U and 2 is chroma-V.
     *
     * If present, the Alpha channel is always the last component.
     */
    AVComponentDescriptor comp[4];

    /**
     * Alternative comma-separated names.
     */
    const char *alias;
} AVPixFmtDescriptor;

const char *name

像素格式名稱。例如 AV_PIX_FMT_YUV420P 的名稱為 yuv420p，AV_PIX_FMT_YUV420P 在 FFmpeg 代碼中使用，而像素格式名稱 yuv420p 則在 FFmpeg 命令行中使用。

uint8_t nb_components

圖像分量數，取值范圍 1 - 4。例如 AV_PIX_FMT_GRAY8 只有 Y 一個分量，AV_PIX_FMT_YUV420P 有 Y、U、V 三個分量，AV_PIX_FMT_NV12 也有 Y、U、V 三個分量，AV_PIX_FMT_ARGB 有 A、R、G、B 四個分量。

uint8_t log2_chroma_w

寬度移位的位數，表示將亮度樣本寬度右移多少位能得到色度樣本的寬度，此值等于水平方向色度子采樣因子。例如對于 yuv420p 格式，若圖像分辨率為 1280 x 720，則亮度樣本寬度(水平方向亮度樣本數)為 1280，色度樣本寬度(水平方向色度樣本數)為 1280/2 = 640，log2_chroma_w 值為 1(右移 1 位)。

uint8_t log2_chroma_h

高度移位的位數，表示將亮度樣本高度右移多少位能得到色度樣本的高度，此值等于垂直方向色度子采樣因子。例如對于 yuv420p 格式，若圖像分辨率為 1280 x 720，則亮度樣本高度(垂直方向亮度樣本數)為 720，色度樣本高度(垂直方向色度樣本數)為 720/2 = 360，log2_chroma_w 值為 1(右移 1 位)。

uint64_t flags

像素格式標志位組合，形如 AV_PIX_FMT_FLAG_BE ｜ AV_PIX_FMT_FLAG_HWACCEL 。例如，標志 AV_PIX_FMT_FLAG_BE 表示大端格式，AV_PIX_FMT_FLAG_HWACCEL 表示此像素格式用于硬解或硬編等硬件加速場合。

AVComponentDescriptor comp[4]

這個成員非常重要。數組的每個元素表示一個分量，注意是一個分量而不是一個 plane，一個 plane 可能含有多個分量。AVComponentDescriptor 數據結構定義了每個分量的像素數據在內存中的格式，詳情參 3.2.3 節。

const char *alias

以逗號分隔的別名列表。在 av_pix_fmt_descriptors[] 數組的定義中可以看到，AV_PIX_FMT_GRAY8 像素格式的名稱是 "gray8"，alias 值為 "gray8,y8"。

3.2.3 AVComponentDescriptor

AVComponentDescriptor 定義了每個分量在內存中的實際組織形式，包含所有細節。

typedef struct AVComponentDescriptor {
    /**
     * Which of the 4 planes contains the component.
     */
    int plane;

    /**
     * Number of elements between 2 horizontally consecutive pixels.
     * Elements are bits for bitstream formats, bytes otherwise.
     */
    int step;

    /**
     * Number of elements before the component of the first pixel.
     * Elements are bits for bitstream formats, bytes otherwise.
     */
    int offset;

    /**
     * Number of least significant bits that must be shifted away
     * to get the value.
     */
    int shift;

    /**
     * Number of bits in the component.
     */
    int depth;

#if FF_API_PLUS1_MINUS1
    /** deprecated, use step instead */
    attribute_deprecated int step_minus1;

    /** deprecated, use depth instead */
    attribute_deprecated int depth_minus1;

    /** deprecated, use offset instead */
    attribute_deprecated int offset_plus1;
#endif
} AVComponentDescriptor;

int plane

當前分量位于哪個 plane 中。

例如 p010 格式有三個分量：Y、U、V，兩個 plane：Y、UV。Y plane 的形式為YYYY...，UV plane 的形式為UVUVUV...。Y 分量的 plane 值是 0， U 分量和 V 分量的 plane 值是 1，U 樣本和 V 樣本交織存放在 plane 1 中。

int step

步長，表示當前分量的兩個相鄰樣本在內存中的間距是多少個字節(或比特)，如果像素格式是比特流格式(標志 AV_PIX_FMT_FLAG_BITSTREAM 有效)，此值表示比特數，否則此值表示字節數。

以 p010 格式為例，Y plane 的形式為YYYY...，UV plane 的形式為UVUVUV...，位深是 10，考慮對齊后，每一個 Y、每一個 U、每一個 V 都占 2 個字節，因此 Y 分量的 step 是 2(兩個 Y 相距兩字節)，U 分量的 step 是 4(兩個 U 相距 4 字節)，V 分量的 step 也是 4(兩個 V 相距 4 字節)。

int offset

偏移，表示在當前 plane 中，當前分量的第一個樣本之前有多少個字節的數據，如果像素格式是比特流格式(標志 AV_PIX_FMT_FLAG_BITSTREAM 有效)，此值表示比特數，否則此值表示字節數。

以 p010 格式為例，每一個 U 或 V 都占 2 個字節，第一個 V 樣本前有 2 個字節被 U 樣本占了，所以 U 分量的 offset 值是 0，V 分量的 offset 值是 2。因此這個字段確定了當前 plane 中 U 和 V排列的順序。

int shift

右移位數，表示將對應內存單元的值右移多少位可以得到實際值。

以 p010 格式為例，位深是 10，而內存對齊后每一個 Y、U、V 樣本占 16 bit，那么 10 位的數據放在 16 位的內存單元中，是占據高 10 位還是占據低 10 位，就是由 shift 值決定的。p010 格式中，各分量的 shift 值都是 6 ，表示數據放在高 10 位。從 Y plane 中獲取第一個 Y 樣本的值，示意代碼如下：

uint8_t y_plane[1280*2];
uint16_t *p_y0 = (uint16_t *)y_plane;
uint16_t y0 = (*p_y0) >> 6;

yuv420p10le 格式中各分量的 shift 值都是 0 ，表示數據放在低 10 位。從 Y plane 中獲取第一個 Y 樣本的值，示意代碼如下：

uint8_t y_plane[1280*2];
uint16_t *p_y0 = (uint16_t *)y_plane;
uint16_t y0 = (*p_y0) >> 0;

int depth

當前分量每個樣本的位寬度，即位深。

上述參數中，plane 表示分量所在的 plane 的序號，offset 表示多個分量交織存放在同一個 plane 中時的排列順序(如 p010 格式的 UV plane 中 U 在前 V 在后)，step、shift 和 depth 則是和內存對齊相關，例如 p010 格式 depth 是 10(bit)， step 是 2(字節)，shift 是 6(bit)，表示每個 10 bit 的樣本占用 16 bit(2 字節)的內存單元，低 6 位是無用位(高位對齊，靠左對齊)。

3.3 節將詳細解釋幾個常用像素格式各具體參數的含義。

3.2.4 av_pix_fmt_descriptors[]

av_pix_fmt_descriptors 是 FFmpeg 中定義各個像素格式的實際格式的。這個數組非常重要，當不知道具體的某個像素格式的實現細節時，查看此數組中的定義即可明白。常用的有代表性的幾個像素格式將在下一節具體分析。

如下是 yuv420p 像素格式的定義：

static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
    [AV_PIX_FMT_YUV420P] = {
        .name = "yuv420p",
        .nb_components = 3,
        .log2_chroma_w = 1,
        .log2_chroma_h = 1,
        .comp = {
            { 0, 1, 0, 0, 8, 0, 7, 1 },        /* Y */
            { 1, 1, 0, 0, 8, 0, 7, 1 },        /* U */
            { 2, 1, 0, 0, 8, 0, 7, 1 },        /* V */
        },
        .flags = AV_PIX_FMT_FLAG_PLANAR,
    },
    ......
}

3.3 典型像素格式實例分析

FFmpeg 中所有的像素格式都定義在 av_pix_fmt_descriptors[] 數組中。下面只列出最常用到的幾個像素格式進行分析：

3.3.1 像素格式 yuv420p

yuv420p 格式有 Y、U、V 三個分量，分別存放在 plane 0，plane 1 和 plane 2 中。

static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
    [AV_PIX_FMT_YUV420P] = {
        .name = "yuv420p",
        .nb_components = 3,    // 一共三個分量：Y、U、V
        .log2_chroma_w = 1,    // 水平采樣因子是 2，pow(2, 1)
        .log2_chroma_h = 1,    // 垂直采樣因子是 2, pow(2, 1)
        .comp = {
            { 0, 1, 0, 0, 8, 0, 7, 1 },        /* Y */
            { 1, 1, 0, 0, 8, 0, 7, 1 },        /* U */
            { 2, 1, 0, 0, 8, 0, 7, 1 },        /* V */
        },
        .flags = AV_PIX_FMT_FLAG_PLANAR,
    },
    ......
}

3.3.2 像素格式 yuv422p

yuv422p 格式有 Y、U、V 三個分量，分別存放在 plane 0，plane 1 和 plane 2 中。

static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
    ......
    [AV_PIX_FMT_YUV422P] = {
        .name = "yuv422p",
        .nb_components = 3,    // 一共三個分量：Y、U、V
        .log2_chroma_w = 1,    // 水平采樣因子是 2，pow(2, 1)
        .log2_chroma_h = 0,    // 垂直采樣因子是 1，pow(2, 0)
        .comp = {
            { 0, 1, 0, 0, 8, 0, 7, 1 },        /* Y */
            { 1, 1, 0, 0, 8, 0, 7, 1 },        /* U */
            { 2, 1, 0, 0, 8, 0, 7, 1 },        /* V */
        },
        .flags = AV_PIX_FMT_FLAG_PLANAR,
    },
    ......
}

3.3.3 像素格式 yuv444p

yuv444p 格式有 Y、U、V 三個分量，分別存放在 plane 0，plane 1 和 plane 2 中。

static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
    ......
    [AV_PIX_FMT_YUV444P] = {
        .name = "yuv444p",
        .nb_components = 3,    // 一共三個分量：Y、U、V
        .log2_chroma_w = 0,    // 水平采樣因子是 1，pow(2, 0)
        .log2_chroma_h = 0,    // 垂直采樣因子是 1，pow(2, 0)
        .comp = {
            { 0, 1, 0, 0, 8, 0, 7, 1 },        /* Y */
            { 1, 1, 0, 0, 8, 0, 7, 1 },        /* U */
            { 2, 1, 0, 0, 8, 0, 7, 1 },        /* V */
        },
        .flags = AV_PIX_FMT_FLAG_PLANAR,
    },
    ......
}

3.3.4 像素格式 nv12 和 nv21

nv12 格式有 Y、U、V 三個分量，Y 分量存放在 plane 0 中，U 和 V 交織存放在 plane 1 中。

static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
    ......
    [AV_PIX_FMT_NV12] = {
        .name = "nv12",
        .nb_components = 3,    // 一共三個分量：Y、U、V
        .log2_chroma_w = 1,    // 水平采樣因子是 2，pow(2, 1)
        .log2_chroma_h = 1,    // 垂直采樣因子是 2，pow(2, 1)
        .comp = {
            {                  // Y 分量
              0,               // plane:  plane 0, YYYYYYYY...
              1,               // step:   兩個 Y 間距 1 字節
              0,
              0,
              8,               // 8 位寬
              0, 7, 1 },
            {                  // U 分量
              1,               // plane:  plane 1, UVUVUVUV...
              2,               // step:   兩個 U 間距 2 字節
              0,               // offset: U 在前 V 在后
              0,
              8,               // 8 位寬
              1, 7, 1 },
            {                  // V 分量
              1,               // plane: plane 1, UVUVUVUV...
              2,               // step:  兩個 V 間距 2 字節
              1,               // offset: U 在前 V 在后，因 V 前有 1 個字節的 U
              0, 
              8,               // 8 位寬
              1, 7, 2 },
        },
        .flags = AV_PIX_FMT_FLAG_PLANAR,
    },
    ......
}

nv21 格式和 nv12 格式只有一點不同：plane 1 中Ｕ和 V 的順序不同，對比下列二者的定義，只有 .comp.offset 成員值不同

    [AV_PIX_FMT_NV12] = {
        .name = "nv12",
        .nb_components = 3,
        .log2_chroma_w = 1,
        .log2_chroma_h = 1,
        .comp = {
            { 0, 1, 0, 0, 8, 0, 7, 1 },        /* Y */
            { 1, 2, 0, 0, 8, 1, 7, 1 },        /* U */
            { 1, 2, 1, 0, 8, 1, 7, 2 },        /* V */
        },
        .flags = AV_PIX_FMT_FLAG_PLANAR,
    },
    [AV_PIX_FMT_NV21] = {
        .name = "nv21",
        .nb_components = 3,
        .log2_chroma_w = 1,
        .log2_chroma_h = 1,
        .comp = {
            { 0, 1, 0, 0, 8, 0, 7, 1 },        /* Y */
            { 1, 2, 1, 0, 8, 1, 7, 2 },        /* U */
            { 1, 2, 0, 0, 8, 1, 7, 1 },        /* V */
        },
        .flags = AV_PIX_FMT_FLAG_PLANAR,
    },

3.3.5 像素格式 p010le

p010le 和 nv12 格式類似，只是 p010le 位深是 10。

static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
    ......
    [AV_PIX_FMT_P010LE] = {
        .name = "p010le",
        .nb_components = 3,    // 一共三個分量：Y、U、V
        .log2_chroma_w = 1,    // 水平采樣因子是 2，pow(2, 1)
        .log2_chroma_h = 1,    // 垂直采樣因子是 2，pow(2, 1)
        .comp = {
            {                  // Y 分量
              0,               // plane:  plane 0, YYYYYYYY...
              2,               // step:   兩個 Y 相距 2 字節
              0,               // offset: 0
              6,               // shift:  10 位數據按高位對齊，低 6 位是無效值
              10,              // depth:  10 位寬
              1, 9, 1 },
            {                  // U 分量
              1,               // plane:  plane 1, UVUVUVUV...
              4,               // step:   兩個 U 相距 4 字節
              0,               // offset: U 在前 V 在后
              6,               // shift:  10 位數據按高位對齊，低 6 位是無效值
              10,              // depth:  10 位寬
              3, 9, 1 },
            {                  // V 分量
              1,               // plane:  plane 1, UVUVUVUV...
              4,               // step:   兩個 V 相距 4 字節
              2,               // offset: U 在前 V 在后，因 V 前有 2 個字節被 U 占了
              6,               // shift:  10 位數據按高位對齊，低 6 位是無效值
              10,              // depth:  10 位寬
              3, 9, 3 },
        },
        .flags = AV_PIX_FMT_FLAG_PLANAR,
    },
    ......
}

3.3.6 像素格式 yuv420p10le

yuv420p10le 格式有 Y、U、V 三個分量，三個分量分別存放在三個 plane 中。

static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
    ......
    [AV_PIX_FMT_YUV420P10LE] = {
        .name = "yuv420p10le",
        .nb_components = 3,    // 一共三個分量：Y、U、V
        .log2_chroma_w = 1,
        .log2_chroma_h = 1,
        .comp = {
            {                  // Y 分量
              0,               // plane:  plane 0, YYYYYYYY...
              2,               // step:   兩個 Y 相距 2 字節
              0,               // offset: 0
              0,               // shift:  10 位數據按低位對齊
              10,              // depth:  10 位寬
              1, 9, 1 },
            {                  // U 分量
              1,               // plane:  plane 1, UUUUUUUU...
              2,               // step:   兩個 U 相距 2 字節
              0,               // offset: 0
              0,               // shift:  10 位數據按低位對齊
              10,              // depth:  10 位寬
              1, 9, 1 },
            {                  // V 分量
              2,               // plane:  plane 2, VVVVVVVV...
              2,               // step:   兩個 U 相距 2 字節
              0,               // offset: 0
              0,               // shift:  10 位數據按低位對齊
              10,              // depth:  10 位寬
              1, 9, 1 },
        },
        .flags = AV_PIX_FMT_FLAG_PLANAR,
    },
    ......
}

3.3.7 像素格式 argb

argb 格式有 A、R、G、B 四個分量，交織存放在 plane 0 中，所以 argb 是一種 packed 存儲模式。

static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
    ......
    [AV_PIX_FMT_ARGB] = {
        .name = "argb",
        .nb_components = 4,
        .log2_chroma_w = 0,
        .log2_chroma_h = 0,
        .comp = {
            { 0, 4, 1, 0, 8, 3, 7, 2 },        /* R */
            { 0, 4, 2, 0, 8, 3, 7, 3 },        /* G */
            { 0, 4, 3, 0, 8, 3, 7, 4 },        /* B */
            { 0, 4, 0, 0, 8, 3, 7, 1 },        /* A */
        },
        .flags = AV_PIX_FMT_FLAG_RGB | AV_PIX_FMT_FLAG_ALPHA,
    },
    ......
}

注意 R、G、B、A 各分量的 .comp.offset 值依次是 1、2、3、0，表示 A、R、G、B 在內存中的排列順序是 ARGBARGBARGB...

posted @ 2024-12-16 14:28 葉余閱讀(681) 評論(0) 收藏舉報

刷新頁面返回頂部

葉余

一直在模仿，從來不專業

色彩空間與像素格式3-FFmpeg中的像素格式

3. FFmpeg 中的像素格式

3.1 基礎概念

3.2 數據結構

3.2.1 AVPixelFormat

3.2.2 AVPixFmtDescriptor

3.2.3 AVComponentDescriptor

3.2.4 av_pix_fmt_descriptors[]

3.3 典型像素格式實例分析

3.3.1 像素格式 yuv420p

3.3.2 像素格式 yuv422p

3.3.3 像素格式 yuv444p

3.3.4 像素格式 nv12 和 nv21

3.3.5 像素格式 p010le

3.3.6 像素格式 yuv420p10le

3.3.7 像素格式 argb

公告

葉余

一直在模仿，從來不專業

色彩空間與像素格式3-FFmpeg中的像素格式

3. FFmpeg 中的像素格式

3.1 基礎概念

3.2 數據結構

3.2.1 AVPixelFormat

3.2.2 AVPixFmtDescriptor

3.2.3 AVComponentDescriptor

3.2.4 av_pix_fmt_descriptors[]

3.3 典型像素格式實例分析

3.3.1 像素格式 yuv420p

3.3.2 像素格式 yuv422p

3.3.3 像素格式 yuv444p

3.3.4 像素格式 nv12 和 nv21

3.3.5 像素格式 p010le

3.3.6 像素格式 yuv420p10le

3.3.7 像素格式 argb

公告

一直在模仿，從來不專業